PLC server redundancy

Redundancy

Redundancy should prevent failure and consequential damages in plants in case of flawed components

Basics for Redundancy

The ideal state is having all components redundant. This will include the SCADA system with its staff and the machine itself. For cost reasons machines, teams and SCADA systems are redundant seldom. Ideal are two systems, for security reasons at different locations.

If the costs are considered, very extensive redundancies are economically rarely portable. There are few exceptions for far-reaching redundancy as nuclear power plants and railway signals.

Suggestive those components having most failures or will produce highest costs on failure will be make redundant. This are network cables, switches, sometimes the network adapters on the PC and the controller. The main reason for this is the broad expanse of networks. Various disturbance are threatening all components like construction work, overload, aging contacts and animals affecting the cables.
It is impossible making wireless networks redundant. If the "air cable" is affected it can not be made redundant with air. Multiple senders and receivers do not help often. As maximum the number of failures can be decreased a little bit.
Power supply can be make redundant also. This will not be threated here.
Multiple available SCADA systems will not be threatened here, the same is for multiple controllers or complete plants for redundancy. High-available controllers will be part of redundancy concepts.

Limitations of Redundancy

For best results the information need to be fetched simultanously over multiple connections. The data need to be checked for the best value. Such logic will result in heavyly load on the controller, so it is not usable on most plants. The main reason for this: The controllers are optimized for fast reactions of the plant and not for heavily network communication. Additionally - fetching the data over multiple connections will result in inconsistent data behaviour. The data reading works with a lot of small requests. If the data are changing in between - especially if they will be fetched over multiple connections - no consistency check can be used truely. In real plants the data content is changing very quickly.
If system immanent errors (by construction or by design) should be circumvented different technologies must be used. An example: Two different controller types on one machine from different manufacturers need to be used. Another example are computers with different technology as a PC with INTEL cpu and Windows, another with ARM cpu and Linux. Such things are not handled here.

Interpretation of Network Redundancy

The network has the most failures. So doing redundant networks rises the down time significantly.
A supervisory control machine needs a minimum of two network adapters, the controller also.
Two separate networks need to be build up.
Both networks need a different IP address area. One station with an standard operating system can not handle overlapping network masks on multiple netawrk adapters. Broken lines using the same subnet does not switch to the second network fast enough to the remaining cable system.
The network cables should not be build into the same cable tunnel. The switches of both networks need to be mounted in different cabinets. The cables need to be shielded against on another. A goot thing is having fire protection between both cables.

Rule: Prevent a "Single Point Of Failure". Exceptions are the starting point (the user management station) and the end point (the machine).

Interpretation Redundancy in Tani Products

In the Tani OPC server and PLC Engine Collect the connection to the controllers will be defined as redundant. The connection consists of two or more connections internally. Each connection is configured in a manner that it will work over one of the network adapters only. The controller has two network adapters also. Each redundancy single connection has an IP destination address of its own. The result is each connection uses its defined network adapter.

Configuration of redundancy in connections

The connection has a checkmark for redundancy. If it will be checked another entry field for connection parameters will open.

How it works in the OPC server

The OPC application need not to be changed or affected for redundance connections to the controllers.

During the start of the data aquisition both connections will be created, established and checked. The data will be handled over one connection only preventing the controller from overloaded. This connection now is the master connection.
If the master connection does not respond in the requested time or it breaks, all items will be switched active on the second connection. If data are arriving now it becomes the master connection. The previous connection becomes slave, all items will be inactivated on it. None of the items will give a quality error to the OPC client. In the diagnostics logger the switching will be listed.
Only if all connections will be disturbed the elements will deliver bad quality. The OPC client can check the switching connection over the item "System.Topics.<plc connection>.Redundancy", the value will change.
If the slave connection will fail this will be written into the diagnostics logger. Additionally the item "System.Topics.<plc connection>.Status<n>" will be set to failure.

On redundancy with three connections in case of failure of the master connection, some items are switched active on both slave connections. The faster connection wins and will become master, all of the rest items will be used on it. The more slowly connection goes into the slave state again..
Over the item "Redundancy" it can be checked which connection actually is the master. The items "Status1" and "Status2" (optionally "Status3" or more) are showing the actual state of the connections. They should be all zero if all connections exists.
Writing to the item "Redundancy" will switch the master immediately. This is for test purposes or for mainenance.
The jitter during switching the redundancy connection sets together from

Monitoring timeout - the application timeout. It should be the controller reaction time multiplied by three, but minimum of one second.
Reconnect timeout. This value will work on connection interruptings.

The smaller value from both is used.
If the connection singals an error self reliant because the network cable on the adapter was removed the redundancy connection will switch immediately.
In the SCADA system or in the controller the connection state shound be used for warning messages for the machinery personel. So they can look for the reason of the failure and fix it.

How it works in the PLC Engine

PLC Engine Collect uses the connection and item management the OPC server is using also The redundency functionality is used in the same manner. The status variables should be used in logic tables so a broken connection will signal this to the plant personel. This can be writing to an element in the controller or into a database.