Keywords

1 Introduction

The currently ongoing developments summarized under the term Industrie 4.0 (I4.0) promise more flexibility, higher productivity, an increased quality of manufactured products, and in general a more sophisticated customer experience. This results in the need for adaptive network architectures and communication structures inside the industrial automation domain. On the other hand, today’s situation clearly shows hybrid characteristics, containing wired, wireless, isolated, and also often legacy e.g. Profibus or CANopen technologies combined [1]. In addition, the ongoing standardisation landscape including upcoming technologies, such as Time-Sensitive Networking (TSN) or 5G, is prepared to be integrated into the industrial systems [2]. This prevalent heterogeneity results in intensive requirements regarding time and general resources for configuration, monitoring, and overall management of the underlying industrial communication systems [3]. Additionally, these heterogeneous landscapes and architectures demand for specialized human engineering experts with detailed know-how and sophisticated domain-specific skills [4].

To address the overall presence of more brownfield than greenfield environments, this work provides a solution to increase the redundancy capabilities of legacy technologies like CANopen with state-of-the-art technologies like TSN. The proposed concept of the “CANopen Flying Master over TSN” was developed within the “FIND – Future Industrial Network Architecture” research project and will be presented in the following.Footnote 1 The goal is to provide a robust and reliable combination of legacy and up-to-date communication technologies and also to enable the adaptive reconfiguration of CANopen networks by offering redundancy with an additional master and the synchronisation of the corresponding masters.

Section 2 reviews current redundancy solutions from both worlds, IT and OT. Section 3 shows the theoretical approach of our concept including the required background information. Section 4 describes the implementation of the concept and gives further details of the CANopen Flying Master over TSN. The whole work is concluded in Sect. 5 where also future work is laid out.

2 State of the Art

In this section a selection of current solutions to increase the redundancy of the systems and redundancy management approaches are introduced, including IT and OT representatives.

2.1 CANopen Flying Master

The Flying Master functionality is specified by the Controller Area Network (CAN) in Automation (CiA) group [6] and integrates the concept of redundant masters capable devices within a CANopen network. Whenever the Active Network Management (NMT) master currently serving the application fails, e.g. that device goes offline, another possible NMT master capable device takes over the responsibility of the application. To determine the active NMT master in the network, the so-called “NMT Flying Master Negotiation” takes place. Any new NMT master capable device in the network checks for an already active NMT master in the network. Therefore, multiple CANopen services are exchanged between those Flying Master capable devices to retrieve “priority level” and “node id”. Those are inputs to the master ranking algorithm of the new NMT master capable device. The master ranking is made autonomously based on the following characteristics, listed by decreasing order regarding their impact:

  • The master’s “priority level” (value range 0 to 2, lower value means higher priority)

  • The state of the master capable device (i.e. an active master will not be deposed if a different master capable device has the same “priority level” but a lower “node id”).

  • The “node id” of the master capable device (lower value means higher priority).

If the ranking based on the comparison suggests a change of the active NMT master, the “NMT Flying Master Negotiation” gets triggered and a new active NMT master gets determined. Values, such as “priority level” and “node id”, are pre-engineered and configured beforehand.

In addition to the “NMT Flying Master Negotiation” algorithm, the “Heartbeat Monitoring” mechanism is specified by CiA [7] to monitor the CANopen network and especially the Flying Master capable devices. CANopen participants are configured to regularly send heartbeat messages, expecting an alive response within a configurable consume time. If the current active master fails to respond in a timely manner, the observing device enters resetting state to trigger the “NMT Flying Master Negotiation” algorithm again and determine the new active NMT master.

2.2 PROFINET IO Redundancy

Profinet is an industrial Ethernet-based communication standard designed for data exchange among devices in industrial automation networks. The IEEE standardization of the Profinet concept is defined in IEC 61158 and IEC 61784 [9]. Industrial automation networks run multiple time-critical applications which makes network availability and reliability two most important factor of the system. With a redundancy mechanism, a reliable network can turn into a highly available system. In Profinet, the redundancy features are categorized into four different types, namely device redundancy, media redundancy, network redundancy, and controller redundancy (in Fig. 1) [10].

Fig. 1
figure 1

Type of redundancies in Profinet [10]

Among these four categories, apart from controller redundancy, the other three types can be handled by the network directly. To achieve these dependencies, multiple Network Access Points (NAP) for IO devices and more than one physical connection among controller and IO devices are necessary. Due to cost-effectiveness and complexity of redundant systems, end-users’ requirements should be taken into account before system deployment.

For achieving media redundancy, Profinet implements the Media Redundancy Protocol (MRP) as defined in the IEC 62439 [9]. In this mechanism, ring topology is used where one switch acts as a Media Redundancy Manager (MRM) while the rest of the switches acts as Media Redundancy Clients (MRC). The MRM sends test packets within the ring network with special MAC addresses which are forwarded by the MRCs within the network. When the test packets are received by the MRM in both ports, it concludes that the ring is active and starts normal operation by blocking one port and forwarding data towards the another one [8]. During normal operation, it acts as a line topology and the blocked port is used only to receive test frames and other configuration related packets. In case the test packets don’t arrive within the specified interval because of a line failure, the MRM opens the blocked port thus acting as a relay and continue to operate until the line failure is fixed. MRP is used for normal TCP/IP and real-time (RT) packets and has a reconfiguration time less than 200 ms and a maximum of 50 nodes can be present in the system in order to attain this reconfiguration time. For Profinet Isochronous Communication (IRT), the concept of Media Redundancy for Planned Duplication (MRPD) (described in IEC 61158) is used [9]. In this case, the IO controller during the start-up loads all the possible communication paths and their schedules to every node in the system. The senders transmit data telegrams through both paths to the receiver and in case of a failure in one path, the telegram still arrives through the redundant path. MRPD enables a smooth transition from one path to another which is necessary for IRT communication.

2.3 IEEE 802.1CB

The standard IEEE 802.1CB specifies how the communication redundancy is applied and managed in 802.1 networks. It does not specify how to create multiple paths, as it is already covered by the Multiple Spanning Tree Protocol (MSTP) specified in IEEE 802.1Q. The redundancy mechanism is called Frame Replication and Elimination for Reliability (FRER). It is responsible, as its name reveals, for the duplication of frames at the source of a stream and the elimination of replicated ones at the destination. It is designed in such a way that a set of end stations conforming to FRER can get most or all of the benefits even connected through a network that is not aware of FRER. The same applies to unaware end stations that are connected to a FRER capable network, see Fig. 2 with two end stations and five bridges.

Fig. 2
figure 2

Redundancy in IEEE 802.1 CB: the Sequence Generation Function (yellow circle) duplicates the frames, sending them through both possible paths to both Sequence Recovery Functions (blue circle). These eliminate duplicates and send the frames further to destination. A third Sequence Recovery Function in the bridge does the last elimination for a single connected end station

2.4 Industrial 5G

The fifth generation cellular network (5G) is one of the biggest trends in industries nowadays [11]. It primarily focuses on machine-to-machine communication and the internet of things. 5G promises three essential types of communication, namely Massive Machine Type Communication (mMTC), Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low Latency Communication (URLLC) among which mMTC and URLLC are the two most important aspects of industrial automation networks. The URLLC communication type can achieve a latency of 1 ms with a reliability of 99.999% over 5G radio networks [11]. This type of communication requires higher availability which can be achieved through modulation and coding schemes and diversity/redundancy techniques. Usage of multiple antennas, multi-carrier connectivity, multiple transmission points, frequency and time diversity etc. are some of the techniques which can be used to provide the possibility of redundant data transmission paths. 5G also allows packet duplication in the Packet Data Convergence Protocol (PDCP) layer where the transmitter of sender makes a duplicate copy of the data packets before sending over the network and the PDCP layer at the receiver end is responsible for eliminating the duplicate data packets [12].

3 Concept of Flying Master Over TSN

The Flying Master feature in CANopen is limited to the legacy technology and to one single CANopen network. However, when addressing upcoming heterogeneity of industrial communication by combining state-of-the-art communication technologies like TSN and legacy technologies such as CANopen through e.g. gateways, some additional opportunities can be spotted. Following the example of 5G, the use of different media combined can help to increase the redundancy capabilities of the industrial networks in a way that pure state-of-the-art technologies like TSN cannot.

3.1 FIND Abstraction Concepts

In FIND networks the management of the communication is logically centralized in the FIND Controller of Controllers (FIND CoC). The idea is to use the current functionality of the different communication technologies regarding management from a higher point of view. Each network has its own controller and the heterogeneous compound of networks has the FIND CoC to coordinate those dedicated controllers to enable the inter-operation in the FIND network, see Fig. 3. As not all the communication technologies offer the same functionality, they were grouped in observable and controllable networks. While CANopen is only an observable network, meaning no re-configuration during runtime possible and everything is pre-engineered, the TSN backbone is controllable during runtime [14].

Fig. 3
figure 3

Architectural view of the CoC approach

To manage the heterogeneity given by mixing legacy and state-of-the-art networks an abstracted way of defining the communication was developed. The functions directly related to the productive activity of an industrial system are called Automation Functions (AtFs). The relation between AtFs are called Application Relations (ARs) and their communication related part are the Communication Relations (CRs). An AR/CR represents the communication requirements between two or more AtFs. Having all this in mind, the FIND CoC is able to detect, whether an AR/CR needs to be re-routed and can re-configure, in this case the TSN backbone and start the TSN streams between the corresponding gateways to re-establish the application again. These gateways can actually host the CANopen master application and extend the capabilities of the redundant masters, with the ability of communicating and synchronizing over TSN. In that way, the application can be hold running, even in the case of a cut in the CANopen line.

3.2 Extending the CANopen Redundancy

Out of the given examples, one for redundant interconnections and communication in heterogeneous networks is discussed in more detail with CANopen Flying Master technology and a TSN backbone network.

To realize the expected functionality of redundancy over the heterogeneous networks, the CANopen master application is extended with a gateway capability between the CANopen network and the TSN backbone as shown in Fig. 4. Thus, communication is done via CANopen or using the TSN Talker/Listener and TSN streams. By adapting the FIND concept of AR/CR communication, i.e. each communication within CANopen gets abstracted as an AR/CR pair, the CANopen application itself is extended to handle the communication independently from the interfaces to TSN or CANopen. In general, slave devices are addressed through CANopen, when the specific slave is online and the master is active on the CANopen network (due to the Flying Master decision). Whenever a slave is not actively seen on the CANopen network, communication needs to be re-routed through TSN.

Fig. 4
figure 4

General overview of the extended gateway functionality to provide Flying Master capability through the backbone TSN system in regards of the FIND CoC concept

Here, the FIND CoC kicks into place by processing the CANopen network state and CANopen device states actively provided by the gateway functionality of the CANopen master applications. Thus, the FIND CoC is able to determine the availability of devices, the topology of the heterogeneous network, and if there is any issue on the path of communication. Monitoring and diagnosis data are communicated through extended capabilities of the FIND CoC indicated by the FIND Adapter in Fig. 4 [13].

While AR/CRs cannot be reconfigured within CANopen, as this technology is only observable (thus the behavior of the application is pre-engineered) the FIND CoC actively handles the TSN backbone network. Whenever there is the need of re-routing AR/CRs through both gateways, e.g. due to a cut in the CAN line, the TSN Talker/Listeners on the gateways are activated and the TSN streams are deployed to the TSN network. All needed resources are reserved to fulfil the communication requirement of the AR/CRs. Activation of TSN Talker/Listener streams on the gateways triggers also the CANopen application to configure the outgoing interface for the communication data. The concept is described in more detail in the following section.

4 Flying Master Over TSN Implementation

The proof-of-concept example is based on a simple CANopen application. Three slave devices provide a synchronized running lights of LEDs. One of the devices generates the counter value and thus, acts as a sensor device on the network. The active NMT master of the network reads the counter value from the sensor and deploys it to all slave devices on the network. Thus, the running light is realized and synchronized on all devices. A redundant master capable device is available to take over the responsibility of this application, if an error occurs. All CANopen slaves are additional devices. The CANopen NMT capable masters are hosted each on a gateway to the TSN backbone, as shown Fig. 4.

While performing the described Flying Master algorithm on the CANopen network, one of the gateways gets promoted to the active CANopen NMT master and will also be classified as functional master of the CANopen application, as shown in Fig. 5. The functional master is responsible for all AR/CRs of the CANopen synchronized LED application regardless of any potential issue in the network. The second gateway hosting the other CANopen NMT capable master is the backup for redundancy and may act as a simple data gateway between CANopen and TSN, represented by the “GW” functionality in Fig. 5.

Fig. 5
figure 5

Overview of the superposition of the application and the physical planes. The dashed lines belong to the application plane and show the relations (AR/CR). The continuous lines belong to the physical plane and represent the hardware. The yellow connection is representing the cable failure

The breakdown of this simple CANopen application into AtFs, ARs, and CRs can be seen in Fig. 5. Both gateways implement the same two functions: AtF0 retrieving the current counter value from the sensor of slave 2, and AtF1 deploying the new value to all slaves. AtF2 represents the sensor activity providing the counter value. Each slave does also have the functionality to retrieve the data from the master and provide it to the digital outputs driving the LEDs, which is represented by AtF3-5. Each connection is abstracted as a single AR and one corresponding CR. Therefore, AR/CR0 represents the application functionality of reading the sensor and retrieving the current state of the counter. AR/CR1-3 are the connections to each slave to deploy the counter to the LED running light.

A consequent implementation of the AtFs and ARs in the CANopen application provides two situations of possible redundancy at runtime:

  • If the active NMT master of the CANopen network, thus, the functional master of the CANopen application fails, e.g. it goes offline, AtF0-1 gets transferred to the other gateway on the network. The redundant master capable device takes over the responsibility of the CANopen application and gets promoted to be the functional master. This is standard behavior of the CANopen Flying Master capability. The FIND CoC needs only monitoring capabilities to detect the new CANopen NMT master and that all slaves are now connected to the redundant master of the network.

  • The more complex situation is experienced whenever a cable brake is detected. In this situation, the CANopen network gets split into two separate CANopen networks and all devices are scattered around, meaning, they are located in one of those networks. The FIND CoC has to detect the new network hierarchy to be able to determine, whether AR/CRs needs to be re-routed or not, to re-establish the original functionality of the application. In this situation, the original functional master still exists, but not all slaves are connected to its own network.

In the second situation, we assume that Gateway/Master 1 is the functional master of the overall CANopen application. If a cable-break in line “c” between slave 2 and slave 3, as shown in Fig. 5, is detected, then AR/CR3 gets automatically flagged as interrupted and does no longer take part in the original network of the functional master. On the other side, the corresponding device shows up in another network behind Gateway/Master 2. Thus, Gateway/Master 2 needs to implement the gateway functionality and communicate with slave 3 while transferring the data from and to the backbone network TSN. The functional master itself needs to communicate AR/CR3 to the backbone network TSN. The overall FIND CoC itself needs to configure the controllable network TSN to establish the communication path between both gateways and thus re-route the communication to slave 3.

5 Conclusion

The CANopen Flying Master over TSN represents a feasible solution for the reliable integration of legacy and future communication technologies inside modern industrial systems. This successful implementation opens the door for future improvements in the field of redundant and flexible networking inside the industrial automation domain. The concept of the CANopen Flying Master over TSN was presented and related to the state-of-the-art solutions with regard to their redundancy capabilities. It allows redundant interconnection and integration of legacy field buses with modern Industrial Internet technologies. The use of the FIND CoC and the corresponding abstraction concepts open the implementation of the flying master also to other legacy technologies than the shown CANopen. The CANopen Flying Master over TSN is a milestone in the integration of IT and OT technologies combined and also a promising approach for the future in order to integrate brownfield environments into the upcoming developments of the industry.

The concept was successfully implemented and demonstrated in a laboratory setup, where the feasibility of the concept was proven and the different possibilities of implementations where assessed with regard to their functionality. Further investigations in the future will test the timing behaviour of this approach and evaluate the application of the flying master in productive, industrial setups. Another open question which remains is the optimal configuration of both legacy technology and TSN network in order to achieve a resource-efficient and adequate communication architecture.