Cyphal/UDP redundancy

The current specification defines redundant transports abstractly and redundant CAN transports concretely. What is the story for UDP? It’s interesting because Cyphal/TSN can use 802.1CB to meet this requirement but I’m not sure what the plain-old-UDP story should be. Any thoughts on this out there?

This may not be entirely accurate. The Specification defines redundancy in terms of the behaviors observable on the network while it avoids procedural-level, concrete description (as was the case in v0). The relevant sections are:

  • 4.1.2 Redundant transports
  • Transmission over redundant transports
  • Behaviors

The most pertinent fragment comes from the latter section:

The last paragraph essentially requires a no-fail-over behavior for monotonic transfer-ID transports, such as UDP. The tactics that are chosen to achieve the required behavior is a matter that is outside of the scope of the Specification. One possible approach is described here: pycyphal.transport.redundant package — PyCyphal 1.11.1 documentation

The important thing to notice is that transport redundancy is independent of the kind of the transport used, aside from one property, which is whether the transfer-ID is expected to overflow while the network is operational (cyclic transfer-ID) or not (monotonic transfer-ID). The former requires fail-over while the latter allows the network to utilize all transports simultaneously and thus achieve no fail-over delay, similar to 802.1CB.

The strong decoupling between the specific transports and the logic that facilitates redundancy enables heterogeneous transport redundancy. This has already been tested with PyCyphal, for example, where there are integration tests that utilize Cyphal/UDP alongside Cyphal/serial in a redundant group.

For Cyphal/TSN specifically, a redundant TSN network might be seen by Cyphal as a single non-redundant transport since the underlying levels are to take care of the redundancy management. This remains to be researched though.

In conversations I’ve had on this topic the concept of redundancy has been at issue. I submit the below figure to help focus the conversation:

Here we see three different ways of providing network redundancy by OSI layer. Layer 1 is opaque to the end system and is therefore not considered. Layer 7 is undefined by Cyphal. It is only Layer 4 that we are discussing.

Sorry if this is painfully obvious but it does become interesting when we consider 802.1CB which provides layer 2 redundancy in a way that won’t be entirely opaque to our specification. That, however, is future work. For Cyphal/UDP I’m reading that you expect a node to handle multiple Ethernet MACs as part of a redundant group. I think that’s something we should prototype to be sure it works as expected.

Certainly. I have done some rudimentary prototyping with PyCyphal already. We should revisit this once again after this is in: