Cyphal/UDP C Library Repository

I’m working on a prototype implementation of a Cyphal/UDP C library based on libcanard and current specification from the PyCyphal/UDP implementation. How can I go about creating a repository in the OpenCyphal Garage for this prototype? Can I get one of the maintainers to create the repository so that I can submit a PR?

Currently I’m running under the assumption that the library would be called libethard. Does anyone have a different preference? I think I’ve seen libudpard float around in talks with Scott, but I’m curious if there is a different direction we want to take the name.

Please share your GitHub username, and we will add you to the team. But before we do that, can we please speak about the architecture a bit? Basing this work on libcanard directly may lead to a suboptimal result. Should we organize a call about this, or should we start a new thread here instead?

Regarding the name, I assumed that it would be libudpard because libethard seems lower-level than necessary and may result in confusion with a possible future libtsnard. I am open to any option, though.

I’m using the the GitHub username SchoberMJ

Regarding naming, it should be simple enough to refactor while still in the garage to UDPARD, etc.

See the updated UML-ish diagram (from libcanard to libudpard/libethard) below. It is a little different than what I actually ended up with but should be close enough. It ended up being more of a proof of concept. I anticipate there being changes to anything official, but wanted to have something that would transmit cyphal frames over udp so I went with libcanard + some changes.

I do agree that there were some pain-points in using libcanard directly as a basis, particularly around managing the “SessionSpecifier”. And in the prototype/proof of concept I limited the ETHARD_NODE_ID_MAX to what it was with libcanard (CANARD_NODE_ID_MAX) to limit the total potential number of internal sessions per node. We should probably set up some sort of call to discuss and come back with an updated UML diagram of how we want something official to look.

Thanks, I sent you the invitation. Is the library designed to operate on top of the regular Berkeley sockets or at the level of raw Ethernet frames? If it’s the latter, is the below model accurate? Do we envision scenarios where the library is used on an embedded platform without the full UDP stack?

Regardless of the layering underneath the library, do you consider it feasible to bring the API closer to the following (disregard the naming difference for now):

int8_t udpardServe(Udpard* const      ins,
                   UdpardIface* const iface);

int32_t udpardTxPush(Udpard* const                       ins,
                     UdpardIface* const                  iface,
                     const UdpardMicrosecond             tx_deadline_usec,
                     const UdpardTransferMetadata* const metadata,
                     const size_t                        payload_size,
                     const void* const                   payload);

int8_t udpardRxAccept(Udpard* const                ins,
                      UdpardIface* const           iface,
                      const UdpardMicrosecond      timestamp_usec,
                      const UdpardFrame* const     ethernet_frame,
                      UdpardRxTransfer* const      out_transfer,
                      UdpardRxSubscription** const out_subscription);

I would like to have a call. Let’s coordinate on Matrix.

I was considering the application to handle the actual sockets and the library would only handle the cyphal frame as well as some translation functions between cyphal concepts and UDP/IP concepts (ip address, ports, etc.)

The user of the library would use something like socket(2) - Linux manual page (Berkley Sockets I think?) setup as a UDP socket and then bind based on the SessionSpecifier. The user would need to create a SessionSpecifier based on the socket struct received during a recvfrom call.

With regards to the API, what is the function of udpardServe and UdpardIface? Would this be passing the file_descriptor of the socket to the library to handle sending and receiving?


On a side note:
Would we anticipate that the frame layout for TSN, TCP, etc. would be the same as UDP? Where Cyphal/ETH has the format:

typedef struct
{
    uint8_t  version;
    uint8_t  priority;
    uint16_t _reserved_a;
    uint32_t frame_index_eot;
    uint64_t transfer_id;
    uint64_t _reserved_b;
} EthardFrameHeader;

typedef struct
{
    size_t            payload_size;
    EthardFrameHeader cyphal_header;
    const void*       payload;
} EthardFrame;

If that is or could be the case, then we could have libethard be an underlying library that queues and reassembles EthardFrames and libupdard, libtcpard, and libtsnard would handle their own UDP Header, TSN Header, etc. while packing the EthardFrame as their payload?

Okay. Does it mean that the actual implementation of the socket layer is decoupled from the library entirely, i.e., I can use it on an embedded system with no standard IP & IGMP stack at all?

The udpardServe() is only meaningful if the library is to implement the UDP/IP and IGMP layers directly, in which case it would manage periodic IGMP announcements. If the socket layer is kept separate, it is not needed.

The UdpardIface is intended to encapsulate a redundant interface, similar to CanardTxQueue but not only tx but rather bidirectional. Again, if the library is built on sockets, this is probably not directly applicable since each socket effectively acts as a dedicated session-specific interface.

Are there thoughts on how to implement redundant transports? Redundant transmission is trivial (just repeat each tx transfer per redundant transport socket), but redundant reception requires transfer deduplication. It is easy to do though.

TCP – no, see Cyphal/serial for that (works via TCP, too).

TSN – not sure.

“Cyphal/ETH” does not seem like a good choice of name because it implies a lower-level transport implementation (as I wrote earlier).

That is correct, the only part that is sort of coupled is that messages use the multicast prefix defined here pycyphal.transport.udp package — PyCyphal 1.9.0 documentation which means whatever IP stack is used would need to support it.

I actually had a thought on this regarding multiframe transfers and how we currently handle the EOT Frame Index field.

This may be better as a separate forum post, but what are your thoughts on these 2 proposals (for others, EOT = End of Transfer, SOT = Start of Transfer):

  1. We split the 1 bit and 31 bits of EOT and Frame index into 16 bits total frames (expected for the transfer) and 16 bits for current frame in the transfer. This way we will have all the information we need in each frame to slot the frame in the appropriate location. This does reduce our total frames allowed per transfer ID, but potentially allows for easier redundant reassembly and request for missing frames.
  2. We change EOT bit to SOT bit and start with the largest frame index counting down to 1 (current frame index == 1 means EOT). We would need to require the first frame with the SOT bit set is received but then we would have the total expected number of frames to preconstruct an array for slotting the frames.

Both of those proposals require a change to the Cypahl/UDP specification however.

Without a change to the specification we could do something more complicated like calculating the expected frames from the MTU and extent of the message.

Makes sense, it is easy enough to change the name in the current state. Cyphal/UDP is what we have been using so libudpard would fit that theme better than libethard.

Deduplication should be performed at the transfer level, not frame level, to avoid these issues. Please look here: pycyphal.transport.redundant package — PyCyphal 1.9.0 documentation

If you deduplicate on the frame level you would transitively be deduplicating on the transfer level.

E.g. if for a given transfer_id you accept each index of the transfer_id only once you will never have more than one full copy of that transfer_id. This is assuming that you use one libudpard Updard instance for multiple sockets. This also assumes that transfer_id only increases monotonically.

If you don’t share a Udpard instance between sessions then you need to maintain some expected transfer_id per subscribed/published message that only increments monotonically and rejects duplicates.

edit:
Am I misunderstanding the deduplication? Looking at the link it seems to show both transfer and frame level are viable options

There exist two approaches to implementing transport-layer redundancy. The differences are confined to the specifics of a particular implementation, they are not manifested on the bus – nodes exhibit identical behavior regardless of the chosen strategy: […] Frame-level redundancy […] Transfer-level redundancy

This is all true. Both strategies are self-sufficient.

Deduplication at the transfer level requires keeping a dedicated state per input port that stores the last accepted transfer-ID. All reassembled transfers are then compared against that transfer-ID and if the value is not greater, the transfer is rejected as a duplicate. The state is reset to zero when the transfer-ID timeout condition is encountered (this is the delay specified in the Specification that Scott understandably doesn’t like). The sessions operating beneath the deduplicator need not be concerned with transport redundancy at all.

Deduplication at the frame level is more convoluted because you need to match the MTU (which may be potentially limiting) or resort to the additional logic you mentioned in your previous post. Additionally, there are certain differences in their failure modes (I will skip this for now).

Also, frame-level deduplication cannot be used to facilitate heterogeneous transports, unlike the transfer-level strategy.

Heterogeneous transports will likely need their specific version of the library, canard vs updard etc.
I haven’t fully read the link you sent (I’ll be reading it at some point soon); does it cover how to deduplicate transfers from heterogeneous transfers where one transfer has cyclic ids and the other monotonic? When the state is reset to 0 does that mean the monotonic ids will be reused?

On a separate note: what are your thoughts on reassembly of out of order transmission frames? Should we always assume we will receive frame 1 (SOT) first and the EOT frame last? I’m not entirely sure how the queued messages would get out of order if we aren’t doing redundant deduplication at the frame level.

I don’t think it must be so. Transfer-level deduplication can be generalized and abstracted away from the specifics of the transport layer, such that it is usable even if the deduplicated transfers come from different implementations.

Yes, it says that it is impossible.

A core assumption of Cyphal is that the transport layer does not reorder frames, so as long as your reassembler only accepts frames from one interface (i.e., non-redundant), there is no need to address this case. IIRC the reassembler I implemented for PyCyphal, might actually support this, but it is atypical for the protocol.

This seems to contradict:

I think there is some misunderstanding somewhere. I’ll read the doc and see if I can find a different way to phrase the question.

Er, yes, the underlying transport implementation can be arbitrary as long as its transfer-ID is monotonic.

First off, here are some links to the current progress of creating libudpard from libcanard and existing Cyphal/UDP specifications: libudpard
Note that there are some limitations here:

  1. It currently relies on the user to create and maintain UDP/IP sockets
  2. It relies on UDP fragmentation for messages larger than the MTU (it essentially only supports single frame transfers currently).
  3. It limits the node id to what libcanard had, iirc 127
  4. It currently uses the naming ETH, Eth, eth, instead of UDP, Udp, udp. This will change at a later time

Here is a simple demo of setting up some sockets and sending a message libudpard_demo


Additionally, there were some discussion points mentioned during the August 5th Weekly Open Cyphal Meeting:

  1. @scottdixon is great
  2. We should consider using a tree for the internal session allocation. This will still have a deterministic memory consumption for static systems that would be using this library.
  3. We should re-architect this library to include a small udp/ip stack and build the entire ethernet frame for the user. The user would then only need to maintain raw sockets which should make this library more portable to embedded systems.
  4. If we consider the above (3) then we will need to take care to ensure we account for IGMP in whatever design we implement. (We need to publish multicast announcements, look at the RFC for IGMP).
  5. We could attempt to make the library compatible with Berkley sockets similar to the current prototype and demo along side providing an optional udp/ip stack.
  6. Transfer id deduplication at the transfer level still seems desirable, however there is still concerns about the reset timeout. We (newer folks to the project) should try to brainstorm some solutions to deduplication.
  7. We need to converse with @scottdixon about the uavcan library and how it might utilize libcanard, libudpard, libserard, etc.
2 Likes

@pavel.kirienko or @scottdixon

Do we have a preference of taking a dependency on a minimal UDP/IP Stack versus implementing an minimal UDP/IP stack? Is there a benefit to implementing it here?

Would we want to include the UDP/IP stack within libudpard directly or would we create a cyphaludpip library that libudpard would rely on?

Generally we don’t like dependencies in OpenCyphal. One of the tenets (I can’t remember if we wrote this down anywhere) is to minimise dependencies since OpenCyphal implementations are supposed to be leaf nodes in a system’s software stack (with only drivers beneath things like libcanard, etc).

It’s a compelling story to have sloccount libudpard be the full and complete accounting of source needed to enable Cyphal on top of an Ethernet MAC driver.

2 Likes

Here’s what I was describing during the meeting @pavel.kirienko. Some behaviors of the Stack have to be taken below the libupdard for various automated responses. My Terrible Drawing

Yep. Sounds like we’ll have to switch from this model adopted in libcanard:

+---------------------------------+
|           Application           |
+-------+-----------------+-------+
        |                 |
+-------+-------+ +-------+-------+
|   Libcanard   | |  Media layer  |
+---------------+ +-------+-------+
                          |
                  +-------+-------+
                  |    Hardware   |
                  +---------------+

To this model reminiscent of the old libuavcan v0:

+---------------+
|  Application  |
+-------+-------+
        |
+-------+-------+
|   Libudpard   |
+---------------+
        |
+-------+-------+
|   NIC driver  |
+-------+-------+
        |
+-------+-------+
|    Hardware   |
+---------------+

The RxAccept call will have to be removed and some polling call will need to be added, I presume.

RxAccept could probably stay but as a private function that is called by whatever is running the IP stack. You may even keep it if someone wants to bring their own stack.