[Cyphal/UDP] Architectural issues caused by the dependency between the node's IP address and its identity

I think I now see what you were trying to achieve with the variable-size Cyphal header: the header size field would change depending on whether the payload incorporates the timestamp as the first field or not; if it does, the header size would be minimal (24 bytes per our examples above). Otherwise, the header timestamp would be injected at the same offset from the origin of the Cyphal frame where it would have been if it were part of the serialized payload, and the Cyphal header length would be increased by 7 bytes such that the injected timestamp is not deserialized as part of the message. Except that in your case, the CHL is said to be a multiple of 4 bytes. Is this more or less in line with your goal, at least on a high level?

With a fixed-size header, the objective of “without incurring the overhead of including the same timestamp twice” is not achievable because there will always be some reserved space in the header if the timestamp is not provided.

One possible option here that doesn’t require variable-size headers and yet allows the transport layer to reach the timestamp is to replace your timestamp indication flag with an optional offset:

uint8 timestamp_offset_from_header_origin  # Zero if not timestamped.
  1. If the serialized object is timestamped, this field would be set to the header size (32 bytes).

  2. If the serialized object is not timestamped but the header timestamp is given in the reserved field, this field would be set to the offset of the reserved field (8 bytes per my proposal, 24 bytes per your proposal).

  3. If neither is present, this field would be set to zero, indicating the lack of timestamps.

The obvious disadvantage is that we always incur the overhead of transferring the reserved field in the header so that kind of defeats the point.

yes


uint8 timestamp_offset_from_header_origin  # Zero if not timestamped.

this works for me however the checksum becomes an issue. My proposal has the property of including the timestamp in the header checksum. DPI can’t use header data if the header is invalid so It must be part of this checksum.


My proposal does require 8 more bytes to expand the header but it never transmits those bytes on the wire as additional header data. What we’re doing is borrowing 8 bytes, sometimes, from the data payload to save bandwidth but we still require anyone reading the header to always read those 8 bytes when calculating the checksum and to allocate 8 bytes if using the timestamp without de-serializing the message.

We must choose: either to retain the fixed-size header or to sacrifice intelligent timestamping.

To this end, I would like to discuss how would a Cyphal/UDP implementation deduce whether the serialized object contains a timestamp in it so that it can decide whether to add an explicit timestamp to the header or not. Do you have some API solution in mind for this? I presume that a generic implementation that always adds the header timestamp would also defeat the point of this variable-size-header approach.

My assumption was this would be an API where the sender tells the library that there’s a timestamp and any offset (if we go that route) involved.

Okay. I certainly see the value in this but I fear that we might accidentally build another DDS unless we scrutinize every feature we add to the protocol.

I suspect that the situation where the application layer and the transport layer view the timestamping problem differently is not entirely impossible. From the transport layer standpoint, a timestamp could be leveraged for queueing policy implementations, discarding of obsolete data, and rate limiting, and we can expect it to represent the point in time where the transfer is emitted to the network. At the application layer, a timestamp could potentially refer to a point in time that precedes the formation of the transfer (e.g., it is common for state estimators to timestamp published estimations based on the timestamp of the latest to arrive sensor feed message). This creates a certain danger that the transport layer might misuse an application-layer timestamp for transport purposes where it would not be appropriate. If you accept this, then perhaps you might see how the danger arises out of our attempt to save seven bytes per transfer (sic! not per frame) by mixing two distinct layers of the communication stack.

The risk is indeed low (I expect the cases where the application layer timestamp is ill-suited for transport purposes to be rare), but then so is the reward (saving just 7 bytes per transfer). Considering our commitment to simplicity, I am mildly inclined towards the option of using a simpler and less efficient fixed-size header with an optional transport layer timestamp that does not invite implementations to make assumptions about the contents of the serialized payload.

Do you think this is sound or am I missing something?

Per the dev call, this is what I think we all agreed to

uint4 version                      # <- 1
void4
 
@assert _offset_ == {8}
uint3 priority                     # Duplicates QoS for ease of access; 0 -- highest, 7 -- lowest.
void5
 
@assert _offset_ == {16}
uint16 source_node_id
uint16 destination_node_id
uint16 data_specifier              # Like in Cyphal/serial: subject-ID | (service-ID + request/response discriminator).
 
@assert _offset_ == {64}
uint64 transfer_id
 
@assert _offset_ == {128}
uint31 frame_index
bool end_of_transfer
 
uint16 user_data
# Opaque application-specific data with user-defined semantics. Generic implementations should ignore
 
@assert _offset_ % 16 == {0}
uint8[2] header_crc16_big_endian
 
@assert _offset_ / 8 == {24}       # Fixed-size 24-byte header with natural alignment for each field ensured.
@sealed

  1. This is an optimization for UDP/IP on Ethernet. By limiting the multicast group ID to the least significant 23 bits, Ethernet hosts can avoid additional filtering responsibilities above layer 2.
  2. RFC. 2365, Section 6.2.1 reserves 239.0.0.0/10 and 239.64.0.0/10 for future use (because of footnote 1, Cyphal/UDP does not have access to the 239.128.0.0/10 scope). Cyphal/UDP uses this bit to isolate IP header version 0 traffic (note that the IP header version is not, necessarily, the same as the Cyphal Header version) to the 239.0.0.0/10 scope but we can enable the 239.64.0.0/10 scope in the future.
  3. SNM (Service, Not Message): If set then this is an RPC request or response and the 16 LSbs of the destination IP address is the full-range destination node identifier. If not set then the 15 LSbs of the destination IP address are a subject identifier for a pub/sub message and the 16th LSb is 0.
  4. Zero on transmit, discard on receipt unless zero.
  5. This is a temporary UDP port. We’ll register an official one later.
  6. Per RFC 1112, the default TTL is 1, which is unacceptable. Therefore, publishers should use the TTL value of 16 by default, which is chosen as a sensible default suitable for any intravehicular network.
  7. (comment removed)
  8. The data specifier is taken directly from Cyphal/Serial (pycyphal.transport.serial package — PyCyphal 1.15.0 documentation)
  9. If the SNM10 bit is set then this is a 10-bit service identifier with a 1-bit IRNR11 flag, otherwise it is a 13-bit subject identifier.
  10. SNM (Service, Not Message). Same value as found in the destination IP header (SNM3).
  11. IRNR (Is Request Not Response) if SNM10 is set.
  12. (comment removed)
  13. Like in CAN: 0 – highest priority, 7 – lowest priority. This data is duplicated from lower-layer QoS fields but provided in the Cyphal header to simplify transfer forwarding where the QoS data is not readily available above the transport layer.
  14. 0xFFFF == anonymous transfer
  15. The 31 bit frame index within the current transfer.
  16. EOT (End Of Transfer): if the most significant bit (31st) bit of the 32-bit frame index is set if the current frame is the last frame of the transfer.
  17. If EOT16 is set then this is the CRC of the reassembled transfer (header data excluded). This also applies to single frame transfers where EOT will always be set. In this case the CRC applies to just the single frame which is different than CAN where single transfers do not use a Cyphal CRC as they rely on the CAN CRC exclusively. Because the UDP checksum is weak the UDP version of Cyphal relies on the UDP checksum only as an optimization for multi-part transfers (where the CRC failure can catch an error before the transfer is reassembled and the strong Cyphal CRC is applied).
  18. 0xFFFF == broadcast
  19. Header CRC is CRC-16/CCITT-FALSE (aka CRC-16/AUTOSAR) and is encoded as a two-byte, big-endian integer.

2 Likes

Yes, all seems to be correct, except that the user_data not being included in the CRC computation is surprising. Why do you want it to be this way? If this is indeed to be so, then shouldn’t it be swapped places with the CRC field?

fixed

:tada:

@maksim.drachov seems like we’ve reached a tentative consensus here. Someone else might still raise objections, but generally, I would say it is now safe to proceed with the implementation of the new header layout and the refined multicast address structure. The new header layout is defined in the previous post by Scott: [Cyphal/UDP] Architectural issues caused by the dependency between the node's IP address and its identity - #60 by scottdixon

The new IP address format is defined here: [Cyphal/UDP] Architectural issues caused by the dependency between the node's IP address and its identity - #46 by scottdixon

The transfer-CRC is now to be present not only in multi-frame transfers but also in single-frame transfers, always.

I think these are all changes. Scott, please correct me if I missed anything.

2 Likes

I updated #60 to be a single post with the entire result of this thread’s negotiations.

1 Like

The header CRC function is CRC-16-CCITT-FALSE. Some details are available in the chat room here:

This is now available in PyCyphal v1.13, brought to us by @maksim.drachov

3 Likes

@scottdixon @schoberm @maksim.drachov is the header CRC supposed to be serialized in the little endian or big endian byte order? Considering that DSDL prescribes little-endian, the answer seems obvious, but the native byte ordering for CRC16-CCITT is big-endian, and the native byte ordering is used in the Cyphal/CAN multi-frame transfer CRC.

>>> from pycyphal.transport.commons.crc import CRC16CCITT
>>> CRC16CCITT.new(b'123456789').value
10673

>>> CRC16CCITT.new(b'123456789').value_as_bytes.hex()  # Big endian
'29b1'

>>> hex(10673)
'0x29b1'

>>> CRC16CCITT.new(b'123456789\x29\xb1').value  # Correct residue
0

>>> CRC16CCITT.new(b'123456789\xb1\x29').value  # Incorrect residue
37875

The byte ordering currently implemented in PyCyphal UDP transport by @maksim.drachov is little endian.

The byte ordering currently implemented in @samcrow’s Canadensis is little endian.

The byte ordering currently implemented in libudpard is, uhm, platform-dependent.

P.S.: The native byte ordering of CRC32C is little-endian, so there is no ambiguity there.

Currently the next MR should move Cyphal/Serial to little-endianness , so that only leaves Cyphal/CAN to be ported eventually. (Speaking about pycyphal here)

Also: all other values in the header are encoded as little-endian, so I see no reason to make an exception for CRC16-CCITT.

Cyphal/CAN will not be ported because that would break wire compatibility. Either we move the header CRC field in Cyphal/UDP and Cyphal/serial to big-endian now, or we retain the inconsistency forever.

Well…

In that case big-endian :sob:

Can you point me to the governing standard here?

It follows from the definition of this algorithm; more info here Catalogue of parametrised CRC algorithms. You can see in the example I provided that the correct residue is obtained only if the bytes are fed in the big endian order.

Right, but that’s the endianness of that algorithm. We’re only concerned with the representation of the 2-byte result in our UDP header. Does the standard actually control the endianness of that value on the wire or should the user simply assume network byte order?

No. It’s our value and we can do whatever we want with it. The problem here is that the process of verification becomes a bit (a few cycles) more complex because you need to compute the CRC for the entire header sans the last two bytes and then compare the result against the value contained in the header. If the value was in the big-endian byte order, you would simply run the CRC algorithm over the entire header and then ensure that the result is zero. This is not a significant shortcoming by any margin but still.