New node type proposal: Control reception

The_Alchemist · April 5, 2023, 3:03am

I propose adding control channel data reception to the Cyphal protocol. This is a common use case, and would be a good fit for various vehicle types that use Cyphal. This is a collaboration between Hydra and myself.

Requirements:

The device transmits control channel data with low latency. The canonical use case is a radio transmitter connected to a manual controller, with 4 high-resolution channels for pitch, roll, yaw, and throttle, and several more lower-resolution “aux” channels assoc. The aux channels are intended for buttons, and switches that have fixed positions. They might be used for changing flight mode, cycling steer points, activating payloads, arming motors etc. This is an example use and is common, but a more general solution is desired.
The device sends link statistics, such as RSSI, portion of packets received recently, transmitter nominal power level etc. Note on a general solution applies here as well.
The device sends metadata, including an immediate message if the link is lost. (Note: This could also be handled by the receiver, but I think a backup from the RF node itself would be nice) It may also send a periodic heartbeat containing system status.

I’m unclear on the exact capabilities of DroneCAN and Cyphal, and use cases that may arise. Below are two starting example implementations, that represent different ends of a flexibility spectrum. They would both be sufficient for the canonical use case described above, but different in applications beyond it:

#1: A simple, partially hard-coded standard, analogous to the CRSF protocol. I imagine this will integrate in a straightforward way with DroneCAN. I’ve put this together by following examples on the DroneCAN List of standard data types. Of note, it sends all control channel data in the same packet, without distinguishing priority.

Example packet, using the DroneCAN List of standard data types:

uint16 [<=10] channel_high_precision     # High precision channel data. For example, throttle or pitch

uint8 [<=10] channel_low_precision        # Low precision channel data. For example, arm status or flight mode

Of note, I’m not sure how to make precision flexible using the example standard data types; in this example, I’ve hard-coded 2 precision levels. Compared to existing standards that use 11-13 bits of data for control channel data, this is higher precision, so uses more bandwidth/larger packet size than required. How would you handle this, using the DroneCAN spec?

Example link stats packet, sent at a slower rate, hard-coded values and precision; this is what CRSF uses. I am not proposing it specifically, but it’s an example of hard-coding, using a current standard. Of note, it is inflexible:

uint32 timestamp
uint8 uplink_rssi_1
uint8 uplink_rssi_2
uint7 uplink_link_quality
int8 uplink_snr
uint8 active_antenna
int8 rf_mode
int4 uplink_tx_power
uint8 download rssi
uint8 downlink_link_quality
int8 downlink_snr

If the link is lost at any point, the node immediately broadcasts a terse message of high priority.

Downsides of the above:

everything is hard-coded, inflexible.
impossible to add new data types or telemetry items without defining entire new frames.
and change breaks backwards compatibility and introduces run-time or compile-time logic into flight-controller codebase.
looks to the past, only at what exists now, not the future which stifles development and experimentation possibilities.

#2: An ideal, flexible standard. I would prefer something like this, but am not sure if it fits the model of DroneCAN and/or Cyphal:

By default, the node is silent. The entity receiving data (eg flight controller) submits a discovery request. The Rx responds with a list of information it can supply, including channel indices, available resolutions, and available data rates for each channel. Given this info, the FC can choose which info it wants to receive based on the Rx’s capabilities, which it knows due to the response to discovery. The subscribe request will always succeed, since the flight controller has the information it needs to make it.

Example of available information: #1: The channel number (to be interpreted by the Rx node). #2: Data rate. #3: resolution. The node responds with a success or error message, per its ability to serve this request. If success, the node broadcasts the data as requested until it receives a cancellation request. Example subscription request:

[
    Subscription index/reference = 1,
    Rate: 100Hz,
    Channel 1, 12 bits,
    Channel 2, 12 bits
]

[
    Subscription index/reference = 2,
    Rate: 50Hz,
    Channel 5, 2 bits,
]

[
    Subscription index/reference = 3,
    Rate: 10Hz,
    Channel 6, 2 bits,
    Channel 17, 1 bit,
    Channel 18, 1 bit,
]

Example responses here. E.g.


[
  Subscription index/reference = 1,
  <12 bits of data for channel 1>,
  <12 bits of data for channel 2>,
]

[
  Subscription index/reference = 2,
  <2 bits of data for channel 5>,
]


[
  Subscription index/reference = 3,
  <2 bits of data for channel 6>,
  <1 bits of data for channel 17>,
  <1 bits of data for channel 18>,
]

The sequence of packets will be something like this:
1,1,1,1…,2,1,1,1…,2,1,1,1… eventually 3,1,1,1…, 2,1,1,1…

The flexible approach would be robust to changing requirements for different use cases, and allow the subscriber (e.g. flight-controller) to only receive the data it needs, at the rate it’s capable of receiving. It would minimize the amount of data sent, saving bandwidth. Of note, the FC could make multiple subscriptions, each with a reference that is included in the subscribe request; the same reference is returned when the Rx sends data, so the FC knows what data, and the format of it for decoding purposes.

I think some combination of the approaches may be viable. For example, a flexible subscription setup, but restricting the number of data rates to 2.

Of note, I think for many cases of channel data, 2+ frames may be required on FDCAN. If this addition is made available for basic CAN, multi-frame packets would be required in all cases. This begs the question: Should this be backwards compatible with basic CAN, or should it be FDCAN only? Suggested approach: A discovery agreement between the Rx and FC, allowing either to be used. The subscriber can choose the frame format (CAN/FDCAN) too.

Relevant discussion on the TBS github; about improving the CRSF spec, and is immediately applicable here: CRSF Protocol Repo · Issue #26 · tbs-fpv/freedomtx · GitHub

Would appreciate any and all input on establishing a standard that’s simple and flexible. Of note, I already have a working CAN receiver, using an ExpressLRS circuit integrated with an STM32 MCU that acts as the FDCAN node, and a CAN transceiver. Using this for specific hardware where both Rx and flight controller are cooperative is easy; my intent here is establishing a common API that will allow CAN Rxes to be swapped arbitrarily, with no change to flight-controller code.

Key design requirements/features:
Flexible
Additions to telemetry data points should not require a protocol change.
Additions to Rx data points should not require a protocol change.
Should not force users into legacy data items, such as the notion of ‘uS-based’ channel values.
Minimal or zero error handling requirements in the subscriber/flight-controller core.
Would be nice:
Re-use of existing code/technologies in the flight-controller core.
In-faillable subscription requests (requires conveying maximum number of subscribe requests,maximum unique data rates, etc, in the ‘discover’ phase).

pavel.kirienko · April 9, 2023, 6:14am

Hi David! Thank you for starting this thread, and apologies for taking my time to get back to you – am traveling.

As I indicated on Matrix, this matter is of interest but it is outside of the scope of the Cyphal standard (which is not to say that it can’t be discussed on this forum). Some time ago one would have said that it is within the domain of UDRAL, but UDRAL itself is currently being transferred to a completely separate effort that is maintained on GitHub here by @Dima:

Hence what we are discussing here is within the domain of DS-015. You will see that many of your valid concerns are already sensibly addressed by Cyphal out of the box.

In Cyphal, one should normally tend towards clarity and simplicity over bandwidth optimization. The idiomatic approach here is to use fixed larger types, like int16 or float32 (or float16 if you can tolerate ~10 bit resolution), for all analog channels. Variable-width encoding is not supported in Cyphal (nor DroneCAN). The topic of good interface design is covered at length in the Guide.

Cyphal (unlike DroneCAN) allows you to add/remove fields to a data type definition without breaking backward compatibility. Practical examples are covered both in the Guide linked above and in the Specification, section 3.8 “Compatibility and versioning”. This feature has limitations but the use case you outlined could serve as a perfect example of data type versioning.

On the high level, this description begs the objection of being too stateful and complex. Some systems are certain to benefit from the added flexibility, while others where functional safety guarantees are of interest will be difficult to reconcile with it. For an in-depth discussion, refer to the section of the Guide dedicated to stateful interfaces.

However, if you choose to express your configurables via dedicated topics, you will see that the logic you described can be implemented in Cyphal (not DroneCAN) out of the box without the need to reinvent custom negotiation protocols. The key here is to choose a particular topic naming format that encodes the required information in the topic name. For example, channels_0to8_13bit_200hz, etc (this is just a crude example; at this point I see little reason to segregate options by resolution). Then, in the case of a high-integrity system, the integrator will be able to choose one (or several) topic once and configure it manually, while in the case of a more flexible PnP scenario the flight controller would simply list the available topics and find the one that suits its configuration best, and then enable it selectively. The other topics would remain, of course, unused, thus consuming zero resources from the network and from the publisher node.

A closely related topic is that of the PnP node configuration. Please read this:

You can further enhance your message definition somewhat to take advantage of the different, receiver-defined update rates per channel (and possibly resolution, although as I said earlier this may not be justifiable). To do this, you could define a separate data type that carries the channel value and its index, let’s call it Channel.1.0:

# Channel.1.0
uint8 channel_index
Value.1.0 value
@extent 64

In the fixed-resolution case, Value.1.0 would be simply a native numeric type like int16 or float32. In the variable-resolution case, you would make it a union like this:

# Value.1.0
@union
uint16 high_resolution
uint8 low_resolution
@sealed

Then you fold it into an array in the top-level type that is actually published by the receiver:

Channel.1.0[<=16] channels
# The array contains only those channels that are updated in this cycle.

To compensate for the possible message loss, the receiver could force an update every 100 ms or so regardless of whether there’s been an actual change.

Cyphal does not distinguish between Classic CAN and CAN FD (sic, FDCAN is not the correct spelling and may hinder forum searchability). As a designer of the network service, you should focus on the higher-level objectives and leave the matter of transport management to the transport layer. I recommend that you think in terms of network services, domain objects, and topics, and purposefully forget about the low-level aspects until the final stages of the design process.

The_Alchemist · April 10, 2023, 3:04am

Thank you for the detailed insights!

In Cyphal, one should normally tend towards clarity and simplicity over bandwidth optimization. The idiomatic approach here is to use fixed larger types, like int16 or float32 (or float16 if you can tolerate ~10 bit resolution), for all analog channels. Variable-width encoding is not supported in Cyphal (nor DroneCAN). The topic of good interface design is covered at length in the Guide.

I like that approach. Would still prefer customization if given a choice, but would probably default to something at least 16 bit wide if there’s no choice.

On the high level, this description begs the objection of being too stateful and complex. Some systems are certain to benefit from the added flexibility, while others where functional safety guarantees are of interest will be difficult to reconcile with it. For an in-depth discussion, refer to the section of the Guide dedicated to stateful interfaces.

Yea - I agree. I think what I posted is probably too complex as you point out.

2 points in particular leave me with an open question -

Cyphal does not distinguish between Classic CAN and CAN FD (sic, FDCAN is not the correct spelling and may hinder forum searchability). As a designer of the network service, you should focus on the higher-level objectives and leave the matter of transport management to the transport layer. I recommend that you think in terms of network services, domain objects, and topics, and purposefully forget about the low-level aspects until the final stages of the design process.

As I indicated on Matrix , this matter is of interest but it is outside of the scope of the Cyphal standard (which is not to say that it can’t be discussed on this forum). Some time ago one would have said that it is within the domain of UDRAL, but UDRAL itself is currently being transferred to a completely separate effort that is maintained on GitHub here by @Dima

These, to my inexperienced perspective, seem like they compose most of the node’s behavior. So, if Cyphal doesn’t specify either of these, it makes me wonder what I need to write to make the node Cyphal compliant. I’m sure this will become apparent once I try to integrate it on an existing bus with other devices. (RN I’m prototyping with just a FC and the node together). For example, I may say “This is a new device; Basic CAN is unsupported so the control channel data will fit on 2 frames”. Or probably not because it’s simple enough to code it to split into diff frame counts depending on the CAN version used, but that’s an example of a road we could go down.

I think my approach will be this: Make it work using custom firmware. Err on the side of simplicity over flexibility at first. PR PX4 and AP to make it work, and compatible with Cyphal and DC on the former, and DC on the latter. Likely my misunderstandings and questions will be cleared up at that point. Overall, i’m going to look at it as Make it compatible with common OSS firmwares, vice Make it compatible with DroneCAN and Cyphal, since I’m having trouble understanding what Cyphal and DC are.

The limited degrees of freedom of code and hardware design expose the significance of abstractions. If I make the device work, the project requirements are met. There are many pages of Cyphal and DC docs and forum posts. There will be a few hundred lines of node code.

pavel.kirienko · April 10, 2023, 4:57pm

This behavior is managed by configuring the MTU correctly in libcanard:

github.com/OpenCyphal/libcanard

libcanard/canard.h

5c69d451a


      
          /// The transport-layer maximum transmission unit (MTU). The value can be changed arbitrarily at any time between
          /// pushes. It defines the maximum number of data bytes per CAN data frame in outgoing transfers via this queue.
          ///
          /// Only the standard values should be used as recommended by the specification;
          /// otherwise, networking interoperability issues may arise. See recommended values CANARD_MTU_*.
          ///
          /// Valid values are any valid CAN frame data length value not smaller than 8.
          /// Invalid values are treated as the nearest valid value. The default is the maximum valid value.
          size_t mtu_bytes;

Example:

github.com/OpenCyphal-Garage/demos

differential_pressure_sensor/src/main.c

598be7aad


      
          // should define "uavcan.can.bitrate" of type natural32[2]; the second value is 0/ignored if CAN FD not supported.
          const int sock[CAN_REDUNDANCY_FACTOR] = {
              socketcanOpen("vcan0", val.natural16.value.elements[0] > CANARD_MTU_CAN_CLASSIC)  //
          };
          for (uint8_t ifidx = 0; ifidx < CAN_REDUNDANCY_FACTOR; ifidx++)
          {
              if (sock[ifidx] < 0)
              {
                  return -sock[ifidx];
              }
              state.canard_tx_queues[ifidx] = canardTxInit(CAN_TX_QUEUE_CAPACITY, val.natural16.value.elements[0]);
          }
          
          // Load the port-IDs from the registers. You can implement hot-reloading at runtime if desired.
          // Publications:
          state.port_id.pub.differential_pressure =
              getPublisherSubjectID("airspeed.differential_pressure",
                                    uavcan_si_unit_temperature_Scalar_1_0_FULL_NAME_AND_VERSION_);
          state.port_id.pub.static_air_temperature =
              getPublisherSubjectID("airspeed.static_air_temperature",
                                    uavcan_si_unit_temperature_Scalar_1_0_FULL_NAME_AND_VERSION_);

The_Alchemist · April 11, 2023, 1:50pm

Thanks! Will check out libcanard as an example!