Configuration of Node-ID and Port-ID | DHCP-esque static IP behavior in Cyphal

Scoeerg · January 22, 2025, 9:14am

Hi,

tl;dr:

What’s “the” solution to uniquely identifying message/node-combinations consistently over powercycles? I have some motivation, my own understanding and a proposed solution based off both using Allocatee-Allocator with Dynamic node ID allocation as fallback.

Motivation:

I am building STM32-based compute modules which then will be driving sensors, actuators, monitors etc. All STM32 will be in a CAN-transporation-layer Cyphal network as one Cyphal-Node, each subscribing/publishing multiple messages/services.

I will flash the same software on any given STM32+Peripherals combination. Example:

STM32 + H-Bridge + Motor (STM-H-M)

Now, if I build a robot/drone/whatever that has two of those Cyphal nodes (front-left and front-right), how do I make sure that even through powercycles I can identify all messages front-left and front-right? Remember: they run the same software, I will not hardcode e.g. Node-IDs.

My Understanding (Correct me if wrong, still learning):

related documentation:

Cyphal currently supports Plug-and-Play node ID allocation, which will not be persistent (through powercycles), as well as Allocatee-Allocator exchange. For understanding, an analogy:

Allocator: DHCP-Server
Allocatee: DHCP-client
Allocatee-unique-ID: DHCP-client MAC-address

But using Allocatee-Allocator also means Cyphal will not be a decentralized network anymore, so there must be a fallback strategy.

My approach:

I would implement:

On Allocatee-Side:

Set STM32 in anonymous state
Request Node id as allocatee with STM32-hardware built in 96bit unique ID, where the preferred ID would be identical on all STM32 with the same peripherals (e.g. motor-node will prefer ID 20, IMU-nodes will prefer ID 30 etc.).
If there is no allocator on the network after some time (What time is recommended?), fall back to dynamic PnP allocation.

On Allocator-Side (e.g. a Linux-machine)

Start Allocator
Look for a lookup-table of Node-ID/Unique-ID pairs, if it does not exist generate one
Assign node IDs based on lookup-table, if unique-ID not in table write in table and save to file

Since the lookup-table will persist through powercycles, we have a solution.

Open Side-Quests:

A. Documentation asks for a 128bit unique ID. But that is max. length, so 96bit is fine? If not, just fill up the 96bit with zeros?
B. What time is recommended to wait for an allocator before falling back to dynamic node ID?

pavel.kirienko · January 23, 2025, 5:08pm

Any faithfully implemented decentralized publish-subscribe system (DCPS) should not make any assumptions, or assign any significance, to the identity of the actors publishing and consuming data. In practical terms, for Cyphal, it means that a node receiving data should not look at the node-ID of its publisher; a node publishing data should not have any expectations regarding who is listening and what is being done to that data.

To answer your question specifically: it doesn’t matter what the node-ID of your nodes are. What matters is what subjects they publish their messages to. Your left-side and right-side nodes should use different subject-IDs for their publications.

Next, this is not directly related to the problem at hand, but it is recommended to store dynamically allocated node-IDs into a nonvolatile memory, such that at the next bootup the node can commence operations immediately instead of waiting for the allocator to respond. Otherwise, if your node experiences a power surge or some other abnormality causing it to restart, its ability to return back to duty will be conditional on the availability and performance of the fundamentally nondeterministic PnP service.

Documentation asks for a 128bit unique ID. But that is max. length, so 96bit is fine? If not, just fill up the 96bit with zeros?

Yes.

What time is recommended to wait for an allocator before falling back to dynamic node ID?

This doesn’t make sense. You need the allocator to be available only if you’re trying to get a node-ID allocated. If you already have a node-ID, you don’t need to wait for anything.

Scoeerg · January 24, 2025, 8:35am

Hi,

Thanks for the response. It really clarifies things, but I’ve got some more work to do. I’ll share my progress here to help out any future newcomers like myself.

Scoeerg · January 29, 2025, 3:16pm

I made some progress, which I believe will work in my case. There is a “central” node monitoring the entire network (written in PyCyphal, but it should not matter). It requires nodes to have

7509 - heartbeat publishing (already a requirement for all Cyphal nodes anyway)
7510 - List publishing all publishers/subscribers/services/clients
430 - getInfo() service (already highly recommended to have implemented)

# scanner heard the first heartbeat Subject-ID 7509 of the Cyphal Node
INFO:root:Node 67 has appeared for the first time.
# scanner successfully asked the Node for a Service-ID 430 GetInfo
INFO:root:Node 67 has responded to getInfo service.
# scanner heard a List Subject-ID 7510 of the Cyphal Node which contains info about the node's 
# publishers, subscribers, clients and servers
INFO:root:Node 67 has unique-ID [129, 133, 164, 45, 37, 244, 253, 221, 251, 239, 157, 232, 95, 188, 101, 40]
INFO:root:Node 67 has a publisher with Subject-ID uavcan.node.port.SubjectID.1.0(value=7509)
INFO:root:Node 67 has a publisher with Subject-ID uavcan.node.port.SubjectID.1.0(value=7510)
INFO:root:Node 67 has a subscriber with Subject-ID uavcan.node.port.SubjectID.1.0(value=7509)
INFO:root:Node 67 has a server with Service-ID 384
INFO:root:Node 67 has a server with Service-ID 385
INFO:root:Node 67 has a server with Service-ID 430
INFO:root:Node 67 has a client with Service-ID 112
# scanner is done registering all the Subject/Service IDs
INFO:root:Node 67 has registered all Subject/Service IDs.
INFO:root:Node 67 has published its port list.
# scanner tracks the online-time of the node to be 2s332µs since it heard the nodes first heartbeat
INFO:root:Node 67 has been online for 0d/0h/0min/2s/332µs.
# scanner will print online-message, publishers, subscribers, etc. every couple seconds. 
INFO:root:Node 67 has subscribers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509)]
INFO:root:Node 67 has publishers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509), uavcan.node.port.SubjectID.1.0(value=7510)]
INFO:root:Node 67 has servers with service IDs: [384, 385, 430]
INFO:root:Node 67 has been online for 0d/0h/0min/7s/86µs.
INFO:root:Node 67 has subscribers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509)]
INFO:root:Node 67 has publishers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509), uavcan.node.port.SubjectID.1.0(value=7510)]
INFO:root:Node 67 has servers with service IDs: [384, 385, 430]
# scanner notices the node has gone offline (shutdown or disconnected from CAN-bus) since there are no heartbeats from it anymore
INFO:root:Node 67 has been offline for 0d/0h/0min/5s/111276µs.
# some time later ... scanner notices the node is back online with the same unique ID and node-ID
INFO:root:Node 67 has reappeared. 
INFO:root:Node 67 already registered all Subject/Service IDs.
INFO:root:Node 67 has published its port list.
# scanner tracks the node since it went back online
INFO:root:Node 67 has been online for 0d/0h/0min/2s/1725µs.

Notice that online-time is NOT the node’s idea of how long it’s been online, but the central node’s monitoring time of said node. Now anybody wondering who is publishing what where and why can just ask the scanner for said information, and everything can be mapped with unique-IDs.

e.g. I know unique-ID [123, …, 123] is my front-left motor, I can just map this to current dynamic node-ID, and start pub/sub/client/service based on the information provided by the central scanner. If you have no interest in having ONE central scanner, this could of course be implemented in every node.

pavel.kirienko · January 29, 2025, 3:54pm

Consider abstracting your system away from the identities of the actors. Normally, it shouldn’t matter where the data is coming from and where it is going. That can enable much simpler and maintainable designs. Commonly, this is referred to as DCPS, as I mentioned earlier; Lars-Berno Fredriksson explains it as follows:

Source: https://forum.opencyphal.org/uploads/short-url/5lAC88jvBjQYCF2QdLNRBCMLhFC.pdf

I am worried that your usage of Cyphal may be unidiomatic and may lead to a poor design.

Scoeerg · February 26, 2025, 12:01pm

Alright. As promised. The problem persists, see the original motivation. Here’s another idea for a solution, which I believe will be idiomatic - mostly because it uses only Cyphal-Built-In services and the Plug-And-Play solution provided via PyCyphal.

The system integrator knows about the hardware built into his application. He defines a config: the future node on hardware 3 will subscribe to some topic with known type from hardware 2 with unique ID C. But at what node ID and subject ID?!

Step 0: There exists an allocator in the network, which lives in Node 1 running on hardware 1 with 128bit unique ID A. Hardware 2 with unique ID B at start only has an anonymous node which only serves as a Node-ID allocatee. It requests a node ID from the allocator with its unique ID B and gets the response: 125.

Step 1: The anonymous node in hardware 2 stops and starts a non-anonymous node with ID 125. This node now has:

Heartbeat (mandatory)
430.GetInfo.1.0-Server
Some publisher with subject ID 1234

hardware 3 runs through the same process as hardware 2, yet receives node ID 124.

Step 2:

The anonymous node on hardware 3 stops and starts node 124. Both nodes 125 and 124 are running.

Step 3:

Node 124 (see above from system integrator config) still knows it needs to subscribe to the subject of hardware 2. So it asks the allocator: who is unique-ID B and receives response node ID 125.

Step 4:

Node 124 now asks node ID 124 via GetInfo: what publishers do you have (and especially what are the subject IDs?). Gets full response including subject ID 1234.

Step 5:

Node 124 subscribes to subject ID 1234.

I have this running in PyCyphal on a virtual CAN bus. Works splendidly.

Discussion points:

Even for identically compiled and flashed software nodes on different hardware all publishing their identical message types into subject ID 1234, the subscriber can still tell them apart by knowing the publisher’s node ID via the logic described above. So the example of left/right motor is solved.
Network is “centralized” around the allocator, which is seemingly ok since it’s a built-in-feature by Cyphal. Also, multiple allocators can exist in the same network. This begs the question: how are multiple allocators synchronized?
The Plug-And-Play allocator/allocatee scheme works fine in PyCyphal. For this to work in my specific case, I require anonymous nodes and an allocatee running on STM32, so libcanard. Any pointers on how to achieve this?
Since this forum.opencyphal.org discussion the subject ID (my example 1234) can be set by the integrator via register API. The vendor will simply make them configurable with a default value.

Scoeerg · February 26, 2025, 6:00pm

Nevermind. Reading the libcanard-header, anonymous nodes seem to be a no-brainer since

#define CANARD_NODE_ID_MAX 127U
...
/// This value represents an undefined node-ID: broadcast destination or anonymous source.
/// Library functions treat all values above CANARD_NODE_ID_MAX as anonymous.
#define CANARD_NODE_ID_UNSET 255U

Also, this forum.opencyphal.org/t/automatic-configuration-of-port-identifiers/840/3 discussion was most helpful and closely related to my problem.

pavel.kirienko · March 14, 2025, 11:19am

As I covered above, this is a highly unorthodox solution. It’s good if it works for you, though.

An idiomatic solution would assign a dedicated subject-ID to each publisher and obviate the need to distinguish and know the node-ID, and especially unique-ID, of network participants.