Configuration of Node-ID and Port-ID | DHCP-esque static IP behavior in Cyphal

Hi,

tl;dr:

What’s “the” solution to uniquely identifying message/node-combinations consistently over powercycles? I have some motivation, my own understanding and a proposed solution based off both using Allocatee-Allocator with Dynamic node ID allocation as fallback.

Motivation:

I am building STM32-based compute modules which then will be driving sensors, actuators, monitors etc. All STM32 will be in a CAN-transporation-layer Cyphal network as one Cyphal-Node, each subscribing/publishing multiple messages/services.

I will flash the same software on any given STM32+Peripherals combination. Example:

STM32 + H-Bridge + Motor (STM-H-M)

Now, if I build a robot/drone/whatever that has two of those Cyphal nodes (front-left and front-right), how do I make sure that even through powercycles I can identify all messages front-left and front-right? Remember: they run the same software, I will not hardcode e.g. Node-IDs.

My Understanding (Correct me if wrong, still learning):

related questions:

related documentation:

Cyphal currently supports Plug-and-Play node ID allocation, which will not be persistent (through powercycles), as well as Allocatee-Allocator exchange. For understanding, an analogy:

Allocator: DHCP-Server
Allocatee: DHCP-client
Allocatee-unique-ID: DHCP-client MAC-address

But using Allocatee-Allocator also means Cyphal will not be a decentralized network anymore, so there must be a fallback strategy.

My approach:

I would implement:

On Allocatee-Side:

  1. Set STM32 in anonymous state
  2. Request Node id as allocatee with STM32-hardware built in 96bit unique ID, where the preferred ID would be identical on all STM32 with the same peripherals (e.g. motor-node will prefer ID 20, IMU-nodes will prefer ID 30 etc.).
  3. If there is no allocator on the network after some time (What time is recommended?), fall back to dynamic PnP allocation.

On Allocator-Side (e.g. a Linux-machine)

  1. Start Allocator
  2. Look for a lookup-table of Node-ID/Unique-ID pairs, if it does not exist generate one
  3. Assign node IDs based on lookup-table, if unique-ID not in table write in table and save to file

Since the lookup-table will persist through powercycles, we have a solution.

Open Side-Quests:

A. Documentation asks for a 128bit unique ID. But that is max. length, so 96bit is fine? If not, just fill up the 96bit with zeros?
B. What time is recommended to wait for an allocator before falling back to dynamic node ID?

Any faithfully implemented decentralized publish-subscribe system (DCPS) should not make any assumptions, or assign any significance, to the identity of the actors publishing and consuming data. In practical terms, for Cyphal, it means that a node receiving data should not look at the node-ID of its publisher; a node publishing data should not have any expectations regarding who is listening and what is being done to that data.

To answer your question specifically: it doesn’t matter what the node-ID of your nodes are. What matters is what subjects they publish their messages to. Your left-side and right-side nodes should use different subject-IDs for their publications.

Next, this is not directly related to the problem at hand, but it is recommended to store dynamically allocated node-IDs into a nonvolatile memory, such that at the next bootup the node can commence operations immediately instead of waiting for the allocator to respond. Otherwise, if your node experiences a power surge or some other abnormality causing it to restart, its ability to return back to duty will be conditional on the availability and performance of the fundamentally nondeterministic PnP service.

Documentation asks for a 128bit unique ID. But that is max. length, so 96bit is fine? If not, just fill up the 96bit with zeros?

Yes.

What time is recommended to wait for an allocator before falling back to dynamic node ID?

This doesn’t make sense. You need the allocator to be available only if you’re trying to get a node-ID allocated. If you already have a node-ID, you don’t need to wait for anything.

1 Like

Hi,

Thanks for the response. It really clarifies things, but I’ve got some more work to do. I’ll share my progress here to help out any future newcomers like myself.

I made some progress, which I believe will work in my case. There is a “central” node monitoring the entire network (written in PyCyphal, but it should not matter). It requires nodes to have

  • 7509 - heartbeat publishing (already a requirement for all Cyphal nodes anyway)
  • 7510 - List publishing all publishers/subscribers/services/clients
  • 430 - getInfo() service (already highly recommended to have implemented)
# scanner heard the first heartbeat Subject-ID 7509 of the Cyphal Node
INFO:root:Node 67 has appeared for the first time.
# scanner successfully asked the Node for a Service-ID 430 GetInfo
INFO:root:Node 67 has responded to getInfo service.
# scanner heard a List Subject-ID 7510 of the Cyphal Node which contains info about the node's 
# publishers, subscribers, clients and servers
INFO:root:Node 67 has unique-ID [129, 133, 164, 45, 37, 244, 253, 221, 251, 239, 157, 232, 95, 188, 101, 40]
INFO:root:Node 67 has a publisher with Subject-ID uavcan.node.port.SubjectID.1.0(value=7509)
INFO:root:Node 67 has a publisher with Subject-ID uavcan.node.port.SubjectID.1.0(value=7510)
INFO:root:Node 67 has a subscriber with Subject-ID uavcan.node.port.SubjectID.1.0(value=7509)
INFO:root:Node 67 has a server with Service-ID 384
INFO:root:Node 67 has a server with Service-ID 385
INFO:root:Node 67 has a server with Service-ID 430
INFO:root:Node 67 has a client with Service-ID 112
# scanner is done registering all the Subject/Service IDs
INFO:root:Node 67 has registered all Subject/Service IDs.
INFO:root:Node 67 has published its port list.
# scanner tracks the online-time of the node to be 2s332µs since it heard the nodes first heartbeat
INFO:root:Node 67 has been online for 0d/0h/0min/2s/332µs.
# scanner will print online-message, publishers, subscribers, etc. every couple seconds. 
INFO:root:Node 67 has subscribers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509)]
INFO:root:Node 67 has publishers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509), uavcan.node.port.SubjectID.1.0(value=7510)]
INFO:root:Node 67 has servers with service IDs: [384, 385, 430]
INFO:root:Node 67 has been online for 0d/0h/0min/7s/86µs.
INFO:root:Node 67 has subscribers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509)]
INFO:root:Node 67 has publishers for the following subjects: [uavcan.node.port.SubjectID.1.0(value=7509), uavcan.node.port.SubjectID.1.0(value=7510)]
INFO:root:Node 67 has servers with service IDs: [384, 385, 430]
# scanner notices the node has gone offline (shutdown or disconnected from CAN-bus) since there are no heartbeats from it anymore
INFO:root:Node 67 has been offline for 0d/0h/0min/5s/111276µs.
# some time later ... scanner notices the node is back online with the same unique ID and node-ID
INFO:root:Node 67 has reappeared. 
INFO:root:Node 67 already registered all Subject/Service IDs.
INFO:root:Node 67 has published its port list.
# scanner tracks the node since it went back online
INFO:root:Node 67 has been online for 0d/0h/0min/2s/1725µs.

Notice that online-time is NOT the node’s idea of how long it’s been online, but the central node’s monitoring time of said node. Now anybody wondering who is publishing what where and why can just ask the scanner for said information, and everything can be mapped with unique-IDs.

e.g. I know unique-ID [123, …, 123] is my front-left motor, I can just map this to current dynamic node-ID, and start pub/sub/client/service based on the information provided by the central scanner. If you have no interest in having ONE central scanner, this could of course be implemented in every node.

Consider abstracting your system away from the identities of the actors. Normally, it shouldn’t matter where the data is coming from and where it is going. That can enable much simpler and maintainable designs. Commonly, this is referred to as DCPS, as I mentioned earlier; Lars-Berno Fredriksson explains it as follows:

Source: https://forum.opencyphal.org/uploads/short-url/5lAC88jvBjQYCF2QdLNRBCMLhFC.pdf

I am worried that your usage of Cyphal may be unidiomatic and may lead to a poor design.