[Cyphal/UDP] Architectural issues caused by the dependency between the node's IP address and its identity

pavel.kirienko · October 2, 2022, 2:57pm

At the last dev call, I mentioned that I wanted to discuss an architectural deficiency in the current working draft of the Cyphal/UDP transport specification that requires attention. The deficiency is not a major one and is not likely to jeopardize the utility of the transport at large, but it may lead to unnecessary friction in its deployment and long-term maintenance. I am not the first to call attention to this property of the protocol; the first one to question it was someone from the Amazon team (sorry, can’t recall the exact person).

This post provides an exposition of the problem, suggests a possible solution, and lists its advantages and known disadvantages at the end. I suggest implementing this proposal on a separate branch of PyCyphal for evaluation and testing purposes.

CC @scottdixon @ASMik @schoberm @lydiagh

Problem

The current draft of the Cyphal/UDP transport reifies the Cyphal node-ID through the IP address of the node as follows:

xxxxxxxx.xddddddd.nnnnnnnn.nnnnnnnn
\________/\_____/ \_______________/
 (9 bits) (7 bits)     (16 bits)
  prefix  subnet-ID     node-ID

The prefix is fixed, and the subnet-ID bears little relevance, so assume it is also fixed. The node-ID replaces the two least significant octets of the address.

One can determine the origin of a given Cyphal transfer (whether message or service) by evaluating the source IP address. The destination address depends on the kind of transfer: message transfers are directed towards their multicast group address (the rules of its composition are not covered here), while service transfers are directed towards the IP address of the recipient (composed as described above).

This approach was chosen as it allows the system to be decentralized above OSI layer 3 (no need to maintain a central agent responsible for brokering the connections between nodes) and, at the same time, allows a node to commence operation immediately without the need to discover the network configuration dynamically. These requirements are based on the core design principles of Cyphal and, as such, are not questioned.

The specification focuses on IPv4. It is trivial to extend to also cover IPv6 but the benefits of such extension remain unclear.

While the approach is simple and functional, as indicated by limited testing in lab conditions, it suffers from several issues that appear to be cheap to address. They are reviewed below.

1. Leaky abstraction

The protocol stack is layered as follows:

L5…7	Cyphal/UDP
L4	UDP
L3	IP

The Cyphal node-ID is a property of the top-layer protocol, but it is manifested at L3 directly. This leads to practical complications (discussed below) and has the potential to make the evolution of the protocol difficult in the longer term.

2. Difficulty supporting multiple nodes per local NIC

This problem does not affect simple, deeply embedded devices that run one logical Cyphal node per hardware unit, but it is significant for higher-level devices that may run several nodes concurrently on the same machine, which may or may not require connectivity to remote nodes over the network. A Cyphal node in this case may be hosted by a separate application running in a higher-level OS; multiple nodes per application are also possible.

Generally, the configuration of the network connectivity of a higher-level OS should not affect the application software executed in it. Say, a ROS node launched on one computer can communicate with its peers regardless of how the network is configured and whether the peers are located on the same machine or are remote. The erasure of the distinction between local and remote resources is a critical architectural feature of networked OS that should be respected to maximize the utility of the protocol. In the current draft, however, Cyphal/UDP is incompatible with these principles, as the application-level parameters of a node (specifically its ID) derive from the low-level aspects of the network configuration (the IP address). Further, to run more than one node on the same machine, one would have to apply an uncommon configuration of the OS’s networking stack to enable multiple IP addresses per NIC.

Ideally, it should be possible to run an arbitrary software node on any networked OS without the need to alter or query the network configuration at all.

3. Minor inner inconsistency

The current design relies on IP multicast for subjects, but IP unicast for services. This seems logical at first but considering that IP multicast is effectively a distinct protocol (very different from IP unicast), one might argue that the Cyphal/UDP transport could be simplified by focusing on the multicast IP exclusively.

4. Boundary node-IDs are not usable

From the node-ID to IP mapping above, one might see that node-ID values of 0 and 4095 (possibly others that are one less than a power of 2) may be unusable depending on the network mask setting.

5. ARP is required

Either a static or dynamic ARP table is required for sending IP unicast datagrams.

Solution

In general terms, the solution is to break the hard link between the Cyphal node-ID and the IP address of a node. A conventional solution would be to rely on a central broker that manages the routing and keeps the mapping between IP addresses and node identities (see ROS1). In the case of Cyphal, this is incompatible with the core requirements, but it is possible to rely on IP multicast to attain virtually the same result.

In the existing proposal, a service transfer is performed by sending a unicast IP datagram to the IP address computed as shown earlier. The current proposal is to modify this such that service transfers are also multicast transfers, where the destination address is computed as follows:

    fixed          service
   (9 bits)  res.  selector
   ________      ||
  /        \     vv
  11101111.0ddddd01.nnnnnnnn.nnnnnnnn
  \__/      \___/   \_______________/
(4 bits)   (5 bits)     (16 bits)
  IPv4     subnet-ID     node-ID
multicast   \_______________________/
 prefix             (23 bits)
            collision-free multicast
               addressing limit of
              Ethernet MAC for IPv4

The subject multicast group address is modified as follows (the two least significant bits of the subnet-ID are replaced with the message selector bit and one reserved zero bit):

    fixed   message  reserved
   (9 bits) select.  (3 bits)
   ________   res.|  _
  /        \     vv / \
  11101111.0ddddd00.000sssss.ssssssss
  \__/      \___/      \____________/
(4 bits)   (5 bits)       (13 bits)
  IPv4     subnet-ID      subject-ID
multicast   \_______________________/
 prefix             (23 bits)
            collision-free multicast
               addressing limit of
              Ethernet MAC for IPv4

Since the unicast address of a node is no longer connected to its Cyphal identity, the Cyphal node-ID of the origin of a given transfer needs to be communicated using some other means. It is therefore proposed to modify the Cyphal/UDP header as follows:

-uint8 version           # =0 in this revision; ignore frame otherwise.
+uint8 version           # =1 in this revision; ignore frame otherwise.
 uint8 priority          # Like in CAN: 0 -- highest priority, 7 -- lowest priority.
-void16                  # Set to zero when transmitting, ignore when receiving.
+uint16 source_node_id   # Cyphal node-ID of the origin.
 uint32 frame_index_eot  # MSB is set if the current frame is the last frame of the transfer.
 uint64 transfer_id      # The transfer-ID never overflows.
 void64                  # This space may be used later for runtime type identification.

The existing draft does not support anonymous Cyphal/UDP transfers for two reasons: first, that would require anonymous IP transfers which are not supported by the Internet protocol; second, the problem of automatic address assignment is already covered by DHCP. This reasoning does not apply to the updated architecture anymore. Therefore, it may be desirable to introduce the support for anonymous transfers with a trivial change: by reserving the maximum valid source node-ID value of 65535 to indicate that the source address is not valid. This is also in line with the Cyphal/serial (aka Cyphal/TCP) specification draft. An alternative would be to compact the version+priority fields to provide a few free bits for the new flags field, where one of the flags would indicate anonymity.

The UDP port number assignment is not altered by this proposal.

Advantages

Cleaner architecture. The link between the IP address of a node and its Cyphal identity is completely eliminated.
Compatibility with software nodes. One can execute an arbitrary software node on any system without the need to alter or inspect the configuration of its networking stack. This brings the Cyphal experience on par with high-level pub/sub frameworks.
The resulting solution relies solely on IP multicast for all types of communication. Neither IP unicast nor broadcast are used; ARP is therefore also unnecessary.
The entire range of Cyphal node-ID is usable as IP-related restrictions are no longer relevant.
Implementations are expected to be simplified due to the removal of IP unicast communication (only one mode of communication is left).

Disadvantages

The network routers will have to manage a larger number of IP multicast groups, which is defined as the number of subjects plus the number of nodes on the network.
IGMP level 2 implementation will be required for all nodes. Previously, IGMP level 2 implementation was only needed for nodes that subscribe to at least one subject.
The number of Cyphal subnets (domains) is reduced from 128 to 32.

maksim.drachov · October 3, 2022, 4:43pm

I will try to implement the proposed changes on this branch.

After reading pycyphal.transport.udp docs, I have a small question.

There’s this table:

Supported transfers	Unicast	Broadcast
Message	No	Yes
Service	Yes	Banned by Spec.

“Broadcast” should have been “Multicast” right?

pavel.kirienko · October 3, 2022, 4:59pm

Yes. It’s Cyphal parlance for multicast. Maybe we should change the wording to make this less confusing.

schoberm · October 4, 2022, 8:54pm

This is a big win and should simplify both the implementations and readability of the specification and implementations!

scottdixon · October 5, 2022, 9:18pm

This is an acceptable proposal worthy of a prototype.

Sergey_Andreev · October 21, 2022, 4:39pm

The proposal looks unnecessary overcomplicated. Why not to use a standard approach of dynamic ports for multi-node configuration on the same IP? Can we just make Cyphal/UDP a multi-port service by design? You can define one fixed UDP port and a range of dynamic ports for testing/development purpose and it will be easy to implement yet robust and reliable.

We can even make it an optional production feature like follows:

Single node per IP - no changes needed, but just using fixed known UDP port.
Multiple nodes on same IP: a brokerage service has to start first on the fixed UDP port and then managing allocation/deallocation of dynamic UDP ports for regular Cyphal nodes on the same IP.

pavel.kirienko · October 21, 2022, 5:08pm

The proposal actually simplifies the current draft, as I attempted to illustrate, since it reduces the design from two modes of communication down to one. The introduction of the local brokerage service would add further complexity to the original design with unclear benefits. Could you maybe elaborate on what advantages do you see in the multiport design based on the local brokerage service?

Sergey_Andreev · October 21, 2022, 5:58pm

Main reasons:

There are already two use cases: single node on a single IP, multiple nodes on a single IP. The former is more for actual operation of the system in production, but the latter is more for testing/development/simulation. It is important to stay focused and not mix additional/optional features (i.e. nice to haves) into the main workflow. That’s why I would suggest to clearly separate two use cases from the beginning. The proposed dynamic ports concept will be fully optional and not a dependency for the single node operation. I.e. Cyphal/UDP development and production usage for nodes will not be affected by that additional feature at all. I.e. no quality, no schedules, no spec requirement, etc. - i.e. nothing, but it stays as is. Adding multi-node support will be flexible and replaceable - i.e. we can implement various brokerage APIs and port allocation schemes when (and if) needed. In simpliest way, it can be just a hardcoded port number for each service on the multi-node computer and no need to have a brokerage service at all. More advanced implementations will need nodes to pull a port number dynamically (from a file, shared memory, some API, maybe a networked call, etc.) - but this concept still stays extremely simple and natural.
Not to reinvent a wheel. Dynamic ports and multi-port services are quite standard ways to do this. Moreover, a concept on ports was exactly created to support multiple service on a single IP - why on earth should we avoid this in Cyphal?
Not to overcomplicate Cyphal spec, but remove very custom dependencies and pre-requisites on dynamic multi-cast pre-configuration. It looks really fragile to rely on such advanced configuration to let just a basic system work. That multicast configuration concept is really complicated by design and who knows what would it be in the implementation and in real life?
Again, simplicity - to keep the main production workflow simple, straightforward and not dependent on quite artificial use case of testing/dev/simulation with multi-nodes…
Extensibility - hard coding the initial concept freezes the spec and low level implementation, but what if we decide to extend, change or even remove that multi-node feature? The previous version of Cyphal will be broken and it will be again a big-bang backwards incompatible change in v.2 of Cyphal. But if we decide on just very generic concept of dynamic ports, then it becomes compatible with vast variety of implementation (from no-op, to hard coded port ids, to file read ports ids, to API call, etc.). And the changes won’t be breaking, but even can coexist on the same system with different implementation on different computers!

pavel.kirienko · October 21, 2022, 7:40pm

I am not convinced so far, but it could be that I am missing something.

Per my proposal, multi-tenant nodes are intended as a key capability for production use rather than as a development-only extension, as there are valid use cases that require collaboration between low-level (nearly) baremetal nodes and their higher-level counterparts running on a higher-level POSIX OS.

Both the original draft and my new proposal require multicasting for pub/sub. Are you proposing to use some other transport for pub/sub that is not IP multicast? Unless you do, the dependency on multicasting is going to stay regardless of the design chosen.

Sergey_Andreev · October 21, 2022, 9:09pm

Can you please clarify more on that “Uber” enormous single computer use case with numerous virtual nodes in it, requiring several separate multi-cast groups talking to different publishers (or subscribers) over Ethernet outside and inside the computer - and that everything on just a single IP address? I don’t see any real production use case for this example, but it looks truly artificial to me. Why is it such deficit of IPs and Ethernet cards for such a huge (and I believe already very expensive) computer/server? If it was a real monstrous super compute node in the system, then placing multiple network cards to support real networking needs looks the most appropriate solution. Or else, run software routing and make a virtual subnet (private network) inside that machine. But again - it is very low-level networking already available in Ethernet standards, so why should we reinvent a wheel here and go to the low level modifying how standard IP multicast work?

Yes, this huge system will be most likely needed for simulation and testing - i.e. placing the entire system with all the nodes on a single computer and simulate communications between them. But all the real h/w nodes look pretty modular devices decoupling functionality per one device. So, one IP per one h/w node looks very reasonable and the most common production use case. We can create a work around for testing/simulation easily - software routing, virtual private network, or just multiple ethernet cards physically connected together.

My proposal/example with different port numbers on a single IP was for rather realistic use cases when a few nodes are within the same multicast (or broadcast, unicast) group on the same IP. For example, multi motor controlling board, some aggregated battery management system, etc. Having simple small groups of same devices is much better for reliability, redundancy and reducing complexity rather than making a spaghetti-like big centralized systems with huge functionality and numerous tricky complicated communication paths. Such big systems just do not scale, but require huge development and maintenance effort. They are almost impossible to extend, improve and evolve in the future. You can see a lot of such examples on web apps and enterprise systems - everybody prefer to go to micro-services, simple small IoT devices, decentralized systems, simple RESTful interfaces, NoSql databases without huge central storage, etc. lightweight protocols.

So, my proposal is simple: let’s follow standards and decades of Ethernet industry knowledge - that community have already verified a lot of concepts. If it looks we want something too complicated and breaking existing low-level Ethernet standards, then most likely we are wrong, not the entire Ethernet community and experts who already have addressed vast majority of use cases. Probably, we should Invent and Simplify here? I would highly recommend to limit the scope of Cyphal and start with something small, but robust, reliable, standard and extendable by design from the beginning.

pavel.kirienko · October 22, 2022, 7:15pm

The case of multi-tenant nodes is based on the core design principles (the relevant part is “Cyphal targets a wide variety of embedded systems, from high-performance on-board computers to extremely resource-constrained microcontrollers”). The design principles are not going to be reviewed, hence, this requirement is going to stay. The existing draft addresses the multi-tenant case poorly, therefore, it is inferior compared to the new proposal; the same holds for the alternative solution based on the local brokerage service. The problem is not that the local IP addresses or NICs are scarce resources; it’s that they complicate the node configuration process, and this extra complexity is avoidable as illustrated in my proposal.

I don’t understand a few things that you are saying. Could you please elaborate on these so that we are on the same page:

why should we reinvent a wheel here and go to the low level modifying how standard IP multicast work?
<…>
If it looks we want something too complicated and breaking existing low-level Ethernet standards

I don’t understand what you are referring to. My proposal is based entirely on the existing standards and does not call for modification of the standard IP multicast mechanism (or any other standard). It is implementable using existing off-the-shelf technologies and standard APIs without the need for customization (unlike, say, AFDX).

My proposal/example with different port numbers on a single IP was for rather realistic use cases when a few nodes are within the same multicast (or broadcast, unicast) group on the same IP

What are unicast groups?

We have node A and node B, each running on a single-tenant hardware unit. Node A publishes a message on subject X; node B is a subscriber and hence should receive the message. Please explain in detail how the said message is to be transferred per your proposal. Same for the case if A and B share the same multi-tenant hardware unit.

let’s follow standards and decades of Ethernet industry knowledge

We are on the same page here.

Sergey_Andreev · October 24, 2022, 3:55pm

Can we make a data-driven decision with a proof from real products/technologies and based on real needs from the industry?
It would be great to collect feedback from several experts from various industries and consider pros/cons data points.

If you are saying the proposed hack with hidden bits in multicast group addresses and fully eliminating unicast and broadcast communications are very standard - please provide the proof of working products, systems, specifications of standards in use of such approach. No of such evidence provided so far. My feedback - it is overcomplicated and for already complex concept of multicast groups. the proposed communication scheme is very non-standard - thus potentially a source of bugs, human errors, and misconfiguration.
If the use case of a highly powerful embedded device with numerous unrelated nodes in it but having one and only IP address advertised as the must - again, please provide a proof of existing of such product. In the discussion above, it is clear that it is even difficult to imagine or artificially build-up such a device. My feedback - such use case does not exist in production and is not needed for my industry (moreover, it smells a bad embedded design of such huge single embedded device with multiple unrelated responsibilities and a single point of failure).
For your concerns on testing and lab research - can you please provide proof/evidence that the problem cannot be solved with multiple networks cards?
My feedback - there are cheap multiport Ethernet cards to easily address a need of simulating multiple embedded devices on a single server:
Amazon.com - 2 ports
https://www.amazon.com/dp/B01HH6WETO - 4 ports
And can you provide any data points that the already existing standard approach of using multiple/dynamic ports on UDP did not work? My feedback - a concept of ports on a single IP has been
specifically designed and proven by decades for exact this situation of simultaneous communications of different sort on a single IP address. As an inventor of a new/custom idea, you need to provide data points that existing standards do not work. I am strongly against reinventing a wheel, but for relying on already avilable reliable and proven approaches. My data points collected so far on standard multi-port networking services for multiple nodes on a single IP address:

Simple hardcoded port numbers can work just fine
File based configuration of different ports is more advanced
Reading dynamically changed config files with port numbers - even more powerful
A brokerage API - covers all the needs and beyond

pavel.kirienko · October 24, 2022, 4:22pm

I have to restate a few things that might have been lost in this discussion:

The core design goals are not going to be reviewed. Any discussion around them is off-topic.
The scarcity of local NICs or IP addresses is a non-issue. The issue is the complexity of commissioning a node due to the coupling between the local node-ID and its IP address. This is covered in my proposal.

I understand that you dislike IP multicasting, but I don’t understand what alternative you are proposing. Could you please answer the questions I asked in the previous post so we can move the discussion forward?

I do not share the perception that IP multicasting as technology is inherently complex. The underlying principles are simple and well-understood. It is also unclear how one could avoid the reliance on multicasting without a centralized broker or a dedicated peer discovery mechanism (in the style of RTPS perhaps).
Off-topic.
The problem can be solved with multiple NICs or multiple local IPs per NIC, but this is beside the point. See above.
I don’t understand what you are proposing here. Please provide a detailed description of your intended design as I requested in my previous post so that we can discuss it.

Sergey_Andreev · October 24, 2022, 6:19pm

Again, opinions, data points and proof from different industry experts are still needed on this topic. Unfortunately, just a subjective matter of opinion (and now a sort of executive orders were provided in support of this spec change request so far… But real/strong data points, industry feedback, and proof from the existing network standards are not heard, but just being rejected without any reason and without providing any opposite evidence/proof…
Is there any data, proof, evidence, standards, etc. ground for this change request apart of saying like it is the right thing to do, because it is a correct approach?

Answering your questions that I might have missed above:

What are unicast groups?
1-to-1 group of two nodes directly talking to each other. It would be too bad if we do not support that basic use case in Cyphal at all…

Regarding my proposed alternative designs:
Option 1. Do nothing in s/w.
The problem does not exist, but we can just add several h/w Ethernet NIC ports in that “uber” embedded device as needed.

Option 2. Use a concept of dynamically allocated multiple ports on a single IP.
Instead of hiding children nodes info into the multicast group addresses with the same single IP, we can explicitly group nodes by different port numbers if there was overlapping on the same IP.
I.e. if there are nodes A, B, and C on the same IP address and they belong to different pub/sub groups, then assign some specific port numbers to them, say port 10001 to the node A, 10002 → node B, 10003 → node C. The external traffic from other IP addresses to the nodes A, B, C will be routed by the local Ethernet driver by the port number. Such assignment is crystal clear and a standard way to separate different unrelated consumers of network traffic on the same IP. There are a lot of options to configure port numbers per node (hardcoding, static config file, dynamic config file, shared memory, API call to a local broker service, network call to a remote broker service, etc.).
In the case when communication between nodes are needed locally (i.e. pubs and subs are on the same IP inside that “uber” device) - use the localhost IP address (i.e. 127.0.0.1). That’s a standard concept of loopback networking.

pavel.kirienko · October 24, 2022, 7:09pm

The rationale is given in my proposal. I did not mean to imply that the proposed approach is the only correct way; if my write-up reads this way, I apologize as it is not intentional. I am open to consider competing proposals but so far there have been none. I asked you repeatedly to answer my specific questions so that the discussion could progress. Allow me to be redundant and restate the question once more:

Multicast IP packets are transferred point-to-point (switching notwithstanding) as long as the source and destination share the same L2 domain, so in this sense, multicast is equivalent to unicast.

The concept of dynamic port assignment is trivial. What is not trivial is how one could avoid IP multicasting without centralized brokering or active network discovery. Please see my question above.

Sergey_Andreev · October 24, 2022, 9:10pm

Probably, I did not mention that in your example with two nodes on different IPs - no changes would be needed. Just use multicast to messages and unicast for service requests as it is in the current Cyphal spec.
For the case of two nodes on the same IP - my feedback is as follows.

This use case is not based on data. I.e. no real requests from the industry to support it. Just do not do it, but provide separate IP addresses per each node.
If the need arrived in the future, then that unusual use case can be supported by standard features. I.e. static definition of the network and adding a few ports to the unicast and multicast UDP requests. Different port numbers is a standard discriminator of network traffic routing.
If someone says we need auto-discovery and automatic network configuration, the answer again - this use case does not exist and has never been required nor justified by requests from industry experts. But anyway, if someone wants it - a bunch of standard tools available already from a shared file, memory, etc. custom communication between nodes on that single IP to more sophisticated broker service. For example, what’s wrong with a lightweight SSDP?
And finally, if someone asks not to use static network config nor brokerage services at all, but some magic hacks with a few bits inside a multicast group address - then they need to prove that request is valid and why nobody have asked/implemented that so far after decades on networking protocols evolution…

pavel.kirienko · October 25, 2022, 7:57am

I think you are overemphasizing the importance of peer-to-peer (unicast) communication for a DCPS system, where p2p is considered irrelevant or an anti-pattern, depending on how you squint. In a typical Cyphal deployment, the service traffic may account for less than 1% of the total traffic, plus it is possible to implement a valid Cyphal network without reliance on the service traffic at all (more on this in the Guide), so optimizing the solution for the benefit of this rather marginal case is unwise.

The advantage of my proposal over the original draft is that it eliminates the extra entities that are currently used to support this 1% of use cases (i.e., unicast exchanges) by unifying their implementation with the other part of the protocol that supports the remaining 99% of the usage (i.e., multicast exchanges). The problem with your intended proposal, which I still have not seen but it seems like a safe guess, is that it brings extra complexity to support the part of the protocol that is of low importance to most applications.

The part where you speak about port-based discrimination still seems hand-wavy; if you could provide a comprehensive description of the scenario that I asked about earlier (nodes A&B, subject X), that would be appreciated.

I recommend that we avoid emotionally loaded language like “magic hacks” because it communicates no useful information and does not bring us closer to consensus. The multicast-based solution I proposed does not use any “hacks” but is based on the correct utilization of a well-known and simple technology. I am tempted to describe your proposed alternative based on local brokering the same way, but you might see perhaps how that would be counter-productive to our discussion.

Sergey_Andreev · October 25, 2022, 5:20pm

Thanks a lot for clarifying the main intent of this request was removal of unicast communications from Cyphal.
I am really glad that we are close to the deal breaking point. At Amazon, we account data, industrial proof, research/docs as a main source of technical decisions.
Some emotions are actually fine as we are humans and not robots (yet,

Can you please defend those really bold statements with data, proof links, opinions from a few industry experts, any other similar industrial systems? You stated:

Unicast/p2p are not important, irrelevant and anti-pattern for Cyphal
Possibility to implement Cyphal network without service traffic (i.e. p2p) at all
Unicast/p2p is 1% of use cases

The exact data points against those statements (but proofs and exact usecases requiring unicast p2p communications) are as follows:

Security
Authentication and authorization require p2p communication. Otherwise it is an immediate security threat welcoming man-in-the-middle attack.
Moreover, multicast groups do not even allow to determine the COUNT of participants in the group. So, it is like allowing a group of random people staying around your at the ATM without knowledge the size of that group and detecting if it is present at all!
Can you please get feedback from industry experts regarding a multicast scenario for authentication/authorization in a highly secure and sensitive system?
My feedback - it makes no sense by definition: the exact single authority approves a specific device via direct and non-altered communications.

Encryption
Maintaining keys and security certificates requires p2p communication. We do not want keys leakage or even worse (replacement!) by a man-in-the-middle on a pub/sub group.

Firmware update
This is a sensitive part of the system and has to be directly delivered to the specific device. Moreover, the host needs to control the process of updating each node - it implies p2p communication.
Security is paramount here as well and we do not want firmware update process being intercepted or altered.

Configuration Management
Also highly sensitive and very device specific. This is p2p communication. There is nothing about pub/sub here, but direct communication between a single authority and a specific device being configured.

Device Calibration
This is specific 1-to-1 communication following a special procedure (i.e. a state machine) to calibrate an individual device. I.e. a bunch of service messages (commands, responses). It also needs to be highly secure, protected and not altered.

Retrieving logs
A simple use case of collecting/persisting all the network communications within the embedded system and then uploading for offline analysis, auditing, some other online analytics, etc.
Thats essentionally the entire 100% of production traffic to be delivered via unicast p2p logs retrieval process. I really don’t see any ground in your statement of 1% number of unicast communications…
And that process can be done all the times - either constantly in realtime, or at some specific moments (Like after completing some major parts of operations)

File transfers
It is close to the logs use cases above, but more generic. This is exact p2p communication of sensitive information.

"Safety device"
Most of aerospace (and some others) have a concept of a redundant safety device to be used for post mortem analysis after crashes of the main system. It is p2p highly secure communication here on recovering data from this device and we also do not want communication being altered or intercepted.

External Systems integration: cloud, other networks
Connecting to some bridge, cloud, or other external network requires unicast p2p access in general. I’ve provided some specific integration examples above and they can be extrapolated to any external cloud, network, orchestration, etc. system.

pavel.kirienko · October 25, 2022, 5:40pm

This is not correct, and I said nothing that could be interpreted as such. The main intent is the improvement of the Cyphal/UDP transport definition. Please actually do read my proposal.

Please read The Cyphal Guide - Applications & Usage - OpenCyphal Forum, where you will find detailed elaboration and references. Pay specific attention to the fact that Cyphal is, in fact, a DCPS solution, and point-to-point communication is not something that is at the focus. If your application requires strictly p2p, Cyphal is not for you and there is nothing more to discuss.

Security and encryption are out of the scope at this stage of the project. This is something that we have on the longer-term roadmap, though. I don’t see how these objectives can be used to argue against the DCPS architecture unless you are willing to claim that all pub/sub architectures are inherently flawed by design.

Do we really need to continue this conversation? It is tiring.

Sergey_Andreev · October 25, 2022, 7:58pm

I am fine to continue discussion and to see feedback from various industries and experts, because no real data, proofs, other people feedback, etc. have been provided to support this change apart of a personal opinion so far…

If my asks to provide data, proof, real needs from industries for the proposed MAJOR change in Cyphal is “tiring discussion” - I am very sorry about that. But it is how successful commercial projects work. Successful industrial grade products are not abstract theories nor math exercises, but relevant to the actual customers and their real needs.

I hope Cyphal project is not going to a quite abstract academic research nor personal toy project, though someone could have found a sort of red flags in that direction.
If Cyphal is vulnerable to voluntary major changes without getting a consensus with the industry but rejecting all and any valid concerns and data points, then it looks not an open-source, but more like quite proprietary software with vendor locking on decisions from a single party.

Is there any voting process and ability to bring more people to provide feedback on this request from various industries?
Can you please consider keeping the original specification, especially as this request was proposed to be “minor” originally?