Overview
One of the core design principles of Cyphal is simplicity. That means two things: the protocol is constructed in a straightforward way and it is trivial to apply. We believe that this goal is achieved by the current design; yet, there were several occasions where our fellow humans would exclaim:
But if your protocol is so simple, how come it takes over a hundred pages to describe?
That question could arise out of a misunderstanding of how design specifications work. Among other objectives, Cyphal is designed to facilitate the robust interoperability of equipment from different vendors in high-integrity applications. That requires that every detail of the protocol is meticulously specified to ensure that, first, there are no unforeseen behaviors that might jeopardize a safety-critical application; second, that every implementation of the protocol can interoperate successfully with any other spec-compliant implementation, possibly from a different vendor. The formal project documentation also enables the integration of Cyphal into high-assurance V-model systems design workflow:
As a result, a concept that takes one sentence to explain using regular daily speech takes two pages in the Specification. This chapter provides a legalese-free description of the protocol as a response to the misperception of the Cyphalās complexity.
While we are at it, we also explain how to migrate from the earlier revision of the protocol known as UAVCAN/DroneCAN. We are not going to cover how the design decisions were made because that would take forever considering that even a simple idea might turn out to be a huge can of worms if explored in-depth. If you want the background, feel free to search the forum because it is the place where many of the decisions were discussed and agreed upon (that often involves much back-and-forth bikeshedding).
Basics
Cyphal is a layered composition of three decoupled parts: the application layer, the presentation layer, and the transport layer (the physical layer is purposefully omitted). The application layer defines some generic high-level functions and concepts that are expected to be useful in different application domains, like diagnostics, configuration, basic physical quantities, etc. The presentation layer is modeled by DSDL, meaning data structure description language, which describes data formats and how that data is to be serialized and interpreted. The transport layer determines how to transfer serialized data structures over the network.
There are different kinds of data structures:
-
Messages. A node may publish a message using a specific numerical subject identifier. Using that subject identifier, another node or several may subscribe to specific messages. This is the standard publish-subscribe pattern. This is typically the main mode of communication in Cyphal applications and it is possible to implement a system using only messages.
-
Services. A node may send another node a request identified by a specific numerical service identifier. That other node would receive the request and then maybe send a response back. This is the standard client-server model.
When discussing subject or service identifiers generically we call them port identifiers. Thereās nothing special about this term and any port identifier is always either a subject or service identifier in concrete terms.
Messages and service calls are exchanged between nodes. A given hardware unit could implement one node (e.g. an air-speed sensor might be a single Cyphal node) or many nodes (e.g. a flight controller might have several functions that each act as a Cyphal node). Cyphal is a stateless protocol, meaning a node can join the network and begin operation immediately upon powering on without any kind of registration or preparatory data exchange with other participants. This is an important feature as it enables highly deterministic fault-tolerant systems. Cyphal is a peer-to-peer (democratic) protocol, meaning that there is no such thing as a āmasterā or any other kind of centralized intelligence ā all nodes are equal.
Application
Cyphal itself is designed to meet the needs of a wide spectrum of vehicular computing applications, so as far as the application-level capabilities go, it doesnāt offer much beyond the very foundation that domain-specific and application-specific entities can be built upon. This includes some common functions like diagnostics, introspection, configuration management, etc.
Special interest groups (SIG) or individual adopters can define domain- or application-specific standards or conventions on top of Cyphal.
Presentation
Anybody familiar with C-like languages should immediately feel at home. Suppose thereās a file my_project/MyMessageType.1.0.dsdl
, where my_project
is a namespace directory and 1.0
is the version number of the data type:
uint16 VALUE_LOW = 1000
uint16 VALUE_HIGH = 2000
uint16 VALUE_MID = (VALUE_HIGH + VALUE_LOW) / 2 # Exact rational arithmetics!
uint16 value
uint8[<=100] key # Variable-length array, at most 100 items.
@extent 128 * 8
The wire representation is straightforward. The byte order is little-endian and variable-length arrays (like key
here) are prepended with a length prefix which is either uint8
, uint16
, uint32
, or uint64
:
$ pip install pycyphal # Using PyCyphal for this demo.
$ python
>>> import pycyphal
>>> pycyphal.dsdl.compile('dsdl_src/my_project') # Compile DSDL...
>>> import my_project # ...and use it.
>>> serialized = pycyphal.dsdl.serialize(my_project.MyMessageType_1_0(
... value=1234, key='Hello world!'))
>>> b''.join(serialized)
b'\xd2\x04\x0cHello world!'
Hereās what we see: 1234 = 0x04D2, so it is serialized as [0xD2, 0x04]
. The greeting encoded in UTF-8 is 12 bytes long, which is 0x0C, which is immediately followed by Hello world!
. You can generate and parse such serialized representations using auto-generated code (Nunavut will help you here) or you can just twiddle bytes manually if you donāt want to get your hands dirty with automatic transcompilers.
The crucial thing to note here is that DSDL does not exist at runtime. As a specification language, DSDL can be read by humans to serialize and deserialize objects manually, and by machines to generate such serialization and deserialization code automatically. An embedded system does not know anything about DSDL, because at the time it is deployed DSDL has already done its job.
Suppose you manufactured a gazillion devices using the above definition and then you suddenly realized that the definition is deficient. You canāt just migrate all devices to a newer version at once because 1/4 gazillion of these devices are already in the field (sales have been brisk)! At this point the concept of semantic compatibility will become extremely prominent in your life. The Cyphal designers endured two years of occasionally heated debates about data type versioning and in the end, they summoned the implicit truncation rule and the implicit zero extension rule into existence. Hereās how they work:
# my_project/MyMessageType.1.1.dsdl
uint16 VALUE_LOW = 1000
uint16 VALUE_HIGH = 2000
uint16 VALUE_MID = (VALUE_HIGH + VALUE_LOW) * 0.5 # Rational arithmetics.
uint16 value
# This definition has no key. Who needs keys anyway?
@extent 128 * 8
# my_project/MyMessageType.1.2.dsdl
uint16 value
uint8[<=100] key # The key is back.
float64 extra_value # A new field!
@extent 128 * 8
We have two new versions of the same type which look quite different, but they are all semantically compatible . A node may publish
my_project/MyMessageType.1.1
, another node may subscribe and deserialize the message using my_project/MyMessageType.1.2
, and they would communicate just fine thanks to the implicit zero extension rule. The rule says that if the deserializer expects more data than there is, it shall assume that itās just zeros all the way down, so the key would look empty and the extra field would be zero. If the nodes reversed their roles, the implicit truncation rule would enter the scene, which says that if thereās more data than the node expected, it should pretend that the extra data isnāt there at all. At the DSDL level, a related concept is structural polymorphism or structural sub-typing.
There are two other things that are sometimes relevant: tagged unions and service types. A tagged union is a way of encoding one value out of several possible options (like std::variant<>
in C++ or enumerations in Rust, there is no equivalent in C); the encoded value is prepended with a byte that says which one is it:
@union # This directive adds one byte in front of the message.
uint16 integer # If the tag is zero, it's an integer.
uint8[<=100] string # If the tag is one... You know the drill.
my_project.MyMessageType.1.2 my_object # Yeah, composition.
@extent 200 * 8
A service type is defined by inserting three minus characters (---
) somewhere into the definition, which separates service request schema from the response schema. If you have experience with ROS, you already know everything there is to know.
Looking at the examples here might help:
The public regulated data types define certain standard application-level functions such as the heartbeat message uavcan.node.Heartbeat
, the only application-level function that every Cyphal node is required to support. Except for publishing a Heartbeat once a second, every other application-level function is optional and can be implemented at the discretion (or lack thereof) of the designer. The documentation for such application-level behaviors is provided right in the comments of the respective DSDL definitions so that everything is kept conveniently in one place.
The last thing you need to know about the presentation layer of Cyphal is how subject and service identifiers (aka port identifiers) are assigned unique numbers. Suppose there is a node that publishes messages of type my_project.MyMessageType.1.2
or provides a service of such and such type. How does it know what exact port to use? The vendor of the node could go the DroneCAN way and just hard-code a specific identifier, but you might see, perhaps, how this could get out of hand? Another vendor would do the same thing; collisions galore! So we say this:
-
If the vendor really needs that fixed port identifier, it should send a pull request with the new data type definition to the public regulated data types repository linked above. The Cyphal maintainers will be picky about which types are allowed into the regulated data type set; if the proposed type serves a very specific use case of a small vendor, it might get rejected. Think of it like the USB standard classes or CANopen standard profiles.
-
If the above does not hold (it rarely does), the vendor shall provide the ability to reconfigure the subject/service identifier by the end-user or integrator (such identifiers are called non-fixed identifiers). Failure to do so will render the device not Cyphal-compliant and will cause many headaches for the customer.
-
If the vendor happens to be using Cyphal in a closed project with no exposure to the outside world the vendor can do as it pleases (in most countries). Nobody cares, really.
Seems restrictive? Thatās the cost of robust interoperability.
The specification and the public regulated data types repository document the ranges of port identifiers that can be used with fixed and non-fixed identifiers; the former are called regulated identifiers in the Cyphal parlance, and the latter is unregulated.
This is a case where the specification might actually be pretty clear so weāll repeat table 2.1 here to sum up this section:
regulated | unregulated | |
public | robust interoperability | no fixed port identifiers/must be configurable |
private | (nope, not a thing) | sure, just keep it to yourself |
Transport
The job of the transport layer is to ferry serialized objects around the network (such occurrences are called transfers) and to facilitate topic-based filtering. The transports are designed with the requirements of high-integrity applications in mind, which include strict temporal predictability guarantees, redundant interfaces, and more exotic concepts like tunable reliability controls for very special snowflakes.
There are several transport protocols designed on top of different networking technologies such as CAN (FD) (called Cyphal/CAN) or UDP/IP (called Cyphal/UDP); they are replaceable, meaning that the presentation layer and the application on top of it are isolated from the specifics of the transport and can be migrated from one transport to another easily.
The transport protocols are designed to support topic-based data filtering in hardware, such that when the application requests a particular subscription or a service, the transport layer configures the underlying hardware to accept the relevant messages/requests/responses and to reject the rest automatically. All common implementations of high-speed and/or real-time networking hardware, like CAN controllers or Ethernet adapters, provide the necessary functionality out of the box so the software doesnāt need to sift through copious amounts of data in real-time.
Cyphal/CAN
The Cyphal/CAN transport is the direct successor of UAVCAN/DroneCAN extended with CAN FD support. One familiar with DroneCAN will have no trouble migrating to Cyphal because, in essence, it relies on the same core concepts just having a few bits shifted around. The specification of Cyphal/CAN is hardly five pages long, so implementing it from scratch should be generally a no-brainer and it would barely take more than a few hundred lines of code but even that is unlikely to be necessary given that there are portable MIT-licensed implementations available.
Not to replicate the specification but for clarityās sake, the CAN transport treats both Classic CAN and CAN FD equivalently, the only difference being the maximum transmission unit (MTU, the amount of data per CAN frame). Most of the metadata (such as the subject/service ID, source node ID, and priority) is packed into the CAN ID in the most obvious way possible, except for the four things: the transfer-ID and the three flags which are start-of-transfer, end-of-transfer, and the toggle bit. These four things go into the last byte of the frame payload, also known as the tail byte.
The transfer-ID is a peculiarity of the Cyphal jargon ā most other protocols call it the sequence number. It is an integer that is incremented every time a message of a specific subject is published or a specific service is invoked, and it is quite paramount for many functions of the protocol.
The flags are only used when the serialized entity does not fit into a single frame, which means that the transfer is a multi-frame transfer. Multi-frame transfers may appear convoluted but essentially they are a zero-cost feature because any other implementation that takes into account all relevant edge cases (which are many) will end up being functionally similar. The purpose of the flags should be evident: the start-of-frame and the end-of-frame demarcate the first and the last frame of the transfer, respectively, and the toggle bit toggles, starting from one.
It is well known that one console dump is worth 1024 words. Suppose we start up the Yakut CLI tool and use it to publish a message of the type we defined above over a GNU/Linux SocketCAN interface using the subject-ID 4919 (0x1337 in hex) from node-ID 59. First, we need to ensure that the custom DSDL definitions are found in CYPHAL_PATH
so that Yakut can see them (more on this in the Yakut user guide). Then we go like:
# Publish a message over CAN bus (could be UDP or serial if you change the next line):
$ export UAVCAN__CAN__IFACE='socketcan:vcan1'
$ export UAVCAN__NODE__ID=59
$ yakut pub 4919:my_project.MyMessageType '{value: 1234, key: "Hello world!"}'
Meanwhile, having started the candump utility in another terminal, we observe the following developments:
# Columns: timestamp, iface, flags (B means BRS), CAN ID, [payload size], payload.
$ candump -decaxta any
(1.365) vcan1 TX B - 107D553B [08] 00 00 00 00 20 89 00 E0 '.... ...'
(1.366) vcan1 TX B - 1013373B [16] D2 04 0C 48 65 6C 6C 6F 20 77 6F 72 6C 64 21 E0
The first frame here is a heartbeat from the CLI tool ā remember, they are mandatory for all nodes. There may be additional frames related to various high-level protocol features, like uavcan.node.port.List
, for instance; they are not shown here for brevity. In the following example, we equip a pair of sunglasses and publish the same message but the MTU is set to 8 bytes, forcing the publisher to resort to a multi-frame transfer:
$ export UAVCAN__CAN__MTU=8
$ yakut pub 4919:my_project.MyMessageType.1.0 '{value: 1234, key: "Hello world!"}'
$ candump -decaxta any
(7.925) vcan2 TX - - 107D553B [8] 00 00 00 00 20 3D 01 E0 '.... =..'
(7.925) vcan2 TX - - 1013373B [8] D2 04 0C 48 65 6C 6C A0 '...Hell.'
(7.925) vcan2 TX - - 1013373B [8] 6F 20 77 6F 72 6C 64 00 'o world.'
(7.925) vcan2 TX - - 1013373B [4] 21 F9 02 60 '!..`'
Look at it go. The last two bytes at the end of the transfer (right after the exclamation mark !
) are the multi-frame transfer CRC ā the CRC-16-CCITT function of the serialized representation. It is needed to let the receiver ensure that the received multi-frame transfer is reassembled correctly.
Cyphal/UDP
Cyphal/UDP is a first-class transport intended for low-latency, high-throughput intravehicular Ethernet networks with complex topologies, which may be switched, multi-drop, or mixed. A network utilizing Cyphal/UDP can be built with standards-compliant commercial off-the-shelf networking equipment and software.
Cyphal/UDP heavily relies on IP multicasting; for example, each subject is modeled as a separate multicast group, which allows implementations to offload the network traffic handling to the standard IP stack and the underlying off-the-shelf networking hardware.
There are no hands-on examples in this section but it is adequately covered in the Specification. LibUDPard provides a minimalistic implementation for deeply embedded high-integrity systems.
Cyphal/serial
Cyphal/serial operates on top of raw byte-level communication channels, such as TCP/IP connections, SSL, UART, RS-232, RS-422, USB CDC ACM, and any similar communication links that allow the exchange of unstructured byte streams. Cyphal/serial may also be used to store Cyphal frames in files.
Likewise, interested readers should proceed to the Specification for details.
Cyphal/whatever
Cyphal is designed to be portable to various underlying transport protocols. One wishing to define a custom transport can consult with the following topic where some general pointers are provided:
Key concepts graphically
The following diagram summarizes the key ideas of a Cyphal-based publish-subscribe system:
When a new component needs to be integrated into the network, its port-identifiers must be configured so that it knows where to find data it needs and where to put data it produces:
Migration from DroneCAN
The protocol has been simplified compared its predecessor DroneCAN, and several design issues have been addressed. Here is the full list of substantial changes:
-
The Data Type ID has been removed. Without going much into detail, it was coupling the syntax of data (a data type definition) with its semantics (how it was used). It was the cause of certain architectural imperfections in the applications that relied on DroneCAN, so proceeding further without resolving that issue was considered undesirable. In v1, this is resolved with the new concepts of Subjects and Services, which are decoupled from the type identity and permit surjective mapping of subjects or services onto types, rendering Cyphal architecturally identical to conventional publish-subscribe frameworks.
-
The Data Type Signature went the same way. It was shown to make data type definitions unnecessarily difficult to evolve. The new design permits polymorphic subtyping and arbitrary modification of data types so that deployed systems can be upgraded incrementally. This is the second most important upgrade after the syntax-semantics decoupling shown above.
-
The transfer CRC is no longer pre-seeded with anything. This is a direct consequence of the above. Also, in Cyphal/CAN the CRC has been moved towards the end of the transfer.
-
Data type definitions can now be explicitly versioned and evolved sensibly. Messed up a type? No problem, just release a new version.
-
Tail Array Optimization is removed. Every array has a length prefix now, always.
-
The implicit fields (array length and union tag) are now either 8, 16, 32, or 64-bit wide. They used to have odd sizes like
uint3
. This change simplifies data type design and serialization. -
The byte order is kept little-endian but the bits are now populated LSB-to-MSB, not the other way around. This change provides enhanced compatibility with 3rd-party tools and enables faster serialization and deserialization on conventional little-endian microarchitectures (big-endian platforms shall convert the byte order during serialization and deserialization).
-
The CAN ID bit layout of Cyphal/CAN is different and the toggle bit starts with 1 instead of 0. The toggle bit change is to make DroneCAN and Cyphal distinguishable at runtime, enabling their coexistence in the same application.
There are no changes that affect the hardware. A single unit or a whole system can migrate from DroneCAN to Cyphal by a trivial software update. Said software update amounts to few things:
-
Replace your old DroneCAN library with its Cyphal equivalent. The API will be slightly different but architecturally they are all alike. Some things got new names; like, Data Type ID is now the subject/service-ID. Some things are completely removed thus making development easier; for example, no more Data Type Signature and Tail Array Optimization.
-
Donāt forget to RTFM! All libraries are supplied with extensive documentation.
-
Add version numbers to your DSDL definitions and remove the manual padding before variable-length arrays and unions. Read the design guidelines in the public regulated DSDL repository and consider the recommendations about idempotency, if applicable. Donāt forget that thereās no tail array optimization anymore.
It doesnāt take much effort on the software side and there are zero repercussions for the hardware. Most importantly, the new implementations are built to a much higher quality standard. The only valid reason for hodling onto DroneCAN is the legacy, and we are actively working with vendors to ensure their speedy convergence on Cyphal.
Further information
The best place to start for a newcomer is probably the PyCyphal demo, as it allows experimenting on the local machine without much preparation. The key concepts are easily transferable to other implementations such as the deeply embedded real-time Libcanard (which implements the full Cyphal/CAN in a little over 1000 lines of code and works in extremely resource-constrained microcontrollers with ca. 32K ROM, 8K RAM, taking only a few kibibytes of ROM for itself).