Port type safety enforcement

pavel.kirienko · June 19, 2021, 7:45pm

Took me a while but the PoC is now ready. It’s a bit crude but I hope it’s good enough to illustrate the concept:

It is a mix of the original idea I shared at Automatic configuration of port identifiers - #7 by pavel.kirienko and the cookie register I mentioned in the previous post. Before I explain how it works, let me explain its behavior.

What it does

You can grab the demo from the repo, install Yakut (pip install yakut), and run it:

yakut compile -O . https://github.com/UAVCAN/public_regulated_data_types/archive/master.zip \
                   https://github.com/Zubax/zubax_dsdl/archive/master.zip
export UAVCAN__CAN__IFACE="socketcan:vcan0"
export UAVCAN__NODE__ID=1                    # Any node-ID will do
./udral_pnp.py

Be sure to configure a virtual CAN interface or use a real one.

Run Yakut monitor with the node-ID allocator:

export UAVCAN__CAN__IFACE="socketcan:vcan0"
export UAVCAN__NODE__ID=127                  # Any other node-ID will do
y mon -P node_id_allocation_table.db

The PoC will emit a few messages saying that it’s detected a new node (the monitor), but the node doesn’t seem to be UDRAL PnP-capable because there is no cookie register in it:

root:       Detected new online node 127
node_proxy: Started auto-configuration of node 127
node_proxy: Node 127 is not UDRAL-PnP-capable, please configure it manually

Now you can grab the old DS-015 demos extended with the cookie register and build them; here is the relevant diff:

@@ -645,6 +649,13 @@ int main(const int argc, char* const argv[])
     val._string.value.count = 0;
     registerRead("uavcan.node.description", &val);  // We don't need the value, we just need to ensure it exists.
 
+    // The UDRAL cookie is used to mark nodes that are auto-configured by a specific auto-configuration authority.
+    // We don't use this value, it is managed by remote nodes; our only responsibility is to persist it across reboots.
+    // This register is entirely optional though; if not provided, the node will have to be configured manually.
+    uavcan_register_Value_1_0_select_string_(&val);
+    val._string.value.count = 0;  // The value should be empty by default, meaning that the node is not configured.
+    registerRead("udral.pnp.cookie", &val);
+
     // Configure the transport by reading the appropriate standard registers.
     uavcan_register_Value_1_0_select_natural16_(&val);
     val.natural16.value.count       = 1;

Now, if you run either demo (or both), the PoC will auto-configure them immediately. For example, when I start the differential pressure demo for the first time, I get roughly this:

root:       Detected new online node 118
node_proxy: Started auto-configuration of node 118
node_proxy: Node 118 requires autoconfiguration because cookie '' != expected 'autoconfigured 4b65eb38'
node_proxy: Registers available on node 118: ['uavcan.node.id', 'uavcan.node.description', 'udral.pnp.cookie', 'uavcan.can.mtu', 'uavcan.pub.airspeed.differential_pressure.id', 'uavcan.pub.airspeed.differential_pressure.type', 'uavcan.pub.airspeed.static_air_temperature.id', 'uavcan.pub.airspeed.static_air_temperature.type', 'uavcan.node.unique_id']
node_proxy: Node 118: currently configured ports: PortAssignment(pub={'airspeed.differential_pressure': 65535, 'airspeed.static_air_temperature': 65535}, sub={}, cln={}, srv={})
root:       Allocating services of remote node 118; available ports: PortAssignment(pub={'airspeed.differential_pressure': 65535, 'airspeed.static_air_temperature': 65535}, sub={}, cln={}, srv={})
root:       Detected services on node 118: {'airspeed': {'': PortSuffixMapping(pub={'differential_pressure': 'airspeed.differential_pressure', 'static_air_temperature': 'airspeed.static_air_temperature'}, sub={}, cln={}, srv={})}}
root:       New airspeed client of node 118: AirspeedClient(diff_pressure=Subscriber(dtype=uavcan.si.sample.pressure.Scalar.1.0, transport_session=CANInputSession(InputSessionSpecifier(data_specifier=MessageDataSpecifier(subject_id=6010), remote_node_id=None), PayloadMetadata(extent_bytes=11))), temperature=Subscriber(dtype=uavcan.si.sample.temperature.Scalar.1.0, transport_session=CANInputSession(InputSessionSpecifier(data_specifier=MessageDataSpecifier(subject_id=6020), remote_node_id=None), PayloadMetadata(extent_bytes=11))))
node_proxy: Node 118: new ports: PortAssignment(pub={'airspeed.differential_pressure': 6010, 'airspeed.static_air_temperature': 6020}, sub={}, cln={}, srv={})
node_proxy: Writing registers of node 118: {'uavcan.pub.airspeed.differential_pressure.id': uavcan.register.Value.1.0(natural16=uavcan.primitive.array.Natural16.1.0(value=[6010])), 'uavcan.pub.airspeed.static_air_temperature.id': uavcan.register.Value.1.0(natural16=uavcan.primitive.array.Natural16.1.0(value=[6020])), 'udral.pnp.cookie': uavcan.register.Value.1.0(string=uavcan.primitive.String.1.0(value='autoconfigured 4b65eb38'))}
node_proxy: Node 118: command uavcan.node.ExecuteCommand.Request.1.1(command=65530, parameter='') response: uavcan.node.ExecuteCommand.Response.1.1(status=0)
node_proxy: Node 118: command uavcan.node.ExecuteCommand.Request.1.1(command=65535, parameter='') response: uavcan.node.ExecuteCommand.Response.1.1(status=0)
node_proxy: Node 118 configured successfully

These messages are followed by the output of the simulated sensor feed after the node is configured. The servo demo is integrated in a similar fashion, the PoC supports up to 2 airspeed sensors and up to 2 servos.

The following screenshot shows both demos running, with the auto-allocated ports being 5000, 5050, 6010, and 6020:

How it works

The basic algorithm is already explained in Automatic configuration of port identifiers - #7 by pavel.kirienko. While working on its implementation, I noticed that it is not necessary to introduce additional registers for network service identification, since this information is trivially deducible from the names of the available ports, especially if they are mandated to follow a specific pattern. Here is a high-level overview of the process:

The configurator node (e.g., the flight controller) has a certain token that identifies the current configuration of the vehicle. In the PoC, this is simply a 32-bit random number that is generated once and stored in a non-volatile register. If the vehicle needs to be re-configured, the token can be changed to trigger reconfiguration of all nodes connected to the vehicle from now on.
Whenever a new node is detected on the network, the PnP configurator reads the value of its string-typed register udral.pnp.cookie. If the register does not exist (or is of a wrong type), the node does not support auto-configuration, in which case the process ends here. The human may be hinted to configure the node manually.
If the value of the cookie matches the value of the configuration token, the node was auto-configured earlier so it needs no further processing. If the cookie contains some unexpected value (e.g., a string like manual), auto-configuration is also skipped assuming that the human prefers to configure the device manually. Notice that this hot path (one register request) will be executed always whenever any node becomes online.
By this point we have determined that the device supports and requires auto-configuration. We will need to determine which standard network services it supports (like servo control or airspeed sensor data publication). If we find any standard services that we can accommodate automatically, that’s fine; there may be other services that may need to be configured manually (like vendor extensions, application-specific, or other standards). Let’s call this process service discovery, it is very simple; the first thing to do is to read all the registers that are available on the node using uavcan.register.List.
Of all registers exposed by the node, we are only interested in those that configure port-IDs. Per the standard, they follow the pattern like uavcan.(pub|sub|cln|srv).[a-z0-9_.].id. For example, register uavcan.pub.airspeed.differential_pressure.id defines the ID of published subject airspeed.differential_pressure.

Extract port names from the register names. At this stage, we end up with a list like:

PortAssignment(
    pub={'servo.feedback': 1234, 'servo.status': 65535, 'servo.power': 1236, 'servo.dynamics': 1237},
    sub={'servo.setpoint': 1238, 'servo.readiness': 65535, 'some_vendor_specific_thing': 3333},
    cln={'another.vendor_specific_thing': 123},
    srv={},
)

Next we check if there are any port names that look familiar (follow the naming convention we define for UDRAL). Ports that don’t follow the convention are simply ignored (either to be manually configured or to be auto-configured using some other means). Suppose that UDRAL-compliant network services name their ports following the pattern like service_name.instance_name.suffix, where the service name defines which kind of service it is (e.g., “airspeed”, “servo”, “gnss”, “esc”, etc.), the instance name is provided if there is more than instance implemented by the node (e.g., “first”, or just “0”), and the suffix reflects the purpose (e.g., differential_pressure). For example, servo.left.dynamics, or airspeed.static_air_temperature. Here is how this logic is implemented in Python:

@dataclasses.dataclass(frozen=True)
class PortSuffixMapping:
    pub: dict[str, str] = dataclasses.field(default_factory=dict)
    sub: dict[str, str] = dataclasses.field(default_factory=dict)
    cln: dict[str, str] = dataclasses.field(default_factory=dict)
    srv: dict[str, str] = dataclasses.field(default_factory=dict)

def detect_service_instances(pub: Iterable[str] = (),
                             sub: Iterable[str] = (),
                             cln: Iterable[str] = (),
                             srv: Iterable[str] = ()) -> dict[str, dict[str, PortSuffixMapping]]:
    out: dict[str, dict[str, PortSuffixMapping]] = {}

    def psm(s: str, i: str) -> PortSuffixMapping:
        return out.setdefault(s, {}).setdefault(i, PortSuffixMapping())

    for svc, ins, suf, port in _split(pub): psm(svc, ins).pub[suf] = port
    for svc, ins, suf, port in _split(sub): psm(svc, ins).sub[suf] = port
    for svc, ins, suf, port in _split(cln): psm(svc, ins).cln[suf] = port
    for svc, ins, suf, port in _split(srv): psm(svc, ins).srv[suf] = port
    return out

def _split(port_names: Iterable[str]) -> Iterable[Tuple[str, str, str, str]]:
    for pn in port_names:
        p = pn.split(".", 2)
        if   len(p) > 2: yield p[0], p[1], p[2], pn  # e.g., "servo.first.dynamics"
        elif len(p) > 1: yield p[0],   "", p[1], pn  # e.g., "servo.dynamics" (no instance name)
        else:            yield   "",   "", p[0], pn  # e.g., "dynamics" (no service/instance name)

>>> pub = [
...     "airspeed.foo.differential_pressure",   # Service "airspeed", instance "foo"
...     "airspeed.foo.static_air_temperature",  # Service "airspeed", instance "foo"
...     "airspeed.bar.differential_pressure",   # Service "airspeed", instance "bar"
...     "servo.feedback",                       # Service "servo", anonymous instance (singleton)
...     "servo.status",                         # etc.
...     "servo.power",
...     "servo.dynamics",
... ]
>>> sub = [
...     "airspeed.bar.heater.state",            # Service "airspeed", instance "bar" (see above)
...     "servo.setpoint",                       # Service "servo", anonymous instance (see above)
...     "servo.readiness",
...     "unrelated.subscription",               # Application-specific or vendor-specific subject, non-standard
... ]
>>> srv = [
...     "unrelated.server",                     # Application-specific or vendor-specific server, non-standard
...     "standalone_server",                    # Not part of a service, non-standard
... ]
>>> result = detect_service_instances(pub=pub, sub=sub, srv=srv)
>>> list(result)
['airspeed', 'servo', 'unrelated', '']
>>> result["airspeed"]
{'foo': PortSuffixMapping(pub={'differential_pressure':  'airspeed.foo.differential_pressure',
                               'static_air_temperature': 'airspeed.foo.static_air_temperature'},
                          sub={},
                          cln={},
                          srv={}),
 'bar': PortSuffixMapping(pub={'differential_pressure': 'airspeed.bar.differential_pressure'},
                          sub={'heater.state':          'airspeed.bar.heater.state'},
                          cln={},
                          srv={})}
>>> result[""]
{'': PortSuffixMapping(pub={},
                       sub={},
                       cln={},
                       srv={'standalone_server': 'standalone_server'})}

In this example, we assume that the airspeed service defines the following ports:

Publisher differential_pressure of type uavcan.si.sample.pressure.Scalar
Publisher static_air_temperature of type uavcan.si.sample.temperature.Scalar

There are similar conventions for the servo service.

We instantiate service clients locally by picking any arbitrary unoccupied port-IDs. The demo uses 6010…6019 for differential pressure, 6020…6029 for the static air temperature, 5000…5049 for servo dynamics, and so on. Naturally, such ranges do not need to be mentioned in the standard to discourage poor design. Type safety is ensured by segregating different functions by ID ranges; although, in the case of manual configuration, the user is still responsible for setting the types correctly.
Having allocated the identifiers, we update the PortAssignment instance (see above) and update the remote node configuration accordingly. We also rewrite the cookie to indicate that the configuration is now installed.
At the last step, we issue uavcan.node.ExecuteCommand with COMMAND_STORE_PERSISTENT_STATES (in case the node does not implement an automatic update of the storage) and then COMMAND_RESTART (to apply the new settings).

Note that this particular demo does not handle indexed group command messages, such as ESC commands. While this is trivial to implement, I decided not to overcomplicate the PoC for now. If you find this direction sensible, then it might be better to start experimenting with the actual flight controller codebase rather than with Python scripts.

The advantage of this approach is that it is architecturally clean unlike the way of fixing port-identifiers, and it is able to automatically configure multiple instances of function nodes.

The downside of this approach is the added complexity on the flight controller (but not the other nodes, which only have to implement the cookie register to support this). The Python PoC is a little under 400 lines of code large. I expect that on an embedded system it would take about two thousand lines of C++.