Sounds acceptable, but considering the previous feedback on the original DS-015 message set, we should wait until everyone is actually content with the draft before merging (and I don’t think everyone is currently), which may or may not take until after Oct 15. The point is to make a satisfactory message set, not meet a deadline, if that means ripping it up later on.
I’ll provide my own neutral review soon when I get a chance.
This objective is likely to be unreachable within a realistic time frame, which is why having a soft deadline is important. If one prefers a v0-style design, that is simply not going to happen regardless of the amount of time allocated for the debate.
The issues raised here: https://forum.opencyphal.org/t/meeting-minutes-july-21-2021-utc-udral-call/1365
and https://forum.opencyphal.org/t/uavcan-drone-application-layer-sig-guidelines/1280/16 provide a good summary of the requirements a UDRAL proposal should meet. The definitions in the PR are nothing but find/replace drone/udral on the DS-015 definitions. With no attempt to address the shortfalls identified in that message set, I see no reason to expect different feedback on them.
As I’ve mentioned before, I think we need to get the design agreed and the tooling in place before publishing message definitions. I don’t think we’re at that point yet.
I’d also suggest that as definitions are developed we leave them in a branch until they’re iterated, agreed, and stable. MAVLink has demonstrated many times that WIP messages have a habit of making their way into production systems, making it very difficult to iterate and change them. That’s why in MAVLink we’ve moved away from WIP messages in favour of development.xml (which is annoying, but a necessary compromise given the workflows in that project).
The proposed design addresses every significant issue raised on this forum so far, which I explained in the OP post. If you find this to be untrue, please, provide a detailed response instead of making vague references to past discussions. The UDRAL DSDL codebase did not see significant change compared to DS-015 because none was necessary to address the known issues.
Regarding the risk of publishing WIP types: I see your point, but I imagine that UAVCAN is reasonably shielded from this risk by virtue of incorporating a well-defined version number with every definition. One should not expect a data type with a version number of v0.x to be stable. Yet, having WIP types available in the main branch is helpful as it encourages early-stage experimental adoption.
Re WIP / versioning. WIP messages in mavlink were explicitly tagged as WIP in the xml. Didn’t stop people deploying them. I understand the intent to make prototyping easier, but we also need to protect users as much as we can. Any dev can build off a development branch, which in my view is a safer approach. Perhaps we could ask the SIG/consortium?
Re specific points, if you’d prefer I copy them in here, sure:
Andrew TridgelltridgeUAVCAN Consortium member representative
yes, we should have simple ones that just reflect the real data. I don’t think we should add the “air_data_computer” unless there is going to be real hardware that will really be used that needs it.
we should just stop using covariance matrices. It just encourages developers to make up meaningless numbers. We shouldn’t be wasting bandwidth on stuff that is just made up.
A few guiding principles in message design:
- don’t add fields unless there is a real need for them
- don’t add fields that force the developer to make stuff up that they don’t really know
closer, but should remove a bunch of fields.
- I don’t think the timestamp really has value on this message. Timestamps have enormous value on messages like GNSS position and velocity, but on differential pressure used for airspeed I don’t think it is useful. The time of arrival is fine.
- remove both filter_delay and the filtered differential pressure. It would only make sense if we were greatly reducing the sample rate on the bus and I don’t think we are likely to be doing that.
- get rid of the variance, as the sensor is unlikely to really have a good measure of that
- only have one temperature
We should aim to get it down to a single CAN frame if possible.
but that doesn’t tell you what this reading actually is. It just says it is a difference between two pressures and a temperature (plus a pointless timestamp and covariance). We need it to be broadcast in a form that says “this is from a pitot tube, if you want to get a pitot based airspeed you can use this”.
To summarise, this PR does not address the fundamental problem:
Vadim has attempted to address the port identification/type safety issues, but the other significant rub points are untouched.
The DS-015/UDRAL messages assume high level functionality in each node, and this simply isn’t the reality for most of the systems that currently use UAVCAN. For some nodes, such as actuators or gimbals, this level of abstraction can be made to work. For many sensors, such as GNSS and other low level devices, it can’t without introducing risk. It is not reasonable to expect that this can be overcome by a one-off configuration by the integrator, or by transferring complexity to another system (ie the autopilot).
Developing a standard based on an idealised expectation of some future reality doesn’t make sense. Having the scope/flexibility within the standard to adapt to a future state is important, but for the standard to have any chance at success it needs to also be able to functionally replace v0 (particularly given that you’ve deprecated v0).
The reality is that if UAVCANv1/UDRAL doesn’t achieve that, it fails. With what you’ve presented here, it fails.
I provided a detailed review of the benefits of SOA in this post: https://forum.opencyphal.org/t/port-type-safety-enforcement/1303/73?u=pavel.kirienko. The bandwidth concerns (along with the derived issues, such as timestamping and covariance) appear to be unsubstantiated, as illustrated in the OP post.
As for the other points, they are already covered in the OP post, and I see little value in restating them again.
I agree and disagree with some of the points made here. Note that I also made some notes in the UDRAL planning document based on a compromise, and I’d like to see some of those decisions implemented here (I think they were quite reasonable).
Developing a standard based on an idealised expectation of some future reality doesn’t make sense. Having the scope/flexibility within the standard to adapt to a future state is important, but for the standard to have any chance at success it needs to also be able to functionally replace v0 (particularly given that you’ve deprecated v0).
The reality is that if UAVCANv1/UDRAL doesn’t achieve that, it fails. With what you’ve presented here, it fails.
Re flexibility: My original design draft aimed to create usable low level data types as well as high level services for each service class. The idea was to make it very practical to use low level types while providing a path to transition to high level services. Most of this is alright currently, but I believe some of the physics namespace types are meant for higher level services and don’t really promote the use of direct publishing.
we should just stop using covariance matrices. It just encourages developers to make up meaningless numbers. We shouldn’t be wasting bandwidth on stuff that is just made up.
I agree with this. Unfortunately no one has convinced me yet that they are useful enough (or useful at all) to justify the bandwidth they are taking, regardless of whether the bandwidth can be spared.
We should aim to get it down to a single CAN frame if possible.
This is not a necessary goal for a GNSS or air data frame. However, it is definitely a necessary goal for an ESC setpoint - something that I believe I outlined in the planning document which was not implemented in the message draft.
Developing a standard based on an idealised expectation of some future reality doesn’t make sense.
This is only partly true - standards tend to (and should) outlive specific implementations and design choices of the day. The best we can do is think hard about the current “best practices” in the hope they are somewhat future proof. I think UAVCANv1 does that correctly.
The bulk of the call was basically to-ing and fro-ing about “how do we use an architecture designed for high level distributed computing as a low level sensor network”.
We don’t - the idea (in my mind) was to provide a set of solid low level types that vendors can use temporarily without concern, while providing higher level abstract services to migrate to that are more practical in a more complex system.
For many sensors, such as GNSS and other low level devices, it can’t without introducing risk.
How so? I believe all the proven issues were taken care of by @VadimZ’s proposals.
In any case, it’s good to have a nice example fleshed out for a reasonable VTOL vehicle, and still have the bandwidth utilization for v1 on a 1Mbps CAN 2.0B bus be < 30%. I think the bandwidth for our (Volansi’s) vehicles might be higher, but on the other hand, we may split devices between two buses, and further, we’re planning to adopt CAN-FD as soon as we can, so it will be a non-issue either way.
I like the idea here - but the question of what vehicle to use as a standard remains. UDRAL currently takes up more bandwidth than I’d like on a 1Mbps CAN 2.0B bus, which is the reality of the hardware most of us are working on. For instance, my team develops an octocopter, and we’d really like to be able to fit that as well…
I think the approach of presenting abstract arguments (or pointers to past arguments) is unlikely to bring consensus here.
I would like to propose a two-part approach:
- Try reviewing transport, service discovery and tooling part (service registers, nunaweb, type signature etc) separately from message formats. Here the basic requirements are likely to be uncontroversial, so it should be possible to find common ground.
- When discussing the message formats, it would be more productive to focus on comparing specific, fully described alternatives of message implementation, before their abstract properties. It might also make sense to postpone this discussion until after some common ground is established on part 1.
@auturgy, are you seeing remaining rub points outside of message design (with several sub-issues such as granularity, nesting, timestamping and bandwidth optimization) ?
That’s a good starting point, lets examine a GNSS + magnetometer device case in more detail ? We are actually manufacturing those now, so I could meaningfully participate …
When discussing the message formats, it would be more productive to focus on comparing specific, fully described alternatives of message implementation, before their abstract properties.
Sure! Let me propose something simple below then, related to the actuator setpoint service, since it’s a very simple case to start out with:
Currently the generic setpoint messages look like this:
float16[<n>] value
@extent 16 * 256
I propose modifying it to use 14 bit integer setpoints (same as V0):
int14[<n>] value
@extent 16 * 256
The above has the benefit of halving bandwidth usage on quadcopters when using Classic CAN (7 byte payload * 8 bits / 4 setpoint values = 14 bits, fits in a single frame), which are likely one of the most common use cases. Octocopters also see an decrease in bandwidth usage. Since ESC setpoints are relatively heavy (high rate), this is a significant improvement, and also allows the integrator to potentially raise the ESC setpoint frequency. There’s also minimal loss in semantic meaning - they are still normalized/scaled ratiometric setpoints with sufficient resolution [-8192, 8191].
- Approve (14 bit scaled integer setpoint)
- Disapprove (keep the 16 bit float setpoint)
0 voters
I think I like this idea, though for slightly different reasons.
The change from floating point to fixed point would require explicit scaling parameter for non-ratiometric control variables (those with physical unit such as current or rotation speed). Having this extra parameter slightly increases configuration complexity, however it makes the quantization uniform, and more importantly, predictable and explicit.
FP16 would provide and illusion of infinite range, and then surprise with unexpected non-uniform quantization effects
I think I can live with the general high level design and defining services where each subject is mapped to a particular register and all the required pieces to generate code end to end.
The number of subjects per servo seems a bit excessive (4 publications, 2 subscriptions in the demo), but I suppose with appropriate tooling it could be tolerable.
Isn’t a reg.udral.physics.kinematics.translation.Linear.0.1
sent to each servo kind of excessive? What about having some flexibility in the services so that a manufacturer can expose what’s even supported in the first place.
Instead of having so much flexibility per service why not simply carry different services for variations that actually exist? Leaving things open to interpretation with multiple options and details buried in an comment essay seems like a good way to ensure every vendor will carry their own set of implementation quirks.
Do you understand the motivation for such design though, and do you recognize its validity (assuming that we are not discussing the most trivial applications)?
Observe that the complexity of the new design scales with the application. The most basic systems are unlikely to require, say, the feedback, status, and power subjects, which leaves only three, which is comparable to v0.
On a typical UAV, it probably is excessive, which is why there is a simpler alternative provided. The full kinematics message is intended for applications that require controlling the motion profile (such as limiting the acceleration).
We can certainly discuss the extra flexibility and feature reporting but right now I suggest focusing on the bare MVP.
This is partly due to the fact that UDRAL provides a much more detailed specification compared to v0, which also boosts the illusion of added complexity.
Before we solve a problem, let us ensure that it is actually there. Can we please model the subjects of a highly complex vehicle that would run out of bandwidth with UDRAL, and then optimize for that? I am not questioning the existence of such configurations, obviously, but I am questioning their relevance for UDRAL over Classic CAN.
So far, your proposal seems inferior to the current design in two points:
-
It introduces extra complexity for non-ratiometric modes (requires additional scaling as Vadim wrote).
-
It somewhat undermines the composability for the case where the actuator group contains one actuator (or a number of them commanded in lockstep) because there is no primitive-typed message for
int14
while there is one forfloat16
.
You’re saying the setpoint
subject can be different types? I thought that had been settled in the type safety discussion, but perhaps I’ve misunderstood? How is that supposed to work safely? How do you even know which type of setpoint your servo can use on that subject? I feel like we’re going in circles.
Right now the overwhelming majority of vehicles in our ecosystem use dumb PWM servos with no feedback, status, etc. That’s the basic problem we’ve so far failed to solve and where I think we should be focused before anything else.
The type is chosen based on the configuration of the node, so the type safety is not affected:
Alternatively, there could be a separate port for the second type of setpoint.
indeed.
I’ve been looking through the battery, ESC and servo messages in the PR, and evaluating them against the needs of ArduPilot (and presumably PX4). I was going to do one post to summarize my findings for all 3, but it turns out to be much too long, so I will do battery first, then go onto ESCs and servos.
To facilitate this analysis I wrote a little hackish tool to allow me to ‘unwind’ a type from the DSDL in v1. This just automates what I did manually for my posts on GNSS and other messages on the DS-015 thread. Perhaps there was already a tool to do this (in which case a link would be appreciated!), but otherwise if anyone wants it my ugly hack tool is here:
http://uav.tridgell.net/tmp/showtype.py
Battery Messages
The README.md in reg/udral explains the battery service as the main example for udral structure. The first thing we notice is it puts the concept of how the battery will be used inside the node providing the service. I think this is a mistake. The example given shows a battery with registered names of “battery.primary” and “main_drive”. It is a bad idea for this “how it will be used” information to be inside the node itself as it increases the amount of information you need to configure in the nodes, and doesn’t lend itself to automatic configuration. I would much prefer that nodes just present an integer ID for a battery provided by the node and do all of the assignment in the flight controller. Putting the smarts in the node makes it much harder to mix technologies (eg. analog batteries, SMBus/I2C batteries, v0 batteries and others).
Adding a generic ability for a node to have an optional string label set by the user would be OK, but this shouldn’t really be linked into the data type.
Diving into the messages themselves, we have 3 top level messages for batteries.
energy_source reg.udral.physics.electricity.SourceTs (at between 1 and 100Hz)
status reg.udral.service.battery.Status (at around 1Hz)
parameters reg.udral.service.battery.Parameters (at around 0.2Hz)
Let’s expand the energy_source message using the showtype tool:
reg.udral.physics.electricity.SourceTs.0.1 length=23
timestamp: uavcan.time.SynchronizedTimestamp.1.0 length=7
uavcan.time.SynchronizedTimestamp.1.0 length=7
microsecond: truncated uint56 length=7
value: reg.udral.physics.electricity.Source.0.1 length=16
reg.udral.physics.electricity.Source.0.1 length=16
power: reg.udral.physics.electricity.Power.0.1 length=8
reg.udral.physics.electricity.Power.0.1 length=8
current: uavcan.si.unit.electric_current.Scalar.1.0 length=4
uavcan.si.unit.electric_current.Scalar.1.0 length=4
ampere: saturated float32 length=4
voltage: uavcan.si.unit.voltage.Scalar.1.0 length=4
uavcan.si.unit.voltage.Scalar.1.0 length=4
volt: saturated float32 length=4
energy: uavcan.si.unit.energy.Scalar.1.0 length=4
uavcan.si.unit.energy.Scalar.1.0 length=4
joule: saturated float32 length=4
full_energy: uavcan.si.unit.energy.Scalar.1.0 length=4
uavcan.si.unit.energy.Scalar.1.0 length=4
joule: saturated float32 length=4
the good news is this only has one timestamp, although I’d argue that even that is too many. I don’t think time-stamping this data is really useful, as the integration time for this sort of data is quite long and the transport delays won’t be significant.
In the expansion, we have current, voltage, energy and full_energy. Unfortunately there is no information on what to give if the node doesn’t have the information. Low end CAN battery monitors (where they just connect to a battery over an XT60 for example) don’t know the full_energy or energy. In that case it would know the amount of energy used so far, but not the full_energy, which means it could not fill in either field.
Now lets look at the status message (sent at about 1Hz). Expanding the reg.udral.service.battery.Status message we see:
reg.udral.service.battery.Status.0.2 length=604
heartbeat: reg.udral.service.common.Heartbeat.0.1 length=2
reg.udral.service.common.Heartbeat.0.1 length=2
readiness: reg.udral.service.common.Readiness.0.1 length=1
reg.udral.service.common.Readiness.0.1 length=1
value: truncated uint2 length=1
health: uavcan.node.Health.1.0 length=1
uavcan.node.Health.1.0 length=1
value: saturated uint2 length=1
temperature_min_max: uavcan.si.unit.temperature.Scalar.1.0[2] length=8
uavcan.si.unit.temperature.Scalar.1.0 length=4
kelvin: saturated float32 length=4
available_charge: uavcan.si.unit.electric_charge.Scalar.1.0 length=4
uavcan.si.unit.electric_charge.Scalar.1.0 length=4
coulomb: saturated float32 length=4
error: reg.udral.service.battery.Error.0.1 length=1
reg.udral.service.battery.Error.0.1 length=1
value: saturated uint8 length=1
cell_voltages: saturated float16[<=255] length=2
the high value for total length (604 bytes) comes from the high maximum array size for cell_voltages. Supporting 255 cell batteries does seem like overkill to me, but maybe such sizes are coming.
Using coulombs for available charge seems like a poor choice when Ah or (better) Wh is much more commonly used. The conversion isn’t too hard though.
Now the parameters message:
reg.udral.service.battery.Parameters.0.3 length=71
unique_id: truncated uint64 length=8
mass: uavcan.si.unit.mass.Scalar.1.0 length=4
uavcan.si.unit.mass.Scalar.1.0 length=4
kilogram: saturated float32 length=4
design_capacity: uavcan.si.unit.electric_charge.Scalar.1.0 length=4
uavcan.si.unit.electric_charge.Scalar.1.0 length=4
coulomb: saturated float32 length=4
design_cell_voltage_min_max: uavcan.si.unit.voltage.Scalar.1.0[2] length=8
uavcan.si.unit.voltage.Scalar.1.0 length=4
volt: saturated float32 length=4
discharge_current: uavcan.si.unit.electric_current.Scalar.1.0 length=4
uavcan.si.unit.electric_current.Scalar.1.0 length=4
ampere: saturated float32 length=4
discharge_current_burst: uavcan.si.unit.electric_current.Scalar.1.0 length=4
uavcan.si.unit.electric_current.Scalar.1.0 length=4
ampere: saturated float32 length=4
charge_current: uavcan.si.unit.electric_current.Scalar.1.0 length=4
uavcan.si.unit.electric_current.Scalar.1.0 length=4
ampere: saturated float32 length=4
charge_current_fast: uavcan.si.unit.electric_current.Scalar.1.0 length=4
uavcan.si.unit.electric_current.Scalar.1.0 length=4
ampere: saturated float32 length=4
charge_termination_threshold: uavcan.si.unit.electric_current.Scalar.1.0 length=4
uavcan.si.unit.electric_current.Scalar.1.0 length=4
ampere: saturated float32 length=4
charge_voltage: uavcan.si.unit.voltage.Scalar.1.0 length=4
uavcan.si.unit.voltage.Scalar.1.0 length=4
volt: saturated float32 length=4
cycle_count: saturated uint16 length=2
: void16 length=2
state_of_health_pct: saturated uint7 length=1
: void1 length=1
technology: reg.udral.service.battery.Technology.0.1 length=1
reg.udral.service.battery.Technology.0.1 length=1
value: saturated uint8 length=1
nominal_voltage: uavcan.si.unit.voltage.Scalar.1.0 length=4
uavcan.si.unit.voltage.Scalar.1.0 length=4
volt: saturated float32 length=4
A few oddities here. It has a design_cell_voltage_min_max, but not the number of cells. Are you supposed to wait for the status message and look at the array length? If that is the plan then it won’t work, as it is quite common for a smart battery to not be able to read cell voltages on all the cells. Common SMBus chips may (for example) support up to 8 cells but only be able to read cell voltages for 4 of the cells, but can then also give you a total voltage. In that case we can’t really fill things in correctly. We could calculate an average voltage for the remaining cells (ArduPilot can do this for mavlink), but it would be better to add a num_cells in Parameters.
The Parameters message lacks any model name string or manufacturer name. I’d very much like to log the manufacturer info and serial number. It does have a unique_id, but I think a manufacturer string is worthwhile. I suspect that most UAVCAN battery monitors will backend onto SMBus battery systems, which do offer a manufacturer name and also a manufacture date. As the age of a battery can be quite important (for maintenance schedules at least) we really should include the date as well.
It is also worth comparing this battery message set to the closest equivalent in MAVLink is BATTERY_STATUS and SMART_BATTERY_INFO.
I think the first thing you notice is it is much easier to read and understand the mavlink XML than the dsdl. The deep nesting in the dsdl really gets in the way of making this understandable. Once you unwind the dsdl you find that the mavlink and xml are similar. The dsdl supports more cells (mavlink tops out at 14 cells, but could be extended). MAVLink uses a current_consumed instead of energy_remaining, which better fits adapters that don’t know the full capacity of the battery.
Overall the proposed battery messages are not terrible. There are some issues, but not awful. The ESC and servos messages have much bigger issues.
Using coulombs for available charge seems like a poor choice when Ah or (better) Wh is much more commonly used. The conversion isn’t too hard though.
Just regarding this, I know its strange, and I thought so as well at first, but I think the minor inconvenience is worth the added consistency of keeping A * s as per SI. And using this doesn’t hurt the data (clipping or transport bandwidth costs) so it should be fine.
Other than that, I agree on your review of the battery services for the most part. I don’t feel I’m well versed enough in this topic to provide a thorough review though.
Also, just wanted to mention again that this tool exists (written by @bbworld1) : https://nunaweb.uavcan.org/api/storage/docs/docs/reg/index.html
Not quite the same as the tool you wrote, but probably beneficial nonetheless.
ESC Messages
Following on with my dive into the proposed battery, ESC and servo messages, here is what I got from looking at the ESC dsdl. The ESC messages are a much more complex design than the battery messages, and really show some of the design flaws in the type system for UAVCAV v1.
Before we get into the messages themselves we need to first off discuss ‘groups’ in the documentation for ESCs in UDRAL. The doc says:
# ESCs (drives) are segregated into groups. Each ESC in a group has an index that is unique within the group.
# Drives in a group are commanded synchronously by publishing a message containing an array of setpoints.
ok, so having groups could make sense. We don’t need it for ArduPilot (we have a flat namespace of actuator numbering), but I could see some systems using it. The problem is I cannot at all see how the group you are trying to control is communicated. There is no group ID field in any of the messages that I can see, so I wondered if it was a top
level concept I’d missed in the UAVCAN_Specification_v1.0.pdf, so I checked there and while the word ‘group’ is used for quite a few different things (eg. a byte is a group of 8 bits), there is no use of group in there that makes sense with this ESC usage of group.
Can someone explain how groups work?
Now on to the messages to command the ESCs to spin. There are two sets of messages,
reg.udral.service.actuator.common.sp.* and
reg.udral.service.common.Readiness.
The wildcard reg.udral.service.actuator.common.sp.* is really odd. It comes from the way zero truncation in UAVCAN v1 is handled. In UAVCAN v0 it is this simple array in 1030.RawCommand:
int14[<=20] cmd
In the proposed URDAL message the recipient needs to subscribe to Vector31.0.1.uavcan which is this:
float16[31] value
but the sender (typically the flight controller) has to choose which of the following messages to send based on the number of actuators:
Scalar.0.1.uavcan
Vector2.0.1.uavcan
Vector3.0.1.uavcan
Vector4.0.1.uavcan
Vector6.0.1.uavcan
Vector8.0.1.uavcan
Vector31.0.1.uavcan
(all of these are down inside the reg.udral.service.actuator.common.sp.* name space).
This then takes advantage of the zero truncation so when the ESC receives Vector31 and the flight controller sends Vector6 that the ESC gets zeros in the later elements.
This way of handling arrays of ESC commands is very awkward, and requires that the flight controller have code for sending a bunch of different messages for the same purpose, presumably using a set of if statements to work out the threshold.
It is also inefficient if the ID number of the ESCs you’re trying to address doesn’t start with zero. For example, you may want your ESCs on a quadplane to be ID numbers 5 to 8 (which is the default for ArduPilot quadplanes). With this structure the flight controller has to send a Vector8, with the first 4 elements as zero. It would be much better to have a bitmask at the start of message saying which ESCs you are wanting to address. That would scale nicely to a larger number of ESCs. As it stands, bandwidth usage will depend a lot on the IDs you choose.
Combined with these sp messages, we have reg.udral.service.common.Readiness, which puts the ESC into a ready
state of SLEEP, STANDBY or ENGAGED. Having a way to change the readiness state could be useful, although I haven’t missed not having it in v0.
Next we look at the ESC feedback messages. There are 4 separate messages for ESC feedback, following along with the philosophy in UAVCAN v1 of breaking things down into small pieces. The 4 feedback
messages are:
feedback reg.udral.service.actuator.common.Feedback
status reg.udral.service.actuator.common.Status
power reg.udral.physics.electricity.PowerTs
dynamics reg.udral.physics.dynamics.rotation.PlanarTs
Let’s look at these messages. First the feedback message:
reg.udral.service.actuator.common.Feedback.0.1 length=67
heartbeat: reg.udral.service.common.Heartbeat.0.1 length=2
reg.udral.service.common.Heartbeat.0.1 length=2
readiness: reg.udral.service.common.Readiness.0.1 length=1
reg.udral.service.common.Readiness.0.1 length=1
value: truncated uint2 length=1
health: uavcan.node.Health.1.0 length=1
uavcan.node.Health.1.0 length=1
value: saturated uint2 length=1
demand_factor_pct: saturated int8 length=1
now the status message:
reg.udral.service.actuator.common.Status.0.1 length=67
motor_temperature: uavcan.si.unit.temperature.Scalar.1.0 length=4
uavcan.si.unit.temperature.Scalar.1.0 length=4
kelvin: saturated float32 length=4
controller_temperature: uavcan.si.unit.temperature.Scalar.1.0 length=4
uavcan.si.unit.temperature.Scalar.1.0 length=4
kelvin: saturated float32 length=4
error_count: saturated uint32 length=4
fault_flags: reg.udral.service.actuator.common.FaultFlags.0.1 length=2
reg.udral.service.actuator.common.FaultFlags.0.1 length=2
overload: saturated bool length=1
voltage: saturated bool length=1
motor_temperature: saturated bool length=1
controller_temperature: saturated bool length=1
velocity: saturated bool length=1
mechanical: saturated bool length=1
vibration: saturated bool length=1
configuration: saturated bool length=1
control_mode: saturated bool length=1
: void6 length=1
other: saturated bool length=1
the power message:
reg.udral.physics.electricity.PowerTs.0.1 length=15
timestamp: uavcan.time.SynchronizedTimestamp.1.0 length=7
uavcan.time.SynchronizedTimestamp.1.0 length=7
microsecond: truncated uint56 length=7
value: reg.udral.physics.electricity.Power.0.1 length=8
reg.udral.physics.electricity.Power.0.1 length=8
current: uavcan.si.unit.electric_current.Scalar.1.0 length=4
uavcan.si.unit.electric_current.Scalar.1.0 length=4
ampere: saturated float32 length=4
voltage: uavcan.si.unit.voltage.Scalar.1.0 length=4
uavcan.si.unit.voltage.Scalar.1.0 length=4
volt: saturated float32 length=4
and finally the dynamics message:
reg.udral.physics.dynamics.rotation.PlanarTs.0.1 length=23
timestamp: uavcan.time.SynchronizedTimestamp.1.0 length=7
uavcan.time.SynchronizedTimestamp.1.0 length=7
microsecond: truncated uint56 length=7
value: reg.udral.physics.dynamics.rotation.Planar.0.1 length=16
reg.udral.physics.dynamics.rotation.Planar.0.1 length=16
kinematics: reg.udral.physics.kinematics.rotation.Planar.0.1 length=12
reg.udral.physics.kinematics.rotation.Planar.0.1 length=12
angular_position: uavcan.si.unit.angle.Scalar.1.0 length=4
uavcan.si.unit.angle.Scalar.1.0 length=4
radian: saturated float32 length=4
angular_velocity: uavcan.si.unit.angular_velocity.Scalar.1.0 length=4
uavcan.si.unit.angular_velocity.Scalar.1.0 length=4
radian_per_second: saturated float32 length=4
angular_acceleration: uavcan.si.unit.angular_acceleration.Scalar.1.0 length=4
uavcan.si.unit.angular_acceleration.Scalar.1.0 length=4
radian_per_second_per_second: saturated float32 length=4
torque: uavcan.si.unit.torque.Scalar.1.0 length=4
uavcan.si.unit.torque.Scalar.1.0 length=4
newton_meter: saturated float32 length=4
This “lots of small pieces” approach is really bad for several reasons:
- the flight controller will want to combine the data from these messages at a single point in time in order to make decisions about whether an ESC is in a truly healthy state, or if something may be going wrong. This is needed both for in-flight diagnostics and for accurate logging and analysis of a failure.
- it is much less efficient on the wire than just putting all the info in one message, and rapid ESC messages. When complex vehicles are using CAN ESCs then bxCAN bandwidth is at a premium, so breaking things up like this is a poor use of resources.
- aligning the messages using the timestamp (for dynamics and power) makes it a lot more complex for the flight controller and for log analysis
The breaking down into lots of messages is a really good example of ‘the tail wagging the dog’. UAVCAN v1 starts off by defining a huge pile of base SI unit types, and now wants some way to use them all to justify their existence. Pavel has repeatedly said this is good as it makes for more re-use, but it is all topsy turvy. In UAVCAN v0 we have less than 50 messages total that are actually used in ArduPilot (PX4 has considerably fewer). In UAVCAN v1 I count 207 messages, and we’ve only got battery, ESC and servo so far. So the ‘reuse’ approach has greatly increased the number of messages.
It would be vastly simpler and more efficient to use flat message definitions and not do this level of abstraction. Adding abstraction layers and nesting is not a good end goal in and of itself.
Ok, back to the message fields. Most ESCs won’t be able to fill in the angular_position and angular_acceleration fields, and could only give angular_velocity. The units they normally can give is an eRPM, which can’t be converted to radian/second without knowing the number of poles. This message set would mean that you would have to set the number of poles on each ESC, rather than setting it centrally on the flight controller and until you do that you won’t get RPM logging as the ESC node would have no way to fill it in. I’d much prefer to report an eRPM and also report a number of poles, where zero is used if not known. That would allow ESCs to work out of the box, giving a much better experience for end users while still allowing an RPM to be displayed and (if someone really wanted it) a value in radian_second.
Servos
The servo messages, like ESC messages, are in two parts. The first is the actuator command, for which there are two versions:
reg.udral.physics.dynamics.rotation.Planar
reg.udral.physics.dynamics.translation.Linear
oddly, there are both timestamped and non-timestamped versions of each of these messages. It isn’t clear what the purpose is of sending a timestamp in an actuation command. Is the servo supposed to not act upon the command until the timestamp is past?
The Planar message (which is likely to be the most common type of servo) looks like this:
eg.udral.physics.dynamics.rotation.Planar.0.1 length=16
kinematics: reg.udral.physics.kinematics.rotation.Planar.0.1 length=12
reg.udral.physics.kinematics.rotation.Planar.0.1 length=12
angular_position: uavcan.si.unit.angle.Scalar.1.0 length=4
uavcan.si.unit.angle.Scalar.1.0 length=4
radian: saturated float32 length=4
angular_velocity: uavcan.si.unit.angular_velocity.Scalar.1.0 length=4
uavcan.si.unit.angular_velocity.Scalar.1.0 length=4
radian_per_second: saturated float32 length=4
angular_acceleration: uavcan.si.unit.angular_acceleration.Scalar.1.0 length=4
uavcan.si.unit.angular_acceleration.Scalar.1.0 length=4
radian_per_second_per_second: saturated float32 length=4
torque: uavcan.si.unit.torque.Scalar.1.0 length=4
uavcan.si.unit.torque.Scalar.1.0 length=4
newton_meter: saturated float32 length=4
The Linear message is as follows:
reg.udral.physics.dynamics.translation.Linear.0.1 length=16
kinematics: reg.udral.physics.kinematics.translation.Linear.0.1 length=12
reg.udral.physics.kinematics.translation.Linear.0.1 length=12
position: uavcan.si.unit.length.Scalar.1.0 length=4
uavcan.si.unit.length.Scalar.1.0 length=4
meter: saturated float32 length=4
velocity: uavcan.si.unit.velocity.Scalar.1.0 length=4
uavcan.si.unit.velocity.Scalar.1.0 length=4
meter_per_second: saturated float32 length=4
acceleration: uavcan.si.unit.acceleration.Scalar.1.0 length=4
uavcan.si.unit.acceleration.Scalar.1.0 length=4
meter_per_second_per_second: saturated float32 length=4
force: uavcan.si.unit.force.Scalar.1.0 length=4
uavcan.si.unit.force.Scalar.1.0 length=4
newton: saturated float32 length=4
the way these work is you send NaN for the fields you don’t want. The vast majority of servos will only want angular_position in radians. One unfortunate consequence of the use of SI units is that it won’t be possible to use this message without the CAN node knowing the angle the servo can go through. I think the only practical thing to do when a CAN node is being used to interface to a non-CAN (eg. PWM) servo is to default to an assumption of 90 degrees range, and allow for a parameter to set the actual range. For native CAN servos they should know the true angular range.
It is unfortunate how bandwidth inefficient these messages are. In UAVCAN v0 each Command was 3 bytes, which is reasonably efficient for something that is typically sent at 50 to 100Hz. In the proposed UDRAL it is 16 bytes per servo. That is a huge increase in bandwidth requirement and is just not practical for something like a complex quadplane with UAVCAN for both ESCs and servos.
The servo message set also offers an alternative approach which is based on the reg.udral.service.actuator.common.sp.* messages, just like ESCs. Unfortunately that suffers from all the problems I mentioned above for ESCs.
Overall I think these ESC and servo messages are a backwards step from what we have now with UAVCAN v0 which is surprising as I think the v0 messages are not great.
Conclusion
Which brings me to the summary from having looked through these 3 sample message sets (battery, ESC and servo). I think it shows that the lessons I tried to make clear in my DS-015 analysis have really not been learnt. There would be a lot of pain in moving from v0 to v1 for users, for developers and for vendors. To balance that pain there would have to be really big gains in what we get out of the protocol. So far all I’m seeing is pain (and loss of functionality) and no gains. So there is absolutely no motivation to adopt UAVCAN v1 with UDRAL.
For ArduPilot to adopt v1 Pavel will need to have a major change of heart, and give up some of the things he’s been adamant on. There are many ways to do that that I’ve suggested over the past months. I’m sure there are other approaches, but to just keep insisting that all is well and that the detractors are just exaggerating the problems is not going to get us to a solution.
That is even without the really big issue of a clear migration path from v0 to v1. We would have to develop capabilities messages so we can automatically work out which of the new features we can use in a mixed network (eg. bit rate, message sets). Without that technical underpinning of the migration path then v1 is dead in the water.