Port type safety enforcement

Perhaps we should spend some time now analyzing the possible failure modes at configuration time (due to autoconfiguration, user error, hotplug failures, etc) and addressing them in our design before we move forward. Maybe that will make others feel more comfortable?

1 Like

This sounds like we’ve looped back to the start point.

I’m available essentially any time tomorrow night. I would prefer a slightly earlier time than last time though.

I don’t necessarily see this as a problem of showing a strong preference towards the UAV use case. It is true that there is now a rigid divide between fixed and non-fixed messages - or rather regulated and unregulated messages, in tridge’s new proposal. However, I don’t see anything specific about this proposal that would disadvantage using UAVCAN for other purposes.

I think an interesting point was made about port-IDs - that they are not necessarily 1:1 with subject+node ID. The primary weakness of tridge’s proposal in my opinion is that it ties a single subject - and further than that, a single type - directly to a single port ID; as Pavel mentioned, however, a port ID in UAVCANv1 could be considered to be more consistent with a topic name in another system. This is a drastically different concept from v0’s data type IDs. I think perhaps this is a mismatch in expectations that would be best resolved during the call.

2 Likes

I disagree; for one, we now understand each other a lot better, and there are more concrete examples on both sides. I know it appears like we aren’t getting anywhere, but I do feel that we were getting closer to resolving something last week - I’d like to have another call to see if we can get anywhere.

1 Like

looks like you didn’t read my previous messages in this topic. I am in no way insisting that magnetic field readings only be used for navigation. Even in v0 the 1002.MagneticFieldStrength2 message is not in any way tied to being used for navigation. In fact, I’ve used it for non-navigation tasks with existing vehicles and ArduPilot.
Please understand that my proposals on both @semantic_id and @regulated_id are not at all about the purpose of the data, they are about the semantics. Semantics is quite different from purpose.
Once you understand that then you should see that using @semantic_id or @regulated_id in no way reduces the utility of the protocol for composability.

arrgh! no!!!
A rangefinder has no way of knowing the altitude. This is like your earlier push for an airspeed sensor in DS-015 to publish a calibrated airspeed. The rangefinder cannot know the altitude and differential pressure sensor cannot know the airspeed.
In both v0 and v1 you would publish a “range” from the rangefinder. You can’t turn that into an altitude without having a lot more information. You need to know:

  • the angle of the sensor
  • the mount point on the vehicle
  • what type of altitude you want (AMSL, AGL?) by applying terrain data appropriately
    Creating a message for a rangefinder that publishes an altitude is just silly and should never be done, no matter if it is v0, v1 or any other protocol you’d like to name.

no, you can’t even use it for AGL … there just isn’t enough information

no! That is not how you do it.
We have a per-rangefinder orientation in ArduPilot. You assign rangerfinders to slots in our AP_RangeFinder library, and each is associated with its own orientation. Then you can do sane queries on the rangefinder library like “give me the range of the first healthy rangefinfer that has an orientation of pitch 270 degrees”. That is exactly what our API does right now with v0.

but we want the association between the particular rangefinder instance to an orientation to be in the autopilot logic, not in the network protocol routing. They both carry exactly the same data and achieve the same goal, but having an instance ID fits how we structure the code perfectly and fits with all out existing serial, i2c and UAVCAN v0 supported sensors.
I know you want to push all that logic into the network. That is not what ArduPilot or px4 want, and it is just counter productive.

no, it actually completely ruins the usefulness of the protocol. That part is what led to the horrors of DS-015. We want the individual elements to be tightly coupled into a single message so we know all the data comes from a single point in time. We don’t want the fuel flow of the EFI message to come separately from the throttle level, because we really do care that we get a single data point that has the throttle and fuel flow from the same sample of the EFI system.
You could do that with your proposal of lots of separate subjects only if you added a timestamp to every single component so you can line them up. That is enormously wasteful and just a major pain to re-assemble into a coherent set of one-point-in-time data on the receiving end.

absolute rot. You can use a button in v0 just as generically as you can use a button in v1. You keep conflating semantics and purpose, they are not the same thing.

nope. We just will not use v1, or we’ll use a fork of it with sane design.
Your long message just convinces me that you really haven’t listened or thought about the issues I’ve been raising for weeks at all.

Thank you @pavel.kirienko for reiterating on the design-intent behind not having identifiers enclosed within the message. I feel that @tridge has two main pain points which are configuration at startup and network debugging tooling.

@pavel.kirienko What possible solutions can you think of to solve Andrews concerns?
@tridge What, fixed-IDs aside, would solve your pain points? Do you have other pain points (Probably to be discussed in a separate thread)?

It may appear this way but I have the distinct impression that progress is being made. In the Japanese martial arts progress is likened to ascending a spiral staircase. You are progressing from basic to ever more advanced techniques until you arrive at the basics again. But your understanding of basics is deeper this time so you are able to perceive and appreciate them with a new clarity which leads to additional understanding not previously possible.

The bottom line is: From the top it appears if you are going in circles. Only when viewed from the (in)side you can see that despite seemingly going in circles you also move upwards :bowing_man: .

It appears to me that you (Andrew, as acting spokesperson for Ardupilot) do not want to end up with a UAVCAN that requires ArduPilot to adapt its existing implementation because it’s inconvenient. I get that. In fact you’ve got my sympathy. But I think you are seeing this to narrow. UAVCAN V1 will take a while to happen, and the transition will take probably a longer period of time than anyone wishes to admit. Therefore there will be plenty of time to accommodate those changes.

I suggest to let the representative(s) of PX4 speak for themselves. Perhaps even other developers at ArduPilot even share a slightly different view?

Finally, let’s keep in mind that all of us are human beings, with different technical and cultural backgrounds, some even using English as a secondary language. Misunderstandings are bound to happen (Trust me on this, I’m married to a foreigner :wink:). This can be only countered by adopting an open mindset, really considering different viewpoints, asking question for clarification and giving everyone the benefit of the doubt.

3 Likes

Network debugging is trivial to solve. All we have to do is think about how we do protocol analysis a bit differently. As @pavel.kirienko stated, using wireshark to monitor a V1 network is like using it to decipher the contents of a ROS/DDS transfer. Wireshark is a transport layer analyzer, while message contents are an application layer problem. To view network traffic in something like yukon (WIP), you could configure it with a table associating each port ID with the data type on that port. With autoconfiguration, such a table could even be exported automatically by the allocation authority and imported directly into the tool. Note that it only has to be done once - if the system configuration does not change, the port IDs won’t either. This is, to an extent, how yakut works already - although a convenient table is not (yet) implemented.

1 Like

@tridge did have at least one other concern about the general robustness of not being able to know the packet semantics by just looking at each message payload. This is where I highly recommend we do an analysis of the real failure modes rather than resort to high level speculation like “this is not robust” or “this is too complex”.

1 Like

I wanted to mention one more theme I’m seeing here:

The concern of “this is how it’s currently implemented in ardupilot/px4” seems to be coming up often, especially with ardupilot. Now, don’t get me wrong: I am not trying to undervalue/underappreciate the importance of ardupilot, px4, or the possible difficulty of implementing V1 services in either. However, keep in mind that ardupilot and PX4 are just single implementors of a larger protocol which is supposed to be standard across a much wider domain. Let us be cautious about what our design priorities are - the network should be designed to serve the network/standard architecture’s best interests, not the interests of a particular implementation.

2 Likes

The thing about the way Pavel wants to do things is it is based on a false premise. It assumes that semantics == purpose, and then assumes that by adding all this extra complexity of going via remote registry lookups for a regulated data types that we gain some great new capability. The fact is we don’t. Composability and re-use is just as easy with the @regulated_id way as with the registry, but the regulated_id is much much simpler.

please remember that the UI tool is not just for developers - it is a user tool for solving problems in the field. What you are describing is complex and has failure modes that just aren’t justifiable. We want this tool to be as simple and robust as we can make it.
Having the regulated_id in the packet is dirt simple, can’t go wrong, and gives us exactly the behavior we want for regulated message sets where we want to guarantee that all the vendors are using the same DSDL definitions.

It doesn’t matter how the data that the subscriber is receiving is used inside the subscriber.

This part of my example makes assumptions about service orientation, let’s omit it, it is not relevant in this context. The next example under the one you referenced assumes that the sensor is to publish raw range, how do you solve that using your approach? I don’t think a sensible solution exists.

The way you treat data inside ArduPilot matters little because different implementations are expected to utilize the protocol differently. What you described is an okay approach except for the fact that you have to pollute your models with instance-ID and forcing the flight controller to process data it may not need (e.g., how do I connect a rangefinder whose readings are utilized by another node that is not the flight controller?). You can fix the way you handle sensor readings by merely replacing sensor-ID with subject-ID.

I don’t care how it fits your code. I care about sensible architecture that is not a tangled-up mess of different layers.

You can obtain the exact same results by merely swapping sensor-ID with port-ID in your code. Doing this will actually reduce the logical coupling between your application and the network layer.

Generally speaking, bandwidth concerns are grossly overstated. You end up over-optimizing messages like global position or EFI status which only take about 1% of bus load, resulting in bad architecture with no tangible benefits for the network utilization. Regardless, you only need timestamping for reassembly if the data validity period is less than the publication period, which is not the case with applications we are discussing at the moment.

How exactly? By adding a sensor-ID? If you actually read my posts about this you should see how this is a bad idea.

I don’t think you are really receptive to the values I am trying to communicate here. In my last post I asked you a very practical question on how do you address the specific use case presented in the PyUAVCAN demo, but you ignored it. Can you look into it now? Can you provide an equally hands-on illustration of how your proposal doesn’t break in the scenarios I have outlined above?

1 Like

Hi all,

I’ve been following this topic for some time already. Lots of arguments stated here do make sense, however it seems that either proposal can’t work with the other, thus making a compromise seems to be hard right now.

So let’s start with the fixed vs non-fixed port-ID discussion here. When I started implementing UAVCANv1 on my systems I really disliked the idea not having an easy Plug-and-play experience anymore and the requirement of the configuration by the integrator that would complicate a simple drone setup.

I’ve had some discussion with @pavel.kirienko about this and he was very keen that a fixed port-ID wouldn’t be possible and would break the very thing he believes that UAVCANv1 improves over UAVCANv0 which is the focus on isolated and composable network services in a democratic distributed network.

The focus of UAVCANv1 gives a completely different take on distributed networking compared to UAVCANv0. Where in UAVCANv0 you had fixed portID for messages now a integrator has to setup this manually. Which currently from a end-user perspective would be a huge downgrade compared to UAVCANv0.

I don’t think Pavel is ever going to advocate fixed port-ID’s, thus using the uavcan.register/cookie approach seems to be a compromise to get easy plug-and-play like experience on UAVCANv1 setup.

Regarding robustness and safety this might not the best solution, but then the idea of a system integrator that setups the network would be a robust solution. However tooling (e.g. Yukon) is going to be very important for a good end-user experience.

Hi all!
I’m a new poster to this thread, though I’ve interacted with some of you before. Trying to avoid what seems to me like a looming fork in the project, I’d like to bring up a compromise proposal. This proposal addresses a major disagreement point - subject-id assignment. Some additional points such as the desired granularity of messages and the means to increase type safety are left out for now and can be discussed additionally.

  • suggestion
    • Let’s introduce a standard file format to keep a mapping between subject-id and a message type (and instance-id). Lets provisionally call it “subject-id allocation format”
    • URDAL working group would create and maintain an instance of this file. Let’s call this “UDRAL subject-id allocation” Other working groups could choose to fork this file, coordinate compatible extension of it or choose a different allocation approach altogether.
    • This “UDRAL subject-id allocation” file would be used by both the diagnostic tools (log analyzers, field diagnostics) and “network managing” components of PX4, ArduPilot, and other central nodes. Such use can be either static (code generation) or dynamic.
    • During system powerup, the “network manager” would configure the subject-id for each peripheral instance according to this allocation. This should be a relatively straightforward extension of the current dynamic node id allocation.
    • As the subject-ids would be well defined within the boundaries of UDRAL compatible networks, dispatch and network monitoring tooling can depend on knowing them. Recent versions can even be pulled from a repository together with DSDLs.
    • The manual configuration remains an option. Even then, allocation file as a source of unambiguous documentation makes the process easier and the resulting system compatible with tooling.
  • tradeoffs
    • There will be at least one more register round-trip per message during initialization to check or configure the subject-id (as compared with v0)
    • Transferring peripherals between uncoordinated application domains would cause unintelligible messages on powerup and before initialization is complete. For example, removing servo from a CNC machine and putting it on UAV, assuming the UDRAL and (hypothetical) CNC working group do not coordinate their allocations.
    • Multiple network managers can not share a network unless they use coordinated allocations (for example: unifying the two groups mentioned above to make a flying CNC machine would require two networks with a bridge node)
    • The presence of such allocation guidelines could encourage some extremely near-sighted peripheral vendors to hardcode the value instead of implementing a configuration interface.
  • summary
    • “subject-id allocation format” makes the subject-id allocation issue explicit and provides tools to express the design, manage coordination within and even between working groups (if deemed worthwhile). Having this information out in machine-readable format allows adequate tooling to be made.
    • the tradeoffs are probably mild, easily mitigated, and worth the additional expressiveness gained
2 Likes

This works from my perspective. Other than the “network manager” I believe this is quite similar to a few ideas I’ve casually tried to propose at various times.

@VadimZ thanks for the idea. Generally speaking this works for me, although I wanted to clarify a couple things:

  • The port ID mapping should be maintained separately from the DSDLs (I think you mentioned this)
  • IMO the port IDs should be for specific topic names + instances, not message types, as this allows you to reuse a message type for multiple orthogonal publications.

One thing I don’t understand is that if the “network manager” is autoconfiguring subject IDs from a table, what benefits are achieved by creating such a table beforehand rather than having the network manager pull them out of a hat? Is it that the autopilot could still choose to verify that the configured port ID conformed to the standard table?

This allows network analyzers to work out of the box. By having access to this table they can now parse the logs completely.

This is indeed very similar to what @dagar proposed earlier; as I commented near the beginning of this thread, this is not incompatible with UAVCAN v1’s architecture so it is a perfectly viable path forward.

I think it’s worth mentioning that such an allocation table does not necessarily have to be attached to the UDRAL standard, although I am not strongly opposed to that. The allocation tables could be defined on a per-system basis (e.g., there could be a default one for ArduPilot, which could be modified by some major adopters with large fleets) with the same benefits available to the users: diagnostic tools would still be able to parse the data correctly if they are provided with the appropriate allocation file. In fact, I would even say that per-project allocation files are necessary because the standard cannot foresee all valid topic configurations yet they have to be provided to the diagnostic tools (e.g., what if I add a button whose data is consumed by my ESC? no such thing is mentioned in the standard, nor it should be).

Such per-system configurations run the risk of subject collisions that occur when a piece of equipment is transferred from one system to another — Vadim mentioned that in the context of cross-standard equipment migration. This risk is to be mitigated at the initial configuration/preflight check stage as discussed here earlier.

Should we call these files “port mapping files” or “launch files”?

This doesn’t need to happen at system powerup though. Subject configuration is not dynamic, it is to happen at the configuration stage. Once a vehicle is configured, there is no need to repeat the process at every bootup, it wouldn’t be possible anyway in the presence of manually configured subjects. I think you mean to say that “during system powerup” the network manager is to check the existing configuration but not to alter it (unless we are talking about the auto-configuration idea whose PoC I shared here earlier).

If we are to go ahead and define such file format, it should be aligned with the standard register naming patterns. Perhaps it could be a very stripped-down subset of YAML in order to make it directly consumable by the existing orchestration MVP I described here: https://pyuavcan.readthedocs.io/en/stable/pages/demo.html#orchestration

1 Like

As I see it, not attaching the allocation to UDRAL would counteract the entire point of offering predictable standardization across the entire industry domain, which is the main benefit of such “fixed” ID ranges in the first place.

1 Like

I agree, in my proposal, the representation format, and the tooling support is generic across all industry domains.
The allocation itself is of course domain specific.

Yes, “to check all, and to configure those modules that are set up for auto-configuration.”

There might be two allocation files in such a project, UDRAL.allocs for standard IDs, and supplementary project.allocs for project specific ones (which I think would not be needed in majority of small projects)

1 Like

Yes, right. What I am saying is that any non-trivial project will be certain to use custom allocations, which means that freezing the allocation at the standard level is not expected to add much value beyond the most trivial applications.

Let us shelve this discussion though because I think it is now clear that one way or another we will find an agreeable solution to this problem by utilizing these Vadimfiles. We have been stalling the progress long enough so with the fragile near-consensus we have reached in this discussion we should move on to the other parts of the standard. Unless there are objections, I would suggest we move on to define the first UDRAL network service starting with something relatively basic, like, perhaps, ESC or servo. @coder_kalyan and @bbworld1 have already done some work on the design document so perhaps we could leverage that. Those who reside in compatible timezones are welcome to join the dev call tomorrow to discuss this; those who can’t join shouldn’t be worried because all of the relevant context will be also provided here on this forum.

I suggest we avoid discussing UDRAL network services in this thread, that would be off-topic. Let us start a new one.