Yukon design megathread

Hello. It’s very nice to see new people coming abroad. We evaluated React for the project and ended up using vue over it, due to licensing and slightly better developer experience.

Personally I have evaluated GraphQL as a solution for the project. The problems it’s designed to solve are:

  • Consuming data from multiple sources
  • Avoid the need of reconstructing (mapping) objects in the frontend
    … through an extra layer of abstraction.

Any webapp of medium size is going to have some kind of relation projected over REST endpoints. That aside, you can tell by just looking specific components that ql is not going to provide significant advantage (look at the Global Register View for example – the mappings and computed properties are huge).

Same thing with the “simpler” components: The 4 ones that make up the homescreen, for example. Yes we could debate over doing 1 call over 4 ones (1 for each sub-component). The 2 ones are really a couple of bytes and merging the other 2 would not make such difference. It would also make it a bit harder to separate concerns between components, require the logic of “splitting” the response inside the vuex/actions in order to update the correct module’s state parts.

I’m still open to using it if you find a proper use case: But for now, I see no valid one.

I finally found some time to finish work on GRV.

Here are some notes from the devcall:

Soft deadline for async pyuavcan: End of june
Hard: End of july-mid august

Next up to work on: Global Register View, file upload modal/popup thing

  • Merged Homescreen PR

@pavel.kirienko Sneak peek on how I am validating type edit form values:

I unsure of what float format you are using. Do you have some spec or a formula that I can swiftly calculate min and max number for floats? (JS has only got a ‘number’ data type, per IEEE 754.

You seem to be missing a minus-one in the uint case. The upper boundary (assuming that it is inclusive) should be 2 ^ ret.bits - 1. In the int case, the lower boundary should be negated.

For floats we use three formats defined in IEEE 754: binary16, binary32, and binary64. Their maximum values are defined as follows:

where frac constructs a rational number for exact computation; if you don’t require exactness, feel free to omit it.

The minimum values can be found by negating the maximums.

I am wondering though, would it not be easier to just obtain type information from PyDSDL instead of computing everything in JS? Just asking.

1 Like

It’s just for computing some min/max stuff for form generation/input validation. We can swap it out if it turns to be a problem.

GRV is progressing fast, I guess I am adding a hover-over to view full tree of each register.
Does a click-register-truncated-value-to-add-to-workset ux sound good?

Hey Luis, unless @Zarkopafilis has something in mind for you to help him with, may I suggest you to contribute to PyUAVCAN instead? I am currently at the point where I need the media drivers for SLCAN, SocketCAN, and Python-CAN implemented. The new library has nothing in common with the old one (it actually has an architecture, for example) so the old drivers can’t be directly reused. If you have experience with asynchronous network programming, we could also use help with the socket driver for the UDP transport. I think at this point the architecture of the transport layer seems more or less solid so it should be safe to proceed with transport-specific implementations.

There is a plethora of things I could get some help with too. Here are some

  • Implement the subscriber widget with logging
  • Integrate existing parts with server event sourcing
  • Implement plotter
  • Add more functionality in general

I would advise you to look into this thread from start till now, so that you get some insights on the design choices and interactions. If you have questions please post them here or on the corresponding GitHub PRs.

It would be also good to join the devcalls :slight_smile:

2 Likes

Here’s a preview of what’s up for GRV:



As you can see I’ve finished the reconstruction of each type, errors and whatever we’ve ever chat about.

@pavel.kirienko the actual updating of the registers remains stubbed, as I would like to confirm how we are going to approach that. Do you want to have the frontend perform a request on the server with the register name and target node ids, letting the backend iterate over each node and update it with the values, or the frontend to perform 1 request per node targeted ?

I’d opt for the first choice, but we can maybe design this a better way if retrying only failed registers is required (I don’t think it matters that much though, because it wouldn’t take that much to update the registers on all the selected nodes again - as a retry - with updated values). A simple error message or list of nodes that were not updated / along with a fail reason (for the first failure, or for all the failures, or for 5 consecutive register update failures // I don’t know what else).

What’s your opinion on that?


UI/UX, styling and layout will be improved as soon as the basic functionality is complete, of course.

Design considerations regarding tracking state updates. The state updates I am referring to is not at all realtime:

  • Node Software Update %
  • Node Restart
  • Register Update Progress
  • In general actions that are not immediately applied

Ideas so far:

  • On the specific NodeDetail widget, Add a progress bar at the top, signifying specific node’s firmware update/restart status
  • On the HomeScreen, show a more compact version of said widget, once for each node undergoing procedures.
  • As I said on the previous post, in possible register update failure, how is the error treated ? Possible solutions would be to show “retry” buttons per node/register pair that failed, retry everything, remove sucessfully changed registers from the new temporary workset.

  • Do you want to automatically inject type value input fields with the provided value of each type? It could have some optional “reset to defaults” “clear” “reset to original value(s)” “set min” “set max”, but that would require special handling due to different nodes advertising different register parameterisation attribute values. Could be proven handy on single-node configuration
  • We need a way to show a truncated cell-fitting value for each register value
  • I need to come up with a usable and performant way to show expanded values on hover. Placing a hidden modal on each cell is has huge performance impact and a ‘dumb’ solution.

Worked on the UI a lot, added some confirmations for file uploads, node restarts, registers are now able to be added and removed, edited from the GRV.

My general opinion (feel free to challenge) is that we should strive to map UAVCAN exchanges to REST exchanges as precisely as possible, avoiding unnecessary aggregations or other deviations unless there are strong reasons to. For example, in the case of bus monitoring and message subscription, we have to use a different exchange logic on the REST side due to performance issues, so this would be an exception. I don’t foresee any performance-related issues in the GRV so that means that we should follow the exchange model of UAVCAN as closely as we can.

I think it does matter. The spec does not make assumptions about idempotency of register write operations, allowing applications much flexibility. If we follow the full-retry policy as you described, we would be implicitly assuming that register writes are idempotent. So I suggest we keep track of which ones have failed and then ask the user if we should retry. Would that be possible?

I would like to point out that the firmware update process is a bit hard to track because it inverts the control: once you requested a node to update its firmware, it takes over as the logical master of the following exchange:

  1. A unit U (say, Yukon, or an autopilot, companion computer, whatever) calls uavcan.node.ExecuteCommand on a node N requesting it to update its software. The request contains the name of the image file.
  2. The node N receives the request, sends a response, and (unless the request is for some reason denied) takes over from here. The unit U at this point is no longer in charge of the process.
  3. The node N repeatedly requests uavcan.file.Read on the unit U using the file name supplied in the first step until the file is transferred.
  4. The node N makes sure things are okay and starts executing the new software.

The inversion of control that occurs at the step 2 makes the progress tracking slightly non-trivial. You can easily detect only the beginning of the update process when the target node N responds with a confirmation saying that okay, it is going to update its firmware. Everything that follows can be observed only indirectly:

  • Monitor if the target node has restarted (but it’s not required to – theoretically, some nodes may implement some form of hot-swapping, updating themselves as they go).
  • Monitor the firmware file read requests. You can detect when the node whose ID matches N reads the specified firmware file. Knowing the size of the file you can determine how far along the update process the node is. However, it is not required to read the file sequentially, so this is a further complication.
  • Monitor diagnostic messages (uavcan.diagnostic.Record) from the node N. These are not formalized so hard to track automatically (that is, unless we’ve built in some kind of weak AI capability into Yukon which would interpret the meaning of log messages :robot:)

Would it not be easier to leave this to the user, at least for now? The user (being capable of abstract thinking and understanding, seeing as it’s likely to be a human being) should be able to deduce what’s happening by looking at the available controls & indicators.

As for the other undergoing procedures, I think we could generalize this into a simple display, like a set of tiny indicators, showing the currently pending service requests. Say, you send uavcan.node.ExecuteCommand to a node N, and its status display on Yukon lights up an indicator showing that the node is yet to send a response to such and such request. When the node responds, the indicator disappears. If the request has timed out, the indicator turns red or something. This feature is by no means critical and I weakly suggest to push back on this.

If unsure, let’s postpone this.

1 Like

Just did that. Please review this very tentative draft here: https://docs.google.com/spreadsheets/d/1yP9zXChKTaIm92Bd60jgrOSwGIeXp-CO5tGyYcaBopg/edit?usp=sharing

The format seems like a no-brainer but you never know. Basically I am proposing to take Popcop (which is just HLDC without the unnecessary fields), reserve the frame format code 128 for our logging purposes, and encode transport frames into Popcop using the structure outlined in the linked table. The format should be equally suitable both for storing data on-disk in log files and for transfering over the wire. Since every frame is stored as a separate atomic data unit, the user will be able to split and concatenate log files trivially by using cat and split directly. Thanks to the transparent channel offered by Popcop, frame boundaries can always be robustly located by looking for the byte 0x8E, since it is guaranteed to never occur inside a data record. The theoretical worst-case overhead would be just over 50% (that would be for a data frame consisting purely of 0x8E and/or 0x9E). Ordered log files can be queried for frames within a particular time interval in O(log n).

(this is a 100-get)

Hi Pavel, I’m interested in helping with PyUAVCAN. I tried looking for a dev branch on PyUAVCAN’s GitHub repo but couldn’t find one. Could you point me to where this new version is being developed?

I’m not sure if it would be better. I’ve been trying to think about some possible benefits of using graphql, like for example using nunavut to generate the schema (types, queries and mutations) automatically from the dsdl. However, I’ve just started learning more about the uavcan spec, so I haven’t really thought about all the possible limitations/implications.

You are right, since we would need to declare all fields in the query this could be impractical for the GRV. However, queries could be automatically generated. I will think more about this to see if it is feasible.

Yes, it would be impractical to update the vuex store from the graphql results. In the case of using graphql it would be easier to use a client like apollo instead of vuex.

Ok, I will think more about those features and will reach out.

Ok, will do!

Totally possible.

That would do for now. I was thinking that we could maybe check for the first non-file request UAVCAN message with source ID equal to the one of the node that was previously updating, in order to determine online status.

That would then require to export every existing thing again for Yukon usage. Exporting into JS- friendly format would not make much sense too, you’d still have to parse that in order to generate forms. I know graph ql supports ‘recursive’ calls in some sense but I would not rely on it or give away flexibility. There is also much hidden complexity with the version types and other stuff which we have discussed already in this forum post.

Vue supports recursive components and computed properties, so this is no problem. I encourage you to look into TypeInput and TypeEditForm to get some insight

That would introduce the same complexity that graphql would, without giving back much advantages. Vuex is also tightly integrated with vue.

That’s also needed, but I was asking for the more simple format, of copying a TypeValue. For now, the server returned type is copied to clipboard along with the _type_, it could be handy to maybe export it in yaml or csv, or something else, idk, everything is possible.

I took a look into the spreadsheet you posted, it looks OK. It seems like that on the gui_tool, only CAN frames are shown, I am thinking of a design that could show all the frame types, without that being visually noisy. Do you have any recommendations for that?

Just finished implementing Server Sent Events Foundation. Turned out pretty clean.

The devlopment is currently happening in my fork: https://github.com/pavel-kirienko/pyuavcan/tree/uavcan-v1.0. Seeing as you are interested, I should move it into the UAVCAN org. I am going to open tickets for coordination afterwards.

Online nodes can be detected by listening for uavcan.node.Heartbeat – every UAVCAN node is required to publish this message at least once per second. When the uptime value reported by this message leaps backward, we know that the node has restarted. This, however, is unlikely to help with the software update tracking since a node does not necessarily have to restart to apply the updates.

You seem to be talking about a different feature. The format I described is supposed to store raw data frames without any higher-level data whatsoever. This is what I am suggesting for use for data log storage. If it ended up being accepted, we will need to eventually teach Yukon how to open such data logs and extract application-level information out of them. For now this is not yet critical.

I do. Suppose there are the following columns in the log:

  1. monotonic timestamp
  2. UTC or local timestamp (see below)
  3. transport frame header
  4. priority level name per the spec (exceptional, immediate, fast, high, nominal, low, slow, optional)
  5. route specifier:
    • source node ID
    • destination node ID (empty for broadcast)
  6. data specifier:
    • if message: subject ID
    • if service call: service ID and request/response selector
    • data type name and version (the mapping from subject/service ID is to be defined by the user or auto-deduced by parsing the logs)
  7. transfer ID
  8. data in hex
  9. data in ASCII

If you are curious where did “route specifier” and “data specifier” come from, read this: Alternative transport protocols, there is a diagram.

Obviously, UAVCAN may share the same medium/transport with other protocols, so it is expected that for some frames we will not be able to determine “route specifier” and “data specifier” because they have nothing to do with UAVCAN, so the respective columns will be left unpopulated.

Now, the column “transport frame header” is supposed to contain the low-level transport frame information in the raw form, completely unprocessed. Hence, its contents will be dependent on the kind of the transport protocol in use. Per the table I linked from my previous post, the values would be at least the following:

  • For CAN bus:
    • CAN ID
    • flags: extended or base frame, RTR frame, error frame, CAN FD frame
  • For UDP/IP (lots of data here):
    • source endpoint (IPv6/v4 address, MAC address, port)
    • destination endpoint
  • For IEEE 802.15.4:
    • source MAC address (64 bit)
    • destination MAC address
    • source short address (16 bit)
    • destination short address
    • RSSI indicator

For consistency, I suppose all that data should be squeezed into one column.

UTC/local timestamps are tricky. If you look closely at my proposed data log format, you will see that it only mentions a “monotonic timestamp”, which has no well-defined starting point and its only purpose is measuring time intervals between messages belonging to the same recording session (where recording sessions are separated with special records where the transport ID is zero). It is important to stick to monotonic because globally managed time such as UTC or GPS is not always available, yet we don’t want to pause logging if the time information turned out to be missing; additionally, such synchronized clocks may change rate or leap in order to stay synchronized, which is disastrous for logging. So monotonic time is easy to record, but how do we convert that to a more usable global time? I suggest we extract the necessary information directly from the logged data. If your system synchronizes its time with UTC or GPS, this time information will eventually occur in the dump. We scan the dump looking for time synchronization messages. Having found one, we take the time and subtract the monotonic timestamp from it; then this difference can be applied across the whole log to determine the UTC/GPS time of any other logged frame. Additionally, as a fallback option the system that wrote the log file may inject additional UTC/GPS/whatever time records into the recorded log file by using one of the reserved metadata entry formats, so that if the logged system turned out to exchange no time-related information, it could still be extracted from these additional metadata entries. The worst special case is when the data log contains neither logged time information nor time metadata entries, in which case the corresponding time column should remain unpopulated unless the user enters the time difference manually.

1 Like

Does that also mean that the node can keep sending all the messages it used to sent while normally operating, while software updating too?

I will start to prototype on these suggestions and let you know.

The specification does not prohibit that so we should assume that it is possible. We can, however, detect when the update is over by looking at the field mode of the heartbeat message (which I forgot to mention earlier). While the update is in progress, mode is MODE_SOFTWARE_UPDATE; after completion it will be something else.

1 Like

I guess we can cram things in the same column and introduce some colouring or more hover-over details on the separate window (bottom-left, or at the bottom, like wireshark does it)? Maybe only colour the data in hex/ascii a certain colour content-wise but keep some colours for the standard features that are crammed in the same cell(s). Ex. src and dest macs are ALWAYS the same colour., can ID is always a different colour, etc. What do you think about this usability/colouring ideas? Also, what kind of command line and Yukon UI filters do you think make the cut as common use cases?

More on the extra “window”, what kind of details would you want this to contain? I guess it should be a ‘plaintext’ component that changes between any selected row. That could also include doing some processing to find extra missing parts of the messages. An extra ‘header’ part in this component could also act as a placeholder for the hover-over information that multipart column entries provide.

I think we should stick with this monotonic/relative time treatment and maybe introduce extra log metadata at the header of the file (I’m investigating in what extent this is possible // cancel that, plain txt files can not have header parts // I’m investigating this further) in order to keep the ability to process with standard tooling. That would require determining a merge policy for logs that have no matching metadata.