Proposed new UAVCAN repo and pypi distribution

scottdixon · February 25, 2019, 6:31am

While wiring up hpp generation in my branch of libuavcan I kept trying to keep the repo focused on the c++ source but ended up writing as much build logic and configuration to manage the python “dsdl_compiler”. Because of this I’m now convinced that the best design for libuavcan and for the UAVCAN project is to create a generic ~~dsdl-compiler~~ dsdl-code-generator as its own repository and to pull it back in as a pypi dependency to libuavcan.

As such I propose “pydsdlgen” which I’ve roughed out here:

GitHub - thirtytwobits/nunavut: Generate Code from DSDL using pydsdl and jinja2

While pydsdlgen is somewhat language agnostic it will provide only C++ syntax helpers in the template DSL it defines initially. However, it should be easy for additional syntax helpers and escaping extensions to be developed which would allow this script to be used to generate additional languages.

Below is a quick dependency diagram showing how a program could use pydsdlgen by depending on it and providing it a set of dsdl types and Jinja2 templates.

In the case of a libuavcan-based software the dependencies would look like thus:

This modularity should fix the currently painful experience applications have when trying to use custom dsdl with libuavcan v1 and could even allow applications to develop their own header templates to use with libuavcan v2 maximizing their ability to generate highly optimized code for resource constrained systems. Finally, this proposal relieves the C++ libuavcan repo from having to build, test, and distribute a python application.

If this proposal is accepted I will create the pydsdlgen repo in the UAVCAN github organization and hook up travis, coveralls, and pypi for it.

pavel.kirienko · February 25, 2019, 8:54am

Are we certain this is not partly because of the decision to use Jinja2 as an external dependency? Without Jinja, the transpiler could be external-dependency-free, would it not be?

I am observing that you’ve chosen to refer to PyDSDL via PyPI rather than including it as a git submodule. Why? The submodule-based approach would allow you to freeze a particular version of the library and avoid many hassles with dependency management.

Also, how do we reconcile such separation with Libuavcan repo diet?

scottdixon · February 25, 2019, 4:38pm

No. This is something that bugged me about libuavcan from the start and that I found to be problematic when integrating dsdl-to-cpp into production systems that are designed to separate out interface generation/validation from applications.

I’m not completely against pulling it in as a submodule but my thinking is “in for a penny, in for a pound,” if we’re going to use PyPI then we should use it. I certainly don’t want to pull in pytest, pytest-cov, coveralls, and mipy let along Jinja2 and markupsafe all as submodules.

Same as above with “in for a penny…” mono-repos are only simple if they are actually a single repo. If we’re playing the modularity game then we should play it well.

pavel.kirienko · February 25, 2019, 7:47pm

The most troubling thing here is that we might be making the build process unnecessarily complicated and more dependent on the user’s environment than necessary. There were complaints about the high entry barrier when dealing with libuavcan; would be great to avoid raising it further.

Applying the same management approach to each dependency might be a mistake. If we were to stay with pyratemp, PyDSDL would be the only runtime dependency of the transpiler. The end user who only cares about having their headers generated should not care about pytest/mypy/etc. Hence, it is sensible to include PyDSDL as a submodule, thus completely relieving the end user from having to deal with Python deps management, and leave all non-essential dependencies to PIP, since developers won’t mind getting their hands dirty anyway. I am worried that the approach you are proposing might be too complicated for the end user; we don’t want them to fire up a virtualenv just to have their headers compiled.

Sorry, I don’t follow. Are you saying that there are no sensible options in the middle between the two extremes?

scottdixon · February 25, 2019, 9:25pm

I’m shooting for the opposite here. First, you won’t have to have a virtualenv to use PyDSDLgen. The virtualenv will be for our own verification builds because virtualenv is the right way to pull in python dependencies in a build. The user can choose to use this technique or can just pull PyDSDLgen in to their global or user environment if they want (if they are using Vagrant or some sort of containerized build environment this is fine). The thing is, we’re going to require a working python3 environment if we’re going to use python to generate code. I’m trying to make this as normal an environment as possible. PyPI is the most “normal” user-facing environment out there. Anything else we do will require more python environment configuration by our build scripts which are more likely to get it wrong then the user’s own pip.

From my view the “transplier” (a term that is a bit glorified given our use of document templates to generate code) is a build-time dependency. A lot of the trouble I had/have with the dsdl_compiler in libuavcan v1 is because I’m using a build system that considers it “source” because it is part of the package where, if it were external, it would correctly be considered a “build tool.” That said, I am not using a wonderful build system.

As for jinja2 versus pyratemp; to me this is a separate argument we can have next. Even without the Jinja2 dependency I would want to create this new package. Again, I’m looking to make the libuavcan package be just about C++. I’m asserting that libuavcan is not a codegen library. I want consumers to depend on libuavcan separately from the code generator just like they do for the c++ compiler. One could also draw parallels to projects like Cap’n Proto in what I’m trying to accomplish. Namely: code-generation uses an external facility with a well-defined DSL (and we argue next over that DSL which is where the templating language comes in to play).

I think there are reasonable approaches between the two extremes but there’s a valley between everything else and “mono-repo” that is worse then either peak. I’m not proposing that we have to break everything out because you won’t let me have a mono-repo but I do think that, if we are going to have 1 or more dependencies then we can choose more where it provides utility.

pavel.kirienko · February 26, 2019, 9:11am

There are two disadvantages (although I realize that neither of them is of critical importance):

Global/user installation runs the risk of version conflicts between different dependees of PyDSDL. Virtualenv or containers solve this, but at the cost of additional complexity.
The user will be required to set up their environment beforehand. See entry barrier.

We are clearly not on the same page here. I think we can reasonably expect that any dev’s computer is already equipped with a working Python 3 installation (all modern GNU/Linux distros known to man ship with Python 3.5+), so the bare minimum environment is already there and requires no further steps on the part of the user. If the runtime dependencies (these are the only dependencies we care about here; it’s perfectly fine to require a libuavcan developer to jump through additional hoops, it’s their life anyway) are directly embedded into the libuavcan repository through submoduling or otherwise, then no environment configuration will be required whatsoever; likewise, the build system does not need to take any additional steps to make things work. You just clone the repository and run the script. This is the way libuavcan_dsdlc works and I want to keep it unchanged.

If your build system needs to take additional steps, then perhaps something is done wrong or I am still missing something? Can we review these steps and see what can be changed?

Okay, I see that. If we chose to go the way of modularity, it is still paramount to ensure that things just work out of the box regardless of the resulting repository structure.