Future of Kconfig

benpicco · 25 September 2023 09:06

I am wondering what are the future plans for the Kconfig migration. The process appears to be stalled for quite a while now and the current situation just causes frustration: We have to maintain two build systems, one of which is only run by CI for select tests because only a subset of modules is modeled in in. This causes a lot of friction and hard to debug dependency issues. I’m seriously wondering if this may drive contributors away, since getting stuff past CI almost always gets stuck at Kconfig for any significant new development.

What are the goals of the Kconfig migration? To me it seems like the system will be harder to configure as dependencies are not always added automatically - making it easier to create a broken configuration. The Kconfig build also appears to be slower than with the make based dependency resolution. And lastly, the incomplete migration state already lead to many hacks being added to the Kconfig based build resolution that makes me wonder if we won’t end up with a system at least as messy as the purely make based approach.

So what are the goals and future plans for the Kconfig migration? Is our current approach the right one or should we reconsider this experiment?

enoch247 · 25 September 2023 16:11

I would really hate to see us give up on Kconfig. I’ve rolled my own build configuration tools in the past. I much prefer Kconfig after learning to use it. It solves some interesting problems. I looked into helping out with the migration, but just haven’t had the time to do anything. I will make the time though if it saves from throwing it out. Last I looked at how it was being used in RIOT. I thought it was a bit odd, but assumed that I didn’t have the big picture. Perhaps though, that could be a contributing factor in the pains? I don’t really want to draw conclusions though without a deep look.

I suspect one of the biggest issues is that the RIOT build system and Kconfig approach dependency resolution very differently. RIOT basically allows to to select any module, and all dependencies are pulled in. This is similar to an OS package manager. Kconfig, seems to be best suited toward only showing you the things you can enable based on what you have already enabled. For example, you must enable networking before you can enable TCP/IP. Where-as RIOT’s build system would enable networking if you enable TCP/IP.

aabadie · 26 September 2023 12:16

For reference, there’s this tracking issue about Kconfig dependency modelling progress. To summarize, it seems to me that a lot has already been done and the big missing part is related to networking (GNRC, networking packages, some radio drivers).

MrKevinWeiss · 26 September 2023 16:03

I went off on a rant about this at the summit and it comes down to the top-down vs bottom-up approach.

I guess you can say we have a middle out approach (if anyone has watched the HBO show Silicon Valley) where the we use depends on for the peripherals and module features and select for the “high level” modules.

The problem with the bottom-up (depends on) approach (which is what was initially used and what Zephyr uses) is that everything needs to be resolved somehow. In the Zephyr code base we see lots of the apps and tests sprinkled with cpu/board specific configuration files and the process of adding a new app would require defining many different configurations. This, however, makes a very clean dependency tree and makes modelling quite easy.

The top-down (select) is what we currently implement in make and allows very simple configuration, for example, if I have an app using some gnrc stack, I would just need to bring that in and RIOT will resolve it for you, for the better or worse. This is really nice but makes things really complicated, especially with rules such as no circular dependencies and no selecting choice options in kconfig. It also makes the menuconfig look silly as almost all modules would be selectable.

The proper way would be a SAT solver, if it is kconfig or laze or anything, we need something. Currently the recursive make works by hacks and tuning (which probably will have to exist anywhere but it is really hard to tell why some modules are being brought in). I imagine if we are worried about CI time now… probably a SAT solver won’t help with that.

Now that I have my little boilerplate intro done we can get on to constructive conversation.

I will try to structure my response:

The process appears to be stalled for quite a while now and the current situation just causes frustration

Yes it is, especially when having to deal with a different modelling style, for example the usb stdio stuff took me a month of playing around to try to have a solution that is what we want (not just about matching what make says).

Many other issues are just annoyances which are not desired.

This causes a lot of friction and hard to debug dependency issues.

I have found that debugging with kconfig is the easy thing since you can look through the whole tree but maybe that is a tool usage issue. There have been some issues from make that took a while to figure out, for example, bringing in periph modules that were simply not used. There have also been some challenging kconfig issues that are occur more around the circular dependency issues, for example understanding natives periph_rtc was not so easy to read. It is a tool that must be learned, that is for sure.

I’m seriously wondering if this may drive contributors away, since getting stuff past CI almost always gets stuck at Kconfig for any significant new development.

I would like to think that people help others through it, and usually it would only be a problem if changing already existing features as if there is no app.config.test, probably the test will not run the kconfig check.

What are the goals of the Kconfig migration?

The goal is to have a structured way of declaring the dependencies, to provide tools to better understand the dependencies (ie. menuconfig) and try to use more standard tooling.

To me it seems like the system will be harder to configure as dependencies are not always added automatically - making it easier to create a broken configuration.

If we go from the bottom up it is more work for sure. I don’t think I would agree with the broken configurations as kconfig can do a lot more validation. During the migration though I would agree since things may not be complete leading to broken configurations, this should be resolved if it is ever complete.

The Kconfig build also appears to be slower than with the make based dependency resolution.

Yup, currently by far, but things can be done to speed it up.

the incomplete migration state already lead to many hacks being added to the Kconfig based build resolution that makes me wonder if we won’t end up with a system at least as messy as the purely make based approach.

We do need to be careful, however, a lot of hacks that are introduced to match the make system are clearly labeled and can just be removed after the migration. We should try to focus on what behaviour we want rather than what things happen to resolve to.

So what are the goals and future plans for the Kconfig migration?

Whenever I bring it up, the answer is get more man-power to push it. It was really nice with @aabadie going strong for those few weeks we have the tracking issue but usually the last little bit is the hardest and I don’t really believe all the issues would just be solved.

Is our current approach the right one or should we reconsider this experiment?

I think we should open it up to the community what we should do, if nobody wants to put in the work it is at least easy to switch back to make but we will be loosing some nice modelling and capabilities then (ie, if I have a hardware enabled backend then use that feature).

Maybe it just needs some smart and dedicated person/persons to pick it up, maybe we switch to all select based, maybe we disable the circular dependency check, maybe we don’t compare modules or binaries and just use passing tests, maybe we find some easy to integrate sat solver and go to pure depends on, maybe some AI just solves everything for us …

All we can say for now is that nothing is moving and it is just costing us.

mcr · 26 September 2023 18:31

My understanding is that Kconfig can be made more of the “pull” mechanism (“all dependencies are pulled in”), but that the default Linux kernel process is bottom-up. It regularly pisses me off in places like openwrt.

My biggest concern is that I think that Kconfig and Cargo won’t get along very well.

chrysn · 29 September 2023 21:14

My impression is that a lot of the hassle comes from having to make it do just as Makefile based dependencies do – combined with those being the default in all situations one encounters when not actively enabling Kconfig. This leads to things staying harder up to that flag day when we ditch Makefile.dep. That day seems to be “just some hard pushes away” since I’ve joined the project.

I share the desire for a descriptive model of our dependencies (over the descriptive-if-you-know-the-conventions-but-effectively-prescriptive model in the Makefiles). But the downsides of the current state on development are tangible – I’m sure I’m not the only one with PRs that are stalled just because some Kconfig fixes are hard enough that they make the PR require more time than I have to put into it.

I don’t see how to get this unstuck; but maybe this could help: For the parts that are already working with Kconfig, can we migrate parts of the selection process to Kconfig-only already? That’d both remove complexity of make-it-look-like-make-so-tests-pass workarounds, and speed up the transition in people learning to use it.

Kaspar · 2 October 2023 09:25

Personally I still think it was a mistake to go with Kconfig, and I think at this point it’s mostly sunk costs that make us stick to it.

(Rehashing reasons:

Kconfig is not a build system, so once the dependencies and configuration are modelled, we’re still stuck with make-as-buildsystem, or looking at the next migration.
Also, Kconfig is single-invocation-single-configuration, so inherently slow for many configurations, and RIOT has a huge BOARDxAPPLICATION matrix. With a fast build system, Kconfig will dominate build times for more than single configuration builds due to this.
Kconfig cannot easily model multiple only slightly differing configurations, e.g., for a single application on multiple boards
Kconfig is not as simple and well understood as initially indicated )

Anyhow,

Alternatively, get Kconfig requirements out of the main tree. Currently, we’re requiring both Kconfig and make dependency modelling for new contributions and changes, which is a drag and slows mainline development substantially. Kconfig could be a long-lived feature branch, which finds methods to ensure that its building mostly the same as master, and regularly (e.g., for every release) merges in (syncs with) master. That branch would provide a (subset of) RIOT that’s using Kconfig. Once that branch is advanced enough (e.g., Kconfig can configure a critical mass), we merge that branch, give it a release cycle to iron out kinks, and go with it.

While the downside of that approach is obviously that contributors are not forced anymore to add Kconfig to new code (or update changed code), that’s also a huge benefit to RIOT development in general. Also, that Kconfig feature branch can use arbitrary breaking / differing ways to model stuff and wouldn’t be bound to binary equality anymore.

The Kconfig migration would go at it’s own pace, decoupled from mainline, not holding back all of RIOT, and not being held back by having to build exactly the same as the unstructured mess that make+Makefile.dep is.

MrKevinWeiss · 4 October 2023 07:46

For the parts that are already working with Kconfig, can we migrate parts of the selection process to Kconfig-only already?

The problem is that it could lead to a pretty big mismatch of implementations, let’s say an external person wants to create an app that uses some kconfig modelled feature and some make features or so (something we don’t test for), nothing would work in kconfig or make since the make feature decayed to the point of not being usable and the kconfig features are incomplete.

Kconfig is not a build system, so once the dependencies and configuration are modelled, we’re still stuck with make-as-buildsystem, or looking at the next migration.

Agreed and probably the biggest point we should think about, kconfig would essentially have an input of BOARD(s), and application setting and give the modules, packages, and configuration values out. Maybe there are build systems (or build system configurations) that are happy with that but if we want to move away from make, that is a whole different migration.

Also, Kconfig is single-invocation-single-configuration, so inherently slow for many configurations, and RIOT has a huge BOARDxAPPLICATION matrix. With a fast build system, Kconfig will dominate build times for more than single configuration builds due to this.

That can be solved in many ways though, Zephyr has these same constraints but handles it by allowing BOARD to be a wildcard and loading the tree in once. It probably will require reworking some of the work we have already done though (eg. handling the stm32 clock tree).

Kconfig cannot easily model multiple only slightly differing configurations, e.g., for a single application on multiple boards

I don’t know about this, we had some PoCs with different envs that could be loaded which would actually solve these “using symlinks to applications with different makefile confiurations” (eg. periph_spi and periph_spi_dma). Maybe I don’t understand here.

Kconfig is not as simple and well understood as initially indicated )

I think this is more an issue of our requirements, the design patterns that are needed have evolved to something complex over time as we discovered each of the corner cases (ztimer backend selections, usb_stdio, *_default)

get Kconfig requirements out of the main tree .

I think I would rather just drop it then, creating a different branch would be the same as disabling tests, and I am pretty sure it would just decay. Currently the only work being done on Kconfig seems to be what is needed to get it to pass.

Kaspar · 4 October 2023 08:37

Why would that be “the same as disabling tests”? Tests are run for e.g., PR branches.

MrKevinWeiss · 4 October 2023 10:01

Why would that be “the same as disabling tests”? Tests are run for e.g., PR branches.

You’re right, not the same as disabling tests although not doing the module/pkg/binary diffs is disabling significant tests. Just keep in mind that tests are run only on native and select boards. As many of the challenging things with our dependency modelling has to do with board specifics we would miss bunch of possible bugs. Not to mention we may be missing tests that would specifically expose a dependency modelling problem.

I feel like separating the current work to a feature branch will halt any progress and any effort would just be trying to keep it up to date while others can just keep using the old makefile way without much concern for the modelling.

At one point I was thinking of introducing a “binary/usemodule can differ” flag which we could use for, say, the random based tests (so kconfig could select the hwrng if available) but I think that would be abused. Maybe if we have that coupled with actual tests being run on all the devices it might be a solution that can allow us to move a bit faster without breaking so much.