Graphing build sizes

bergzand · 11 October 2017 14:59

Hello,

One of the issues from the CI discussion at the RIOT summit was the tracking and graphing of the nightly build sizes. After some instructions from Kaspar for getting the JSON files I got something working here:

https://riot-graphs.snt.utwente.nl/dashboard/db/demonstrator?orgId=1

For now I want to keep it up to date by running my script as a cron every night approximately after the nightly build. The source code of the script that parses and pushes the values to the database can be found at my github [1].

The dashboard is now a simple Grafana templated dashboard where a test and a board can be selected. I'd like to expand this by creating a dashboard for every test or for every board. The most difficult thing for now is to present the huge amount of data in a clear and concise way. Input on this and the overview in general is most welcome.

Koen

[1]: https://github.com/bergzand/RIOT-graphs

miri64 · 11 October 2017 15:10

Hi Koen,

Wow, +1!

Cheers, Martine

Kaspar · 11 October 2017 15:36

Hi Koen,

tcschmidt · 11 October 2017 17:04

Hi Koen,

really great: please let us know once you are ready for a deployment.

Best, thomas

Kaspar · 12 October 2017 09:19

Hey Koen,

For now I want to keep it up to date by running my script as a cron every night approximately after the nightly build.

If we'd build the in-between HEAD commits, would your script pick them up?

The dashboard is now a simple Grafana templated dashboard where a test and a board can be selected. I'd like to expand this by creating a dashboard for every test or for every board. The most difficult thing for now is to present the huge amount of data in a clear and concise way. Input on this and the overview in general is most welcome.

Is it possible to put the commit hash into each step? If I move my mouse over a graph, each point shows a pop-up with the timestamp. It would be very useful to see which commit that belongs too. Maybe even with a link to the PR page on github?

Kaspar

bergzand · 12 October 2017 13:02

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256

Hey,

Hey Koen, > > On 10/11/2017 04:59 PM, Koen Zandberg wrote: >> For now I want to

keep it up to date by running my script as a cron >> every night approximately after the nightly build. > > If we'd build the in-between HEAD commits, would your script pick them up? If that means I can grab the JSON files based on the merge commit hashes, it shouldn't be that much work to integrate. At that point it might be easier to integrate this work into the CI to trigger it after such a build.

The dashboard is now a simple Grafana templated dashboard where a test >> and a board can be selected. I'd like to expand this by creating a dashboard for every test or for every board. The most difficult thing for now is to present the huge amount of data in a clear and concise way. Input on this and the overview in general is most welcome. > >

Is it possible to put the commit hash into each step? If I move my mouse

over a graph, each point shows a pop-up with the timestamp. It would

be > very useful to see which commit that belongs too. Maybe even with a link > to the PR page on github? The best way to add this to the dashboards might be to use annotations[1]. I would have to try if it is possible to embed a link in those. I'm not sure if annotation are the best way to show this information, but I don't have a better way at this moment on how to integrate this in Grafana.

Koen

[1] Annotate visualizations | Grafana documentation

Kaspar · 12 October 2017 13:32

Hi Koen,

(your mail's quoting arrived a little garbled, I'll try my best to fix)

At that point it might be easier to integrate this work into the CI to trigger it after such a build.

Probably! There's already infrastructure to run arbitrary scripts after each build, which is also used to actually parse out sizes.json.

See [1].

Probably a simple http(s) request using wget would do it?

[1] Annotate visualizations | Grafana documentation

[2] pretty much shows our use-case.

Kaspar

[1] Annotations: Click links and select text from annotation popover · Issue #1588 · grafana/grafana · GitHub

hauke · 13 October 2017 07:53

Hej,

very nice effort, I love it!

Thinking out loud: would it make sense to do some data aggregation for more generic views? On first thought I would imagine something like average ROM/RAM size over all application/examples over all platforms.

I also have an idea for code size diff visualization, that I have in my mind for quite a while: how about we draw a (huge) table, using all available platforms as columns and all available applications as rows. Each cell would then be colored: different green saturation for code size decreases and different red saturation for code size increases, while something neutral for no change. Then we could draw this table once for ROM and once for RAM usage . This would allow us to see on first sight in which aspects a certain PR influences the overall code size and also if there are unexpected side effects in that regard.

What do you think about this idea, and how would you assess the doability?

Cheers, Hauke

Joakim_Nohlgard · 13 October 2017 11:08

Another way to do this could be using the Google docs api and updating a huge spreadsheet.

Kaspar · 13 October 2017 11:19

Hi,

Thinking out loud: would it make sense to do some data aggregation for more generic views? On first thought I would imagine something like average ROM/RAM size over all application/examples over all platforms.

Yes! The data should already be there (in the parsed sizes.json). Maybe an "all" selector for both application and board would do it?

I also have an idea for code size diff visualization, that I have in my mind for quite a while: how about we draw a (huge) table, using all available platforms as columns and all available applications as rows. Each cell would then be colored: [...]

What do you think about this idea, and how would you assess the doability?

Well, that table would be *huge* (~100 * 150 cells). We could make a bitmap with mouse-over.

Kaspar

hauke · 13 October 2017 11:28

Hej,

Hi,

Thinking out loud: would it make sense to do some data aggregation for more generic views? On first thought I would imagine something like average ROM/RAM size over all application/examples over all platforms.

Yes! The data should already be there (in the parsed sizes.json). Maybe an "all" selector for both application and board would do it?

I also have an idea for code size diff visualization, that I have in my mind for quite a while: how about we draw a (huge) table, using all available platforms as columns and all available applications as rows. Each cell would then be colored: [...]

What do you think about this idea, and how would you assess the doability?

Well, that table would be *huge* (~100 * 150 cells). We could make a bitmap with mouse-over.

This is what I head in mind, talking about something like 5x5 pixes per cell (or similar) - the term table was more coined as a HTML table would be the natural construct to build this...

Cheers, Hauke

bergzand · 13 October 2017 11:51

Hi,

Hi Koen,

(your mail's quoting arrived a little garbled, I'll try my best to fix)

Whoops

<snip>

See [1].

Probably a simple http(s) request using wget would do it?

I started working on rewriting the script to a microservice. I should have this working in a few days after enough refactoring.

[1] Annotate visualizations | Grafana documentation

[2] pretty much shows our use-case.

Got this one working with PR urls. I want to modify them a bit and see if I can fit the PR title in the description, but I think the general idea works very nice.

Koen

bergzand · 13 October 2017 12:05

Hey,

Hej,

Hi,

Thinking out loud: would it make sense to do some data aggregation for more generic views? On first thought I would imagine something like average ROM/RAM size over all application/examples over all platforms.

Yes! The data should already be there (in the parsed sizes.json). Maybe an "all" selector for both application and board would do it?

I also have an idea for code size diff visualization, that I have in my mind for quite a while: how about we draw a (huge) table, using all available platforms as columns and all available applications as rows. Each cell would then be colored: [...]

What do you think about this idea, and how would you assess the doability?

Well, that table would be *huge* (~100 * 150 cells). We could make a bitmap with mouse-over.

This is what I head in mind, talking about something like 5x5 pixes per cell (or similar) - the term table was more coined as a HTML table would be the natural construct to build this...

I'm going to see if this is already possible via a plugin or if I can build something. I was thinking of building a sizes diff between the start time and the stop time and then color code. This way it should be possible to diff between a single PR, but also over a whole release.

One feature that approximates this a bit is a carpet map. These are already available for Grafana and are to data over time, so either all boards or all tests over time. I'll put up a few dashboards to get a feel for these.

Koen

bergzand · 14 October 2017 21:22

Hi Kaspar,

Back to the mailing list with this one We could also move this discussion to a github issue if we're generating too much mailing list traffic with this.

Hi Koen,

seems like I accidentally replied off list.

I might have underestimated the amount of data slightly when I started this. Currently the CI is building 13.900 test. Triple that for the actual number of time series stored. I don't think that there's an application optimized for this many series.

Yup, that's a lot of data... Do you think influxdb/grafana can handle this for only the merged PRs?

Yes, the raw data is not a problem. The main problem here is visualisation. Grafana has no issues with displaying a large number of graphs, but at around 200 queries for a single page things start to slow down quite a bit for me. I currently only hit this when displaying a large number of different time series, but storing them in the database is not an issue. Influxdata considers more than 250.000 writes/s with a million unique series "high load" [1].

Although I like being able to see the differences in size a set of merged PR's generate, I agree that visualizing changes from unmerged PR's is one of the more useful features we can get out of this. The data isn't hard to generate (diff current master against the PR build). I have to think a bit more on how visualization of this is best done while keeping it useful and not too processor intensive.

Probably a plain text list, sorted by change (in percent or absolute) would totally suit our needs, for any open PR.

Something like: text data bss total examples/hello-world samr21-xpro +124b +0b +456b +570b examples/hello-world iotlab-m3 +120b +0b +456b +566b

CI needs to warn on any increase (with an additional yellow build result in github that points to that list).

I'm probably just thinking too complicated (or fancy) here, this should do for starters.

Koen [1]: