Mailserver issues

oleg · 8 August 2017 14:53

Dear RIOTers,

due to an immense load of requests on the mailing lists, ten thousands of mails got queued over the last days. This lead to a delay for mail delivery up to several days. After some throttling and performance tuning on the mail server everything should be back to normal and most of the mails should have been delivered by now.

Sorry for the inconvenience and please let me know if you encounter further problems.

Cheers, Oleg

Thomas_Eichinger2 · 8 August 2017 17:00

Oleg,

Thank you for handling this!

Best, Thomas

Adam_Hunt · 11 August 2017 11:04

Just out of curiosity, was this due to a lack of machine resources? How is RIOT's infrastructure (e.g. web and mail hosting) handled; is the hosting donated by one or more of the supporters listed at the bottom of the project's homepage? Would additional hosting resources be helpful?

--adam

oleg · 11 August 2017 11:16

Hi Adam!

Just out of curiosity, was this due to a lack of machine resources?

Actually not. In fact, the maximum number of MTA and content filter processes were configured too low.

How is RIOT's infrastructure (e.g. web and mail hosting) handled; is the hosting donated by one or more of the supporters listed at the bottom of the project's homepage? Would additional hosting resources be helpful?

Currently, some of the services (e.g., web and mail) are hosted privately by community members, other services are sponsored by supporting institutions such as FU Berlin, HAW Hamburg, and Inria. In general, additional resources are always helpful, particular for the CI system, I guess. Do you have something particular in mind?

Cheers, Oleg

Adam_Hunt · 11 August 2017 11:52

I was going to suggest that you might talk to the Oregon State University Open Source Lab (http://osuosl.org). OSUOSL offers free hosting, email, VMs, colocation, FTP, backup, etc. (both managed and unmanaged) to open source projects. I can't speak to what type of computing power they offer but I know they've been working to stand up an OpenStack cluster they may be able to help out with CI or in other ways.

I'm not affiliated with OSUOSL, I just live in their area.

--adam

oleg · 11 August 2017 12:03

Hey Adam!

Thanks for the pointer. That's definitely something we should consider.

Cheers, Oleg

Stefan_Schmidt · 11 August 2017 13:49

Hello.

Hey Adam!

Thanks for the pointer. That's definitely something we should consider.

Cheers, Oleg

I was going to suggest that you might talk to the Oregon State University Open Source Lab (http://osuosl.org). OSUOSL offers free hosting, email, VMs, colocation, FTP, backup, etc. (both managed and unmanaged) to open source projects. I can't speak to what type of computing power they offer but I know they've been working to stand up an OpenStack cluster they may be able to help out with CI or in other ways.

They also offer to host a server of your own. We do that for the Enlightenment project for several years now. Their connectivity and general admin service has worked very well for us.

We have a beefy server there hosting our own tons of git repos, web services, downloads, CI, etc. I don't think there is any bandwidth limitation or such. All in all a really nice service from them to the open source communities.

If it is something interesting for RIOT I have no idea but if you are looking for something in that regard it is surely worth talking to them.

regards Stefan Schmidt

bergzand · 11 August 2017 14:16

Hello,

Hello.

Hey Adam!

Thanks for the pointer. That's definitely something we should consider.

Cheers, Oleg

I was going to suggest that you might talk to the Oregon State University Open Source Lab (http://osuosl.org). OSUOSL offers free hosting, email, VMs, colocation, FTP, backup, etc. (both managed and unmanaged) to open source projects. I can't speak to what type of computing power they offer but I know they've been working to stand up an OpenStack cluster they may be able to help out with CI or in other ways.

They also offer to host a server of your own. We do that for the Enlightenment project for several years now. Their connectivity and general admin service has worked very well for us.

We have a beefy server there hosting our own tons of git repos, web services, downloads, CI, etc. I don't think there is any bandwidth limitation or such. All in all a really nice service from them to the open source communities.

If it is something interesting for RIOT I have no idea but if you are looking for something in that regard it is surely worth talking to them.

regards Stefan Schmidt

To offer an alternative, I can offer the services of Studenten Net Twente[1]. Studenten Net Twente is a student associations of the University of Twente aimed at providing IT services. We have a history of supporting open source projects by offering FTP space and/or colocation services for free.

As we use RIOT-os for an internal project, we'd love to return something to the project. If a host for CI or other services is needed, I can see if I can get a virtual machine or some real hardware available for RIOT-os infrastructure.

Regards, Koen Zandberg

1. http://www.snt.utwente.nl/en/index.php

Adam_Hunt · 11 August 2017 17:14

What sort of hardware would RIOT need for CI? Would a machine with, for example, a pair of E5-2670 (eight cores @ 2.60 GHz), Xeons between 64 and 128 GB of DDR3 ECC RAM, an SSD or two, and maybe some spinning storage suffice or are we talking about something like a highly available cluster consisting of half-dozen or more HP ProLiant DL380 Gen10 machines each with a pair Xeon 8100s (28 cores @ 3.60 GHz), a terabyte of DDR4 RAM, and a pile of blazing fast NVMe drives?

I'm just thinking that if a machine or two with specs closer to the first example would suffice than I imagine the RIOT community might be able to find a way to make it happen.

A word of warning… While I'm absolutely serious about trying to help out in this area (and other) please know I'm currently trying get the ball rolling and determine what type of resources the project could most use; it'll take a bit of time to turn these early thoughts into something real. Personally, I may not have piles of cash laying around but I do have something else of value, time.

Adam

Michael_Andersen1 · 11 August 2017 18:26

Having just done something similar for something else, you should really look at doing CI in AWS lambda. It is remarkably cheap and (more importantly for my case) requires nearly zero devops once set up. If you suddenly have 10x the CI runs, they can all run in parallel for the same cost. No queues.

Kaspar · 14 August 2017 12:14

Hi Adam,

What sort of hardware would RIOT need for CI? Would a machine with, for example, a pair of E5-2670 (eight cores @ 2.60 GHz), Xeons between 64 and 128 GB of DDR3 ECC RAM, an SSD or two, and maybe some spinning storage suffice or are we talking about something like a highly available cluster consisting of half-dozen or more HP ProLiant DL380 Gen10 machines each with a pair Xeon 8100s (28 cores @ 3.60 GHz), a terabyte of DDR4 RAM, and a pile of blazing fast NVMe drives?

The CI can take advantage of anything with 4g of RAM for the ccache and another .5g of RAM per core. Long-running is preferable, as the ccache takes a couple of builds (or a manually triggered full-build) to warm up.

We currently have 2 16-core (Dual Xeon E5-2660/2670), 2 20-core (don't know which Xeons) and a couple of Quad-Cores. There are also about 20 dual-core VM's on the Inria CI cluster, but they're down at the moment. See e.g., the bottom of [1] for the relative performance of these boxes.

We decided (inofficially) on a build target time of <5min to be acceptable, which we're below if the Inria boxes are up. But with every test or every new supported board, the build matrix grows...

I doubt that setting up quad core boxes are worth the administrative overhead (which is basically setting up a systemd service keeping a docker container running), but if you could spare dual 8-core Xeon boxes, that would be nice.

The boxes mostly idle, and as everything runs in RAM, even if a build starts, e.g. on my quad core workstation, I barely notice the builds.

Kaspar

[1] https://ci.riot-os.org/RIOT-OS/RIOT/7438/d24e072af26134996ab488d8bf072142fe4618de/output.html

Kaspar · 14 August 2017 12:26

Hi Michael,

Having just done something similar for something else, you should really look at doing CI in AWS lambda. It is remarkably cheap and (more importantly for my case) requires nearly zero devops once set up. If you suddenly have 10x the CI runs, they can all run in parallel for the same cost. No queues.

I'm not sure the current RIOT build system requirements match AWS lambda in order to properly make use of it.

For every build, the current CI sends "build application X for board Y" as job to a CI queue. With git workdir caching and most of the build already in ccache, such a job takes between .5 and 2 seconds. Without those caches, more like 10 to 20 seconds, depending on the time needed to check out RIOT and to a cold-ccache build. If I understand AWS lambda correctly, it is not really suitable if that much context information is needed.

I was looking into using AWS and the google container engine, but without the caches (e.g., on freshly booted containers/VMs/instances), they're not very attractive performance wise. Long-running (keeping the caches in memory, or even using the persistent storage options), they're not attractive price-wise.

I'd love to be proven wrong!

Kaspar

Michael_Andersen1 · 14 August 2017 16:36

Lambda is actually pretty flexible.

You can do anything that takes less that 300 seconds, so that is ok. The biggest thing that I like is that you pay only for the time spent servicing requests, but the lambda actually persists between requests (for free) so you can store data on the hard drive (up to 500MB) and you can keep data in memory. The only caveat is that AWS can remove that whenever they want, but in my experiments it lasts several hours. So for the CI use case I would basically configure the lambda image to contain a recent-ish workdir, then update it in the invocation (so if it being reused the update will be a noop).

So yeah, totally agree that you need caching, but actually it does that, they just don’t advertise it much.

Kaspar · 15 August 2017 09:41

Hi Michael,