Tweak CI settings

Our current CI configuration has two issues described below.

ccache is used ineffectively

gitlab-runner cache is effectively a ZIP file passed between jobs. Creating a ZIP archive containing a ccache directory, potentially containing hundreds of thousands of files taking up several gigabytes of storage space is a bad idea.

Our current CI configuration also causes the ccache directory to be treated as a build artifact, which only makes things worse.

What we should do instead is to stop using the cache directive altogether, create a ccache directory on each runner and mount it inside containers. This will allow ccache data to persist between jobs without requiring it to be packed into a ZIP file at the end of each job.

As a side note for posterity, it is critical for /etc/gitlab-runner/config.toml to contain volumes = ["/cache"] for the cache directive to actually work, at least with Docker. Otherwise, gitlab-runner will just create a ZIP file when a job is finished, store it in /cache/<namespace>/<project> and then tear the container down, obliterating the cache it just created. Adding the volume line mentioned above causes /cache to be persisted.

`make` uses fixed concurrency settings

Our CI jobs currently always use make -j6 for building and make -j8 for running system tests. Meanwhile, concurrency settings need to be tweaked separately for each runner to ensure stability. We should modify our .gitlab-ci.yml to fetch the number of parallel make jobs to use from environment variables which will subsequently be set by each runner through its configuration file.

There are three rules I can think of for tweaking concurrency-related settings:

With ccache being used, building becomes more (though not fully, of course) I/O-bound than CPU-bound.
As a rule of thumb, it should be assumed that each parallel system test being executed uses about 0.5 GB of RAM.
Currently, total system test execution time plateaus around make -j8 because some tests just take a long time to run.

Apart from the above, the number of concurrent jobs make is allowed to use while building and testing has to be considered together with the concurrent setting in /etc/gitlab-runner/config.toml which limits the number of CI jobs allowed to be run concurrently at any time on a given runner.

Considering the above, an optimal CI machine would have:

8+ CPU cores: 8 cores allow concurrent compilation for two OS images using make -j4 without putting to much strain on the host (ccache storage is not infinite, so we cannot rely on everything being cached beforehand), which sounds like a bare minimum; more cores would enable quicker pipeline completion in case multiple branches and/or more OS images are to be processed concurrently,
fast storage (or lots of RAM, so that ramdisks can be used for ccache data and/or compilation), to let ccache shine,
more than 8 GB of RAM: with tests for two OS images running concurrently, each image using make -j8, top RAM utilization around 8 GB may occur (2 * 8 * 0.5), which would lead to swapping; the more RAM the machine has, the more concurrent test phase jobs (e.g. for different branches and/or different OS images) it will be able to handle (though more RAM itself will not cause system tests to complete faster because increasing the number of parallel jobs used while testing beyond 8 has virtually no effect).

In any case, there are a few trade-offs to consider here, so allowing runner-specific concurrency settings in our .gitlab-ci.yml sounds like a good idea no matter what machine(s) we will be using for CI in the end.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Tweak CI settings

ccache is used ineffectively

make uses fixed concurrency settings

`make` uses fixed concurrency settings