Rethinking the Vieter project

Part of the Rieter series

I’ve been meaning to recreate my Vieter project for a while. The codebase is full of technical debt, and I’ve grown dissatisfied with the language it was originally written in. That’s where the Rieter project comes in: a full reimagining and reimplementation of the core ideas of the project, in Rust. I am however following a different mindset this time around.

My plan is to develop the project in two stages. The first stage involves creating a well-designed general-purpose repository server. This includes serving and storing packages, as well as providing a REST API and web UI to interact with the repository packages. In this stage I’ll also add mirroring functionality to allow a Rieter server to automatically maintain a local copy of another repository. This could be used to easily create another mirror for a distribution’s servers, or perhaps to create a local mirror for faster downloads.

Once the first stage is finished, we have a solid foundation on which we can build the second stage: the build system. This will involve redesigning the agent-server architecture that’s currently used in Vieter, with the goal of completely replacing Vieter in due time.

This post is the first in a hopefully plentiful series of devlogs for this project where I’ll document my progress along the way.

Current progress

The implementation of the repository server itself is almost done. A user can publish, request and remove packages for any number of repositories and architectures. Repositories are then further grouped into distributions, allowing a single server to be used for multiple distributions if need be (e.g. I would for example have arch and endeavouros as distributions on my personal server). A package’s information is added to the database, and this data is then exposed via a paginated REST API.

The only real hurdle left for a first release is concurrency, which brings with it a couple of problems. With the current implementation, it’s possible for concurrent uploads of packages to corrupt the repository. The generation of the package archives happens inside the request handler for each upload, meaning multiple requests basically do duplicate work and can cause CPU usage spikes. The parsing of packages is also done inside the request handler, which once again causes the server to spike in CPU usage if multiple packages are uploaded in parallel. These things combined make concurrent uploads of packages a rather painful problem to deal with.

My solution for these problems consists of two parts. First I want to add a queueing system for new packages. Instead of parsing the packages directly in the request handler, they would get added to a queue, with the server then responding with a 202 Accepted. The actual parsing of the packages would be done asynchronously by a configurable number of worker threads that parse the packages.

The second part involves serializing and stalling the generation of the package archives until needed. Instead of actually generating the package archives for each uploaded package, we simply notify some central worker thread that the repository has been altered. This worker would then generate the package archives, after ensuring the queue is empty and no new packages have arrived in the last n seconds. This pattern accounts for groups of packages being uploaded at once without needlessly stressing the server.

By implementing these features, the server should be able to handle a large number of package uploads without using excessive resources, ensuring Rieter can scale to proper sizes.

First release

Once this is implemented, the codebase should be ready for a 0.1.0 release! This version will already be useable as a fully-fletched repository server on which I can then build the other parts of the first stage.

For the 1.0 release, I’ll be adding a web UI, as this was something that I was sorely missing from Vieter. Perhaps most exciting of all, automatic mirroring will also be added which I’m definitely looking forward to! I hope to publish another post here soon, but until then, thanks for reading.