Switching to a monorepo

Context

There comes a time in a programmer’s life where one has many intertwined projects, in most cases those projects live in different repositories. But this separation (which is necessary when publishing those projects) adds friction when updating dependencies of other projects, as the references to those dependencies must be updated too.

As an exemple, I wrote a small collection of useful nix functions that are shared in multiple projects of mine : https://git.hubrecht.ovh/hubrecht/nix-lib.

I import them using npins in my NixOS projects, but then, when a change is made to nix-lib, I need to update the version referenced by npins. This two-step process (in reality more than that due to having tag the new nixlib version) is quite cumbersome as the new version of nix-lib already lives on my computer not far from my NixOS configuration using it.

One alternative to this process is to store all of my projects in a single repository (hence monorepo) and reference them directly. Several organizations are using this scheme, such as tvl (or Google and Facebook…) which offers compelling simplifications.

Tooling

Problems arise though when one wants to share only part of this monorepo to the outside world while still maintaining the possibility fof outside contributions. But tools exist for this purpose :

  • josh provides on the fly history filtering and partial repo fetches, but it implies that everyone uses https to interact with repositories.
  • mgt is a client side tool that is able to sync remote repositories with the parts included in the monorepo, and the one I chose to use.

Monorepo-git-tools

As implied by its name, this tool only works when using git. It works by having several files (repo-files) describing the mappings between the monorepo and the remote repository. It provides the ability to add the source of dependencies to subfolders, allowing to create a monorepo form multiple remote repositories while keeping the whole history using mgt split-in. To export changes made in the monorepo to the remote repositories, the inverse command mgt split-out exists. Using mgt sync provides two-way synchronisation between the remote repositories and the monorepo.

My experience

I started with multiple repositories that I wanted to fuse in a monorepo, for example I decided to store nix-lib in the lib/ directory of the monorepo. To download it, the command is simple :

mgt split-in-as --gen-repo-file -as lib/ https://git.hubrecht.ovh/hubrecht/nix-lib

Which will also create the corresponding repo_file nix-lib.rf :

[repo]
remote = "https://git.hubrecht.ovh/hubrecht/nix-lib"
branch = "main"

[include_as]
"lib/" = " "

This command will leave you in an orphaned git branch named nix-lib corresponding to the remote repository. Then you will have to return to the main branch and rebase the changes onto it :

git switch main
git rebase nix-lib

git branch -D nix-lib

The same procedure is to be applied to all the repositories you want to incorporate in your monorepo.

Beware of the git setting pull.rebase.

If it is set to true in your monorepo, you will run into the error Updating an unborn branch with changes added to the index., which does not yield many results on the web…

The solution is simply to run :

git config pull.rebase false