Monorepository Example (A GitHub Migration)

by Michael Szul on Sat Apr 25 2020 09:58:08

Debating monorepositories vs. multiple repositories is great, but it's mostly abstract philosophy and posturing. Each use case is technically different, and it's going to depend on your stack, your company processes, and how your team operates, so let's walk through an example, and get a look behind the reasoning.

The company I work for has mostly been a .NET shop, but with a recent project, there was a time-crunch where we needed to start up a pilot in a limited amount of time. This pilot was mostly read-only data, and the stored procedures were already written. I decided to take those SQL stored procedures (they were written as FOR XML procedures), and convert them to FOR JSON SQL statements, and use JavaScript (TypeScript, actually) to power the application. This meant building an Express application in TypeScript.

The pilot was successful, and there were two other developers on the support team (different than my team) whose background in PHP and ColdFusion lent itself better to scripting languages than object-oriented, compiled language. Both had some experience with JavaScript and TypeScript because of front-end work.

I made the executive decision to use the Node.JS stack for two of our upcoming projects because of resource constraints--leaning on the skills of these two support members. They were quality programming talents anyways, so it would have been poor stewardship of resources to let that go to waste.

As one project became two and two projects became three, there was a lot of shared code being passed around, so we did what any good team would do, and we refactored the shared code into a library. This left us with three application repositories and one package repository. The latter being published via Team Foundation Server's (Azure DevOps) private packages.

There was a problem though: Any time a bug was found in the common package, someone had to switch repositories, make the change in the packages, check in the package, wait for the CI/CD to run and publish the new version, then switch back to their own application, and reinstall the package. That's a lot of movement. On top of that, the other two applications would then be out-of-date. This wasn't ideal. At the same time, we were in the process of moving from an on-premise Team Foundation Server instance to GitHub for our source control repository and continuous integration and deployment (CI/CD) pipeline.

As a member of the Bot Builder Community, when we designed the structure of the GitHub organization and repositories, Gary Pretty wanted to follow the same structure as the Bot Framework repositories since we were building out extensions and support for that framework. This meant building a monorepository (one monorepository for each supported languages actually). Mick Vleeshouwer then decided to port over the Lerna implementation from the Bot Framework Node.JS repository. This gave me enough familiarity with monorepository structures that I decided to create an implementation for our own monorepository.

Here's a look at our lerna.json file:

{
        "npmClient": "npm",
        "packages": [
            "packages/**",
            "apps/**"
        ],
        "version": "independent",
        "command": {
            "publish": {
                "ignoreChanges": [
                    "*.md",
                    "example/**/*.*",
                    ".npmignore",
                    "LICENSE",
                    "tsconfig.json"
                ],
                "message": "chore(release): publish"
            }
        }
    }

Two things to pay attention to here:

The version is set to independent. This means that each individual package is versioned separately. Not all packages will be incremented at the same time.
We have two folders we're watching as "packages" with the first being an actual packages folder (this is where the common package is kept) and the other being an apps folder where the Express web applications will be kept.

Meanwhile, our root level package.json has the following scripts:

scripts": {
        "postinstall": "lerna bootstrap --no-ci && jake default",
        "build": "lerna run build",
        ...
      },

When npm install is run, we then run lerna bootstrap (we'll get to jake in a moment). When we run npm build, we then run lerna run build. The second operation runs npm run build in each of the package directories, but more importantly, the first operation runs npm install in each of the package directories and then links any cross-dependencies. This means that any of the applications that use the common package will use the copy within the monorepository and not some external package.

What's the --no-ci switch? In a CI/CD environment, Lerna will ensure that npm ci is run. The --no-ci switch will force npm install instead. I do this for testing/debugging purposes to make sure the same commands are run in each environment in the event the build is broken.

With this monorepository set-up we're able to run each of the applications, while utilizing the common package, and we can make updates to the common package when necessary without switching repositories or reinstalling dependencies. In practice, this saves developer productivity. When the programmer is using internal company code, they should not have to worry about dependency management between applications, and they should be able to switch applications with ease.

Another reason for the monorepository switch is that we use Visual Studio Code. With Visual Studio Code, we set up our launch.json file to point to the appropriate application directory and environment file:

{
        "version": "0.2.0",
        "configurations": [
            {
                "type": "node",
                "request": "launch",
                "name": "Launch LMS Program",
                "program": "${workspaceFolder}/apps/lms/dist/index.js",
                "preLaunchTask": "build-lms",
                "outFiles": [
                    "${workspaceFolder}/apps/lms/dist/**/*.js"
                ],
                "envFile": "${workspaceFolder}/apps/lms/.env"
            },
            {
                "type": "node",
                "request": "launch",
                "name": "Launch OTS Program",
                "program": "${workspaceFolder}/apps/ots/dist/index.js",
                "preLaunchTask": "build-ots",
                "outFiles": [
                    "${workspaceFolder}/apps/ots/dist/**/*.js"
                ],
                "envFile": "${workspaceFolder}/apps/ots/.env"
            },
            ...
            {
                "type": "node",
                "request": "launch",
                "name": "Mocha Tests",
                "program": "${workspaceRoot}/node_modules/mocha/bin/mocha",
                "args": [
                    "--inspect-brk",
                    "${workspaceFolder}/test/**/*.js"
                ],
                "port": 9229,
                "internalConsoleOptions": "openOnSessionStart"
            }
        ]
    }

Doing so means that developers can build (thanks to the preLaunchTask property) and launch any of the applications in the monorepository with a simple change to a dropdown menu item and a click of a button.

Again, this is just one example of where a monorepository makes sense, but there are some caveats. For example, what was up with that jake command in the scripts property? We'll take a look at some of these caveats that represent differences in Lerna as a tool, CI/CD as a process, and when you want to deploy individual applications to a server in the next DevOps post.

▲hckr.fyi // thoughts