hckr.fyi // thoughts

My Monorepo Broke My CI/CD (GitHub Actions with a Monorepository)

by Michael Szul on

We continue our philosophical journey with monorepositories in code architecture. In theory, a good concept with debatable merit in some areas: What about in practice? Let's keep looking.

The biggest problem with a monorepository is that it's good for development but bad for deployment. Why do I say this? In a traditional CI/CD setup, every time you check in code, that code gets a series of tasks run against it, and then is deployed to a server (if those tasks/checks are successful). But if you have a monorepository, which projects in that repository get deployed? You certainly don't want everything deployed each time.

Let's look at some scenarios with the following tech stack:

If I have a basic GitHub Action that checks out my code, runs some NPM tasks, and then deploys it, it's going to be running on a branch and an event. This means for a single push every check-in is going to trigger this action. That's not really what I want. I want changes to an individual application to build and release only that application.

We can start by using the paths property for GitHub Actions. This will mean potentially creating one GitHub workflow for each application and package in your monorepository, but let's face it, you'll likely need some of that granular control anyway. With the paths property, the action will only trigger when a change is made to a file under that path.

name: LMS DEV CI
    on:
      push:
        branches: [dev]
        paths: 
          - 'apps/lms/**'
    

This solves the triggering of a build and release, but here's the thing. Since it's a monorepository, what do we do about actually releasing it? No problem, we'll just cd into that directory using the working-directory property and package that up so we're not packaging up the entire site.

...
    jobs:
      build:
        runs-on: [windows-latest]
        steps:
          ...
          - run: npm ci
            working-directory: 'apps/lms'
          - run: npm run build
            working-directory: 'apps/lms'
          - run: npm run test
            working-directory: 'apps/lms'
    

To eliminate redundancy:

...
    jobs:
      build:
        runs-on: [windows-latest]
        steps:
          ...
          - name: Build applications
              run: |
                npm ci
                npm run build
                npm run test
            working-directory: 'apps/lms'
    

But now we have another problem. We're using Lerna remember? With Lerna, we made cross-dependencies easier for developers, but if you look at the package.json file of your application's directory you'll see that any cross-dependencies actually point to the file path of that dependency. This likely means that it exists outside of the directory you're currently in.

"@uvasomit/common": "file:../../packages/common",
    

Why is this a problem? Because if you look in the node_modules folder for your application, that cross-dependency is missing. Lerna "installs" the dependency as a file references, and Node.JS searches for dependencies working up it's chain. If we isolate our application by utilizing just what's in the application's directory and package that up for deployment, you're going to having missing files.

This is where you have to be careful with Lerna cross-dependencies. Lerna works great as a monorepo tool for development, and it does have the ability to publish, but cross-dependencies in sub-folders will result in file references without node_modules entries, and hoisted references will result in that reference being in the root of the monorepo, meaning the monorepo will work fine, but once the application's individual directory is deployed, the reference and the files won't be available in the way you would normally expect.

This points to one of the cons of a monorepo. Although more efficient for development, it requires some customization in your DevOps process.

For deployment, what I need is for all of the dependencies for an individual application to be included in the node_modules folder. This means that hoisting is okay for typings files or development utilities, but not for utilities used within the application (such as jake—more on this soon). It also means that cross-dependencies can't be hoisted, and the dependencies as a whole need to be in the node_modules folder of the individual application.

As will the example above, in one of my applications at work, the following dependency is linked:

"@uvasomit/common": "file:../../packages/common",`
    

But after the global build of our monorepo is complete in GitHub Actions, I need to very specifically deploy the individual application I'm targeting.

Why don't we just build the application by itself? Why do a build of the entire monorepo? Isn't that inefficient? Yes, but remember that we've linked dependencies based on file, and we need the latest in packages/common in our application's project. This means we need to be sure to build packages/common as well as our application. We could decide to build just those two (or more if there are more cross-dependencies), but since we're linking cross-dependencies, any update to packages/common could work from our application, but break another application that is dependent on it. We want to force our developers to fix these errors, so we need to run the build checks on all applications and reject anything that fails—even if we're only working on one of the applications in the monorepo.

Essentially, we need to isolate our application, converting the monorepo on build to a single project with all its dependencies. This means some custom tasks. You can do this with NPM scripts. You can do this with shell scripts. You can do this with Grunt.

I prefer Jake.

Why? Jake follows in the traditional of Make and every other [SOME LETTER]ake tool. It's very simple. It's very fast.

With Jake, we can set up a simple build task for deployments that copies over any necessary files. For example:

const { desc, task } = require('jake');
    const fs = require('fs-extra');
    
    desc('Copy task to copy @uvasomit/common to base packages.');
    task('copy', async () => {
        console.log(`Copying to ${ __dirname }/apps/my-app/node_modules/@uvasomit/common`);
        fs.copy(`${ __dirname }/packages/common`, `${ __dirname }/apps/my-app/node_modules/@uvasomit/common`)
            .then(() => console.log('Finished copying to ./apps/my-app/node_modules/@uvasomit/vmed-common'))
            .catch(err => console.error(`Failed to copy to ./apps/my-app/node_modules/@uvasomit/common ${ err }`));
    });
    

This is an over-simplification, but it basically copies some necessary files from a cross-dependency directly into the node_modules of the current application. You can use Jake and something like fs-extra to ensure that all necessary files are packaged together with your deployment. Then you can use the working-directory property in GitHub Actions, and a simple script in your package.json to ensure your package has correctly bundled the assets.

We'll talk about bundling assets in a future post, but the point of this post is that monorepositories work great for developing, but when we move from development to a CI/CD environment, there is some configuration and custom build-processing necessary to appropriately package your application for delivery. Although this is a drawback of monorepositories, the truth is, a well-planned CI/CD process is likely going to require some customization anyway.