Friday, 20 January 2017

Rebuilding Nexus - what we did wrong and how we fixed it

Nexus is just a searchable cache for all the various jars you either downloaded from the internet or generated from your code-base.

There are 2 main reasons you use a repository manager like Nexus or Artifactory:

* save on network requests, and make your builds faster
* predictability of builds, to serve as the 'source of truth' for all your release artifacts

Now we had a recent problem with Nexus. It got totally f*cked. I'm not privy to all the gory details, but we seem to have lost a few files.

This shouldn't be that big a deal, because we could just reconstruct Nexus from the files we already have in our Maven /.m2 directories -- we could take a new Nexus instance and populate it with the files we have. 

However, during the process we made a couple of errors, and it took us a few days as we discovered the problems day by day, depending on what projects we were rebuilding and what errors we were getting.

Now Nexus seems to have changed the way it works, and was no longer a system that could just be pointed to a file system to do a mass import. It looked like there wasn't actually any way to do bulk imports of an existing file system repository. So we had to do the import by individual files.

Most of the contents of Nexus were jars that came from either the main Maven repository, or other repos like Apache, Spring or Atlassian, so getting those wasn't an issue.

The main thing we had to repopulate were internal jars that we'd released and uploaded into Maven. One of the sysadmins wrote a script to recurse through internal jars from the /.m2/repository/ directory on our build server and then upload them into Nexus. However there were some bugs in the script, that resulted in people getting broken repositories.

The things we did wrong were:

1. Uploading jars into Nexus without specifying the pom file. By default, Nexus created a default pom file, but this only contained basic information like groupId, artifactId and version. No dependencies. 

Result: Transitive dependencies couldn't get resolved during builds in Maven, so compilation broke.

How We Found The Issue: Looking at the artifact directory in our local repo, and looking at the POM file. 

Fix: Update the command to deploy the file to specify both the Jar and Pom file.

2. Assuming that a .jar file was a java Jar. This was incorrect, because maven builds can also produce JAR containing source or javadoc. This resulted in the source or javadoc jar being mistakenly uploaded. 

Result: Builds broke because class files couldn't be found.

How We Found The Issue: Looking at the .jar file in local repo, opening it, and inspecting contents. No .class files - wrong jar upload

Fix: Update the command to exclude *-source.jar and *-javadoc.jar from deploys.

No comments: