Let's Package jQuery: A Javascript Packaging Dystopian Novella

By Christine Lemmer-Webber on Fri 01 May 2015

The state of packaging of libre web applications is, let's face it, a sad one. I say this as one of the lead authors of one of such a libre web application myself. It's just one component of why deploying libre web applications is also such a sad state of affairs (hence userops). It doesn't help that, for a long time, the status quo in all free software web applications (and indeed all web applications) was to check javascript and similar served-to-client web assets straight into your repository. This is as bad as it sounds, and leads to an even further disconnect (one of many) between the packages that a truly free distro might include (and have to manually link in after the fact) and those of your own package. Your package is likely to become stuck on a totally old version of things, and that's no good.

So, in an effort to improve things, MediaGoblin and many other projects have kicked the bad habit of including such assets directly in our repository. Unfortunately, the route we are taking to do this in the next release is to make use of npm and bower. I really did not want to do this... our docs already include instructions to use Python's packaging ecosystem and virtualenv, which is fine for development, but since we don't have proper system packaging, this means that this is the route users go for deployment as well. Which I guess would be fine, except that my experience is that language package managers break all the time, and when they break, they generally require an expert in that language to get you out of whatever mess you're in. So we added more language package management features... not so great. Now users are even more likely to hit language package management problems, now also ones that our community are less of experts in helping debug.

But what can we do? I originally thought of home-rolling our own solution, but as others rightly pointed out, this would be inventing our own package manager. So, we're sucking it up and going the npm/bower route.

But wait! There may be a way out... recently I've been playing with Guix quite a bit, and I came to realize that, at least for myself in development, it could be nice to have all the advantages of transactional rollbacks and etc. There is a really nice feature in Guix called guix environment which is akin to a "universal virtualenv" (also similar to JHBuild in Gnome land, but not tied to Gnome specifically)... it can give you an isolated environment for hacking, except not just restricted to Python or Javascript or Ruby or C... great! (Nix has something similar called nix-shell.) I know that I can't expect that Guix is usable for everyone right now, but for many, maybe this could be a nice replacement for Virtualenv + Bower, something I wrote to the mailing list about.

(As an aside, the challenge wasn't the "virtualenv" type side of things (pulling in all the server-side dependencies)... that's easy. The challenge is replacing the Bower part: how to link in all the statically-served assets from the Guix store right into the package? It's kind of a dynamic linking problem, but for various reasons, linking things into the package you're working on is not really easy to do in a functional packaging environment. But thanks to Ludo's advice and thanks to g-expressions, things are working!)

I'm happy to say that today, thanks to the help from the list, I came up with such a Virtualenv + Bower replacement prototype using "guix environment". And of course I wanted to test this on MediaGoblin. So here I thought, well, how about just for tonight I test on something simple. How about jQuery? How hard could that be? I mean, it just compiles down to one file, jquery.js. (Well, two... there's also jquery.js.min...)

Luckily, Guix has Node, so it has npm. Okay, the docs say to do the following:

# Enter the jquery directory and run the build script:
cd jquery && npm run build

Okay, it takes a while... but it worked! That seemed surprisingly easy. Hm, maybe too easy. Remember that I'm building a package for a purely functional distribution: we can't have any side effects like fetching packages from the web, every package used has to be an input and also packaged for Guix. We need dependencies all the way up the tree. So let's see, are there any dependencies? There seems to be a [node_modules]{.title-ref} directory... let's check that:

cwebber@earlgrey:~/programs/jquery$ ls node_modules/
commitplease          grunt-contrib-uglify  grunt-npmcopy        npm                   sinon
grunt                 grunt-contrib-watch   gzip-js              promises-aplus-tests  sizzle
grunt-cli             grunt-git-authors     jsdom                q                     testswarm
grunt-compare-size    grunt-jscs-checker    load-grunt-tasks     qunitjs               win-spawn
grunt-contrib-jshint  grunt-jsonlint        native-promise-only  requirejs

Yikes. Okay, that's 24 dependencies... that'll be a long night, but we can do it.

Except, wait... I mean, there's nothing so crazy here as in dependencies having dependencies, is there? Let's check:

cwebber@earlgrey:~/programs/jquery$ ls node_modules/grunt/node_modules/
async          eventemitter2  glob               iconv-lite  nopt
coffee-script  exit           grunt-legacy-log   js-yaml     rimraf
colors         findup-sync    grunt-legacy-util  lodash      underscore.string
dateformat     getobject      hooker             minimatch   which

Oh hell no. Okay, jeez, just how many of these node_modules directories are there? Luckily, it's not so hard to check (apologies for the hacky bash pipes which are to follow):

cwebber@earlgrey:~/programs/jquery$ find node_modules -name "node_modules" | wc -l
158

Okay, yikes. There are 158 dependency directories that were pulled down recursively. Wha?? To look at the list is to look at madness. Okay, how many unique packages are in there? Let's see:

cwebber@earlgrey:~/programs/jquery$ find node_modules -name "node_modules" -exec ls -l {} \; | grep -v total | awk '{print $9}' | sort | uniq | wc -l
265

No. Way. 265 unique packages (the list in its full glory), all to build jquery! But wait... there were 158 [node_modules]{.title-ref} directories... each one of these could have its own repeat of say, the minimatch package. How many non-unique copies are there? Again, easy to check:

cwebber@earlgrey:~/programs/jquery$ find node_modules -name "node_modules" -exec ls -l {} \; | grep -v total | awk '{print $9}' | wc -l
493

So, there's about double-duplication of all these packages here. Hrm... (Update: I have been told that there is an npm dedupe feature. I don't think this reduces the onerousness of packaging outside of npm, but I'm glad to hear it has this feature!)

Well, there is no way I am compiling jQuery and all its dependencies in this state any time soon. Which makes me wonder, how does Debian do it? The answer seems to be, currently just ship a really old version from back in the day before npm, when you could just use a simple Makefile.

Well for that matter then, how does Nix do it? They're also a functional package management system, and perhaps Guix can take inspiration there as Guix has in so many other places. Unfortunately, Nix just downloads the prebuilt binary and installs that, which in the world of functional package management is kind of like saying "fuck it, I'm out."

And let's face it, "fuck it, I'm out" seems to be the mantra of web application packaging these days. Our deployment and build setups have gotten so complicated that I doubt anyone really has a decent understanding of what is going on, really. Who is to blame? Is it conventional distributions, for being so behind the times and for not providing nice per-user packaging environments for development? Is it web developers, for going their own direction, for not learning from the past, for just checking things in and getting going because the boss is leaning over your shoulder and oh god virtualenv is breaking again on the server since we did an upgrade and I just have to make this work this time? Whose fault is it? Maybe pinning blame is not really the best anyway, but I feel that these are conversations that we should have been having, for distributions and web applications to work together, at least a decade ago. And it's not just Javascript; we're hardly better in the Python world. But it's no wonder that the most popular direction of deployment is the equivalent of rolling a whole distro up into a static binary, and I don't have to tell you what a sad state that is.

For me, at the moment, I'd like to be more conscious of what it takes to build software, not less. Reproducibility is key to long-term software freedom, else how can we be sure that the software we're running is really the software we say it is? But given all the above, it's hard to not have empathy for those who instead decide to toss in that towel and take a "fuck it, I'm out" approach to deployment.

But I hope we can do better. In the meanwhile, ensuring that users can actually build and package from top to bottom the software I'm encouraging them to use is becoming more of a priority for me, not less.

And I guess that may mean, if it isn't really feasible to reproduce your software, I can't depend on it in my own.