Archives

Tags

Posts with tag "javascript"

Let's Package jQuery: A Javascript Packaging Dystopian Novella

By Christopher Lemmer Webber on Fri 01 May 2015

The state of packaging of libre web applications is, let's face it, a sad one. I say this as one of the lead authors of one of such a libre web application myself. It's just one component of why deploying libre web applications is also such a sad state of affairs (hence userops). It doesn't help that, for a long time, the status quo in all free software web applications (and indeed all web applications) was to check javascript and similar served-to-client web assets straight into your repository. This is as bad as it sounds, and leads to an even further disconnect (one of many) between the packages that a truly free distro might include (and have to manually link in after the fact) and those of your own package. Your package is likely to become stuck on a totally old version of things, and that's no good.

So, in an effort to improve things, MediaGoblin and many other projects have kicked the bad habit of including such assets directly in our repository. Unfortunately, the route we are taking to do this in the next release is to make use of npm and bower. I really did not want to do this... our docs already include instructions to use Python's packaging ecosystem and virtualenv, which is fine for development, but since we don't have proper system packaging, this means that this is the route users go for deployment as well. Which I guess would be fine, except that my experience is that language package managers break all the time, and when they break, they generally require an expert in that language to get you out of whatever mess you're in. So we added more language package management features... not so great. Now users are even more likely to hit language package management problems, now also ones that our community are less of experts in helping debug.

But what can we do? I originally thought of home-rolling our own solution, but as others rightly pointed out, this would be inventing our own package manager. So, we're sucking it up and going the npm/bower route.

But wait! There may be a way out... recently I've been playing with Guix quite a bit, and I came to realize that, at least for myself in development, it could be nice to have all the advantages of transactional rollbacks and etc. There is a really nice feature in Guix called guix environment which is akin to a "universal virtualenv" (also similar to JHBuild in Gnome land, but not tied to Gnome specifically)... it can give you an isolated environment for hacking, except not just restricted to Python or Javascript or Ruby or C... great! (Nix has something similar called nix-shell.) I know that I can't expect that Guix is usable for everyone right now, but for many, maybe this could be a nice replacement for Virtualenv + Bower, something I wrote to the mailing list about.

(As an aside, the challenge wasn't the "virtualenv" type side of things (pulling in all the server-side dependencies)... that's easy. The challenge is replacing the Bower part: how to link in all the statically-served assets from the Guix store right into the package? It's kind of a dynamic linking problem, but for various reasons, linking things into the package you're working on is not really easy to do in a functional packaging environment. But thanks to Ludo's advice and thanks to g-expressions, things are working!)

I'm happy to say that today, thanks to the help from the list, I came up with such a Virtualenv + Bower replacement prototype using "guix environment". And of course I wanted to test this on MediaGoblin. So here I thought, well, how about just for tonight I test on something simple. How about jQuery? How hard could that be? I mean, it just compiles down to one file, jquery.js. (Well, two... there's also jquery.js.min...)

Luckily, Guix has Node, so it has npm. Okay, the docs say to do the following:

# Enter the jquery directory and run the build script:
cd jquery && npm run build

Okay, it takes a while... but it worked! That seemed surprisingly easy. Hm, maybe too easy. Remember that I'm building a package for a purely functional distribution: we can't have any side effects like fetching packages from the web, every package used has to be an input and also packaged for Guix. We need dependencies all the way up the tree. So let's see, are there any dependencies? There seems to be a node_modules directory... let's check that:

cwebber@earlgrey:~/programs/jquery$ ls node_modules/
commitplease          grunt-contrib-uglify  grunt-npmcopy        npm                   sinon
grunt                 grunt-contrib-watch   gzip-js              promises-aplus-tests  sizzle
grunt-cli             grunt-git-authors     jsdom                q                     testswarm
grunt-compare-size    grunt-jscs-checker    load-grunt-tasks     qunitjs               win-spawn
grunt-contrib-jshint  grunt-jsonlint        native-promise-only  requirejs

Yikes. Okay, that's 24 dependencies... that'll be a long night, but we can do it.

Except, wait... I mean, there's nothing so crazy here as in dependencies having dependencies, is there? Let's check:

cwebber@earlgrey:~/programs/jquery$ ls node_modules/grunt/node_modules/
async          eventemitter2  glob               iconv-lite  nopt
coffee-script  exit           grunt-legacy-log   js-yaml     rimraf
colors         findup-sync    grunt-legacy-util  lodash      underscore.string
dateformat     getobject      hooker             minimatch   which

Oh hell no. Okay, jeez, just how many of these node_modules directories are there? Luckily, it's not so hard to check (apologies for the hacky bash pipes which are to follow):

cwebber@earlgrey:~/programs/jquery$ find node_modules -name "node_modules" | wc -l
158

Okay, yikes. There are 158 dependency directories that were pulled down recursively. Wha?? To look at the list is to look at madness. Okay, how many unique packages are in there? Let's see:

cwebber@earlgrey:~/programs/jquery$ find node_modules -name "node_modules" -exec ls -l {} \; | grep -v total | awk '{print $9}' | sort | uniq | wc -l
265

No. Way. 265 unique packages (the list in its full glory), all to build jquery! But wait... there were 158 node_modules directories... each one of these could have its own repeat of say, the minimatch package. How many non-unique copies are there? Again, easy to check:

cwebber@earlgrey:~/programs/jquery$ find node_modules -name "node_modules" -exec ls -l {} \; | grep -v total | awk '{print $9}' | wc -l
493

So, there's about double-duplication of all these packages here. Hrm... (Update: I have been told that there is an npm dedupe feature. I don't think this reduces the onerousness of packaging outside of npm, but I'm glad to hear it has this feature!)

Well, there is no way I am compiling jQuery and all its dependencies in this state any time soon. Which makes me wonder, how does Debian do it? The answer seems to be, currently just ship a really old version from back in the day before npm, when you could just use a simple Makefile.

Well for that matter then, how does Nix do it? They're also a functional package management system, and perhaps Guix can take inspiration there as Guix has in so many other places. Unfortunately, Nix just downloads the prebuilt binary and installs that, which in the world of functional package management is kind of like saying "fuck it, I'm out."

And let's face it, "fuck it, I'm out" seems to be the mantra of web application packaging these days. Our deployment and build setups have gotten so complicated that I doubt anyone really has a decent understanding of what is going on, really. Who is to blame? Is it conventional distributions, for being so behind the times and for not providing nice per-user packaging environments for development? Is it web developers, for going their own direction, for not learning from the past, for just checking things in and getting going because the boss is leaning over your shoulder and oh god virtualenv is breaking again on the server since we did an upgrade and I just have to make this work this time? Whose fault is it? Maybe pinning blame is not really the best anyway, but I feel that these are conversations that we should have been having, for distributions and web applications to work together, at least a decade ago. And it's not just Javascript; we're hardly better in the Python world. But it's no wonder that the most popular direction of deployment is the equivalent of rolling a whole distro up into a static binary, and I don't have to tell you what a sad state that is.

For me, at the moment, I'd like to be more conscious of what it takes to build software, not less. Reproducibility is key to long-term software freedom, else how can we be sure that the software we're running is really the software we say it is? But given all the above, it's hard to not have empathy for those who instead decide to toss in that towel and take a "fuck it, I'm out" approach to deployment.

But I hope we can do better. In the meanwhile, ensuring that users can actually build and package from top to bottom the software I'm encouraging them to use is becoming more of a priority for me, not less.

And I guess that may mean, if it isn't really feasible to reproduce your software, I can't depend on it in my own.

Javascript beyond Javascript (with Guile?)

By Christopher Lemmer Webber on Fri 15 August 2014

I'm learning more in my spare time / off normal hours about compilers and graph theory and reading various books on lisp. I have an agenda here, no idea if it'll happen; at the very worst I put a lot of tools in my toolkit that I should have had. But there is another reason... Well, this one's easiest to lay out point for point, so here goes:

  • Python is still my favorite language to write in day to day, but I guess that I keep feeling that any language that doesn't have a way to transcompile nicely to Javascript (or isn't Javascript itself) is unideal for writing really great web applications, for a simple reason: modern web applications are highly interactive, and I feel that a really stellar web application needs to share some code between the backend and the frontend. This becomes more obvious when writing dynamic templates or forms.

    While Python probably has hands-down the nicest asynchronous library in current development with asyncio, the above problem makes it feel like you can only do so much in the web world with it. (The Javascript Problem is a good page on the Haskell wiki which puts this all pretty nicely.)

    There's a lot lost by not being able to interweave applications between the frontend and the backend. Sadly, the best option in Python right now is to just give up and write entirely separate codebases in javascript that interact with your main codebase. But clients on the web are becoming thicker and thicker these days (even more so with technology like websockets and webrtc becoming prevalent), so this feels both weak and redundant.

  • You could say "just write javascript!" then, but let's face it, I don't really like Javascript. The language is getting better by leaps and bounds as time goes on, but a lot of the enthusiasm for the language still feels like Stockholm syndrome to me. But we're still left with Javascript. So what, then?

  • Transcompiling actually is becoming an increasingly appealing target. With asm.js this might not only be possible but actually very optimal. One only need look at the Unreal demo to see just how feasible transcompilation has become.

  • So okay, let's say we're doing some kind of transcompiling. We have two options, "transcompile wholesale", or "transcompile a subset that's acceptable for javascript and local evaluation". Obviously the former is desirable if possible, though the latter is a lot easier.

  • Indeed, transcompiling a subset has already been done, and well, by Clojurescript. I'm not really excited by anything too tied to the JVM, but still, pretty cool. Maybe good enough to get me into a hybrid Clojure/Clojurescript solution. But still... there's no asm.js target (maybe in the future?), and it feels like you're tossing out so much with Clojurescript, it really isn't the same language too much, just a very similarly overlapping language. I haven't tried it though... this is just from reading docs and watching talks :)

  • We could say maybe we could transcompile something like Python, but assuming we really want to have something like a template engine, we probably want to be writing in a full fledged version of the language, not a restricted subset. Python feels way too huge to expect for people to load in their browsers. So, what's more lightweight?

  • Let's continue in the lisp direction. Is there a way we can get a complete implementation in the browser, but of a language that's a bit more lightweight? How about Scheme? Scheme is notoriously simple to implement, maybe even too simple... and Javascript shares a lot of overlap with Scheme in design...

  • But which Scheme? How about Guile? Yes, Guile, GNU's extension language based on scheme.

Okay, wait, digression before we continue... I want to talk a bit more about Guile, because up until a few months ago my impression of it was still the somewhat-dismissive impression I had a few years ago. But I've come back to looking at Guile again. Largely this is thanks to Dave Thompson's Sly project. There is some seriously cool stuff going on with Sly, and that made me want to look into what's happening with Guile.

What I found is that Guile isn't the same thing it was a few years ago. For evidence of all the cool things that are happening under the hood, I point you to wingolog (warning: danger of being a real time sink; linking to Andy Wingo's blog has been described as "wingorolling" by a friend of mine). Guile isn't just an interpreter anymore, it has a sophisticated "compiler tower" since Guile 2.0. With the levels of abstraction that exist in Guile, could compiling to javascript/asm.js be a target? (What makes this feel extra appealing/feasible: Andy Wingo also works at his day job on improving javascript virtual machines, so maybe...?)

Well, better to ask Guile hackers themselves if it's possible... and I've pestered a number of them. The responses seem to be (and I may be representing incorrectly):

  • Getting the core language of Guile (which is of course scheme) ported over is not the hard part.
  • Making a subset of the language that works in Javascript is not too hard, but isn't too interesting. It would be much better to have the whole language work.
  • Overall, the most tedious part of porting guile to target any non-C target will be all of the library procedures currently implemented in C... however, gradually the Guile team is reducing the amount of C code in Guile and replacing it with Scheme code.
  • The problem is the runtime; you probably would want to re-use some higher level things like the allocator and garbage collector. Emscripten might do garbage collection correctly (guile relies on Boehm GC to do garbage collection); emscripten seems to provide a similar garbage collector? Tail calls are also a problem until at least es6 is common, and even then until they are optimized.
  • Maybe someone could try transcompiling Guile with emscripten as a test, but that probably wouldn't provide the right integration, and regardless, this would probably be a bit uncomfortable because emscripten relies on llvm, not gcc.
  • Targeting asm.js brings things more low-level, but the asm.js typedarray heap may not be appropriate; the guile-on-js heap and the javascript heap should really be the same.

Okay, so, that's a good number of problems that need to be overcome. Certainly it makes it feel like things are a ways off, but then again, it seems like there's strong interest. And it feels feasible... as an outsider to development. :)

But even assuming all the above are resolved, there are some other problems that I think scheme will always face adoption-wise:

  • I like lisp, and I like parentheses. I have emacs set up with rainbow-delimiters, smartparens, and highlight-parentheses (set up just to highlight the background, not the foreground, so it doesn't clobber rainbow-delimiters). Lisp's greatest enemy has never been its functional capability, but a general parenthesitis in the general population. This the big one, so let's come back to it in a moment.
  • Lots of scheme (lisp generally, but especially scheme) feels like it just is too in love with ideas that make it hard to approach. "car" and "cdr" are not good names but form the very basis of the language for historical reasons where even prominent lispers have had to backtrack to remember why. Linked lists as the core of lisp is great, but overuse of them is encouraged when other data types are much more appropriate: see teaching associative lists before hashmaps. While iteration tools are provided, there's simply too much emphasis on recursion. Recursion is powerful and awesome and necessary to solving many important computer science problems, but rarely needed by say, a web developer, and significantly harder to read and wrap one's mind around than iteration.
  • That said, these could be overcome with better introductory material. There seems to be interest in the guile world for this, thankfully!
  • Module names often look like robot serial numbers ("srfi-13") than anything human-readable and feel like they really need aliases.
  • The Guile community is very small, and does not seem to be very diverse. Outreach is greatly needed.

Returning to "parenthesitis", while I love lisp and all its parentheses, admittedly it's not the most readable language ever. I can wake up groggily at 5AM, be tossed a block of Python code, and even through blurry eyes, I can get a feel for what's happening in the code just by its structure. I can't say the same of any lisp.

But there's a way out of that situation! Guile's compiler stack is nicely set up so that another syntax can be laid on top of that. Guile's VM has semi-complete language implementations of ecmascript and some other languages on top of it. Recently Arne Babenhauserheide has written a whitespace to lisp preprocessor named wisp; I don't feel it really solves the problem by making things so much more readable, but it's a nice demonstration.

The most promising resolution I think would be to implement a syntax akin to Julia on top of Guile. If you haven't looked at Julia, it's a cool project: it has a python-like syntax with lisp-like power, and many other interesting features, including macros(!) in a language that is certified safe for those with parenthesitis. In fact, the language is largely built on top of scheme! So it might be nice to have a "sugarcoat" language that sits on top of Guile.

If all the above were achieved and a well developed web framework were written on top of Guile, it could be well positioned for writing web applications that are a joy to write, both on the backend and frontend.

One more thing: the complaint about "don't use emscripten" because it's built on top of llvm is an indication of how sorely needed a working javascript/asm.js compiler target is in GCC. ARM, X86, SPARC... these are all important compiler targets to have, but to advance user freedom where the user is today, the browser is the most important target of all.