Last month I made a blogpost titled Guile Steel: A Proposal for a Systems Lisp. It got more attention than I anticipated, which is both a blessing and and curse. I mean, mostly the former, the curse isn't so serious, it's mostly that the post was aimed at a specific community and got more coverage than that, and funny things happen when things leave their intended context.
The blessing is that real, actual progress has happened, in terms of organization, actual development (thanks to others mostly!), and a compilation of potential directions. In many ways "Guile Steel" was meant to be a meta project, somewhat biased around Guile but more so a clever name to start brewing some ideas (and gathering intelligence) around, a call-to-arms for those who are likeminded, a test even to see if there are enough likeminded people out there. The answer to that one is: yes, and there's actually a lot that's happening or has happened historically. I actually think Lisp is going through a quiet renaissance and is on the verge or a major revival, but that's a topic for another post. The goal of this post is to give a lay of the landscape, as I've seen it since then. There's a lot out there.
If you enjoy this post by the way, there's an IRC channel:
irc.libera.chat. It's surprisingly well populated
given that people have only shown up through word of mouth.
First, an aside on language (again)
Also by-the-way, it's debatable what "systems language" even means, and the previous post spent some time debating that. Language is mostly fuzzy, and subject to the constant battle between fuzzy and crisp systems, and "systems language" is something people evoke to make themselves sound like very crispy people, even though the term could hardly be fuzzier.
We're embracing the hand-waviness here; I've previously mentioned that "Blockchain" is to "Bitcoin" what "Roguelike" is to "Rogue". Similarly, "Systems Language" is to "C/Rust" what "Roguelike" is to "Rogue".
My friend Technomancy put things about as well or as succinctly as you could: "low-level enough to bootstrap a runtime". We'll extend "runtime" to not only mean "programming language runtime" but also "operating system runtime", and that's the scope of this post.
With that, let's start diving into directions.
Carp and Scopes
I was unaware at the time of writing the previous post of two remarkable "systems lisps" that already existed, are here, now, today, and you can and maybe should use them: Carp and Scopes. Both of them are statically typed, and both perform automatic memory management without the overhead of a garbage collector or reference counting in a style familiar to Rust.
They are also both kind of similar yet different. Carp is written on top of Haskell, looks a lot like Clojure in style. Scopes is written in C++, looks a lot like Scheme, and has an optional, non-parenthetical whitespace syntax which reminds me a lot of Wisp, so is maybe more amenable to the type of people who fear parentheses or must work with people who do.
I can't make a judgement about either; I would like to find some time to try each of them. Scopes looks more up my alley of the two. If someone packaged either of these languages for Guix I would try it in a heartbeat.
There's a lot to say on this one, despite its obscurity, enough that I'm going to give it several sub-headings. I'll get the big one up front: Andrew Whatson is doing an incredible job porting Pre-Scheme to Guile. But I'll get more to that below.
What the heck is Pre-Scheme anyway?
PreScheme (or is it Pre-Scheme or prescheme or what? nobody is consistent, and I won't be either) is a "systems lisp" that is used to bootstrap the incredible but "sleeper hit" (or shall we say "cult classic"?) of programming language runtimes, Scheme48. PreScheme compiles to C, is statically typed with type inference based on a modified version of Hindley-Milner, and uses manual memory management for heap-allocated resources (much like C) rather than garbage collection. (C is just the current main target, compiling directly to native architectures or WebAssembly is also possible.)
The wild things about PreScheme are that unlike C or Rust, you can hack on it live at the REPL just like Scheme, and you can even incorporate a certain amount of Scheme, and it mostly looks like Scheme. But it still compiles down efficiently to low-level code.
It's used to implement Scheme48's virtual machine and garbage collector, and is bootstrappable from a working Scheme48, but there's also apparently a version sitting around somewhere on top of some metacircular Scheme which Jonathan Rees wrote on top of Common Lisp, giving it a good bootstrapping story. While used for Scheme48, and usable from it today, there's no reason you can't use it for other things, and a few smaller projects have.
What's more wild about PreScheme is how incredibly good of an idea it is, how long it's sat around (since the 80s, with a lot of work happening in the 90s!), and how little attention it's gotten. PreScheme's thoughtful design actually follows from Richard Kelsey's amazing PhD dissertation, Compilation By Program Transformation, which really feels like the kind of obscure CS thing that, if you've made it this far in this writeup, you probably would love reading. (Thank you to Olin Shivers for reviving this dissertation in LaTeX, which otherwise would have been lost to history.)
Now I did mention prescheme and how I thought that was a fairly interesting starting point on the last Guile Steel blogpost, and I actually got several people reaching out to me saying they wanted to take up this initiative, and a few of them suggested maybe they should start porting PreScheme to Guile, and I said "yes you should!" to all of them, but one person took up the initiative quickly and has been doing a straight and faithful port to Guile named guile-prescheme.
The emulator (which isn't too much code really) has already worked for a couple of weeks (which means you can already hack PreScheme at Guile's REPL, and Andrew Whatson says that the "compile to C" compiler is already well on its way, and will likely be there in about a month.
The main challenge apparently is the conversion of macros,
which are stuck in the
r4rs era of Scheme.
Andrew has been slowly converting everything to
syntax-case is encoded in r6rs
and even more appealingly r7rs-small,
which begs the question: how general of a port is this?
Does it really have to just be to Guile?
And that brings us to our next subsection...
The Secret Society of PreScheme Revivalists
Okay there's not really a secret society, we just have an email thread going and I organized a video call recently, and we're likely to do another one (I hope). This one was really good, very productive. (We didn't record it, sadly. Maybe we should have.)
On said video call we got Andrew Whatson of course, who's doing the current porting effort, but also Richard Kelsey (the original brain behind PreScheme, co-author of much of Scheme48), Michael Sperber (current maintainer of Scheme48 and also someone who has used PreScheme previously commercially, to do some monte carlo simulation things for some financial firm or something curious like that), and Jonathan Rees (co-author of Scheme48, and one of my friends who I like to call up and talk about all sorts of curious topics). There were a few others, all cool people, and also me, hand-waving excitedly as usual.
As an aside, my wife Morgan says my superpower is that I'm good at "showing up in a room and being excited and getting everyone else excited", and she's right, I think. And... well there's just a lot of interesting stuff in computer science history, amazing people whose work has just been mostly ignored, stuff left on the shelf. It's not what the Tech Influencers (TM) are currently touting, it's not what a FAANG company is going to currently hire you to do, but if you're trying to solve the hard problems people don't even realize they have, your best bet is to scour history.
I don't know if it's true or not but this felt like one of those times where the folks who have worked on PreScheme historically seemed kind of surprised that here we had a gathering of people who are extremely interested in the stuff they've done, but also happy about it. Anyway, that seemed like my reading, I like to think so anyway. Andrew (I think?) said some nice things about how it was just exciting to be able to talk to the people who have done these things, and I agree. It is cool stuff. We are grateful to be able to talk about it.
The conversation was really nice, we got some interesting historical information (some of that which I've conveyed here), and Richard Kelsey indicated that he's been doing work on microcontrollers and wishes he could be using PreScheme, but the things clients/employers get nervous about is "will be able to actually hire anyone to work on this stuff who isn't just you?" I'd like to think that we're building up enough enthusiasm where we can demonstrate in the affirmative, but that's going to take some time.
Anyway, I hinted in the last part that some of the more interesting
conversation came to, just how portable is this port?
Andrew indicated that he thought that the port to Guile as he was doing
it was already helping to make things more portable.
Andrew is just focusing on Guile first, but is avoiding the
ice-9 namespace of Guile modules
(which in this case, from a standardization perspective, becomes
a little bit too appropriate)
and is using as much generic Scheme and
SRFI extensions as possible.
Once the Guile version gets working, the goal is then to try porting
to a more standardized form of Scheme
and then that would mean that any Scheme following that standard could
use the same version of PreScheme.
Michael Sperber seemed to indicate that maybe Scheme48 could use this
This would actually be pretty incredible because it would mean that any version of Scheme following the Scheme standard would suddenly have access to PreScheme, and any of those could also be used to bootstrap a PreScheme based Scheme.
A PreScheme JIT?
I thought Andrew Whatson (
flatwhatson here) said this well enough
himself so I'm just going to quote it verbatim:
<flatwhatson> re: pre-scheme interest for bootstrapping, i think it's more interesting than just "compiling to C" <flatwhatson> Michael Sperber's rejected paper "A Tractable Native-Code Scheme System" describes repurposing the pre-scheme compiler (more accurately called the transformational compiler) as a jit byte-code optimizer and native-code emitter <flatwhatson> the prescheme compiler basically lowers prescheme code to a virtual machine-code and then emits that as C <flatwhatson> i think it would be feasible to directly emit native code at that point instead <flatwhatson> https://www.deinprogramm.de/sperber/papers/tractable-native-code-scheme-system.pdf <flatwhatson> Kelsey's dissertation describes transforming a high-level language to a low-level language, not specifically scheme to C. <flatwhatson> > The machine language is an assembly language written in the syntax of the intermediate language and has a much simpler semantics. The machine is assumed to be a Von Neumann machine with a store and register-to-register instructions. Identifiers represent the machine’s registers and primitive procedures are the machine’s instructions. <flatwhatson> Also, we have code for the unreleased byte-code jit-compiling native-emitting version of Scheme 48: https://www.s48.org/cgi-bin/hgwebdir.cgi/s48-compiler/
(How the hell that paper was rejected btw, I have no idea. It's great.)
Future directions for PreScheme
One obvious improvement to PreScheme is: compile to WebAssembly (aka WASM)! This would be pretty neat and maybe, maybe, maybe could mean a good path to getting more Schemes in the browser without using Emscripten (which is a bit heavy-handed of an approach). Andrew and I both think this is a fun idea, worth exploring. I think once the "compile to C" part of the port to Guile is done, it's worth beginning to look at in earnest.
Relatedly, it would also, I think, be pretty neat if guile-prescheme was compelling enough for more of Guile to be rewritten in it. This would improve Guile's already-better-than-most bootstrapping story and also make hacking on certain parts of Guile's internals more accessible and pleasant to a larger part of Guile's existing userbase.
The other obvious improvement to PreScheme is exploring (handwave handwave handwave) the kinds of automated memory management which have become popular with Rust's borrow checker and also appear in Carp and Scopes, as discussed above.
3L: The Computing System of the Future (???)
I mentioned that an appealing use of PreScheme might be to write not just a language runtime, but also an operating system. A very interesting project called 3L exists and is real and does just that. In fact, it's also a capability-secure operating system, and it cites all the right stuff and has all the right ideas going for it. And it's using PreScheme!
Now the problem is, seemingly nobody I know who would be interested in
exactly a project like this even had heard of it before (except for
the incredible and incredibly nice hacker
pukkamustard is the one who made me even aware
of it by mentioning it in the
#guile-steel chatroom), and I couldn't
even find the actual code on the main webpage.
But there actually is source code,
not a lot of it, but it's there, and in a way "not a lot of it" is
not a bad thing here, because
looks stunningly similar to a very familiar
which begs the question, is that really enough, though?
(As a complete aside: I'd be much more likely to play with Scheme48 if someone Geiser support for it... that's something I've poked at doing every now and then but I haven't had enough of a dedicated block of time. If you, dear reader, feel inspired enough to add such support, or actually if you give 3L a try, let me know).
Anyway, cool stuff, I've been meaning to reach out to the author, maybe I will after I post this. I wonder what's come of it. (It's also missing a license file or any indicators, but maybe we could get that fixed easily enough?)
I had a call with someone recently who said WebAssembly was really only useful for C/Rust users, and I thought this was fairly surprising/confusing, but maybe that's because I think WebAssembly is pretty cool and have hand-coded a small amount of it for fun. Its text-based syntax is S-Expression based which makes it appealing for lispy type folks, and just easy to parse and work with in general.
It's stretching it a bit to call WebAssembly a Lisp, it's really just something that's designed to be an intermediate language (eg in GCC), a place where compiler authors often deign it okay/acceptable to use s-expressions because they don't fear that they'll scare off non-lispers or PLT people, because hey most users aren't going to touch this stuff anyway, right?
I dunno, I consider it a win at least that s-expressions have survived here. I showed up to an in-person WebAssembly meeting once and talked to one of the developers about it, praised them for this choice, and they said "Oh... yeah, well, we initially did it because it was the easiest thing to start with, and then eventually we came to like it, which I guess is the usual thing that happens with S-Expressions." (Literally true, look up the history of M-Expressions vs S-Expressions.)
At any rate, most people aren't coding WebAssembly by hand. However, you could, and if you're going to, a Lisp based environment is actually a really good choice. wasm-adventure is a really cool little demo game (try it!), all hand-written in WebAssembly kinda sorta. The README gives its motivations as "Is it possible (and enjoyable) to write a game directly in web assembly's text format? Eventually, would it be cool to generate wat from Scheme code using the Racket lang system?", and the answer becomes an emphatic "yes". What's interesting is that Lisp's venerable quasiquote does much of the heavy lifting to make, without too much work, a little DSL for authoring WebAssembly which results in some surprisingly easy to read code compared to generic WebAssembly. (The author, Zoé Martin, is another one of those quiet geniuses you run into on the internet; she has a lovely homebrew computer design too.)
So what I'd really like is to see more languages compiling directly to WebAssembly without emscripten as an intermediate hack. Guile especially, of course. Andy Wingo gave an amazing little talk on this where he does a little (quasi, pre-recorded) live coding demo of compiling to WebAssembly and I thought "YES!!! Compiling to WASM is probably right around the corner" and it turns out that's probably not the case because Wingo would like to see some important extensions to WASM land, and I guess, yes that probably makes sense, and also he's working on a new garbage collector which seems damn cool and like it'll be really good for Guile and maybe even could help the compiling to WASM story even before the much desired WASM-GC extension we all want lands, but who knows. I mean it would also be nice to have like, the tail call elimination extension, etc etc etc. But see also Wingo's writeup about targeting the web from last year, etc etc. (And on that note, I mean, is Webassembly the new Kubernetes?)
As another aside, there are two interesting Schemes which are actually written in WebAssembly, or rather, one written directly in hand-coded WASM named scheme.wasm, and one which compiles itself to Webassembly called Schism (which has a cool paper, but sadly hasn't been updated in a couple of years).
As another another aside, I was on a video call with Douglas Crockford at one point and mentioned WebAssembly and how cool I thought it was, and Crock kinda went "meh" to it, and I was like what? I mean it has ocap people you and I have both collaborated on it with, overall its design seems pretty good, better than most of the things of its ilk that have been tried before, why are you meh'ing WebAssembly? And Crock said that well, it's not that WebAssembly is bad, it's just that it felt like an opportunity to do something impactful, and it's "just another von neumann architecture", like, boring, can't we do better? But when I asked for specific alternatives, Crock didn't have a specific design in mind, just thought that maybe we could do better, maybe it could even incorporate actors at a more fundamental level.
Well... it turns out we both know someone who did just that, and (so I hear) both recently got nerdsniped by that very same person who had just such an architecture...
Mycelia and uFork
So I had a call with sorta-colleague, sorta-friend I guess? I'm talking about Dale Schumacher, and I don't know him super well, we don't get to talk that much, but I've enjoyed the amount we have. Dale has been trying to get me to have a video call for a while, we finally did, and I was expecting us to talk about our respective actor'y system projects, and we did... but the big surprise was hearing about Mycelia, Dale's ocap-secure hybrid-actor-model-lisp-machine-lambda-calculus operating system, and its equally astounding, actually maybe more astounding, virtual machine and maybe potentially CPU architecture design, uFork. We're going to take a major digression but I promise that it ties back in.
This isn't the first time Dale's reached out and it's resulted in me being surprised and nerdsniped. A few years ago Dale reached out to me to talk about this programming language he wrote called Humus. What's astounding personally about Humus is that it has an eerie amount of similarity to Spritely Goblins, the ocap distributed object architecture I've been working on the last few years, despite that we fully designed our systems independently. Dale beat me to it, but it was an independent reinvention in the sense that I simply wasn't aware of Humus until Dale started emailing me.
The eerie similarity is because I think Dale and I's systems are the most seriously true-to-form implementations of the "Classic Actor Model" that have been implemented in recent times (more true than say, Erlang, which does some other things, and "Classic" thrown on there because Carl Hewitt has some new ideas that he feels strongly should now be associated with "Actor Model" that can be layered on Dale and I's systems, but are not there at the base layer). (Actually, Goblins supports one other thing that makes it more the vat model of computation, but that isn't important for this post.) The Classic Actor Model says (hand-waving past pre-determinism in the general case, at least from the perspective of a single actor, due to ordering concerns... but those too can be layered on) that you can do pretty much all computation in terms of just actors, which are these funky distributed objects which handle messages one at a time, and while handling them are only able to do some combination of three things: (1) send messages to actors they know about, (2) create new actors (and get their address in the process, which they can share with other actors should they choose... argument passing, basically), and (3) designate their behavior for the next time they are handling a message. It's pretty common to use "become" for last that operation, but the curious thing that both Dale and I did was use lambdas as the thing you become. (By the way, those familiar with Scheme history should notice something interesting and familiar here, and for that same reason Dale and I are also in the shared company of being scolded by Carl Hewitt for saying our actors are made out of lambdas, despite him liking our systems otherwise, I think...)
I remarked off-hand that "well I guess one of the main differences between our systems, and maybe a thing you might not like, is that mine is lispy / based on Scheme, and..."
Dale waved his hand. "That's mostly surface..."
"Surface syntax, yeah I know. So I guess it doesn't..."
"No wait it does matter. What I'm trying to show you is that I actually do like that kind of stuff. In fact I have some projects which use it at a fundamental level. Here... let me show you..." And that's when we started talking about uFork, the instruction architecture he was working on, which I later found was actually part of a larger thing called Mycelia.
Well I'm glad I got the lecture directly from Dale because, let's see, how does the Mycelia project brand itself (at the time of writing)? "A bare-metal actor operating system for Raspberry Pi." Well, maybe this underwhelming self-description is why seemingly nobody I know (yes, like 3L above) has seemingly heard about it, despite it being extremely up the alley of the kind of programming people I tend to hang out with.
Mycelia is not just some throwaway raspberry pi project (which is certainly the impression I would have gotten from lazily scanning that page). Most of those are like, some cute repackaging of some existing FOSS POSIX'y thing. But Mycelia is an actually-working, you-can-actually-boot-it-on-real-hardware open source operating system (under Apache v2) with a ton of novel ideas which happens to be targeting the Raspberry Pi, but it could be ported to run on anything.
Anyway, there are a lot of interesting stuff in there, but here's a bulleted list summary. For Mycelia:
- It's is an object-capability-secure operating system
- It has a Lisp-like language for coding, pretty much Scheme-like, to hack around on
- The Kernel language / Vau calculus show up, which is... wild
- It encodes the Actor model and the Lambda calculus in a way that is sensible and coherent afaict
- It is a "Lisp Machine" in many senses of the term.
But the uFork virtual machine / abstract idea for a CPU also are curious on their own. I dunno, I spent the other night reading it kind of wildly after our call. It also encodes the lambda calculus / actor model in fundamental instructions.
Dale was telling me he'd like to build an actual, physical CPU, but of course that takes a lot of resources, so he might settle for an FPGA for now. The architecture, should it be built, also somehow encodes a hardware garbage collector, which I haven't heard of anything doing that since the actual physical Lisp Machines died out.
At any rate, Dale was really excited to tell me about why his system encoded instructions operating on memory split in quads. He asked me why I thought that would be; I'm not honestly sharp enough in this kind of area to know, sadly, though I said "I hope it's not because you're planning on encoding RDF at the CPU layer". Thankfully it's not that, but then he started mentioning how his system encodes a stream of continuations...
Wait, that sounds familiar. "Have you ever heard of something called sectorlisp?" I asked, with a raised eyebrow.
"Scroll to the bottom of the document," Dale said, grinning.
Oh, there it was. Of course.
The most technically impressive thing I think I've ever seen is John McCarthy's "Lisp implemented in Lisp", also known as a "metacircular evaluator". If you aren't familiar with it, it's been summarized well in the talk The Most Beautiful Program Ever Written by William Byrd. I think the best way to understand it really, and (I'm biased) the easiest to read version of things is in the Scheme in Scheme section of A Scheme Primer (though I wrote that for my work, and as said, I'm biased... I don't think I did anything new there, just explained ideas as simply as I could).
The second most technically impressive thing I've ever seen is sectorlisp, and the two are directly related. According to its README, "sectorlisp is a 512-byte implementation of LISP that's able to bootstrap John McCarthy's meta-circular evaluator on bare metal." Where traditional metacircular evaluator examples can be misconstrued as being the stuff of pure abstractlandia, sectorlisp gets brutally direct about things. In one sector (half a kilobyte!!!), sectorlisp manages to encode a whole-ass lisp system that actually runs. And yet, the nature of the metacircular evaluator persists. (Take that, person on Hacker News who called metacircular evaluators "cheating"! Even if you think mine was, I don't think you can accuse sectorlisp's of that.)
If you do nothing else, watch the sectorlisp blinkenlights demo, even just as an astounding visual demo alone. (Blinkenlights is another project of Justine's, and also wildly impressive.) I highly recommend the following blogposts of Justine's: SectorLISP Now Fits in One Sector, Lisp with GC in 436 bytes, and especially Lambda Calculus in 383 Bytes. Hikaru Ikuta (woodrush) has also written some amazing blogposts, including Extending SectorLISP to Implement BASIC REPLs and Games and Building a Neural Network in Pure Lisp Without Built-In Numbers Using Only Atoms and Lists (and related, but not part of sectorlisp: A Lisp Interpreter Implemented in Conway's Game of Life, which gave off strong Wireworld Computer vibes for me). If you are only going to lazily scan through one of those blogposts, I recommend it be Lambda Calculus in 383 Bytes, which has some wild representations of the ideas (including visually), a bit too advanced for me at present admittedly, though I stare at them in wonder.
I had a bunch more stuff here, partly because the author is someone I find both impressive technically but who has also said some fairly controversial things... to say the least. But I think it was too much of a digression for this article. The short version is that Justine's stuff is probably the smartest, most mind-blowing tech I've ever seen, kinda scarily and intimidatingly smart, and it's hard to mentally reconcile that with some of those statements. I don't know, maybe she wants to move past that phase, I'd like to think so. I think she hasn't said anything like that in a long time, and it feels out of phase with the rest of this post but... it feels like something that needs to be acknowledged.
GOOL, GOAL, and OpenGOAL
Every now and then when people say Lisp couldn't possibly be performant, Lisp people like to bring up that Naughty Dog famously had its own Lisp implementations for most of its earlier games. Andy Gavin has written about GOOL, which was a mix of lisp and assembly and of course lisp generating assembly, and I don't think much was written about its followup GOAL until OpenGOAL came up, which... I haven't looked at it too much tbh. I guess it's getting interesting for some people for the ability to play Jak and Daxter on modern hardware (which I've never played but looked fun), but I'm more curious if someone's gonna poke at it to do something completely different.
But I do know some of the vague legends. I don't remember if this is true or where I read it but one of them that Sony accused Naughty Dog of accessing some devkit or APIs they weren't supposed to have access to because they were pulling off a bunch of features performantly in Crash Bandicoot that were assumed otherwise not possible. But nope, just lisp nerds making DSLs that pull off some crazy shit I guess.
BONUS: Shoutout to Kandria
Oh, speaking of games written in Lisp, I guess Kandria's engine is gonna be FOSS, and that game looks wicked cool, maybe support their Kickstarter while you still can. It's not really in the theme of this post from the definition of "systems lisp" I gave earlier, but this blogpost about its lispy internals is really neat.
Okay here's everything else we're done
This was a long-ass post. There's other things maybe that could be discussed, so I'll just dump briefly:
- Chicken Scheme compiles to C but doesn't strike me as qualifying for this post's vague definition because it still has the usual memory/dynamic typing overheads of Scheme. But go ahead and read literally everything on more-magic.net anyway because that stuff is great.
- SBCL: The Ultimate Assembly Code Breadboard, kinda wild
- Kinda feels worth mentioning the Lambda Paper about the Scheme chip: Design of LISP-based Processors, or SCHEME: A Dielectric LISP, or Finite Memories Considered Harmful, or LAMBDA: The Ultimate Opcode
- Mes, I'm gonna mention Mes again, but I'm tired of writing now and leave you to research further why it's one of the most important projects to have happened in the software world in the past decade. There's videos and stuff about it from FOSDEM so watch those I guess.
- EDIT: And as everyone keeps reminding me, this blogpost really ought to mention Loko Scheme, which is directly and intentionally aiming to be a boots-from-bare-metal Scheme.
Okay, that's it. Hopefully you found something interesting out of this post. Meanwhile, I was just gonna spend an hour on this. I spent my whole day! Eeek!