Gush: A stack based language eventually for genetic programming

By Christine Lemmer-Webber on Thu 06 April 2017

(This blogpost was going to be just about a project I'm working on Gush, but instead it's turned into a whole lot of backstory, and then a short tutorial about Gush. If you're just interested in the tutorial, skip to the bottom I guess.)

I recently wrote about possible routes for anti-abuse systems. One of the goofier routes I wrote about on there discussed genetic programming. I get the sense that few people believe I could be serious... in some ways, I'm not sure if I myself am serious. But the idea is so alluring! (And, let's be honest, entertaining!) Imagine if you had anti-abuse programs on your computer, and they're growing and evolving based on user feedback (hand-waving aside exactly what that feedback is, which might be the hardest problem), adapting to new threats somewhat invisibly from the user benefiting from them. They have a set of friends who have similar needs and concerns, and so their programs propagate and reproduce with programs in their trust network (along with their datasets, which may be taught to child programs also via a genetic program). Compelling! Would it work? I dunno.

A different, fun use case I can't get to leave my mind is genetic programs as enemy AI in roguelikes. After all, cellular automata are a class of programs frequently used to study genetic programs. And roguelikes aren't tooooooo far away from cellular automata where you beat things up. What if the roguelike adapted to you? Heck, maybe you could even collect and pit your genetic program roguelike monsters against your friends'. Roguelike Pokémon! (Except unlike in Pokémon, "evolving" actually really means "evolving".)

Speculating on the future with Lee Spector

Anyway, how did I get on this crazy kick? On the way back from LibrePlanet (which went quite well, and deserves its own blogpost), I had the good fortune to be able to meet up with Lee Spector. I had heard of Lee because my friend Bassam Kurdali works in the same building as him in Hampshire College, and Bassam had told me a few years ago about Lee's work on genetic algorithms. The system Lee Spector works on is called PushGP (Push is the stack language, and PushGP is Push used for genetic programming). Well, of course once I found out that Push was a lisp-based language, I was intrigued (hosted on lisps traditionally, but not always, and Push itself is kind of like Lisp meets Forth), and so by the time we met up I was relatively familiar with Lee's work.

Lee and I met up at the Haymarket Cafe, which is a friendly coffee shop in Northampton. I mentioned that I had just come from LibrePlanet where I had given a talk on The Lisp Machine and GNU. I was entertained that almost immediately after these words left my mouth, Lee dove into his personal experiences with lisp machines, and his longing for the kind of development experiences lisp machines gave you, which he hasn't been able to find since. That's kind of an aside from this blogpost I suppose, but it was nice that we had something immediately to connect on, including on a topic I had recently been exploring and talking about myself. Anyway, the conversation was pretty wild and wide-ranging.

I had also had the good fortune to speak to Gerald Sussman again at LibrePlanet this year (he also showed up to my talk and answered some questions for the audience). One thing I observed in talking with both Sussman and Spector is they're both very interested in thinking about where to bring computing in terms of examining biological systems, but there was a big difference in terms of their ideas; Sussman is very interested in holding machines "accountable", which seems to frequently also mean being able to examine how they came to conclusions. (You can see more of Sussman's thinking about this in the writeup I did of the first time I ever got to speak with Sussman when I was at FSF's 30th anniversary party... maybe I should try to capture some information about the most recent chat too, before it gets lost to the sands of time...) Spector, on the other hand, seems convinced that to make it to the next level of computing (and maybe even the next level of humanity), we have to be willing to give up on demanding that we can truly understand a system, and that we have to allow processes to run wild and develop into their own things. There's a tinge of Vernor Vinge's original vision of the Singularity, which is that there's a more advanced level of intelligence than humans are able to currently comprehend, and to understand it there you have to have crossed that boundary yourself, like the impossibility of seeing past the event horizon of a black hole. (Note that this is a pretty different definition of singularity from some of the current definitions of Singularity that have come since, which are more about possible outcomes of such a change rather than about the concept of an intellectual/technological event horizon itself.) That's a possible vision for the future of humanity, but it's also a vision of what maybe the right direction is for our programming too.

Of course, this vision that code may be generative in a way where the source is mostly unintelligible to us feels possibly at odds to our current understanding of software freedom, a movement which spends a lot of time talking about inspecting source code. The implications of that is a topic of interest of mine (I wrote about it in the FSF bulletin not too long ago) and I did needle Spector about it... what does copyright mean in a world where humans aren't writing software? Spector seems to acknowledge that it's a concern (he agrees that examples of seed-DRM and genetic patents in the case of Monsanto are troubling) but I got the sense that it's not his biggest interest... Spector thinks that the future of that side of software freedom and copyright might be that humanity realizes how absurd software copyright and patents and other intellectual restriction regimes are once things generate far enough. But I get the sense that more than talking about the legal/licensing aspects of the auto-generative future, Spector would rather be building it. Fair enough!

Anyway, somewhere along these lines I mentioned my interest in distributed anti-abuse systems. We talked about how more basic approaches such as Bayesian filtering might not be good enough to combat modern abuse beyond just spam especially, because the attack patterns taken change so frequently. Suddenly it hit me: I wonder whether or not genetic programs would work pretty well in a distributed system... after all, you could use your web of trust to breed the appropriate filtering programs with your friends' programs... would it work?

Anyway, on the ride back I began playing with some of Push's ideas, and (with a lot of helpful feedback from Lee Spector... thank you, Lee!) I started to put together a toy design for a language inspired by PushGP but with some properties that I think might be more applicable to an anti-abuse system that needs to keep around "memories" between generations. (Whether it's better or not, I don't really know yet.) So...

A little Gush tutorial

At this point, Gush exists. It has the stack based language down, but none of the genetic programming. Nonetheless, it's fun to hack around in, looks an awful lot like Push but also is far enough along to demonstrate its differences, and if you've never played with a stack based language before, it might be a good place to start.

Let's do some fun things. First of all, what does a Push program look like?

(1 2 + dup *)

Ok, that looks an awful lot like a lisp program, and yet not at the same time! If you've installed Gush, you can run this example:

> (use-modules (gush))
> (run '(1 2 + dup *))
$1 = (9)

Whoo, our program ran! But what happened? Gush programs operate on two primary stacks... there's an "exec" stack, which contains the program being evaluated in progress, and a "values" stack, with all the values currently built up by the program. Evaluation happens like so:

exec> '(1 2 + dup *)  ; initial exec stack
vals> '()             ; initial empty values stack
exec> '(2 + dup *)    ; [=> 1] popped off from exec stack
vals> '(1)            ; push 1 (a literal) onto values stack
exec> '(+ dup *)      ; [=> 2] popped off from exec stack
vals> '(2 1)          ; push 2 (a literal) onto values stack
exec> '(dup *)        ; [=> +] popped off of exec stack
vals> '(3)            ; apply `+', which is bound to an operation which takes
                      ;   top two numbers on the values stack and adds them,
                      ;   pushing result onto values stack... 2 + 1 = 3,
                      ;   so 2 and 1 are removed and 3 is added
exec> '(*)            ; [=> dup] popped off of exec stack
vals> '(3 3)          ; `dup' takes top item on stack and duplicates it
exec> '()             ; [=> *] popped off of exec stack
vals> '(9)            ; apply `*', which is bound to an operation which takes
                      ;   top two numbers on values stack and multiplies them,
                      ;   pushing result onto values stack... 3 * 3 = 9,
                      ;   so 3 and 3 are removed and 9 is added
                      ; Nothing left to do on exec, so we're done!

Okay, great! What else can we run?

;; Complicated arithmetic runs
> (run '(3 2 / 4 +))
$2 = (14/3)

;; We can assign variables to values and then reference them
> (run '(88 'foo define foo 22 +))
$3 = (110)

;; However, base operations "know" what types to apply for, and search
;; the stack... the string will be "skipped over" in search for
;; a number here.  This means that we can randomly generate code
;; and we won't run into type errors.
> (run '(1 "two" 3 +))
$4 = (4 "two")   

;; Nested parentheses will be "unnested" and applied inline
> (run '(98 ("balloons" "red") 1 +))
$5 = (99 "red" "balloons")
;; so that's the same as
> (run '(98 "balloons" "red" 1 +))
$6 = (99 "red" "balloons")

;; Variables are actually stacks!  Which means we can build up
;; complicated operations on them...
> (run '('+ 'foo define      ; set foo to '(+)
         88 'foo var-push    ; append 88, so foo is now '(88 +)
         2 foo))             ; apply variable stack foo
$7 = (90)

;; Conditionals, etc also work.
> (run '(1 1 + 'b define  ; assign b to the value of 1 + 1
         2 b = if         ; check if b is 2
           "two b"        ; if-then clause
           "not two b"))  ; if-else clause
$8 = ("two b")

There's more to it than that, but that should get you started.

How is Gush different from Push?

Gush takes almost all its good ideas from Push, but there are two big differences.

Both Gush and Push try to avoid type errors. You can do all sorts of code mutation, and whether or not things will actually do anything useful is up for grabs, but it shouldn't crash to a halt. The way Push does it is via different stacks for each value type. This is really clever: it means that each operation applies to very specific types, and if you always know your input types carefully, you can always be safe on a type level and shouldn't have programs that unexpectedly crash (if there aren't enough values on the appropriate type stack, Push just no-ops).

But what if you want to run operations that might apply to more than one type? For example, in Gush you might do:

> (run '(1 2 / 4.5 +))
$9 = (6.5)

In Push, you'd probably do something like this:

> (run '(1 2 INTEGER./ FLOAT.FROMINTEGER 4.5 FLOAT.+))
$9 = (6.5)

(And of course, if you wanted rational numbers rather than just floats, you'd have to add another type stack to that...)

I really wanted generic methods that were able to determine what types they were able to apply to. For one thing, imagine you have a program that's doing a lot of complicated algebra... it should be able to operate on a succession of numbers without having to do type coercion and hit/miss on whether it chose the right of several typed operators, when it could just pick one operator that can apply to several items.

I also wanted to be able to add new types without much difficulty. As it is, I don't have to rewire anything to throw hash-maps into Gush:

> (run '(42 "meaning of life" make-hash hash-set))
$10 = (<hash-table>)

This would just work, no need to wire anything new up!

The way Gush does it is it uses generic operators which know how to check the predicates for each type, and which "search the stack" for values it knows it can apply. (It also no-ops if it can't find anything.) If bells and alarms are going off, you're not wrong! In Gush's current implementation, this does have the consequence that any given operation might be worst case O(n) of the size of the values stack! Owch! However, I'm not too worried. Gush checks how many operations every program takes (and has the option to bail out if a program is taking too many steps) and searching the stack after failing to match initially counts against a program. I figured that if programs are auto-generated, one fitness check can be how many steps it takes for the program to finish its computation, and so programs would be incentivized to keep the appropriate types near where they would be useful. I'm happy to say that it turns out I'm not the only one to think this; unknown to me when I started down this path, there's another Push derivative named Push-Forth which has only one stack altogether (not even separate stacks for exec and values!) and it does some similar-ish (but not quite the same) searching (or converging on a fixed point) by currying operations until the appropriate types are available. (Pretty cool stuff, but to be honest I have a hard time following the Push-Forth examples I've seen.) It comes to the same conclusion that by checking the number of steps a program takes to execute as part of its fitness, programs will be encouraged to keep types in good places anyway. However! There's more reasons to not despair; I'm relatively sure that there are some clever things that can be done with Gush's value stack so that predicate information is cached and looking for the right type can be made O(1). That has yet to be proven though. :)

The other feature Gush differentiates itself from Push is that Gush variables are stacks rather than single values. This ties in nicely with the classic Push approach that lists are unwrapped and applied to the exec stack at the time they are to be evaluated anyway, so it makes no behavioral change in the case that you just use "define" (which will always clobber the state of the stack, whether or not it exists, to replace it with a stack with a single element of the new value). But it also allows you to build up collections of information over time... or even collections of code. An individual variable can be appended to and modified as the program runs, so you could write or even modify subroutines to variables. (Code that writes code! Very lispy, but also a bit crazier because it might happen at runtime.) Push also has this feature, but it has one specific, restricted stack for it, named the CODE stack appropriately. Why have one of these stacks, when you could have an unlimited number of them?

That wasn't the original intent for having variables as values though; I only realized that you could make each variable into a kind of CODE stack later. My original intent was driven by a concern/need to be able to carry information from parent process to child process. I added a structure to Gush programs named "memories", and I figured that parent programs could "teach" their memories to child programs. So this was really just a hash table of symbols to stacks that persisted after the program ended (which, since if you use run-application you get the whole state of the program as the same immutable structure that is folded over during execution, you have that information attached to the application anyway). The idea of "memories" was that parent programs could have another program that, after spawning a child program, could "teach" the things they knew to their children (possibly either by simply copying, or more likely through a separate genetic program applied to that same data). That way a database of accrued information could be passed around from generation to generation... a type of genetic programming educational system (or folklore). So that was there, but then when I began adding variables around the same time I realized that a variable that contained a single value and which was pushed onto to the exec stack was, due to the way Push "unwraps" lists, exactly the same as if there was just that variable alone pushed onto the stack. Plus, it seemed to open up more paths by having the cool effect of having any variable be able to take on the power of Push's CODE stack. (Not to mention, removing the need for a redundant CODE stack!)

Are these really improvements? I don't know, it's hard to say without actually testing with some genetic programming examples. That part doesn't exist yet in Gush... probably I'll follow the current lead of the Push community and do mutation on the linearized Plush representation of Push code.

Anyway, I also want to give a huge thank you to Lee Spector. Lee has been really patient in answering a lot of questions, and even in the case that Gush does have some improvements, they're minor tweaks compared to the years of work and experimentation that has gone into the Push/PushGP designs.

And hey, it was a lot of fun! Not to mention, a great way to procrastinate on the things I should be working on...

Possible routes for distributed anti-abuse systems

By Christine Lemmer-Webber on Tue 04 April 2017

I work on federated standards and systems, particularly ActivityPub. Of course, if you work on this stuff, every now and then the question of "how do you deal with abuse?" very rightly comes up. Most recently Mastodon has gotten some attention, which is great! But of course, people are raising the question, can federation systems really protect people from abuse? (It's not the first time to come up either; at LibrePlanet in 2015 a number of us held a "social justice for federated free software systems" dinner and were discussing things then.) It's an important question to ask, and I'm afraid the answer is, "not reliably yet". But in this blogpost I hope to show that there may be some hope for the future.

A few things I think you want out of such a system:

  • It should actually be decentralized. It's possible to run a mega-node that everyone screens their content against, but then what's the point?
  • The most important thing is for the system to prevent attackers from being able to deliver hateful content. An attack in a social system means getting your message across, so that's what we don't want to happen.
  • But who are we protecting, and against what? It's difficult to know, because even very progressive groups often don't anticipate who they need to protect; "social justice" groups of the past are often exclusionary against other groups until they find out they need to be otherwise (eg in each of these important social movements, some prominent members have had problems including other social justice groups: racist suffragists, civil rights activists exclusionary against gay and lesbian groups, gay and lesbian groups exclusionary against transgender individuals...). The point is: if we haven't gotten it all right in the past, we might not get it all right in the present, so the most important thing is to allow communities to protect themselves from hate.

Of course, keep in mind that no technology system is going to be perfect; these are all imperfect tools for mitigation. But what technical decisions you make do also affect who is empowered in a system, so it's also still important to work on these, though none of them are panaceas.

With those core bits down, what strategies are available? There are a few I've been paying close attention to (keep in mind that I am an expert in zero of these routes at present):

  • Federated Blocklists: The easiest "starter" route. And good news! If you're using the ActivityPub standard, there's already a Block activity, and you could build up group-moderated collections of people to block. A decent first step, but I don't think it gets you very far; for one thing, being the maintainer of a public blocklist is a risky activity; trolls might use that information to attack you. That and merging/squashing blocklists might be awkward in this system.
  • Federated reputation systems: You could also take it a step further by using something like the Stellar consensus protocol (more info in paper form or even a graphic novel). Stellar is a cryptographically signed ledger. Okay, yes, that makes it a kind of blockchain (which will make some peoples' eyes glaze over, but technically a signed git repository is also a blockchain), but it's not necessarily restricted to use of cryptocurrencies... you can track any kinds of transactions with it. Which means we could also track blocklists, or even less binary reputation systems! But what's most interesting about Stellar is that it's also federated... and in this case, federation means you can choose what groups you trust... but due to math'y concepts that I occasionally totally get upon being explained to me and then forget the moment someone asks me to explain to someone else, consensus is still enforced within the "slices" of groups you are following. You can imagine maybe the needs of an LGBT community and a Furry community might overlap, but they might not be the same, and maybe you'd be subscribed to just one or both, or neither. Or pick your other social groups, go wild. That said, I'm not sure how to make these "transactions" not public in this system, so it's very out there in the open, but since there's a voting system built-in maybe particular individuals won't be as liable for being attacked as individuals maintaining a blocklist are. Introducing a sliding-scale "social reputation system" may also introduce other dangerous problems, though I think Stellar's design is probably the least dangerous of all of these since it probably will still keep abusers out of a particular targeted group, but will allow marginalized-but-not-recognized-by-larger groups still avenues to set up their own slices as well.
  • "Charging" for distributing messages: Hoo boy, this one's going to be controversial! This was suggested to me by someone smart in the whole distributed technology space. It's not necessarily what we would normally consider real money that would be charged to distribute things... it could be a kind of "whuffie" cryptocurrency that you have to pay. Well the upside to this is it would keep low-funded abusers out of a system... the downside is that you've now basically powered your decentralized social network through pay-to-play capitalism. Unfortunately, even if the cryptocurrency is just some "social media fun money", imaginary currencies have a way of turning into real currencies; see paying for in-game currency in any massively multiplayer game ever. I don't think this gives us the power dynamics we want in our system, but it's worth noting that "it's one way to do it"... with serious side effects.
  • Web of trust / Friend of a Friend networks: Well researched in crypto systems, though nobody's built really good UIs for them. Still, a lot of potential if the system was somehow made friendly and didn't require showing up to a nerd-heavy "key-signing party"... if the system could have marking who you trust and who you don't (and not just as in terms of verifying keys) built as an elegant part of the UI, then yes I think this could be a good component for recognizing who you might allow to send you messages. There are also risks in having these associations be completely public, though I think web of trust systems don't necessarily have to be public... you can recurse outward from the individuals you do already know. (Edit: My friend ArneBab suggests that looking at how Freenet handles its web of trust would be a good starting point for someone wishing to research this. I have 0 experience with Freenet, but here are some resources.)
  • Distributed recommendation systems: Think of recommender systems in (sorry for the centralized system references) Amazon, Netflix, or any of the major social networks (Twitter, Facebook, etc). Is there a way to tell if someone or some message may be relevant to you, depending on who else you follow? Almost nobody seems to be doing research here, but not quite nobody; here's one paper: Collaborative Filtering with Privacy. Would it work? I have no idea, but the paper's title sure sounds compelling. (Edit: ArneBab also points out that credence-p2p might also be useful to look at. Relevant papers here.)
  • Good ol' Bayesian filtering: Unfortunately, I think that there's too many alternate routes of attacks for just processing a message's statistical contents to be good enough, though I think it's probably a good component of an anti-abuse system. In fact, maybe we should be talking about solutions that can use multiple components, and be very adaptive...
  • Distributed machine learning sets: Probably way too computationally expensive to run in a decentralized network, but maybe I'm wrong. Maybe this can be done in a the right way, but I get the impression that without the training dataset it's probably not useful? Prove me wrong! But I also just don't know enough about machine learning. Has the right property of being adaptive, though.
  • Genetic programs: Okay, I hear you saying, "what?? genetic programming?? as in programs that evolve?" It's a field of study that has quite a bit of research behind it, but very little application in the real world... but it might be a good basis for filtering systems in a federated network (I'm beginning to explore this but I have no idea if it will bear fruit). Programs might evolve on your machine and mine which adapt to the changing nature of social attacks. And best of all, in a distributed network, we might be able to send our genetic anti-abuse programs to each other... and they could breed and make new anti-abuse baby programs! However, for this to work the programs would have to carry part of the information of their "experiences" from parent to child. After all, a program isn't going to very likely randomly bump into finding out that a hateful group has started using "cuck" as a slur. But programs keep information around while they run, and it's possible that parent programs could teach wordlists and other information to their children, or to other programs. And if you already have a trust network, your programs could propagate their techniques and information with each other. (There's a risk of a side channel attack though: you might be able to find some of the content of information sent/received by checking the wordlists or etc being passed around by these programs.) (You'd definitely want your programs sandboxed if you took this route, and I think it would be good for filtering only... if you expose output methods, your programs might start talking on the network, and who knows what would happen!) One big upside to this is that if it worked, it should work in a distributed system... you're effectively occasionally bringing the anti-abuse hamster cages together now and then. However, you do get into an ontology problem... if these programs are making up wordlists and binding them to generated symbols, you're effectively generating a new language. That's not too far from human-generated language, and so at that point you're talking about a computer-generated natural language... but I think there may be evolutionary incentive to agree upon terms. Setting up the "fitness" of the program (same with the machine learning route) would also have to involve determining what filtering is useful / isn't useful to the user of the program, and that's a whole challenging problem domain of its own (though you could start with just manually marking correct/incorrect the way people train their spam filters with spam/ham). But... okay by now this sounds pretty far-fetched, I know, but I think it has some promise... I'm beginning to explore it with a derivative of some of the ideas from PushGP. I'm not sure if any of these ideas will work but I think this is both the most entertainingly exciting and crazy at the same time. (On another side, I also think there's an untapped potential for roguelike AI that's driven by genetic algorithms...) There's definitely one huge downside to this though, even if it was effective (the same problem machine learning groups have)... the programs would be nearly unreadable to humans! Would this really be the only source of information you'd want to trust?
  • Expert / constraint based systems: Everyone's super into "machine learning" based systems right now, but it's hard to tell what on earth those systems are doing, even when their results are impressive (not far off from genetic algorithms, as above! but genetic algorithms may not require the same crazy large centralized datasets that machine learning systems tend to). Luckily there's a whole other branch of AI involving "expert systems" and "symbolic reasoning" and etc. The most promising of these I think is the propagator model by Sussman / Radul / and many others (if you've seen the constraint system in SICP, this is a grandchild of that design). One interesting thing about the propagator model is that it can come to conclusions from exploring many different sources, and it can tell you how it came to those conclusions. These systems are incredible and under-explored, though there's a catch: usually they're hand-wired, or the rules are added manually (which is partly how you can tell where the conclusions came from, since the symbols for those sources may be labeled by a human... but who knows, maybe there's a way to map a machines concept of some term to a human's anyway). I think this won't probably be adaptive enough for the fast-changing world of different attack structures... but! but! we've explored a lot of other ideas above, and maybe you have some combination of a reputation system, and a genetic programming system, and etc, and this branch of study could be a great route to glue those very differing systems together and get a sense of what may be safe / unsafe from different sources... and at least understand how each source, on its macro level, contributed to a conclusion about whether or not to trust a message or individual.

Okay, well that's it I think! Those are all the routes I've been thinking about. None of these routes are proven, but I hope that gives some evidence that there are avenues worth exploring... and that there is likely hope for the federated web to protect people... and maybe we could even do it better for the silos. After all, if we could do filtering as well as the big orgs, even if it were just at or nearly at the same level (which isn't as good as I'd like), that's already a win: it would mean we could protect people, and also preserve the autonomy of marginalized groups... who aren't very likely to be well protected by centralized regimes if push really does come to shove.

I hope that inspires some people! If you have other routes that should be added to this list or you're exploring or would like to explore one of these directions, please contact me. Once the W3C Social Working Group wraps up, I'm to be co-chair of the following Social Community Group, and this is something we want to explore there.

Update: I'm happy to see that the Matrix folks also see this as "the single biggest existential threat" and "a problem that the whole decentralised web community has in common"... apparently they already have been looking at the Stellar approach. More from their FOSDEM talk slides. I agree that this is a problem facing the whole decentralized web, and I'm glad / hopeful that there's interest in working together. Now's a good time to be implementing and experimenting!

Wireworld in Emacs

By Christine Lemmer-Webber on Fri 10 March 2017

It is a truth universally acknowledged, that a hacker under the pressure of a deadline must be in want of a distraction. So it has been with me; I've a TODO list a mountain high, and I've been especially cracking under the stress of trying to get things moving along with ActivityPub. I have a test suite to write, and it's turned out to be very hard, and this after several other deadlines in a row. I've also meant to blog about several things; say the talks I gave at FOSDEM or at ChicagoLUG. I've got a leak in my inbox that's been running for so long that the basement of my email has developed an undertow. So today, instead of getting what I knew I should be doing done, I instead went off and did something much more interesting, which is to say, I implemented Wireworld in emacs.

Wireworld in emacs screenshot

What is Wireworld? It's a cellular automaton, not unlike Conway's Game of Life. Except with Wireworld, the "cells" in play are a bit more constrained... you have a set of wires, and electrons run along them, multiply, and die out, but the paths stay the same. The rules are very simple to implement (Wikipedia says all there is to know). But you can build incredible things with it... even a fully working computer!

Anyway, like many hacks, this one appeared out of boredom/distraction. I had long wanted to play with Wireworld, and I was reminded of it by seeing this cool hack with a digital clock implemented in Conway's Game of Life. It reminded me just how much I wanted to try implementing that computer, or even much simpler circuitry, but I had never been able to get started, because I couldn't find a working implementation that was easy for me to package. (I started packaging Golly for Guix but got stuck for reasons I can't remember.) I started thinking about how much I liked typing out ASCII art in Emacs, and how cool would it be to just "draw out" circuits in a buffer? I started experimenting... and within two hours, I had a working implementation! Two more hours later, I had a major mode with syntax highlighting and a handy C-c C-c keybinding for "advancing" the buffer. Live hacking in Emacs is amazing!

More could be done. It would be nice to have a shortcut, say C-c C-s, that starts up a simulation in a new buffer and runs through the simulation automatically without clobbering your main buffer. (It could work the way M-x life does.) Anyway, the code is here should you want to play around.

Happy (circuit) hacking!

Gems from really old lisp mailing lists

By Christine Lemmer-Webber on Thu 09 March 2017

... which are archived here. I'm especially finding the CADR lisp machine mailing list to be interesting.

The lispnews list is a bit hard to read, but unveils some key lisp ideas one after another in their earliest state; fascinating stuff. First reference to unwind-protect, and the details of backquote/quasiquote are being worked out here. (EDIT: more on backquote's history.)

Here's some interesting bits: David Moon (who worked on Common Lisp, helped develop Emacs, and was one of the original developers of the the lisp machine) mentioning Common Lisp and the CADR switching to it; rms (who was a maintainer of lisp software at the time) not being so pleased about it, or the way it was announced, and Guy L. Steele (who was editing the Common Lisp standard) replying. Later RMS seems to be investigating how to make it work together.

Sadly it seems that debate was discouraged on that list, and I don't see the BUG-LISPM list around anywhere.

You probably noticed that I was cherry-picking reading emails by RMS. It's no coincidence... I knew this was coming up, and here it is:

Here also is where Symbolics started to move out of the AI lab and where they announced that MIT may use their software, but may not distribute it outside the lab... which is, according to my understanding, one of the major factors frustrating rms and leading to the founding of GNU. A quote from that email:

This software is provided to MIT under the terms of the Lisp System License Agreement between Symbolics and MIT. MIT's license to use this software is non-transferable. This means that the world loads, microloads, sources, etc. provided by Symbolics may not be distributed outside of MIT without the written permission of Symbolics.

There it is, folks! And here's another user, Martin Connor, raising concerns about what the Symbolics agreement will mean. That person seems to be taking it well. But guess who isn't? Okay, you already guessed RMS, and were right. Presumably a lot of argument about this was happening on the BUG-LISPM list. I guess it's not important, but here is an amusing back and forth. I wonder if anyone has access to the BUG-LISPM or BUG-LISPM-MIT lists still?

Notably RMS wants to clarify that his work doesn't go to Lisp Machines Incorporated specifically, either, even though he was more okay with them.

I'm giving a talk at LibrePlanet 2017 on the Lisp Machine and GNU, which explains why I'm reading all this! Okay, well maybe I would have read it anyway.

Phyllis Fox, documenting Lisp History

By Christine Lemmer-Webber on Wed 08 March 2017

In honor of International Womens' Day, let's celebrate Phyllis Fox, who may have saved Lisp from the dustbin of history... by documenting it. From her oral history:

HAIGH: So you say that you wrote the first LISP manual?

FOX: Now, this was not because I was a great LISP programmer, but they never documented or wrote down anything, especially McCarthy. Nobody in that group ever wrote down anything. McCarthy was furious that they didn’t document the code, but he wouldn’t do it, either. So I learned enough LISP that I could write it and ask them questions and write some more. One of the people in the group was a student named Jim Slagel, who was blind. He learned LISP sort of from me, because I would read him what I had written and he would tell me about LISP and I would write some more. His mind was incredible. He could give lectures. Have you ever seen a blind person lecture?

HAIGH: No.

FOX: They write on a black (or white) board, and then they put a finger on the board at the point they have stopped to keep the place. Then they talk some more and then they go on writing. His mind was remarkable. He was very helpful to me. But I wrote those manuals. I would ask questions from Minsky or McCarthy, and I got it done. I think it was helpful for people to have it. I guess, essentially I’m a documenter. If you’re looking for it, that’s what I am.

Phyllis Fox did a lot more than that, but as a Lisp enthusiast, thank you to Dr. Fox for preserving our programming knowledge!

CRISPR drive as a Thompson hack?

By Christine Lemmer-Webber on Sat 04 March 2017

I listened to this episode of Radiolab on CRISPR last night (great episode, everyone should listen), and I couldn't stop thinking about this part discussed at the end of the episode about a CRISPR gene drive... the idea is, you might want to replace some gene in a population, so you might use CRISPR to edit the gene of a single parent. Then that parent might reproduce, and there's a chance that its child might have it in the population. Natural selection whether it stays or not... it could, very well, peter out of a population.

But then they talked about this idea, which apparently worked on yeast "on the first try", which was to have the parent modify the yeast of the child during reproduction. The parent includes the instructions so that in reproduction, it goes through and edits its new child's DNA, and inserts the instructions on how to have that editor in the child's DNA too.

Holy crap, am I wrong or is that a Thompson hack in DNA form?

An even more distributed ActivityPub

By Christine Lemmer-Webber on Thu 06 October 2016

So ActivityPub is nearing Candidate Recommendation status. If you want to hear a lot more about that whole process of getting there, and my recent trip to TPAC, and more, I wrote a post on the MediaGoblin blog about it.

Last night my brother Stephen came over and he was talking about how he wished ActivityPub was more of a "transactional" system. I've been thinking about this myself. ActivityPub as it is designed is made for the social network of 2014 more or less: trying to reproduce what the silos do, which is mutate a big database for specific objects, but reproduce that in a distributed way. Well, mutating distributed systems is a bit risky. Can we do better, without throwing out the majority of the system? I think it's possible, with a couple of tweaks.

  • The summary is to move to objects and pointers to objects. There's no mutation, only "changing" pointers (and even this is done via appending to a log, mostly).

    If you're familiar with git, you could think of the objects as well, objects, and the pointers as branches.

    Except... the log isn't in the objects pointing at their previous revisions really, the logging is on the pointers:

    [pointer id] => [note content id]
  • There's (activitystreams) objects (which may be content addressed, to be more robust), and then "pointers" to those, via signed pointer-logs.

  • The only mutation in the system is that the "pointers", which are signed logs (substitute "logs" for "ledger" and I guess that makes it a "blockchain" loosely), are append-only structures that say where the new content is. If something changes a lot, it can have "checkpoints". So, you can ignore old stuff eventually.

  • Updating content means making a new object, and updating the pointer-log to point to it.

  • This of course leads to a problem: what identifier should objects use to point at each other? The "content" id, or the "pointer-log" id? One route is that when one object links to another object, it could link to both the pointer-log id and the object id, but that hardly seems desirable...

  • Maybe the best route is to have all content ids point back at their official log id... this isn't as crazy as it sounds! Have a three step process for creating a brand new object:

    • Open a new pointer-log, which is empty, and get the identifier

    • Create the new object with all its content, and also add a link back to the pointer-log in the content's body

    • Add the new object as the first item in the pointer-log

  • At this point, I think we can get rid of all side effects in ActivityPub! The only mutation thing is append-only to that pointer-log. As for everything else:

    • Create just means "This is the first time you've seen this object." And in fact, we could probably drop Create in a system like this, because we don't need it.

    • Update is really just informing that there's a new entry on the pointer-log.

    • Delete... well, you can delete your own copy. You're mostly informing other servers to delete their copy, but they have a choice if they really will... though that's always been true! You now can also switch to the nice property that removing old content is now really garbage collection :)

  • Addressing and distribution still happens in the same, previous ways it did, I assume? So, you still won't get access to an object unless you have permissions? Though that gets more confusing if you use the (optional) content addressed storage here.

  • You now get a whole lot of things for free:

    • You have a built in history log of everything

    • Even if someone else's node goes down, you can keep a copy of all their content, and keep around the signatures to show that yeah, that really was the content they put there!

    • You could theoretically distribute storage pretty nicely

    • Updates/deletes are less dangerous

(Thanks to Steve for encouraging me to think this through more clearly, and lending your own thoughts, a lot of which is represented here! Thanks also to Manu Sporny who was the first to get me thinking along these lines with some comments at TPAC. Though, any mistakes in the design are mine...)

Of course, you can hit even more distributed-system-nerd points by tossing in the possibility of encrypting everything in the system, but let's leave that as an exercise for the reader. (It's not too much extra work if you already have public keys on profiles.)

Anyway, is this likely to happen? Well, time is running out in the group, so I'm unlikely to push for it in this iteration. But the good news, as I said, is that I think it can be built on top without too much extra work... The systems might even be straight-up compatible, and eventually the old mutation-heavy-system could be considered the "crufty" way of doing things.

Architectural astronaut'ing? Maybe! Fun to think about! Hopefully fun to explore. Gotta get the 2014-made-distributed version of the social web out first though. :)

Will your tooling let me go offline?

By Christine Lemmer-Webber on Fri 15 July 2016

I have been a happy man ever since January 1, 1990, when I no longer had an email address. I'd used email since about 1975, and it seems to me that 15 years of email is plenty for one lifetime.

Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration. I try to learn certain areas of computer science exhaustively; then I try to digest that knowledge into a form that is accessible to people who don't have time for such study.

-- Donald Knuth on not reading email

Finally working again on tasks where I can "go offline" for periods of time. For a while I've been working on things where all the documentation I needed was "live" on the web, and it was too difficult to know what to pull down in advance. Now I'm going offline for periods to work on the thing I'm doing, and remembering just how much that helps. Sometimes I just can't focus with eternal streams of... everything.

I've found over time that I'm massively more productive working with software that has texinfo manuals or man pages, because I can "go offline" for a while and think through problems without the eternal distractnet affecting my ability to concentrate. (I know info manuals aren't great for non-emacs users. But for me, it really helps me focus. Plus, there's nothing like navigating through info manuals in emacs if you are an emacs user.)

I'm not claiming this is a full on accessibility issue, but given my really strong ADD, whether or not you provide good offline manuals affects how productive I am with your tooling.

This post was originally posted to the pumpiverse.

Memories of a march against DRM

By Christine Lemmer-Webber on Wed 23 March 2016

Above image CC BY 3.0,originallyhere,and here's a whole gallery ofimages.{#Protesting EME before the Microsoft building; CC BY 3.0}

I participated in a rally against the W3C endorsing DRM last Sunday. I know it was recorded, but I haven't seen any audio or video recordings up yet, and some friends have asked what really happened there. I thought I'd write up what I remembered.

First, some context: the rally (and subsequent roundtable discussion) wasn't officially part of LibrePlanet, but it did happen right after it. This was one of the busiest free software focused weeks of my life, and just earlier in the week I had been participating in the Social Web Working Group at the W3C, trying to hammer out our work on federation and other related standards. I'm so excited about this work, that it stands out in an interesting contrast to my feelings on a different "standards in the W3C" issue: the real danger that the W3C will endorse DRM by recommending the Encrypted Media Extensions specification.

Before I get to the rally itself, I want to dispel what I think has been some intentional muddying of the issue by advocates of the specification. Let's turn to the words of the specification itself:

This specification does not define a content protection or Digital Rights Management system. Rather, it defines a common API that may be used to discover, select and interact with such systems as well as with simpler content encryption systems. Implementation of Digital Rights Management is not required for compliance with this specification: only the Clear Key system is required to be implemented as a common baseline.

Whew! That doesn't sound so bad does it? Except, oh wait, reading this you might think that this isn't about DRM at all, and that's an intentional piece of trickery by the authors of this specification. As Danny O'Brien later said at the panel (I'm paraphrasing here): "While it's true that the specification doesn't lay out a method for implementing DRM, it instead lays out all the things that surround this hole. The thing is, it's a DRM shaped hole, and indeed DRM is the only thing that fits in that hole."

So once you look at it that way, yes, it's a DRM-enabling specification. We have other programs and standards for handling "encryption". Encryption is good, because it empowers users. The goal of this specification is to make space for something to fit onto your computer to strip you of your computing power and freedom.

With that said, onto the memories of the evening.

The march started outside MIT’s Ray and Maria Stata Center, where the W3C offices are. There were a good number of people there, though I didn't count them. I'm guessing it was 50 or so people, which is not bad turnout at all for a post-busy-conference everyone-is-probably-exhausted march. Despite anticipating being totally exhausted, I was surprised to find that I wasn't, and neither was anyone around me. Everyone seemed super fired up.

There were some speeches from Harry Halpin and Zak Rogoff and myself to kick things off. I don't remember Harry or Zak's speeches at this stage, though I remember thinking they were pretty good. (Harry made clear that he was a W3C staff member but was acting in his own capacity.)

As for what I said, here's my rough memory:

I started MediaGoblin from the goal and vision of preserving the decentralized nature of the World Wide Web in the growing area of media publishing, through audio, video, images, and so on. Thus I was proud to join the W3C in the standards work on our work formalizing federation through ActivityPub and by participating in the Social Web Working Group. But if the W3C enables EME, it enables DRM, and this threatens to undermine all that. If this were to apply to video only, this would be threat enough to oppose it. But once that floodgate opens, DRM will quickly apply to all types of documents distributed through the web, including HTML and JavaScript. The W3C's lasting legacy has been to build a decentralized document distribution network which enables user freedom. We must not allow the W3C to become an enemy of itself. Don't let the W3C lower its standards, oppose DRM infecting the web!

Anyway, something like that!

A lot of things happened, so let me move on to memory from what happened from there in bulleted list form:

  • We marched from MIT to Microsoft. There were quite a few chants, and "rm DRM" was the most fun to chant, but notably probably the least clear to any outsiders.
  • Danny O'Brien from the EFF gave a speech in front of the Microsoft building giving a history of DRM and why we must oppose it. He noted that one of the most dangerous parts of DRM in the United States is that the DMCA makes explaining how DRM works a crime, thus talking about the issue can become very difficult.
  • After the march we went to the roundtable discussion / panel, hosted at the MIT Media Lab. It was a full room, with even more people than the march (maybe 80-100 people attending, but I'm bad at counting these things). Everyone ate some pizza, which was great times. Then Richard Stallman, Danny O'Brien, Joi Ito, and Harry Halpin all took turns speaking.
  • Richard Stallman started with an introduction to free software generally. He then went through a detailed explanation about how DRM makes user freedom impossible. He then said something funny like "I was allowed 30 minutes, but I notice I only used 15; I will use the other 15 minutes to follow up to others if necessary." (He used a good portion of them to correct people on their terminology.)
  • Danny O'Brien gave a detailed explanation of the history of the fight against DRM. He also gave his "EME is a standard with a DRM shaped hole" comment. He then gave a history of the fight of something he considered similar, the fight against software patents, and how the W3C had decided to fight patents by including rules that W3C members could not sue using patents for concepts covered by these specifications.
  • This lead into what was probably the biggest controversy among the panel members: a proposal by the EFF of a "covenant" surrounding DRM. The proposal was something like, "if the W3C must adopt EME, it should provide rules protecting the public by making members promise that they will never sue free software developers and security researchers for violating DRM." Richard Stallman had a strong response to this, saying that this is not really a compromise (Danny clarified that this proposal did not mean giving up fighting DRM) and while it could make things a little bit less dangerous, it would still be very dangerous. It could be easily circumvented as the party suing might not be a W3C member (and indeed, you could imagine many record companies or Hollywood studios who are not W3C members suing in such a scenario).
  • A W3C staff employee at one point said that if the general public was to comment on EME, it should be disagreeing on technical points, and be careful not to raise confused technical issues, as that will lead comments to being dismissed. Danny gave a nice response, saying that while he agreed that technical issues should be correctly engaged, that technical decisions are made within a policy context, so we should be careful to not tell people to limit themselves to technical-issue-only comments.
  • Joi Ito gave various anecdotes about his understanding of what lead DRM to its current rise in prominence today. He also used "intellectual property" several times, leading predictably to a terminology-correcting response from RMS.
  • One audience member suggested that if the W3C adopts EME, it shows that it can not be trusted with the responsibility of managing the web's standards. Interestingly, this seemed to be met with a good deal of approval from the crowd. It also was an interesting counter-point to the "well if the W3C doesn't do it, someone will just set up another standards body to support DRM." This "risk" to the W3C might be just as or more likely of other standards bodies emerging to replace it if it moves forward with adopting EME (but in this case, by individuals motivated by preserving the decentralized integrity of the web).
  • Harry Halpin ended the panel with a bang... first, he reiterated that in participating in this panel, he was acting independently and not as a W3C employee. (And again, to paraphrase:) "However, I will say that there are some lines that must be drawn. Permitting DRM to enter into the web is a line that must not be crossed. And if the W3C moves to recommend EME, I will resign."

Bam!

And so, that was my Sunday evening. If you were going to tell me that I would end the last evening of the last day of the week even more energized than when I began it (especially after a week as busy as that!), I would not have believed you. But there it is! I'm glad I got participate.

For more coverage, read up at Defective By Design, Motherboard, and BoingBoing. Oh yeah, and sign the anti-DRM petition while you're at it!

Goodbye 2015, Hello 2016

By Christine Lemmer-Webber on Mon 04 January 2016

I'm sitting on a train traveling from Illinois to California, the long stretch of a journey from Madison to San Francisco. Morgan sits next to me. We are staring out the windows of the observation deck of this train as we watch the snow covered mountains pass by. I am feeling more relaxed and at peace than I have in years.

2016 is opening in a big way for me. As you may have heard (I mentioned it in the last State of the Goblin post) MediaGoblin was accepted into the Stripe Open Source Retreat program. Basically, Stripe gives us no-strings-attached funding for me to advance our work on MediaGoblin, but they wanted me to work from their office during that time. Seems like quite a deal to me! Unfortunately it does mean leaving Morgan behind in Madison for that time period. But that's why we splurged on a fancy train car and why she's joining me in San Francisco for the first week, so we can spend some quality time together. (Plus, Morgan has a conference that first week in San Francisco anyway; double plus, Amtrak has an extremely generous baggage policy so I'm able to get all of the belongings I need for that period shipped along with me fairly easily.) Morgan and I have been talking about but not really taking a vacation for a while, so we decided the moving-scenery approach would be a nice way to do things. It's great... we're mostly reading and drinking tea and staring out the window at the beautiful passings-by. I could hardly imagine a nicer send-off. (So yeah, if you're considering taking such a journey with your loved ones, I recommend it.)

The passage of scenery leads to reflection on the passage of time. Now seems a good time to write a bit about 2015 and what it meant. It was a very eventful year for me. I have come recently to explain to people that "I live a magical and high-stress life"; 2015 evoked that well. From a personal standpoint, Morgan and I's relationship runs strong, maybe stronger than ever, and I am thankful for that. From the broader family standpoint, the graph advances steady at times with strong peaks and valleys, perhaps more pronounced than usual. Love, gain, success, loss... it feels that everything has happened this year. Our lives have also been rearranged dramatically in an attempt to help a family member in a time of need, and that has its own set of peaks and valleys, as is to be expected. But that is the stuff of life, and you do what you can when you can, and you try your best, and you hope that others will try their best, what happens from there happens, and you use it to plan the next round of doing the best you can.

That's all very vague I suppose, but many things feel too private to discuss so publicly. Nonetheless, I wanted to record the texture of the year.

So what in the way of, you know, that thing we call a "career"? Well, it has continued to be magical, in the way that I have had a lot of freedom to explore things and address issues I really care about. Receiving an award (particularly since I did not know I had even been a candidate ahead of being notified that I received it) has also been gratifying and reassuring in some ways; I regularly fear that I am not doing well enough at advancing the issues I care about, but clearly some people do, and that's nice. It has also continued to be high stress, in that the things I worry about feel very high stakes on a global level, and that the difficulty of accomplishing them also feels very strong, and of course many are not there yet. Nonetheless, there has been a lot of progress this year, though it has come with a worrying increase of scope in the number of things I am attempting to accomplish.

We're much nearer to 1.0 on MediaGoblin, which is a huge relief. Of course, this is mostly due to Jessica Tallon's hard work on getting federation in MediaGoblin working, and other MediaGoblin community memebers doing many other interesting things. Embarassingly, I have done a lot less on MediaGoblin than in the last few years. In a sense, this is okay, because the money from the campaign has been going to pay Jessica Tallon, and not myself. I still feel bad about it though. The good news is that the focus time from the Stripe retreat should allow me the space and focus to hopefully get 1.0 actually out the door. So that leads to strong optimism.

The reduced time spent coding on MediaGoblin proper has been deceptive, since most of the projects I've worked on have spun out of work I believe is essential for MediaGoblin's long-term success. I took a sabbatical from MediaGoblin proper mid-year to focus on two goals: advancing federation standards (and my own understanding of them), and advancing the state of free software deployment. (I'm aware of a whiff of yak fumes here, though for each I can't see how MediaGoblin can succeed in their present state.) I believe I have made a lot of progress in both areas. As for federation, I've worked hard in participating in the W3C Social Working Group, I have done some test implementations, and recently I became co-editor on ActivityPump. On deployment, much work has been done on the UserOps side, both in speaking and in actual work. After initially starting to try to use Salt/Ansible as a base and hitting limitations, then trying to build my own Salt/Ansible'esque system in Hy and then Guile and hitting limitations there too, I eventually came to look into (after much prodding) Guix. At the moment, I think it's the only foundation solid enough on which to build the tooling to get us out of this mess. I've made some contributions, albeit mostly minor, have begun promoting the project more heavily, and am trying to work towards getting more deployment tooling done for it (so little time though!). I'm also now dual booting between GuixSD and Debian, and that's nice.

(Speaking of, towards the end of the year I switched to a Minifree x200 on which I'm dual booting Debian and Guix. I believe this puts me much deeper into the "free software vegan" territory.)

<*COMMENT*> fundamentals, brushing up on

I also believe that over the last year I have changed dramatically as a programmer. For nearly ten years I identified as a "python web developer", but I believe that identity no longer feels like an ideal description. One thing I have always been self conscious of is how little I've known about deeper computer science fundamentals. This has changed a lot, and I believe much of it has been spending so much time in the Guile and Scheme communities, and reading the copious interesting literature that is available there. My brother Steve and I also now often meet together and watch various programming lectures and discuss them, which has been both illuminating and also a great way to understand a side of my brother I never knew. It's a nice mix; I'm a very get-things-done person, he's a very theoretical person, and we're meeting partway in the middle and I think both of us are stretching our brains in ways we hadn't before. I feel like a different programmer than I was. A year and a half ago, I remember being on a bike ride with Steve and I remember complaining to him that I didn't understand why functional programmers are so obsessed with immutability... mutation is so useful, I exclaimed! Steve paused and said very carefully, "Well... mutation brings a lot of problems..." but I just didn't understand what he was getting at. Now I look back on that bike ride and wonder at the former-me taking that position.

(All that said though, I'm glad that I've had the background I have of being a "python web developer" first, for a matter of perspective...)

I do feel that much has changed in my life in this last year. There were hard things, but overall, life has been good to me, and I still am doing what I believe in and care about. Not everyone has that opportunity. And this train ride already points the way to a year that should be productive, and will certainly be eventful.

Anyway, that's enough navel-gazing-reflection, I suppose. One more navel-gaze: here's to the changed person on the other end of 2016. I hope I can do them justice. And I hope you can do yourself justice in 2016 too.