Archives

Tags

What should fit in a FOSS license?

By Christopher Lemmer Webber on Mon 09 March 2020

Originally sent in an email to the OSI license-discuss mailing list.

What terms belong in a free and open source software license? There has been a lot of debate about this lately, especially as many of us are interested in expanding the role we see that we play in terms of user freedom issues. I am amongst those people that believe that FOSS is a movement thats importance is best understood not on its own, but on the effects that it (or the lack of it) has on society. A couple of years ago, a friend and I recorded an episode about viewing software freedom within the realm of human rights; I still believe that, and strongly.

I also believe there are other critical issues that FOSS has a role to play in: diversity issues (both within our own movement and empowering people in their everyday lives) are one, environmental issues (the intersection of our movement with the right-to-repair movement is a good example) are another. I also agree that the trend towards "cloud computing" companies which can more or less entrap users in their services is a major concern, as are privacy concerns.

Given all the above, what should we do? What kinds of terms belong in FOSS licenses, especially given all our goals above?

First, I would like to say that I think that many people in the FOSS world, for good reason, spend a lot of time thinking about licenses. This is good, and impressive; few other communities have as much legal literacy distributed even amongst their non-lawyer population as ours. And there's no doubt that FOSS licenses play a critical role... let's acknowledge from the outset that a conventionally proprietary license has a damning effect on the agency of users.

However, I also believe that user freedom can only be achieved via a multi-layered approach. We cannot provide privacy by merely adding privacy-requirements terms to a license, for instance; encryption is key to our success. I am also a supporter of code of conducts and believe they are important/effective (I know not everyone does; I don't care for this to be a CoC debate, thanks), but I believe that they've also been very effective and successful checked in as CODE-OF-CONDUCT.txt alongside the traditional COPYING.txt/LICENSE.txt. This is a good example of a multi-layered approach working, in my view.

So acknowledging that, which problems should we try to solve at which layers? Or, more importantly, which problems should we try to solve in FOSS licenses?

Here is my answer: the role of FOSS licenses is to undo the damage that copyright, patents, and related intellectual-restriction laws have done when applied to software. That is what should be in the scope of our licenses. There are other problems we need to solve too if we truly care about user freedom and human rights, but for those we will need to take a multi-layered approach.

To understand why this is, let's rewind time. What is the "original sin" that lead to the rise proprietary software, and thus the need to distinguish FOSS as a separate concept and entity? In my view, it's the decision to make software copyrightable... and then, adding similar "state-enforced intellectual restrictions" categories, such as patents or anti-jailbreaking or anti-reverse-engineering laws.

It has been traditional FOSS philosophy to emphasize these as entirely different systems, though I think Van Lindberg put it well:

Even from these brief descriptions, it should be obvious that the term "intellectual property" encompasses a number of divergent and even contradictory bodies of law. [...] intellectual property isn't really analagous to just one program. Rather, it is more like four (or more) programs all possibly acting concurrently on the same source materials. The various IP "programs" all work differently and lead to different conclusions. It is more accurate, in fact, to speak of "copyright law" or "patent law" rather than a single overarching "IP law." It is only slightly tongue in cheek to say that there is an intellectual property "office suite" running on the "operating system" of US law. -- Van Lindberg, Intellectual Property and Open Source (p.5)

So then, as unfortunate as the term "intellectual property" may be, we do have a suite of state-enforced intellectual restriction tools. They now apply to software... but as a thought experiment, if we could rewind time and choose between a timeline where such laws did not apply to software vs a time where they did, which would have a better effect on user freedom? Which one would most advance FOSS goals?

To ask the question is to know the answer. But of course, we cannot reverse time, so the purpose of this thought experiment is to indicate the role of FOSS licenses: to use our own powers granted under the scope of those licenses to undo their damage.

Perhaps you'll already agree with this, but you might say, "Well, but we have all these other problems we need to solve too though... since software is so important in our society today, trying to solve these other problems inside of our licenses, even if they aren't about reversing the power of the intellectual-restriction-office-suite, may be effective!"

The first objection to that would be, "well, but it does appear that it makes us addicted in a way to that very suite of laws we are trying to undo the damage of." But maybe you could shrug that off... these issues are too important! And I agree the issues are important, but again, I am arguing a multi-layered approach.

To better illustrate, let me propose a license. I actually considered drafting this into real license text and trying to push it all the way through the license-review process. I thought that doing so would be an interesting exercise for everyone. Maybe I still should. But for now, let me give you the scope of the idea. Ready?

"The Disposable Plastic Prevention Public License". This is a real issue I care about, a lot! I am very afraid that there is a dramatic chance that life on earth will be choked out within the next number of decades by just how much non-degradeable disposable plastic we are churning out. Thus it seems entirely appropriate to put it in a license, correct? Here are some ideas for terms:

  • You cannot use this license if you are responsible for a significant production of disposable plastics.

  • You must make a commitment to reduction in your use of disposable plastics. This includes a commitment to reductions set out by (a UN committee? Haven't checked, I bet someone has done the research and set target goals).

  • If you, or a partner organization, are found to be lobbying against laws to eliminate disposable plastics, your grant of this license is terminated.

What do you think? Should I submit it to license-review? Maybe I should. Or, if someone else wants to sumbit it, I'll enthusiastically help you draft the text... I do think the discussion would be illuminating!

Personally though, I'll admit that something seems wrong about this, and it isn't the issue... the issue is one I actually care about a lot, one that keeps me up at night. Does it belong in a license? I don't think that it does. This both tries to both fix problems via the same structures that we are trying to undo problems with and introduces license compatibility headaches. It's trying to fight an important issue on the wrong layer.

It is a FOSS issue though, in an intersectional sense! And there are major things we can do about it. We can support the fight of the right-to-repair movements (which, as it turns out, is a movement also hampered by these intellectual restriction laws). We can try to design our software in such a way that it can run on older hardware and keep it useful. We can support projects like the MNT Reform, which aims to build a completely user-repairable laptop, and thus push back against planned obsolescence. There are things we can, and must, do that are not in the license itself.

I am not saying that the only kind of thing that can happen in a FOSS license is to simply waive all rights. Indeed I see copyleft as a valid way to turn the weapons of the system against itself in many cases (and there are a lot of cases, especially when I am trying to push standards and concepts, where I believe a more lax/permissive approach is better). Of course, it is possible to get addicted to those things too: if we could go back in our time machine and prevent these intellectual restrictions laws from taking place, source requirements in copyleft licenses wouldn't be enforceable. While I see source requirements as a valid way to turn the teeth of the system against itself, in that hypothetical future, would I be so addicted to them that I'd prefer that software copyright continue just so I could keep them? No, that seems silly. But we also aren't in that universe, and are unlikely to enter that universe anytime soon, so I think this is an acceptable reversal of the mechanisms of destructive state-run intellectual restriction machine against itself for now. But it also indicates maybe a kind of maxima.

But it's easy to get fixated on those kinds of things. How clever can we be in our licenses? And I'd argue: minimally clever. Because we have a lot of other fights to make.

In my view, I see a lot of needs in this world, and the FOSS world has a lot of work to do... and not just in licensing, on many layers. Encryption for privacy, diversity initiatives like Outreachy, code of conducts, software that runs over peer to peer networks rather than in the traditional client-server model, repairable and maintainable hardware, thought in terms of the environmental impact of our work... all of these things are critical things in my view.

But FOSS licenses need not, and should not, try to take on all of them. FOSS licenses should do the thing they are appropriate to do: to pave a path for collaboration and to undo the damage of the "intellectual restriction office suite". As for the other things, we must do them too... our work will not be done, meaningful, or sufficient if we do not take them on. But we should do them hand-in-hand, as a multi-layered approach.

Terminal Phase v1.1 and Spritely Goblins v0.6 releases!

By Christopher Lemmer Webber on Thu 05 March 2020

Hello all! I just did a brand new release of both:

So some highlights from each.

Terminal Phase

Okay, this is flashier, even if less important than Goblins. But the main thing is that I added the time travel debugging feature, which is so flashy I feel the need to show that gif again here:

Time travel in Spritely Goblins shown through Terminal Phase

Aside from time travel, there aren't many new features, though I plan on adding some in the next week (probably powerups or a boss fight), so another release should be not far away.

And oh yeah, since it's a new release, now is a good time to thank the current supporters:

Terminal Phase Credits

But yeah, the main thing that was done here is that Terminal Phase was updated for the new release of Goblins, so let's talk about that!

Goblins

For those who aren't aware, Spritely Goblins is a transactional actor model library for Racket.

v0.6 has resulted in a number of changes in semantics.

But the big deal is that Goblins finally has decent documentation, including a fairly in-depth tutorial and documentation about the API. I've even documented how you, in your own programs, can play with Goblins' time travel features.

So, does this mean you should start using it? Well, it's still in alpha, and the most exciting feature (networked, distributed programming) is still on its way. But I think it's quite nice to use already (and I'm using it for Terminal Phase).

Anyway, that's about it... I plan on having a new video explaining more about how Goblins works out in the next few days, so I'll announce that when it happens.

If you are finding this work interesting, a reminder that this work is powered by people like you.

In the meanwhile, hope you enjoy the new releases!

Content Addressed Vocabulary

By Christopher Lemmer Webber on Wed 26 February 2020

How can systems communicate and share meaning? Communication within systems is preceded by a form of meta-communication; we must have a sense that we mean the same things by the terms we use before we can even use them.

This is challenging enough for humans who must share meaning, but we can resolve ambiguities with context clues from a surrounding narrative. Machines, in general, need a context more explicitly laid out for them, with as little ambiguity as possible.

Standards authors of open-world systems have long struggled with such systems and have come up with some reasonable systems; unfortunately these also suffer from several pitfalls. With minimal (or sometimes none at all) adjustment to our tooling, I propose a change in how we manage ontologies.

How we deal with ambiguous terms today

Consider Note, a seemingly simple term in ActivityStreams, the vocabulary used by ActivityPub. The meaning of Note, as described by the ActivityStreams vocabulary, seems simple enough: Represents a short written work typically less than a single paragraph in length.

Here is how an ActivityStreams usage of Note might look (a bit simplified from what it would probably look like in practice):

  {"@context": "https://www.w3.org/ns/activitystreams",
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?"}

What's that @context thing? This is some JSON-LD thing, which tries to be "more exact" about what Note we must be talking about. It does so by mapping Note to https://www.w3.org/ns/activitystreams#Note by something like the following:

  {"as": "https://www.w3.org/ns/activitystreams#",
   "Note": "as:Note",
   "content": "as:content",
   ...}

The choice to use JSON-LD has been semi-controversial in ActivityPub land; historically there was some debate about whether or not we needed to be "more exact" at all as to what terms mean. This post really isn't about JSON-LD as much as it is the more general topic of vocabularies and vocabulary mapping systems. There are other concerns people raise about JSON-LD, usually around the tooling... that's not the scope of this post. This blogpost could as easily apply to XML or Turtle or whatever; the protocol I've worked on just happens to use JSON-LD to do that, so I've used it as my illustration.

That said, the ActivityPub spec tries to make things as simple as possible for the default case of ActivityPub usage by saying that the ActivityStreams context is implied, so that if you're not doing anything complicated, so:

  {"@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?"}

... is really the same as the first example.

So okay, probably everyone can guess what Note means, but what about sensitive? What the heck is that? It doesn't appear in the ActivityStreams vocabulary; it kind of implies something along the lines of content-warning type behavior, like "this content may be considered sensitive" by some users, but how would you guess that just by the term? This is an extension, and it lives at http://joinmastodon.org/ns#sensitive.

So maybe if we were going to use it (and if we inline our context) it might look like:

  {"@context": {"as": "https://www.w3.org/ns/activitystreams#",
                "toot": "http://joinmastodon.org/ns#",
                "Note": "as:Note",
                "content": "as:content",
                "sensitive": "toot:sensitive"},
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?",
   "sensitive": true}

(I mean, the Great Ontology Wars are a sensitive topic for some.)

The choice of JSON-LD in ActivityPub is controversial for various reasons. But it turns out what isn't really controversial anymore is whether we need some way of being more exact about the way we speak about terms... those who used to complain about that mostly now agree (disagreements then surround what tooling need to be used to do so (not in scope of this post), and namespace governance (in scope of this post)).

Maybe you feel like, having heard what sensitive and Note mean, these are the obvious definitions. But consider that Note itself could have meant something very different. Are we talking about a short mostly-textual post (probably on a microblog), as ActivityStreams does? Are we talking about a musical note? Are we instructing someone to take note of something, as an action (or yes, activity)?

So terms really are ambiguous, and in a decentralized but extensible system with open world assumptions, we are eventually going to result in conflicts. The choice to map our vocabulary to URIs is actually a very reasonable way to reduce ambiguity. Unfortunately, the choice to map them to namespaces and to live URIs (a-la http(s): URIs), is a mistake that will eventually bite us (and doubly so for JSON-LD contexts).

Problems appear

The first problem with choosing to put our terminology URIs at HTTP(S) URIs is that it assumes that those vocabularies will remain alive. Perhaps popular ones shall, but really the modern web rots all the time. Soon enough, many ontologies will eventually be replaced by Viagra ads.

The problem is dramatically worse for json-ld contexts (and similar documents such as XML DTDs): these are the very documents by which we map terms to their fully defined meanings. Servers get hammered by people looking up contextual mappings. This is no good already. It gets even worse when such documents add (or otherwise amend) their terminology mappings; old documents may suddenly mean different things!

(I'd be remiss to not note here that vocabulary namespaces and json-ld contexts are frequently the same URIs and yet frequently not the same thing. Still, they share a lot of the same problems and solutions in terms of liveness.)

Furthermore, both the choice to put terms in namespaces and the choice to have common contextual URIs that can change creates governance problems.

I know this from personal experience (and by that I mean many painful hours of my life wasted that I can never get back). Consider sensitive above. The Mastodon folks created their own namespace, as previously mentioned, but they didn't really want to. The good news was that the Social Web Community Group was given permission to both extend the ActivityStreams vocabulary and the official ActivityStreams context.

Despite the entire group agreeing that it made sense to make sensitive official in some way (which does not mean everyone agreed that it was a good term, just that it was in enough usage that we should make it more easily widely available), the SocialCG got tied up for months and months in meetings being unable to make progress about how to do so:

  • Should we add sensitive to the ActivityStreams namespace, or leave it in the old namespace but "officially sanction" it?
  • What is the migration path for software using the previous term URI?
  • How often should we do this? What is the governance process for incubating a new term? Should it happen in a separate namespace first and then get "pulled in" later?
  • What would happen if we didn't for terms like these, and the sites went down?
  • If we also update the json-ld context, what happens for documents that already had sensitive in them meaning either the old URI or a new one? This can have significant impact on normalization for signature verification.

The group met for months about all the topics above and came to no conclusions. Eventually we decided that no consensus could be reached, so instead no action was taken at all. What a disappointment.

In general, this seems to be common. Ironically, it leads to otherwise nice decentralized designs for vocabularies eventually ending up centralized in something like schema.org anyway.

Content addressed vocabularies (and contexts) are the answer

My friend Sandro Hawke offered a solution, which I initially rejected as terrible, decided upon further consideration was brilliant, and fully embraced. Then Sandro explained to me that I had totally misunderstood him, and that he meant something different. It turns out that I actually think my initial misunderstanding was the right answer.

Here's what I understood Sandro to say:

The name we choose for a term doesn't matter that much. What really matters is the paragraph or so of specification language that describes the term. If two implementations refer to the same specification text, they mean the same thing. So just use that as the description.

Once I (incorrectly) came to realize that this could mean naming via content addressing, I latched onto the idea. Of course! We had merely selected the wrong edge of Zooko's triangle. But we know how to fix that sort of thing.

Here's how it works. Let's remember the specification text for Note above: Represents a short written work typically less than a single paragraph in length. Let's hash that (along with a "recommendation" prefix that a user might choose to bind this to the term Note, though this is just a recommendation):

$ echo "Note: Represents a short written work typically less than a single paragraph in length." | sha256sum
3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6

So if we were defining Note via content-addressing, we instead would have defined it as urn:sha256:3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6. This is unambiguous enough to avoid collisions with other uses of the word "Note". But note that it doesn't require any servers staying up. It also doesn't have any namespace governance quagmire, because there is no namespace. Updates can be handled the usual way, via errata (translations can be handled similarly), and standards organizations can still publish such things... but it is important that the original term remain content-addressed and immutable. (Hash migration is left as an exercise for the user, with a hint that the solution is similar to that with errata.)

Anyway, our post might end up looking in the end like this instead:

  {"@context": {"Note": "urn:sha256:3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6",
                "content": "urn:sha256:57dc44a1cdcbb7aa976a65a858b4d349ad6110d58d9d546650ce2b0e2b1048e4",
                "sensitive": "urn:sha256:81d98cf83fcf733400ad5d2a25495feeea47f287193a53a9722f4cb025da88f1"},
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?",
   "sensitive": true}

I'll note very briefly that content-addressing is also the answer for JSON-LD contexts. If something like Datashards or IPFS were used to host json-ld contexts, each post could link to the exact immutable content-addressed context it was intended to be used with. Servers that use such contexts can "pin" them to keep them available, avoiding a single point of failure (or bandwidth bottleneck).

  {"@context": "idsc:p0.JLnUcJN4R1KNvSXm9Ut3Tmg7WfXAKEOx47p01Pk_Htw.2_rCdtnEha1RpD_qyzxhFIjUvLj7crIbzpmzWei5xRk",
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?",
   "sensitive": true}

As one other side-note, I'll also observe that even though the fully expanded version of the above message is:

  {"@type": "urn:sha256:3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6",
   "urn:sha256:57dc44a1cdcbb7aa976a65a858b4d349ad6110d58d9d546650ce2b0e2b1048e4": "Would you read me a bedtime story about the great ontology wars?",
   "urn:sha256:81d98cf83fcf733400ad5d2a25495feeea47f287193a53a9722f4cb025da88f1": true}

... we never needed to look at it that way because json-ld contexts (and systems like them) are actually petname systems.

Conclusions (and non-conclusions)

Let me clarify a claim I'm not making: we don't need to throw away the old terms for systems like ActivityStreams that are already well understood. However, going forward I do think that using content-addressing of new terms is a good idea. And in the long run, I think content-addressing of json-ld contexts and any documents like them is an absolute must (when they aren't inlined, anyway... but inlining is expensive).

If we adopted Content Addressed Vocabularies, working on vocabulary extensions to ActivityPub could be a different story. Imagine a git repository that communities can fork to work on new terms. We could have a drafts directory where people hammer out common extension terms, and when they're ready, we simply move them to the extensions directory. Since the names are merely hashes of the contents of that directory, statically generating a webpage that lists all current known and recommended extensions would be trivial. Everything could be handled in issues and PRs, and even if terms aren't merged into the main repo, that's merely a matter of lower term discoverability rather than a hinderance of application itself.

If we moved to content addressed vocabulary, we'd be more free from the perils of downtime and general web bitrot, freer from gatekeeping and governance challenges, but just as free (I'd argue even freer) to collaborate. Moving forward, I intend to ake content addressed approaches to terms I define in my systems, and I encourage you to do the same.

Vats and Propagators: towards a global brain

By Christopher Lemmer Webber on Sun 16 February 2020

(This is a writeup for future exploration; I will be exploring a small amount of this soon as a side effect of some UI building I am doing, but not a full system. A full system will come later, maybe even by years. Consider this a desiderata document. Also a forewarning that this document was originally written for an ocap-oriented audience, and some terms are left unexpanded; for instance, "vat" really just means a one-turn-at-a-time single-threaded event loop that a bunch of actors live in.)

We have been living the last couple of decades with networks that are capable of communicating ideas. However, by and large it is left to the humans to reason about these ideas that are propagated. Most machines that operate on the network merely execute the will of humans that have carefully constructed them. Recently neural network based machine learning has gotten much better, but merely resembles intuition, not reasoning. (The human brain succeeds by combining both, and a successful system likely will too.) Could we ever achieve a network that itself reasons? And can it be secure enough not to tear itself apart?

Near-term background

In working towards building out a demonstration of petname systems in action in a social network, I ran into the issue of changes to a petname database automatically being reflected through the UI. This lead me back down a rabbit hole of exploring reactive UI patterns, and also lead me back to exploring that section, and the following propagator section, of SICP again. This also lead me to rewatch one of my favorite talks: We Don't Really Know How to Compute! by Gerald Sussman.

At 24:54 Sussman sets up an example problem: specifically, an expert in electrical systems having a sense of how to be able to handle and solve an electrical wiring diagram. (The kind of steps explained are not dissimilar to the kind of steps that programmers go through while reasoning about debugging a coding problem.) Sussman then launches into an exploration of propagators, and how they can solve the problem. Sussman's explanation is better than mine would be, so I'll leave you to watch the video to see how it's used to solve various problems.

Okay, a short explanation of propagators

Well, I guess I'll give a little introduction to propagators and why I think they're interesting.

Propagators have gone through some revisions since the SICP days; relevant reading are the Revised Report on the Propagator Model, The Art of the Propagator, and to really get into depth with the ideas, Propagation networks: a flexible and expressive substrate for computation (Radul's PhD thesis).

In summary, a propagator model has the following properties:

  • There are cells with accumulate information about a value. Note! This is a big change from previous propagator versions! In the modern version of a propagator model, a cell doesn't hold a value, it accrues information about a value which must be non-contradictory.
  • Such cell information may be complete (the number 42 is all there is to know), whereas some other information may be a range of possibilities (hm, could be anywhere between -5 to 45...). As more information is made available, we can "narrow down" what we know.
  • Cells are connected together with propagators.
  • Information is (usually) bidirectional. For example, with the slope formula of y = (m * x) + b, we don't need to just solve for y... we could solve for m, x, or b given the other information. Similarly, partial information can propagate.
  • Contradictions are not allowed. Attempting to introduce contradictory information into the network will throw an exception.
  • We can "play with" different ideas via a Truth Maintenance System. What do we believe? Changes in our beliefs can result in changes to the generated topology of the network.
  • Debugging is quite possible. One of the goals of propagator networks is that you should be able to investigate and determine blame for a result. Relationships are clear and well defined. As Sussman says (roughly paraphrased), "if an autonomous car drives off the car of the road, I could sue the car manufacturer, but I'd rather sue the car... I want to hold it accountable for its decision making". The ability to hold accountability and determine blame stands in contrast to squishier systems like neural nets, genetic programs, etc (which are still useful, but not as easy to interrogate).

There are a lot of things that can be built with propagators as the general case of constraint solving and reasoning; functional reactive UIs, type checkers, etc etc.

Bridging vats and propagators

The prototype implementations are written in Scheme. The good news is, this means we could implement propagators on top of something like Spritely Goblins.

However (and, granted, I haven't completed it) I think there is one thing that is inaccurately described in Radul's thesis and Sussman's explanations, but which I think actually is no problem at all if we apply the vat model of computation (as in E, Agoric, Goblins): how distributed can these cells and propagators be? Section 2.1 of Radul's thesis explains propagators as asynchronous and completely autonomous, as if cells and their propagators could live anywhere on the computer network with no change in effectiveness. I think this is only partially true. The reference implementation actually does not fully explore this because it uses a single-threaded event loop that processes events until there are no more to process, during which it may encounter a contradiction and raise it. However I believe that the ability to "stop the presses" as it were is one of the nicest features of propagators and actually should not be lost... if we introduced asynchronous events coming in, there may be multiple events that come in at the same time and which try making changes to the propagator network in parallel. Thankfully a nice answer comes in form of a the vat model: it should be possible to have a propagator network within a single vat. Spritely Goblins' implementation of the vat model is transactional, so this means that if we try to introduce a contradiction, we could roll back immediately. This is the right behavior. As it turns out, this is very close to the propagator system in the way it's implemented in the reference implementation... I think the reference implementation did something more or less right while trying to do the simplest thing. Combined with a proper ocap vat model this should work great.

Thus, I believe that a propagator system (here I mean a propagator network, meaning a network of propagator-connected cells) should actually be vat-local. But wait, we talked about network (as in internet) based reasoning, and here I am advocating locality! What gives?

The right answer seems to me that propagator networks should be able to be hooked together, but a change to a vat-contained propagator system can trigger message passing to another vat-contained propagator system, which can even happen over a computer network such as the internet. We will have to treat propagator systems and changes to them as vat-local, but they can still communicate with other propagator systems. (This is a good idea anyway; if you communicate an idea with me and it's inconsistent with my worldview, it should be important for me to be able to realize that and use that as an opportunity to correct our misunderstandings between each other.)

However, cells are still objects with classic object references. This means it is possible to hold onto one and use it as either a local or networked capability. Attenuation also composes nicely; it should be possible to produce a facet of a cell that only allows read access or only allows adding information. It's clear and easily demonstrated that ocaps can be the right security model for the propagator model simply by realizing that both the propagator prototype system is written in scheme, and so is Jonathan Rees' W7 security kernel.

This is all to say, if we built the propagator model on top of an ocap-powered vat model, we'd already have a good network communication model, a good security model, and a transactional model. Sounds great to me.

Best of all, a propagator system can live alongside normal actors. We don't have to choose one or the other... a multi-paradigm approach can work great.

Speaking the same language

One of the most important things in a system that communicates is that ideas should be able to be expressed and considered in such a way that both parties understand. Of course, humans do this, and we call it "language".

Certain primitives exist in our system already; for optimization reasons, we are unlikely to want to build numbers out of mere tallying of numbers (such as in Peano arithmetic); we instead build in primitives for integers and a means of combination for them. So we will of course want to have several primitive data types.

But at some point we will want to talk about concepts that are not encoded in the system. If I would like to tell you about a beautiful red bird I saw, where would I even begin? Well obviously at minimum, we will have to have ways of communicating ideas such as "red" and "bird". We will have to build a vocabulary together.

Natural language vocabulary has a way of becoming ambiguous fast. A "note" passed in class versus a "note" in a musical score versus that I would like to "note" a topic of interest to you are all different things.

Linked data (formerly "semantic web") folks have tried to use full URIs as a way to get around this problem. For instance, two ActivityPub servers which are communicating are very likely speaking about the same thing if they both use "https://www.w3.org/ns/activitystreams#Note", which is to say they are talking about some written note-like message (probably a (micro)blog post). This is not a guarantee; vocabulary drift is still possible, but it is much less likely.

Unfortunately, http(s) based URIs are a poor choice for hosting vocabulary. Domains expire, websites go down, and choosing whether to extend a vocabulary in some namespace is (in the author's experience) a governance nightmare. A better option is "content-addressed vocabulary"; instead of "https://www.w3.org/ns/activitystreams#Note" we could instead simply take the text from the standard:

"Represents a short written work typically less than a single paragraph in length."

Hash that and you get "urn:sha256:54c14cbd844dc9ae3fa5f5f7b8c1255ee32f55b8afaba88ce983a489155ac398". No governance or liveness issues required. (Hashing mechanism upgrades, however, do pose some challenge; mapping old hashes to new ones for equivalence can be a partial solution.)

This seems sufficient to me; groups can collaborate somewhere to hammer out the definition of some term, simply hash the definition of it, and use that as the terminology URI. This also avoids hazards from choosing a different edge of Zooko's Triangle for vocabulary.

Now that we have this, we can express advanced new ideas across the network and experiment with new terms. Better yet, we might be even able to use our propagator networks to associate ideas with them. I think in many systems, content-addressed-vocabulary could be a good way to describe beliefs that could be considered, accepted, rejected in truth maintenance systems.

Cerealize me, cap'n!

One observation from Agoric is that it is possible to treat systems that do not resemble traditional live actor'y vats still as vats (and "machines") and develop semantics for message passing between them (and performing promise resolution) nonetheless, for instance blockchains.

Similarly, above we have observed that propagator systems can be built on top of actors; I believe it is also possible to describe propagator networks in terms of pure data. It should be possible to describe changes to a propagator network as a standard serialized ledger that can be transferred from place to place or reproduced.

However, the fact that interoperability with actors is possible is good, desirable, and thankfully a nice transitional place for experimentation (porting propagator model semantics to Spritely Goblins should not be hard).

Where to from here?

That's a lot of ideas above, but how likely is any of this stuff to be usable soon? I'm not anticipating dropping any current work to try to make this happen, but I probably will be experimenting in my upcoming UI work to try to have the UI powered by a propagator system (possibly even a stripped down version) so that the experimental seeds are in place to see if such a system can be grown. But I am not anticipating that we'll see anything like a fully distributed propagator system doing something interesting from my own network soon... but sometimes I end up surprised.

Closing the loop

I mentioned before that human brains are a combination of faster intuitive methods (resembling current work on neural nets) and slower, more calculating reasoning systems (resembling propagators or some logic programming languages). That's also to say nothing about the giant emotional soup that a mind/body tends to live in.

Realistically the emergence of a fully sapient system won't involve any of these systems independently, but rather a networked interconnection of many of them. I think the vat model of execution is a nice glue system for it; pulling propagators into the system could bring us one step closer, maybe.

Or maybe it's all just fantastical dreaming! Who knows. But it could be interesting to play and find out at some point... perhaps some day we can indeed get a proper brain into a vat.

State of Spritely for February 2020

By Christopher Lemmer Webber on Mon 10 February 2020

We are now approximately 50% of the way through the Samsung Stack Zero grant for Spritely, and only a few months more since I announced the Spritely project at all. I thought this would be a good opportunity to review what has happened so far and what's on the way.

In my view, quite a lot has happened over the course of the last year:

  • Datashards grew out of two Spritely projects, Magenc and Crystal. This provides the "secure storage layer" for the system, and by moving into Datashards has even become its own project (now mostly under the maintainership of Serge Wroclawski, who as it turns out is also co-host with me of Libre Lounge. There's external interest in this from the rest of the federated social web, and it was a topic of discussion in the last meeting of the SocialCG. While not as publicly visible recently, the project is indeed active; I am currently helping advise and assist Serge with some of the ongoing work on optimizations for smaller files, fixing the manifest format to permit larger files, and a more robust HTTP API for stores/registries. (Thank you Serge also for taking on a large portion of this work and responsibility!)

  • Spritely Goblins, the actor model layer of Spritely, continues its development. We are now up to release v0.5. I don't consider the API to be stable, but it is stabilizing. In particular, the object/update model, the synchronous communication layer, and the transactional update support are all very close to stable. Asynchronous programming mostly works but has a few bugs I need to work out, and the distributed programming environment design is coming together enough where I expect to be able to demo it soon.

  • In addition, I have finally started to write docs for Spritely Goblins. I think the tutorial above is fairly nice, and I've had a good amount of review from various parties, and those who have tried it seem to think it is fairly nice. (Please be advised that it requires working with the dev branch of Goblins at the time of writing.) v0.6 should the first release to have documentation after the major overhaul I did last summer (effectively an entire rewrite of the system, including many changes to the design after doing research into ocap practices). I cannot recommend that anyone else write production-level code using the system yet, but I hope that by the summer things will have congealed enough that this will change.

  • I have made a couple of publicly visible demos of Goblins' design. Weirdly enough all of these have involved ascii art.

    • The proto-version was the Let's Just Be Weird Together demo. Actually it's a bit strange to say this because the LJBWT demo didn't use Goblins, it used a library called DOS/HURD. However, writing this library (and adapting it from DOS/Win) directly informed the rewrite of Goblins, Goblinoid which eventually became Goblins itself, replacing all the old code. This is why I advocate demo-driven-development: the right design of an architecture flows out of a demo of it. (Oh yeah, and uh, it also allowed me to make a present for my 10th wedding anniversary, too.)

    • Continuing in a similar vein, I made the "Season's Greetings" postcard, which Software Freedom Conservancy actually used in their funding campaign this year. This snowy scene used the new rewrite of Goblins and allowed me to try to push the new "become" feature of Goblins to its limit (the third principle of actor model semantics, taken very literally). It wasn't really obvious to anyone else that this was using Goblins in any interesting way, but I'll say that writing this really allowed me to congeal many things about the update layer and it also lead to uncovering a performance problem, leading to a 10x speedup. Having written this demo, I was starting to get the hang of things in the Goblins synchronous layer.

    • Finally there was the Terminal Phase demo. (See the prototype announcement blogpost and the 1.0 announcement.) This was originally designed as a reward for donors for hitting $500/mo on my Patreon account (you can still show up in the credits by donating!), though once 1.0 made it out the door it seems like it raised considerable excitement on the r/linux subreddit and on Hacker News, which was nice to see. Terminal Phase helped me finish testing and gaining confidence in the transactional object-update and synchronous call semantics of Spritely Goblins, and I now have no doubt that this layer has a good design. But I think Terminal Phase was the first time that other people could see why Spritely Goblins was exciting, especially once I showed off the time travel debugging in Terminal Phase demo. That last post lead people to finally start pinging me asking "when can I use Spritely Goblins"? That's good... I'm glad it's obvious now that Goblins is doing something interesting (though the most interesting things are yet to be demo'ed).

  • I participated in, keynoted, and drummed up enthusiasm for ActivityPub Conference 2019. (I didn't organize though, that was Morgan Lemmer-Webber's doing, alongside Sebastian Lasse and with DeeAnn Little organizing the video recording.) We had a great speaker list and even got Mark S. Miller to keynote. Videos of the event are also available. While that event was obviously much bigger than Spritely, the engagement of the ActivityPub community is obviously important for its success.

  • Relatedly, I continue to co-chair the SocialCG but Nightpool has joined as co-chair which should relieve some pressure there, as I was a bit too overloaded to be able to handle this all on my own. The addition of the SocialHub community forum has also allowed the ActivityPub community to be able to coordinate in a way that does not rely on me being a blocker. Again, not Spritely related directly, but the health of the ActivityPub community is important to Spritely's success.

  • At Rebooting Web of Trust I coordinated with a number of contributors (including Mark Miller) on sketching out plans for secure UI designs. Sadly the paper is incomplete but has given me the framework for understanding the necessary UI components for when we get to the social network layer of Spritely.

  • Further along the lines of sketching out the desiderata of federated social networks, I have written a nearly-complete OcapPub: towards networks of consent. However, there are still some details to be figured out; I have been hammering them out on the cap-talk mailing list (see this post laying out a very ocappub-like design with some known problems, and then this analysis). The ocap community has thankfully been very willing to participate in working with me to hammer out the right security foundations, and I think we're close to the right design details. Of course, the proof of the pudding is in the demo, which has yet to be written.

Okay, so I hope I've convinced you that a lot has happened, and hopefully you feel that I am using my time reasonably well. But there is much, much, much ahead for Spritely to succeed in its goals. So, what's next?

  • I need to finish cleaning up the Goblins documentation and do a v0.6 release with it included. At that point I can start recommending some brave souls to use it for some simple applications.

  • A demo of Spritely Goblins working in a primarily asynchronous environment. This might simply be a port of mudsync as a first step. (Recorded demo of mudsync from a few years ago.) I'm not actually sure. The goal of this isn't to be the "right" social network design (not full OcapPub), just to test the async behaviors of Spritely Goblins. Like the synchronous demos that have already been done, the purpose of this is to congeal and ensure the quality of the async primitives. I expect this and the previous bullet point to be done within the next couple of months, so hopefully by the end of April.

  • Distributed networked programming in Goblins, and associated demo. May expand on the previous demo. Probably will come out about two months later, so end of June.

  • Prototype of the secure UI concepts from the forementioned secure UIs paper. I expect/hope this to be usable by end of third quarter 2020.

  • Somewhere in-between all this, I'd like to add a demo of being able to securely run untrusted code from third parties, maybe in the MUD demo. Not sure when yet.

  • All along, I continue to expect to push out new updates to Terminal Phase with more fun enemies and powerups to continue to reward donors to the Patreon campaign.

This will probably take most of this year. What you will notice is that this does not explicitly state a tie-in with the ActivityPub network. This is intentional, because the main goal of all the above demos are to prove more foundational concepts before they are all fully integrated. I think we'll see the full integration and it coming together with the existing fediverse beginning in early 2021.

Anyway, that's a lot of stuff ahead. I haven't even mentioned my involvement in Libre Lounge, which I've been on hiatus from due to a health issue that has made recording difficult, and from being busy trying to deliver on these foundations, but I expect to be coming back to LL shortly.

I hope I have instilled you with some confidence that I am moving steadily along the abstract Spritely roadmap. (Gosh, I ought to finally put together a website for Spritely, huh?) Things are happening, and interesting ones I think.

But how do you think things are going? Maybe you would like to leave me feedback. If so, feel free to reach out.

Until next time...