Archives

Tags

Departing Libre Lounge

By Christopher Lemmer Webber on Wed 13 May 2020

Over the last year and a half I've had a good time presenting on Libre Lounge with my co-host Serge Wroclawski. I'm very proud of the topics we've decided to cover, of which there are quite a few good ones in the archive, and the audience the show has had is just the best.

However, I've decided to depart the show... Serge and I continue to be friends (and are still working on a number of projects together, such as Datashards and the recently announced grant), but in terms of the podcast I think we'd like to take things in different creative directions.

This is probably not the end of me doing podcasting, but if I start something up again it'll be a bit different in its structure... and you can be sure you'll hear about it here and on my fediverse account and over at the birdsite.

In the meanwhile, I look forward to continuing to tuning into Libre Lounge, but as a listener.

Thanks for all the support, Libre Loungers!

Spritely's NLNet grant: Interface Discovery for Distributed Systems

By Christopher Lemmer Webber on Wed 13 May 2020

I've been putting off making this blogpost for a while because I kept thinking, "I should wait to do it until I finish making some sort of website for Spritely and make a blogpost there!" Which, in a sense is a completely reasonable thought because right now Spritely's only "website" is a loose collection of repositories, but I'd like something that provides a greater narrative for what Spritely is trying to accomplish. But that also kind of feels like a distraction (or maybe I should just make a very minimal website) when there's something important to announce... so I'm just doing it here (where I've been making all the other Spritely posts so far anyway).

Spritely is an NLnet (in conjunction with the European Commision / Next Generation Internet initative) grant recipient! Specifically, we have received a grant for "Interface Discovery for Distributed Systems"! I'll be implementing the work alongside Serge Wroclawski.

There are two interesting sub-phrases there: "Interface Discovery" and "Distributed Systems". Regarding "distributed systems", we should really say "mutually suspicious open-world distributed systems". Those extra words change some of the requirements; we have to assume we'll be told about things we don't understand, and we have to assume that many objects we interact with may be opaque to us... they might lie about what kind of thing they are.

Choosing how to name interfaces then directly ties into something I wrote about here more recently, namely content addressed vocabulary.

I wrote more ideas and details about the interfaces ideas email to cap-talk so you can read more there if you like... but I think more details about the interfaces thoughts than that can wait until we publish a report about it (and publishing a report is baked into the grant).

The other interesting bit though is the "distributed" aspect; in order to handle distributed computation and object interaction, we need to correctly design our protocols. Thankfully there is a lot of good prior art to work from, usually some variant of "CapTP" (Capability Transport Protocol), as implemented in its original form by E, taking on a bit of a different form in the Waterken project, adapted in Cap'N Proto, as well as with the new work happening over at Agoric. Each of these variants of the core CapTP ideas have tried to tackle some different use cases, and Goblins has its own needs to be covered. Is there a possibility of convergence? Possibly... I am trying to understand the work of and communicate with the folks over at Agoric but I think it's a bit too early to be conclusive about anything. Regardless, it'll be a major milestone once Spritely Goblins is able to actually live up to its promise of distributed computation, and work on this is basically the next step to proceed on.

When I first announced Spritely about a year and a half ago I included a section that said "Who's going to pay for all this?" to which I then said, "I don't really have a funding plan, so I guess this is kind of a non-answer. However, I do have a Patreon account you could donate to." To be honest, I was fairly nervous about it... so I want to express my sincere and direct appreciation to NLnet alongside the European Commission / Next Generation Internet Initiative, along with Samsung Stack Zero, and all the folks donating on Patreon and Liberapay. With all the above, and especially the new grant from NLnet, I should have enough funding to continue working on Spritely through a large portion of 2021. I am determined to make good on the support I've received, and am looking forward to put out more interesting demonstrations of this technology over the next few months.

What should fit in a FOSS license?

By Christopher Lemmer Webber on Mon 09 March 2020

Originally sent in an email to the OSI license-discuss mailing list.

What terms belong in a free and open source software license? There has been a lot of debate about this lately, especially as many of us are interested in expanding the role we see that we play in terms of user freedom issues. I am amongst those people that believe that FOSS is a movement thats importance is best understood not on its own, but on the effects that it (or the lack of it) has on society. A couple of years ago, a friend and I recorded an episode about viewing software freedom within the realm of human rights; I still believe that, and strongly.

I also believe there are other critical issues that FOSS has a role to play in: diversity issues (both within our own movement and empowering people in their everyday lives) are one, environmental issues (the intersection of our movement with the right-to-repair movement is a good example) are another. I also agree that the trend towards "cloud computing" companies which can more or less entrap users in their services is a major concern, as are privacy concerns.

Given all the above, what should we do? What kinds of terms belong in FOSS licenses, especially given all our goals above?

First, I would like to say that I think that many people in the FOSS world, for good reason, spend a lot of time thinking about licenses. This is good, and impressive; few other communities have as much legal literacy distributed even amongst their non-lawyer population as ours. And there's no doubt that FOSS licenses play a critical role... let's acknowledge from the outset that a conventionally proprietary license has a damning effect on the agency of users.

However, I also believe that user freedom can only be achieved via a multi-layered approach. We cannot provide privacy by merely adding privacy-requirements terms to a license, for instance; encryption is key to our success. I am also a supporter of code of conducts and believe they are important/effective (I know not everyone does; I don't care for this to be a CoC debate, thanks), but I believe that they've also been very effective and successful checked in as CODE-OF-CONDUCT.txt alongside the traditional COPYING.txt/LICENSE.txt. This is a good example of a multi-layered approach working, in my view.

So acknowledging that, which problems should we try to solve at which layers? Or, more importantly, which problems should we try to solve in FOSS licenses?

Here is my answer: the role of FOSS licenses is to undo the damage that copyright, patents, and related intellectual-restriction laws have done when applied to software. That is what should be in the scope of our licenses. There are other problems we need to solve too if we truly care about user freedom and human rights, but for those we will need to take a multi-layered approach.

To understand why this is, let's rewind time. What is the "original sin" that lead to the rise proprietary software, and thus the need to distinguish FOSS as a separate concept and entity? In my view, it's the decision to make software copyrightable... and then, adding similar "state-enforced intellectual restrictions" categories, such as patents or anti-jailbreaking or anti-reverse-engineering laws.

It has been traditional FOSS philosophy to emphasize these as entirely different systems, though I think Van Lindberg put it well:

Even from these brief descriptions, it should be obvious that the term "intellectual property" encompasses a number of divergent and even contradictory bodies of law. [...] intellectual property isn't really analagous to just one program. Rather, it is more like four (or more) programs all possibly acting concurrently on the same source materials. The various IP "programs" all work differently and lead to different conclusions. It is more accurate, in fact, to speak of "copyright law" or "patent law" rather than a single overarching "IP law." It is only slightly tongue in cheek to say that there is an intellectual property "office suite" running on the "operating system" of US law. -- Van Lindberg, Intellectual Property and Open Source (p.5)

So then, as unfortunate as the term "intellectual property" may be, we do have a suite of state-enforced intellectual restriction tools. They now apply to software... but as a thought experiment, if we could rewind time and choose between a timeline where such laws did not apply to software vs a time where they did, which would have a better effect on user freedom? Which one would most advance FOSS goals?

To ask the question is to know the answer. But of course, we cannot reverse time, so the purpose of this thought experiment is to indicate the role of FOSS licenses: to use our own powers granted under the scope of those licenses to undo their damage.

Perhaps you'll already agree with this, but you might say, "Well, but we have all these other problems we need to solve too though... since software is so important in our society today, trying to solve these other problems inside of our licenses, even if they aren't about reversing the power of the intellectual-restriction-office-suite, may be effective!"

The first objection to that would be, "well, but it does appear that it makes us addicted in a way to that very suite of laws we are trying to undo the damage of." But maybe you could shrug that off... these issues are too important! And I agree the issues are important, but again, I am arguing a multi-layered approach.

To better illustrate, let me propose a license. I actually considered drafting this into real license text and trying to push it all the way through the license-review process. I thought that doing so would be an interesting exercise for everyone. Maybe I still should. But for now, let me give you the scope of the idea. Ready?

"The Disposable Plastic Prevention Public License". This is a real issue I care about, a lot! I am very afraid that there is a dramatic chance that life on earth will be choked out within the next number of decades by just how much non-degradeable disposable plastic we are churning out. Thus it seems entirely appropriate to put it in a license, correct? Here are some ideas for terms:

  • You cannot use this license if you are responsible for a significant production of disposable plastics.

  • You must make a commitment to reduction in your use of disposable plastics. This includes a commitment to reductions set out by (a UN committee? Haven't checked, I bet someone has done the research and set target goals).

  • If you, or a partner organization, are found to be lobbying against laws to eliminate disposable plastics, your grant of this license is terminated.

What do you think? Should I submit it to license-review? Maybe I should. Or, if someone else wants to sumbit it, I'll enthusiastically help you draft the text... I do think the discussion would be illuminating!

Personally though, I'll admit that something seems wrong about this, and it isn't the issue... the issue is one I actually care about a lot, one that keeps me up at night. Does it belong in a license? I don't think that it does. This both tries to both fix problems via the same structures that we are trying to undo problems with and introduces license compatibility headaches. It's trying to fight an important issue on the wrong layer.

It is a FOSS issue though, in an intersectional sense! And there are major things we can do about it. We can support the fight of the right-to-repair movements (which, as it turns out, is a movement also hampered by these intellectual restriction laws). We can try to design our software in such a way that it can run on older hardware and keep it useful. We can support projects like the MNT Reform, which aims to build a completely user-repairable laptop, and thus push back against planned obsolescence. There are things we can, and must, do that are not in the license itself.

I am not saying that the only kind of thing that can happen in a FOSS license is to simply waive all rights. Indeed I see copyleft as a valid way to turn the weapons of the system against itself in many cases (and there are a lot of cases, especially when I am trying to push standards and concepts, where I believe a more lax/permissive approach is better). Of course, it is possible to get addicted to those things too: if we could go back in our time machine and prevent these intellectual restrictions laws from taking place, source requirements in copyleft licenses wouldn't be enforceable. While I see source requirements as a valid way to turn the teeth of the system against itself, in that hypothetical future, would I be so addicted to them that I'd prefer that software copyright continue just so I could keep them? No, that seems silly. But we also aren't in that universe, and are unlikely to enter that universe anytime soon, so I think this is an acceptable reversal of the mechanisms of destructive state-run intellectual restriction machine against itself for now. But it also indicates maybe a kind of maxima.

But it's easy to get fixated on those kinds of things. How clever can we be in our licenses? And I'd argue: minimally clever. Because we have a lot of other fights to make.

In my view, I see a lot of needs in this world, and the FOSS world has a lot of work to do... and not just in licensing, on many layers. Encryption for privacy, diversity initiatives like Outreachy, code of conducts, software that runs over peer to peer networks rather than in the traditional client-server model, repairable and maintainable hardware, thought in terms of the environmental impact of our work... all of these things are critical things in my view.

But FOSS licenses need not, and should not, try to take on all of them. FOSS licenses should do the thing they are appropriate to do: to pave a path for collaboration and to undo the damage of the "intellectual restriction office suite". As for the other things, we must do them too... our work will not be done, meaningful, or sufficient if we do not take them on. But we should do them hand-in-hand, as a multi-layered approach.

Terminal Phase v1.1 and Spritely Goblins v0.6 releases!

By Christopher Lemmer Webber on Thu 05 March 2020

Hello all! I just did a brand new release of both:

So some highlights from each.

Terminal Phase

Okay, this is flashier, even if less important than Goblins. But the main thing is that I added the time travel debugging feature, which is so flashy I feel the need to show that gif again here:

Time travel in Spritely Goblins shown through Terminal Phase

Aside from time travel, there aren't many new features, though I plan on adding some in the next week (probably powerups or a boss fight), so another release should be not far away.

And oh yeah, since it's a new release, now is a good time to thank the current supporters:

Terminal Phase Credits

But yeah, the main thing that was done here is that Terminal Phase was updated for the new release of Goblins, so let's talk about that!

Goblins

For those who aren't aware, Spritely Goblins is a transactional actor model library for Racket.

v0.6 has resulted in a number of changes in semantics.

But the big deal is that Goblins finally has decent documentation, including a fairly in-depth tutorial and documentation about the API. I've even documented how you, in your own programs, can play with Goblins' time travel features.

So, does this mean you should start using it? Well, it's still in alpha, and the most exciting feature (networked, distributed programming) is still on its way. But I think it's quite nice to use already (and I'm using it for Terminal Phase).

Anyway, that's about it... I plan on having a new video explaining more about how Goblins works out in the next few days, so I'll announce that when it happens.

If you are finding this work interesting, a reminder that this work is powered by people like you.

In the meanwhile, hope you enjoy the new releases!

Content Addressed Vocabulary

By Christopher Lemmer Webber on Wed 26 February 2020

How can systems communicate and share meaning? Communication within systems is preceded by a form of meta-communication; we must have a sense that we mean the same things by the terms we use before we can even use them.

This is challenging enough for humans who must share meaning, but we can resolve ambiguities with context clues from a surrounding narrative. Machines, in general, need a context more explicitly laid out for them, with as little ambiguity as possible.

Standards authors of open-world systems have long struggled with such systems and have come up with some reasonable systems; unfortunately these also suffer from several pitfalls. With minimal (or sometimes none at all) adjustment to our tooling, I propose a change in how we manage ontologies.

How we deal with ambiguous terms today

Consider Note, a seemingly simple term in ActivityStreams, the vocabulary used by ActivityPub. The meaning of Note, as described by the ActivityStreams vocabulary, seems simple enough: Represents a short written work typically less than a single paragraph in length.

Here is how an ActivityStreams usage of Note might look (a bit simplified from what it would probably look like in practice):

  {"@context": "https://www.w3.org/ns/activitystreams",
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?"}

What's that @context thing? This is some JSON-LD thing, which tries to be "more exact" about what Note we must be talking about. It does so by mapping Note to https://www.w3.org/ns/activitystreams#Note by something like the following:

  {"as": "https://www.w3.org/ns/activitystreams#",
   "Note": "as:Note",
   "content": "as:content",
   ...}

The choice to use JSON-LD has been semi-controversial in ActivityPub land; historically there was some debate about whether or not we needed to be "more exact" at all as to what terms mean. This post really isn't about JSON-LD as much as it is the more general topic of vocabularies and vocabulary mapping systems. There are other concerns people raise about JSON-LD, usually around the tooling... that's not the scope of this post. This blogpost could as easily apply to XML or Turtle or whatever; the protocol I've worked on just happens to use JSON-LD to do that, so I've used it as my illustration.

That said, the ActivityPub spec tries to make things as simple as possible for the default case of ActivityPub usage by saying that the ActivityStreams context is implied, so that if you're not doing anything complicated, so:

  {"@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?"}

... is really the same as the first example.

So okay, probably everyone can guess what Note means, but what about sensitive? What the heck is that? It doesn't appear in the ActivityStreams vocabulary; it kind of implies something along the lines of content-warning type behavior, like "this content may be considered sensitive" by some users, but how would you guess that just by the term? This is an extension, and it lives at http://joinmastodon.org/ns#sensitive.

So maybe if we were going to use it (and if we inline our context) it might look like:

  {"@context": {"as": "https://www.w3.org/ns/activitystreams#",
                "toot": "http://joinmastodon.org/ns#",
                "Note": "as:Note",
                "content": "as:content",
                "sensitive": "toot:sensitive"},
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?",
   "sensitive": true}

(I mean, the Great Ontology Wars are a sensitive topic for some.)

The choice of JSON-LD in ActivityPub is controversial for various reasons. But it turns out what isn't really controversial anymore is whether we need some way of being more exact about the way we speak about terms... those who used to complain about that mostly now agree (disagreements then surround what tooling need to be used to do so (not in scope of this post), and namespace governance (in scope of this post)).

Maybe you feel like, having heard what sensitive and Note mean, these are the obvious definitions. But consider that Note itself could have meant something very different. Are we talking about a short mostly-textual post (probably on a microblog), as ActivityStreams does? Are we talking about a musical note? Are we instructing someone to take note of something, as an action (or yes, activity)?

So terms really are ambiguous, and in a decentralized but extensible system with open world assumptions, we are eventually going to result in conflicts. The choice to map our vocabulary to URIs is actually a very reasonable way to reduce ambiguity. Unfortunately, the choice to map them to namespaces and to live URIs (a-la http(s): URIs), is a mistake that will eventually bite us (and doubly so for JSON-LD contexts).

Problems appear

The first problem with choosing to put our terminology URIs at HTTP(S) URIs is that it assumes that those vocabularies will remain alive. Perhaps popular ones shall, but really the modern web rots all the time. Soon enough, many ontologies will eventually be replaced by Viagra ads.

The problem is dramatically worse for json-ld contexts (and similar documents such as XML DTDs): these are the very documents by which we map terms to their fully defined meanings. Servers get hammered by people looking up contextual mappings. This is no good already. It gets even worse when such documents add (or otherwise amend) their terminology mappings; old documents may suddenly mean different things!

(I'd be remiss to not note here that vocabulary namespaces and json-ld contexts are frequently the same URIs and yet frequently not the same thing. Still, they share a lot of the same problems and solutions in terms of liveness.)

Furthermore, both the choice to put terms in namespaces and the choice to have common contextual URIs that can change creates governance problems.

I know this from personal experience (and by that I mean many painful hours of my life wasted that I can never get back). Consider sensitive above. The Mastodon folks created their own namespace, as previously mentioned, but they didn't really want to. The good news was that the Social Web Community Group was given permission to both extend the ActivityStreams vocabulary and the official ActivityStreams context.

Despite the entire group agreeing that it made sense to make sensitive official in some way (which does not mean everyone agreed that it was a good term, just that it was in enough usage that we should make it more easily widely available), the SocialCG got tied up for months and months in meetings being unable to make progress about how to do so:

  • Should we add sensitive to the ActivityStreams namespace, or leave it in the old namespace but "officially sanction" it?
  • What is the migration path for software using the previous term URI?
  • How often should we do this? What is the governance process for incubating a new term? Should it happen in a separate namespace first and then get "pulled in" later?
  • What would happen if we didn't for terms like these, and the sites went down?
  • If we also update the json-ld context, what happens for documents that already had sensitive in them meaning either the old URI or a new one? This can have significant impact on normalization for signature verification.

The group met for months about all the topics above and came to no conclusions. Eventually we decided that no consensus could be reached, so instead no action was taken at all. What a disappointment.

In general, this seems to be common. Ironically, it leads to otherwise nice decentralized designs for vocabularies eventually ending up centralized in something like schema.org anyway.

Content addressed vocabularies (and contexts) are the answer

My friend Sandro Hawke offered a solution, which I initially rejected as terrible, decided upon further consideration was brilliant, and fully embraced. Then Sandro explained to me that I had totally misunderstood him, and that he meant something different. It turns out that I actually think my initial misunderstanding was the right answer.

Here's what I understood Sandro to say:

The name we choose for a term doesn't matter that much. What really matters is the paragraph or so of specification language that describes the term. If two implementations refer to the same specification text, they mean the same thing. So just use that as the description.

Once I (incorrectly) came to realize that this could mean naming via content addressing, I latched onto the idea. Of course! We had merely selected the wrong edge of Zooko's triangle. But we know how to fix that sort of thing.

Here's how it works. Let's remember the specification text for Note above: Represents a short written work typically less than a single paragraph in length. Let's hash that (along with a "recommendation" prefix that a user might choose to bind this to the term Note, though this is just a recommendation):

$ echo "Note: Represents a short written work typically less than a single paragraph in length." | sha256sum
3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6

So if we were defining Note via content-addressing, we instead would have defined it as urn:sha256:3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6. This is unambiguous enough to avoid collisions with other uses of the word "Note". But note that it doesn't require any servers staying up. It also doesn't have any namespace governance quagmire, because there is no namespace. Updates can be handled the usual way, via errata (translations can be handled similarly), and standards organizations can still publish such things... but it is important that the original term remain content-addressed and immutable. (Hash migration is left as an exercise for the user, with a hint that the solution is similar to that with errata.)

Anyway, our post might end up looking in the end like this instead:

  {"@context": {"Note": "urn:sha256:3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6",
                "content": "urn:sha256:57dc44a1cdcbb7aa976a65a858b4d349ad6110d58d9d546650ce2b0e2b1048e4",
                "sensitive": "urn:sha256:81d98cf83fcf733400ad5d2a25495feeea47f287193a53a9722f4cb025da88f1"},
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?",
   "sensitive": true}

I'll note very briefly that content-addressing is also the answer for JSON-LD contexts. If something like Datashards or IPFS were used to host json-ld contexts, each post could link to the exact immutable content-addressed context it was intended to be used with. Servers that use such contexts can "pin" them to keep them available, avoiding a single point of failure (or bandwidth bottleneck).

  {"@context": "idsc:p0.JLnUcJN4R1KNvSXm9Ut3Tmg7WfXAKEOx47p01Pk_Htw.2_rCdtnEha1RpD_qyzxhFIjUvLj7crIbzpmzWei5xRk",
   "@type": "Note",
   "content": "Would you read me a bedtime story about the great ontology wars?",
   "sensitive": true}

As one other side-note, I'll also observe that even though the fully expanded version of the above message is:

  {"@type": "urn:sha256:3e1de3b56d2dc1bee7313963462691f9a8f46b068557b75e0e0d14c0994eddc6",
   "urn:sha256:57dc44a1cdcbb7aa976a65a858b4d349ad6110d58d9d546650ce2b0e2b1048e4": "Would you read me a bedtime story about the great ontology wars?",
   "urn:sha256:81d98cf83fcf733400ad5d2a25495feeea47f287193a53a9722f4cb025da88f1": true}

... we never needed to look at it that way because json-ld contexts (and systems like them) are actually petname systems.

Conclusions (and non-conclusions)

Let me clarify a claim I'm not making: we don't need to throw away the old terms for systems like ActivityStreams that are already well understood. However, going forward I do think that using content-addressing of new terms is a good idea. And in the long run, I think content-addressing of json-ld contexts and any documents like them is an absolute must (when they aren't inlined, anyway... but inlining is expensive).

If we adopted Content Addressed Vocabularies, working on vocabulary extensions to ActivityPub could be a different story. Imagine a git repository that communities can fork to work on new terms. We could have a drafts directory where people hammer out common extension terms, and when they're ready, we simply move them to the extensions directory. Since the names are merely hashes of the contents of that directory, statically generating a webpage that lists all current known and recommended extensions would be trivial. Everything could be handled in issues and PRs, and even if terms aren't merged into the main repo, that's merely a matter of lower term discoverability rather than a hinderance of application itself.

If we moved to content addressed vocabulary, we'd be more free from the perils of downtime and general web bitrot, freer from gatekeeping and governance challenges, but just as free (I'd argue even freer) to collaborate. Moving forward, I intend to ake content addressed approaches to terms I define in my systems, and I encourage you to do the same.