This article is long, and it's taken me a long time to get to.  I
procrastinated enough that the only way I could do it was to make it
as [social network] [threads] first, and this is a compilation and
expansion of those ideas.

But still, this post is also too long, so here's a tl;dr (too long;
didn't read) summary:

• Spritely Goblins has made significant headway with CapTP, a design
  referred to as the "Capability Transport Protocol"

• CapTP is not original to Spritely, but Spritely is making some
  improvements; various implementations have been around for over two
  decades

• CapTP enables secure, distributed object capability programming in
  mutually suspicious networks

• CapTP reduces the work on writing "protocols" to merely thinking
  about writing ordinary programs

• CapTP has cool features like "distributed acyclic garbage
  collection" and "promise pipelining"

• We plan a general version of CapTP which can be used across many
  different languages (and we're already talking with the Agoric folks
  about interoperability with their Javascript ecosystem); a ways out,
  but on the horizon

• Nearly everything cool that comes out of Spritely in the future will
  use CapTP as part of its foundation

Okay, now if you want the verbose version, read on.

# WHITEBOARD IMAGE HERE

For a while I've had the following whiteboard hovering around my desk
in my office.  It has unintentionally served as a fun conversation
starter in some video calls, but its real purpose was to help me think
through some difficult problems in implementation.

If you've been following me on the fediverse or on Twitter you
probably have a guess as to what this is, because I've been saying it
over and over again: this diagram represents [CapTP].  I've been
making major strides lately in the CapTP implementation in [Spritely
Goblins] and posting about it as I go.

I've even made wild claims like "I think CapTP is the most important
work I've done yet in my life" (yes, in the long run, I think it will
be a bigger deal than [ActivityPub]'s standardization work, which I am
very proud of, but the two are not at odds; the plan for Spritely is
that the two [live side by side]).  But while I've made it very clear
that /I am/ excited, I haven't done a very good job of explaining why
/others/ should be excited.

So the real question is: what does CapTP do?  Or more importantly,
what does CapTP enable us to do?  I don't think I've done a good job
of explaining this so far, so this blogpost is an attempt to do
better.  Admittedly, even the best attempt here might not succeed
until people get to use it; I used to have /the hardest/ time
explaining to anyone what ActivityPub was, even though I use mostly
the same language I do now, and suddenly when it started gaining major
adoption it's as if everyone I was talking to got it and my life
became much easier.

I suspect a similar thing will happen with CapTP… at the moment, a lot
of what I am going to say will sound abstract and maybe even barely
believable.  As it turns out, simply [talking at people] is a rather
horrible way to get people to understand concepts, but [showing
people] is another matter.  A [small demo] does exist, but the big
CapTP demo is to come.  Nonetheless, this blogpost will attempt to
explain in plain (or plain-ish) language what CapTP can enable us to
do.

Let me speak at a high level first.  Spritely makes the claim that we
are aiming for /the ability to/ build such rich things as distributed
virtual worlds.  That requirement is not there just because
distributed virtual worlds are fun things to work on or use (though
they are), but because right now the social and computing systems we
have tend to be /insufficiently/ capable of performing the kind of
rich and secure interactions that this goalpost requires.  CapTP
becomes the foundation for making this and many of the other spaces
Spritely aims to work in.

But I'm already multiple paragraphs in with no explanation yet, so
here we go.  Are you ready?

      CapTP is a protocol to allow "distributed object
      programming over mutually suspicious networks, meant to be
      combined with an object capability style of programming".

Wow, clear as mud, right?  There's a lot to unpack there!  So let's
break it apart piece by piece…

First of all, "distributed object programming": Yes, CapTP allows you
to program across networks with the level of convenience as if you
were programming against /local objects/.  (Throw out [your
assumptions of "objects" as OOP], this can be functional; Spritely
Goblins more or less is.)  This is done by creating local proxies
representing remote objects that the programmer can operate against.
This has been done wrong many times in the past (eg the NeXT model);
doing this right is the result of [significant research].  But the
result is significant programmer ergonomics in building distributed
systems.

Next, "mutually suspicious networks": there's no assumption that trust
exists on server-boundaries… CapTP is built to allow collaboration
/without/ full trust.  Curiously, this approach allows for /increased/
collaboration and building of more trust; collaboration is more
consensual.

This is no small matter.  To draw parallels to non-computing life, I
feel safer knowing I do not need to trust all people equally and with
the same things in my own life… it is important to permit building the
/appropriate/ level of trust, rather than an /absolute/ level of
trust, in all parties.  We do this in our daily lives, but our
computing systems are generally not privy to all of our thoughts (that
too might result in trust violations) and yet must act on our behalf.
The ability to scope the amount of trust permitted means living a life
of greater collaboration, less paranoia, and less distrust.

The decision to not assume the need for trust on machine/server
boundaries may also seem surprising, but is important.  If you've ever
tried to configure CORS, you'll be aware of how hard and error-prone
this is.  Even the most advanced security architects find themselves
frequently making mistakes in this area.

But making decisions based on node-boundary seems like a strange
system if we think too long about it anyway.  In general (though it
often requires much social un-conditioning), I try to not evaluate
trust boundaries where I treat members of one nation-state the same.
Similarly, there are many households where I trust its members to
varying degrees and with different things.

So the machine boundary trust seems like a poor indicator.  It is even
more poor when we examine the needs of fully peer to peer systems.
Server-boundary-oriented-systems with only a few, small number of
trusted servers barely scales in the post-web-2.0 increasing
consolidation of the web to just a few service providers.  They cannot
stand up when making new nodes is extremely trivial.

So, nodes are mutually suspicious and do not hand out access to each
other simply because they happen to be on trusted lists of server
identities.  So how is authority handed out?

This is where "combined with object capability style of programming"
comes in: this combination is where the power really comes out.
*Safe, cooperative interaction* is very /easy/ in the ocap style: it
turns out capability flows can be encoded as normal programming:
argument passing and scope!  This is the fundamental observation of [A
Security Kernel Based on the Lambda Calculus]; if we take our models
of programming security, within them is the best security model we
have, and the easiest for programmers to reason about.  CapTP takes
that observation and applies it on the network level.

A nicely implemented CapTP system will abstract this for programmers
so they can focus on the programming part.  It wasn't handed to you?
Then it's not in your scope and you can't access it.

This simplifies program construction dramatically.

Recently [I did a 250 line client/server p2p chat "protocol"] (well,
250 lines for the protocol, a mere 300 lines more for the GUI), but I
didn't really have to think about the protocol at all.  In fact I
designed it locally first, in one process; it "automatically" worked
over the network, but that's because CapTP took care of the network
considerations for me.

By contrast, in most programming systems, an /enormous/ amount of time
is spent on protocol design and APIs which tend to be bespoke and
disconnected mostly from the actual implementation.  They also tend to
be made of many moving parts which are hard to reason about.  We know
that building good abstractions can lead to significant gains in
programmer productivity; TCP and TLS are clear examples of this.
CapTP, when combined with object capability security systems, brings a
similar type of abstraction gain; the more tedious parts of protocols
are handled in a general way, and we can focus on the specific ways
our programs work and need to communicate.  In general this will often
correspond to what we would have put at the API perimeter anyway, but
now we need less confusing wiring to do it.

Okay, if you've really read this far already, time for an
intermission.  Spritely has not invented CapTP (though it is helping
in some of the innovations happening which have been planned for this
generation).  The idea is somewhere around a two and a half decades
ago and was part of the [E programming language] (which I often
jokingly call "the most interesting programming language you've never
heard of").  E actually came out of another distributed virtual worlds
system of the late 90s, [Electric Communities Habitat].  Even though
EC Habitat did not make it out of the dot-com crash, E did, and lived
on as an open source project.  (It's no exaggeration to say that the
vast majority of the exploration space Spritely is exploring comes out
of work that was trailblazed most especially by E; I can't recommend
[Mark Miller's dissertation] enough.)  In-between then and now, CapTP
has seen several variants (maybe the most famous of which is [Cap'N
Proto], which in some ways I think of as a mostly-CapTP for people who
don't know [they're using a CapTP], and whose [rpc.capnp] was of
enormous help to me learning how CapTP works).

Another funny thing happened in-between E's CapTP and now: most of the
E and ocap folks joined Javascript's standardization efforts and over
the course of the last decade and a half or so have helped beat it
into a suitable shape to finally also achieve the distributed object
dream.  Most of those folks have gone on to start an organization
named [Agoric] (whose namesake goes back go [The Agoric Papers] which
laid out the vision for all this work all the way back in 1998 (holy
cow!))  which is just now bringing that dream of distributed ocap
networks to Javascript land and beyond (with a bit more focus on
economic systems in contrast to Spritely's focus on social networks).

Which may lead you to ask: shouldn't Agoric and Spritely's CapTPs
interoperate?  I'm happy to say: [we're already talking, and that's
the plan].  In fact, part of my ([documented, but long]) process of
learning CapTP I submitted a [PR adding some comments to Agoric's
implementation] since I was reading and trying to make sense of it
anyway (happily, it was merged).  It's a long-term high priority (but
not an urgent short-term priority) for both sides to implement the
same protocol.  When this happens, it won't matter whether you're
using Lisp'y Spritely code or Javascript'y Agoric code; you should be
able to do distributed object programming in each and both should
interoperate happily.

Okay, back to the cool features that CapTP provides.  CapTP is very
/efficient:/ you may have used ocap systems that have huge
certificates or long URIs.  In CapTP a shared capability is merely a
/bidirectional integer assignment/ between the machine importing and
the machine exporting!  (But users need not be aware of this, since
again, a well designed CapTP interface encapsulates the underlying
semantics in the same way that good TLS and TCP libraries encapsulate
the layers of encrypted and ordered network connections.)

CapTP also has distributed acyclic garbage collection.  That means
that two servers can collaborate to say "oh yeah, thanks for giving me
that object, but you don't need to hold onto it any more on my
behalf."  Wow!  (The original Electric Communities proto-CapTP even
could handle [collecting cycles that span machines]; this seems to
require more significant support from the underlying language runtime
than most support.  This turns out to be rarely needed anyway and is
definitely a deeper rabbit hole than the already deep rabbit holes
this blogpost has gone down so we will save that for another time.)

Why should you care about distributed acyclic GC though?  Let's put it
another way: imagine you were building some sort of distributed role
playing game.  Your players are regularly fighting bats, which are
cheap enemies that generally don't stick around long.  It doesn't take
long for your system to be bogged down by bat corpses!  CapTP helps
solve this problem by allowing servers to cooperatively know when they
no longer need object references held on their behalf.  (Before you
start asking about non-cooperative scenarios, there are abstraction
layers for that too, but we won't worry about those in this particular
post.)  This is a big win that few other protocols provide, but which
is cheap and efficient under CapTP.

CapTP also has ["promise pipelining"], which reduces round trips.  I
can send a message to a remote car factory and ask it to drive the car
once it makes it, even before I've been told the car is made!
(Spritely, Agoric, and even E all have tooling that makes this look
like a "natural" code flow as well.)  To quote [Mark Miller's
dissertation]:

      Machines grow faster and memories grow larger. But the
      speed of light is constant and New York is not getting any
      closer to Tokyo.  As hardware continues to improve, the
      latency barrier between distant machines will increasingly
      dominate the performance of distributed computation.

All in all, this reduces the amount of work for rich, networked
collaborations with safety properties we can reason about from
something which only protocol hyper-experts are deemed worthy to
consider, to something that us mere mortals can think about.  Focus on
writing your code and think about where access is being passed around
on that layer.

Now let's discuss the current state of things.  Some of the biggest
pieces of the CapTP puzzle have recently landed.  One of those,
"handoffs", is what was being puzzled on the whiteboard previously
shown; since much of CapTP's efficiency and operation comes from local
pairwise meaning of integers between two machines, transferring a
capability machine A has to machine C to machine B is a tricky
process.  We now have a certificate-oriented solution that has gone
through some community review (though we would like more, and more
will come at protocol codification time for certain) and is also in
alignment with the plans expressed by the folks over at Agoric, so
this is good news: getting the "key features" of CapTP in is (mostly)
no longer a blocker for fleshing out other layers of Spritely.

At this current moment, Spritely has actually gotten a bit ahead of
Agoric in terms of /current implementations/ of CapTP (this won't be
the case for long though, it's just a matter of time until we're at
feature parity); we already have distributed acyclic garbage
collection, handoffs, and shortening/unwrapping of object references
that return home.  This is easily explained in that the Agoric folks
have actually implemented such features in the past systems so truly
are the main innovators in this space (though some of these features
have been broadly planned to be done differently in this generation of
CapTP, and Spritely's implementation has proceeded by implementing
those new approaches… this is probably Spritely's main source of
innovation CapTP-wise), but Agoric is also working on helping
Javascript standardize the appropriate tooling to make all these
pieces possible in such runtimes whereas Spritely's choice of lisp'y
languages, while more obscure, puts it in a space of languages built
for language design enthusiasts, so all of those pieces were already
there.

We should be glad that Agoric is doing the hard work to bring these
tools to a wide audience through Javascript standardization; this is
hard work (trust me, I'm not a stranger to standards work, but
language standardization requires /extra/ care).  Additionally, the
focuses of Spritely and Agoric means we're both working on similar
things with different priorities: Agoric is more focused on things
that are economic-oriented and so has made enormous strides in the
features that are needed for this area, whereas Spritely has made
innovations in the areas necessary for distributed virtual worlds and
social networks.  In the future we should see feature parity on the
CapTP layer; this is also a big win because it means that users of
either system can leverage the features of the other side without
having to disagree over /which/ language is the right foundation.
This also means that we have a way to collaborate even with us
focusing on different pieces of the end-user puzzle.  (Ie, Spritely
need not be focusing on the economic layer right now…  once we have
CapTP interoperability, we'll already have a bridge into that world
thanks to the hard work of the Agoric folks!  And the Agoric folks can
also benefit from our work on the social side of things.)

But Spritely's CapTP implementation still needs some cleanup and to be
documented.  (I did my best to [document my process of learning and
implementing CapTP as I went], though forewarning, the linked mailing
list thread is full of many twists and turns.  In a cleaned up primer,
introducing the core concepts should be much simpler.)  This work will
begin soonish, but again, both Spritely and Agoric are in close
conversations but also have pressing matters to attend to which
precede this work.  But likely you'll hear more about this soon-ish.

The more exciting and immediate thing to do is to start building
demos.  Longform textual explanations are good and well, but "seeing
is believing" (and shinier demos are better; the [Terminal Phase time
travel demo] showed off features that had existed for some time
Goblins, but was the first time I saw people raising their heads and
saying "gosh, wow, what's happening /over here/!")  CapTP is only
exciting because it's a powerful /foundation/ for what's to come.  I
fear this blogpost, long and rambly as it is, still will not capture
minds in the appropriate way.  Hopefully by building demos people can
get a sense and feeling that indeed, something truly interesting is
going on here, something they really want to use.

I believe Spritely's future is bright, but part of this is because of
the long and hard work on its architectural foundations.  Those pieces
are coming together, and CapTP is probably the most shining pillar of
all of those.  At the beginning of this process, CapTP was a strange
and mysterious thing, yet with seemingly alluring powers.  At present,
those alluring powers have shown themselves true and are increasingly
available to the Spritely system.  In time as CapTP is codified, we
aim to chip away at the strange and mysterious component, distributing
its power to all.

[social network] <https://octodon.social/@cwebber/105651918309744783>

[threads] <https://twitter.com/dustyweb/status/1355957088690778122>

[CapTP] <http://erights.org/elib/distrib/captp/index.html>

[Spritely Goblins] <https://docs.racket-lang.org/goblins/index.html>

[ActivityPub] <https://www.w3.org/TR/activitypub/>

[live side by side] <https://spritelyproject.org/#mandy>

[talking at people]
<http://habitatchronicles.com/2004/04/you-cant-tell-people-anything/>

[showing people]
<https://dustycloud.org/blog/if-you-cant-tell-people-anything/>

[small demo]
<https://dustycloud.org/blog/spritely-goblins-v0.7-released/>

[your assumptions of "objects" as OOP]
<http://mumble.net/~jar/articles/oo.html>

[significant research] <http://www.erights.org/talks/thesis/>

[A Security Kernel Based on the Lambda Calculus]
<http://mumble.net/~jar/pubs/secureos/secureos.html>

[I did a 250 line client/server p2p chat "protocol"]
<https://dustycloud.org/blog/spritely-goblins-v0.7-released/>

[E programming language] <http://erights.org/>

[Electric Communities Habitat]
<https://www.youtube.com/watch?v=KNiePoNiyvE>

[Mark Miller's dissertation] <http://www.erights.org/talks/thesis/>

[Cap'N Proto] <https://capnproto.org/>

[they're using a CapTP] <https://capnproto.org/rpc.html>

[rpc.capnp]
<https://github.com/capnproto/capnproto/blob/master/c++/src/capnp/rpc.capnp>

[Agoric] <https://agoric.com/>

[The Agoric Papers] <https://agoric.com/papers/>

[we're already talking, and that's the plan]
<https://github.com/Agoric/agoric-sdk/issues/1827>

[documented, but long]
<https://groups.google.com/g/cap-talk/c/xWv2-J62g-I>

[PR adding some comments to Agoric's implementation]
<https://github.com/Agoric/agoric-sdk/pull/1139>

[collecting cycles that span machines]
<http://erights.org/history/original-e/dgc/>

["promise pipelining"]
<http://www.erights.org/elib/distrib/pipeline.html>

[document my process of learning and implementing CapTP as I went]
<https://groups.google.com/g/cap-talk/c/xWv2-J62g-I>

[Terminal Phase time travel demo]
<https://dustycloud.org/blog/goblins-time-travel-micropreview/>