Archives

Tags

Posts with tag "federation"

I've been awarded the Samsung Stack Zero Grant

By Christopher Lemmer Webber on Thu 31 January 2019

Good news everyone! I've been awarded the Samsung Stack Zero Grant. But why not quote them?

Christopher Lemmer Webber is the co-editor and co-author of the now-ubiquitous ActivityPub protocol. While it provides a great framework for creating, updating, and deleting content across applications, it doesn’t provide any standardised mechanism for secure authorisation. With Spritely, Webber will work on extending the protocol in a backward-compatible manner, while at the same time building tools and applications that showcase its use. This will enable developers to build applications that enable richer interactions through a federated standard.

This should fund my next couple of years of work on full time advancement of the fediverse.

You may remember that I've talked about Spritely before. In fact I am finally in launch-mode... I am currently sitting in a wizard's tower at a hackathon, getting out the first release of Golem, a Spritely artifact.

Anyway, I'll be at FOSDEM 2019 giving a talk and on a panel. And after that I'll be speaking at CopyleftConf. Maybe I'll see you?

More news soon...

Spritely: towards secure social spaces as virtual worlds

By Christopher Lemmer Webber on Sun 14 October 2018

If you follow me on the fediverse, maybe you already know. I've sent an announcement to my work that I am switching to doing a project named Spritely on my own full time. (Actually I'm still going to be doing some contracting with my old job, so I'll still have some income, but I'll be putting a full 40 hours a week into Spritely.)

tl;dr: I'm working on building the next generation of the fediverse as a distributed game. You can support this work if you so wish.

What on earth is Spritely?

"Well, vaporware currently", has been my joke since announcing it, but the plans, and even some core components, are starting to congeal, and I have decided it's time to throw myself fully into it.

But I still haven't answered the question, so I'll try to do so in bullet points. Spritely:

  • Aims to bring stronger user security, better anti-abuse tooling, stronger resistance against censorship, and more interesting interactions to users of the fediverse.
  • Is based on the massively popular ActivityPub standard (which I co-authored, so I do know a thing or two about this).
  • Aims to transform distributed social networks into distributed social games / virtual worlds. The dreams of the 90s are alive in Spritely.
  • Recognizes that ActivityPub is based on the actor model, and a pure version of the actor model is itself already a secure object capability system, so we don't have to break the spec to gain those powers... just change the discipline of how we use it.
  • Will be written in Racket.
  • Is an umbrella project for a number of modular tools necessary to get to this goal. The first, an object capability actor model system for Racket named Goblins, should see its first public release in the next week or two.
  • And of course it will be 100% free/libre/open source software.

That's a lot to unpack, and it also may sound overly ambitious. The game part in particular may sound strange, but I'll defend it on three fronts. First, not too many people run federated social web servers, but a lot of people run Minecraft servers... lots of teenagers run Minecraft servers... and it's not because Minecraft has the best graphics or the best fighting (it certainly doesn't), it's because Minecraft allows you to build a world together with your friends. Second, players of old MUDs, MOOs, MUSHes and etc from the 90s may recognize that modern social networks are structurally degenerate forms of the kinds of environments that existed then, but contemporary social networks lack the concept of a sense of place and interaction. Third, many interesting projects (Python's Twisted library, Flickr, much of object capability security patterns) have come out of trying to build such massively multiplayer world systems. Because of this last one in particular, I think that shooting for the stars means that if we don't make it we're likely to at least make the moon, so failure is okay if it means other things come out of it. (Also, four: it's a fun and motivating use case for me which I have explored before.)

To keep Spritely from being total vaporware, the way I will approach the project is by regularly releasing a series of "demos", some of which may be disjoint, but will hopefully increasingly converge on the vision. Consider Spritely a skunkworks-in-the-public-interest for the federated social web.

But why?

Standardizing ActivityPub was a much more difficult effort than anticipated, but equally or more so more successful than I expected (partly due to Mastodon's adoption launching it past the sound barrier). In that sense this is great news. We now have dozens of projects adopting it, and the network has (at last I looked) over 1.5 million registered users (which isn't the same as active users).

So, mission accomplished, right? Well, there are a few things that bother me.

  • The kind of rich interactions one can do are limited by a lack of authorization policy. Again, I believe object capabilities provide this, but it's not well explained to the public how to use it. (By contrast, Access Control Lists and friends are absolutely the wrong approach.)
  • Users are currently insufficiently protected from spam, abuse, and harassment while at the same time administrators are overwhelmed. This is leading a number of servers to move to a whitelisting of servers, which both re-centralizes the system and prioritizes big instances over smaller instances (it shouldn't matter what instance size you're on; arguably we should be encouraging smaller ones even). There are some paths forward, and I will hint at just one: what would happen if instead of one inbox, we had multiple inboxes? If I don't know you, you can access me via my public inbox, but maybe that's heavily moderated or you have to pay "postage". If I do know you, you might have an address with more direct access to me.
  • Relatedly, contemporary fediverse interfaces borrow from surveillance-capitalism based popular social networks by focusing on breadth of relationships rather than depth. Ever notice how the first thing Twitter shows you when you hover over a person's face is how many followers they have? I don't know about you, but I immediately compare that to my own follower count, and I don't even want to. This encourages high school popularity contest type bullshit, and it's by design. What if instead of focusing on how many people we can connect to we instead focused on the depth of our relationships? Much of the fediverse has imported "what works" directly from Facebook and Twitter, but I'd argue there's a lot we can do if we drop the assumption that this is the ideal starting base.
  • The contemporary view in the fediverse is that social scoping is like Python scoping: locals (instance) and globals (federation). Instance administrators are even encouraged to set up to run communities based on a specific niche, which is a nice reason to motivate administrators but it causes problems: even small differences between servers' expected policies often result in servers banning each other entirely. (Sometimes this is warranted, and I'm not opposed to moderation but rather looking for more effective forms of it.) Yet most of us are one person but part of many different communities with different needs. For instance, Alice may be a computer programmer, a tabletop game enthusiast, a fanfiction author, and a member of her family. In each of those settings she may present herself differently and also have different expectations of what is acceptable behavior. Alice should not need multiple accounts for this on different servers, so it would seem the right answer for community gathering is closer to something like mailing lists. What is acceptable at the gaming table may not be acceptable at work, and what happens on the fanfiction community perhaps does not need to be shared with one's family, and each community should be empowered to moderate appropriately.
  • I'd like to bridge the gap between peer to peer and federated systems. One hint as to how to do this: what happens when you run ActivityPub servers over Tor onion services or I2P? What if instead of our messages living at http addresses that could down, they could be securely addressed by their encrypted contents?
  • Finally, I will admit the most urgent reason for these concerns... I'm very concerned politically about the state of the world and what I see as increasing authoritarianism and flagrant violations of human rights. I have a lot of worry that if we don't normalize use of decentralized and secure private systems, we will lose the ability to host them, though we've never needed them more urgently.

There are a lot of opportunities, and a lot of things I am excited about, but I am also afraid of inaction and how many regrets I will have if I don't try. I have the knowledge, the privilege, and the experience to at least attempt to make a dent in some of these things. I might not succeed. But I should try.

Who's going to pay for all this?

I don't really have a funding plan, so I guess this is kind of a non-answer. However, I do have a Patreon account you could donate to.

But should you donate? Well, I dunno, I feel like that's your call. Certainly many people are in worse positions than I am; I have a buffer and I still am doing some contracting to keep myself going for a while. Maybe you know people who need the money more than I do, or maybe you need it yourself. If this is the case, don't hesitate: take care of yourself and your loved ones first.

That said, FOSS in general has the property of being a public good but tends to have a free rider problem. While we did some fundraising for some of this stuff a few years ago, I gave the majority of the money to other people. Since then I've been mostly funding work on the federated social web myself in one way or another, usually by contracting on unrelated or quasi-related things to keep myself above the burn rate. I have the privilege and ability to do it, and I believe it's critical work. But I'd love to be able to work on this with focus, and maybe get things to the point to pull in and pay other people to help again. Perhaps if we reach that point I'll look at putting this work under a nonprofit. I do know I'm unwilling to break my FOSS principles to make it happen.

Anyway... you may even still be skeptical after reading all this about whether or not I can do it. I don't blame you... even I'm skeptical. But I'll try to convince you the way I'm going to convince myself: by pushing out demos until we reach something real.

Onwards and upwards!

Possible routes for distributed anti-abuse systems

By Christopher Lemmer Webber on Tue 04 April 2017

I work on federated standards and systems, particularly ActivityPub. Of course, if you work on this stuff, every now and then the question of "how do you deal with abuse?" very rightly comes up. Most recently Mastodon has gotten some attention, which is great! But of course, people are raising the question, can federation systems really protect people from abuse? (It's not the first time to come up either; at LibrePlanet in 2015 a number of us held a "social justice for federated free software systems" dinner and were discussing things then.) It's an important question to ask, and I'm afraid the answer is, "not reliably yet". But in this blogpost I hope to show that there may be some hope for the future.

A few things I think you want out of such a system:

  • It should actually be decentralized. It's possible to run a mega-node that everyone screens their content against, but then what's the point?
  • The most important thing is for the system to prevent attackers from being able to deliver hateful content. An attack in a social system means getting your message across, so that's what we don't want to happen.
  • But who are we protecting, and against what? It's difficult to know, because even very progressive groups often don't anticipate who they need to protect; "social justice" groups of the past are often exclusionary against other groups until they find out they need to be otherwise (eg in each of these important social movements, some prominent members have had problems including other social justice groups: racist suffragists, civil rights activists exclusionary against gay and lesbian groups, gay and lesbian groups exclusionary against transgender individuals...). The point is: if we haven't gotten it all right in the past, we might not get it all right in the present, so the most important thing is to allow communities to protect themselves from hate.

Of course, keep in mind that no technology system is going to be perfect; these are all imperfect tools for mitigation. But what technical decisions you make do also affect who is empowered in a system, so it's also still important to work on these, though none of them are panaceas.

With those core bits down, what strategies are available? There are a few I've been paying close attention to (keep in mind that I am an expert in zero of these routes at present):

  • Federated Blocklists: The easiest "starter" route. And good news! If you're using the ActivityPub standard, there's already a Block activity, and you could build up group-moderated collections of people to block. A decent first step, but I don't think it gets you very far; for one thing, being the maintainer of a public blocklist is a risky activity; trolls might use that information to attack you. That and merging/squashing blocklists might be awkward in this system.
  • Federated reputation systems: You could also take it a step further by using something like the Stellar consensus protocol (more info in paper form or even a graphic novel). Stellar is a cryptographically signed ledger. Okay, yes, that makes it a kind of blockchain (which will make some peoples' eyes glaze over, but technically a signed git repository is also a blockchain), but it's not necessarily restricted to use of cryptocurrencies... you can track any kinds of transactions with it. Which means we could also track blocklists, or even less binary reputation systems! But what's most interesting about Stellar is that it's also federated... and in this case, federation means you can choose what groups you trust... but due to math'y concepts that I occasionally totally get upon being explained to me and then forget the moment someone asks me to explain to someone else, consensus is still enforced within the "slices" of groups you are following. You can imagine maybe the needs of an LGBT community and a Furry community might overlap, but they might not be the same, and maybe you'd be subscribed to just one or both, or neither. Or pick your other social groups, go wild. That said, I'm not sure how to make these "transactions" not public in this system, so it's very out there in the open, but since there's a voting system built-in maybe particular individuals won't be as liable for being attacked as individuals maintaining a blocklist are. Introducing a sliding-scale "social reputation system" may also introduce other dangerous problems, though I think Stellar's design is probably the least dangerous of all of these since it probably will still keep abusers out of a particular targeted group, but will allow marginalized-but-not-recognized-by-larger groups still avenues to set up their own slices as well.
  • "Charging" for distributing messages: Hoo boy, this one's going to be controversial! This was suggested to me by someone smart in the whole distributed technology space. It's not necessarily what we would normally consider real money that would be charged to distribute things... it could be a kind of "whuffie" cryptocurrency that you have to pay. Well the upside to this is it would keep low-funded abusers out of a system... the downside is that you've now basically powered your decentralized social network through pay-to-play capitalism. Unfortunately, even if the cryptocurrency is just some "social media fun money", imaginary currencies have a way of turning into real currencies; see paying for in-game currency in any massively multiplayer game ever. I don't think this gives us the power dynamics we want in our system, but it's worth noting that "it's one way to do it"... with serious side effects.
  • Web of trust / Friend of a Friend networks: Well researched in crypto systems, though nobody's built really good UIs for them. Still, a lot of potential if the system was somehow made friendly and didn't require showing up to a nerd-heavy "key-signing party"... if the system could have marking who you trust and who you don't (and not just as in terms of verifying keys) built as an elegant part of the UI, then yes I think this could be a good component for recognizing who you might allow to send you messages. There are also risks in having these associations be completely public, though I think web of trust systems don't necessarily have to be public... you can recurse outward from the individuals you do already know. (Edit: My friend ArneBab suggests that looking at how Freenet handles its web of trust would be a good starting point for someone wishing to research this. I have 0 experience with Freenet, but here are some resources.)
  • Distributed recommendation systems: Think of recommender systems in (sorry for the centralized system references) Amazon, Netflix, or any of the major social networks (Twitter, Facebook, etc). Is there a way to tell if someone or some message may be relevant to you, depending on who else you follow? Almost nobody seems to be doing research here, but not quite nobody; here's one paper: Collaborative Filtering with Privacy. Would it work? I have no idea, but the paper's title sure sounds compelling. (Edit: ArneBab also points out that credence-p2p might also be useful to look at. Relevant papers here.)
  • Good ol' Bayesian filtering: Unfortunately, I think that there's too many alternate routes of attacks for just processing a message's statistical contents to be good enough, though I think it's probably a good component of an anti-abuse system. In fact, maybe we should be talking about solutions that can use multiple components, and be very adaptive...
  • Distributed machine learning sets: Probably way too computationally expensive to run in a decentralized network, but maybe I'm wrong. Maybe this can be done in a the right way, but I get the impression that without the training dataset it's probably not useful? Prove me wrong! But I also just don't know enough about machine learning. Has the right property of being adaptive, though.
  • Genetic programs: Okay, I hear you saying, "what?? genetic programming?? as in programs that evolve?" It's a field of study that has quite a bit of research behind it, but very little application in the real world... but it might be a good basis for filtering systems in a federated network (I'm beginning to explore this but I have no idea if it will bear fruit). Programs might evolve on your machine and mine which adapt to the changing nature of social attacks. And best of all, in a distributed network, we might be able to send our genetic anti-abuse programs to each other... and they could breed and make new anti-abuse baby programs! However, for this to work the programs would have to carry part of the information of their "experiences" from parent to child. After all, a program isn't going to very likely randomly bump into finding out that a hateful group has started using "cuck" as a slur. But programs keep information around while they run, and it's possible that parent programs could teach wordlists and other information to their children, or to other programs. And if you already have a trust network, your programs could propagate their techniques and information with each other. (There's a risk of a side channel attack though: you might be able to find some of the content of information sent/received by checking the wordlists or etc being passed around by these programs.) (You'd definitely want your programs sandboxed if you took this route, and I think it would be good for filtering only... if you expose output methods, your programs might start talking on the network, and who knows what would happen!) One big upside to this is that if it worked, it should work in a distributed system... you're effectively occasionally bringing the anti-abuse hamster cages together now and then. However, you do get into an ontology problem... if these programs are making up wordlists and binding them to generated symbols, you're effectively generating a new language. That's not too far from human-generated language, and so at that point you're talking about a computer-generated natural language... but I think there may be evolutionary incentive to agree upon terms. Setting up the "fitness" of the program (same with the machine learning route) would also have to involve determining what filtering is useful / isn't useful to the user of the program, and that's a whole challenging problem domain of its own (though you could start with just manually marking correct/incorrect the way people train their spam filters with spam/ham). But... okay by now this sounds pretty far-fetched, I know, but I think it has some promise... I'm beginning to explore it with a derivative of some of the ideas from PushGP. I'm not sure if any of these ideas will work but I think this is both the most entertainingly exciting and crazy at the same time. (On another side, I also think there's an untapped potential for roguelike AI that's driven by genetic algorithms...) There's definitely one huge downside to this though, even if it was effective (the same problem machine learning groups have)... the programs would be nearly unreadable to humans! Would this really be the only source of information you'd want to trust?
  • Expert / constraint based systems: Everyone's super into "machine learning" based systems right now, but it's hard to tell what on earth those systems are doing, even when their results are impressive (not far off from genetic algorithms, as above! but genetic algorithms may not require the same crazy large centralized datasets that machine learning systems tend to). Luckily there's a whole other branch of AI involving "expert systems" and "symbolic reasoning" and etc. The most promising of these I think is the propagator model by Sussman / Radul / and many others (if you've seen the constraint system in SICP, this is a grandchild of that design). One interesting thing about the propagator model is that it can come to conclusions from exploring many different sources, and it can tell you how it came to those conclusions. These systems are incredible and under-explored, though there's a catch: usually they're hand-wired, or the rules are added manually (which is partly how you can tell where the conclusions came from, since the symbols for those sources may be labeled by a human... but who knows, maybe there's a way to map a machines concept of some term to a human's anyway). I think this won't probably be adaptive enough for the fast-changing world of different attack structures... but! but! we've explored a lot of other ideas above, and maybe you have some combination of a reputation system, and a genetic programming system, and etc, and this branch of study could be a great route to glue those very differing systems together and get a sense of what may be safe / unsafe from different sources... and at least understand how each source, on its macro level, contributed to a conclusion about whether or not to trust a message or individual.

Okay, well that's it I think! Those are all the routes I've been thinking about. None of these routes are proven, but I hope that gives some evidence that there are avenues worth exploring... and that there is likely hope for the federated web to protect people... and maybe we could even do it better for the silos. After all, if we could do filtering as well as the big orgs, even if it were just at or nearly at the same level (which isn't as good as I'd like), that's already a win: it would mean we could protect people, and also preserve the autonomy of marginalized groups... who aren't very likely to be well protected by centralized regimes if push really does come to shove.

I hope that inspires some people! If you have other routes that should be added to this list or you're exploring or would like to explore one of these directions, please contact me. Once the W3C Social Working Group wraps up, I'm to be co-chair of the following Social Community Group, and this is something we want to explore there.

Update: I'm happy to see that the Matrix folks also see this as "the single biggest existential threat" and "a problem that the whole decentralised web community has in common"... apparently they already have been looking at the Stellar approach. More from their FOSDEM talk slides. I agree that this is a problem facing the whole decentralized web, and I'm glad / hopeful that there's interest in working together. Now's a good time to be implementing and experimenting!

An even more distributed ActivityPub

By Christopher Lemmer Webber on Thu 06 October 2016

So ActivityPub is nearing Candidate Recommendation status. If you want to hear a lot more about that whole process of getting there, and my recent trip to TPAC, and more, I wrote a post on the MediaGoblin blog about it.

Last night my brother Stephen came over and he was talking about how he wished ActivityPub was more of a "transactional" system. I've been thinking about this myself. ActivityPub as it is designed is made for the social network of 2014 more or less: trying to reproduce what the silos do, which is mutate a big database for specific objects, but reproduce that in a distributed way. Well, mutating distributed systems is a bit risky. Can we do better, without throwing out the majority of the system? I think it's possible, with a couple of tweaks.

  • The summary is to move to objects and pointers to objects. There's no mutation, only "changing" pointers (and even this is done via appending to a log, mostly).

    If you're familiar with git, you could think of the objects as well, objects, and the pointers as branches.

    Except... the log isn't in the objects pointing at their previous revisions really, the logging is on the pointers:

    [pointer id] => [note content id]
    
  • There's (activitystreams) objects (which may be content addressed, to be more robust), and then "pointers" to those, via signed pointer-logs.

  • The only mutation in the system is that the "pointers", which are signed logs (substitute "logs" for "ledger" and I guess that makes it a "blockchain" loosely), are append-only structures that say where the new content is. If something changes a lot, it can have "checkpoints". So, you can ignore old stuff eventually.

  • Updating content means making a new object, and updating the pointer-log to point to it.

  • This of course leads to a problem: what identifier should objects use to point at each other? The "content" id, or the "pointer-log" id? One route is that when one object links to another object, it could link to both the pointer-log id and the object id, but that hardly seems desirable...

  • Maybe the best route is to have all content ids point back at their official log id... this isn't as crazy as it sounds! Have a three step process for creating a brand new object:

    • Open a new pointer-log, which is empty, and get the identifier
    • Create the new object with all its content, and also add a link back to the pointer-log in the content's body
    • Add the new object as the first item in the pointer-log
  • At this point, I think we can get rid of all side effects in ActivityPub! The only mutation thing is append-only to that pointer-log. As for everything else:

    • Create just means "This is the first time you've seen this object." And in fact, we could probably drop Create in a system like this, because we don't need it.
    • Update is really just informing that there's a new entry on the pointer-log.
    • Delete... well, you can delete your own copy. You're mostly informing other servers to delete their copy, but they have a choice if they really will... though that's always been true! You now can also switch to the nice property that removing old content is now really garbage collection :)
  • Addressing and distribution still happens in the same, previous ways it did, I assume? So, you still won't get access to an object unless you have permissions? Though that gets more confusing if you use the (optional) content addressed storage here.

  • You now get a whole lot of things for free:

    • You have a built in history log of everything
    • Even if someone else's node goes down, you can keep a copy of all their content, and keep around the signatures to show that yeah, that really was the content they put there!
    • You could theoretically distribute storage pretty nicely
    • Updates/deletes are less dangerous

(Thanks to Steve for encouraging me to think this through more clearly, and lending your own thoughts, a lot of which is represented here! Thanks also to Manu Sporny who was the first to get me thinking along these lines with some comments at TPAC. Though, any mistakes in the design are mine...)

Of course, you can hit even more distributed-system-nerd points by tossing in the possibility of encrypting everything in the system, but let's leave that as an exercise for the reader. (It's not too much extra work if you already have public keys on profiles.)

Anyway, is this likely to happen? Well, time is running out in the group, so I'm unlikely to push for it in this iteration. But the good news, as I said, is that I think it can be built on top without too much extra work... The systems might even be straight-up compatible, and eventually the old mutation-heavy-system could be considered the "crufty" way of doing things.

Architectural astronaut'ing? Maybe! Fun to think about! Hopefully fun to explore. Gotta get the 2014-made-distributed version of the social web out first though. :)

Activipy v0.1 released!

By Christopher Lemmer Webber on Tue 03 November 2015

Hello all! I'm excited to announce v0.1 of Activipy. This is a new library targeting ActivityStreams 2.0.

If you're interested in building and expressing the information of a web application which contains social networking features, Activipy may be a great place to start.

Some things I think are interesting about Activipy:
  • It wraps ActivityStreams documents in pythonic style objects
  • Has a nice and extensible method dispatch system that even works well with ActivityStreams/json-ld's composite types.
  • It has an "Environment" feature: different applications might need to represent different vocabularies or extensions, and also might need to hook up entirely different sets of objects.
  • It hits a good middle ground in keeping things simple, until you need complexity. Everything's "just json", until you need to get into extension-land, in which case json-ld features are introduced. (Under the hood, that's always been there, but users don't necessarily need to understand json-ld to work with it.)
  • Good docs! I think! Or I worked really hard on them, at least!

As you may have guessed, this has a lot to do with our work on federation and the Social Working Group. I intend to build some interesting things on top of this myself.

In the meanwhile, I spent a lot of time on the docs, so I hope you find reading them to be enjoyable, and maybe you can build something neat with it? If you do, I'd love to hear about it!