Archives

Tags

DRM will unravel the Web

By Christopher Allan Webber on Mon 18 September 2017

I'm a web standards author and I participate in the W3C. I am co-editor of the ActivityPub protocol, participate in a few other community groups and working groups, and I consider it an honor to have been able to participate in the W3C process. What I am going to write here though represents me and my feelings alone. In a sense though, that makes this even more painful. This is a blogpost I don't have time to write, but here I am writing it; I am emotionally forced to push forward on this topic. The W3C has allowed DRM to move forward on the web through the EME specification (which is, to paraphrase Danny O'Brien from the EFF, a "DRM shaped hole where nothing else but DRM fits"). This threatens to unravel the web as we know it. How could this happen? How did we get here?

Like many of my generation, I grew up on the web, both as a citizen of this world and as a developer. "Web development", in one way or another, has principally been my work for my adult life, and how I have learned to be a programmer. The web is an enormous, astounding effort of many, many participants. Of course, Tim Berners-Lee is credited for much of it, and deserves much of this credit. I've had the pleasure of meeting Tim on a couple of occasions; when you meet Tim it's clear how deeply he cares about the web. Tim speaks quickly, as though he can't wait to get out the ideas that are so important to him, to try to help you understand how wonderful and exciting this system it is that we can build together. Then, as soon as he's done talking, he returns to his computer and gets to hacking on whatever software he's building to advance the web. You don't see this dedication to "keep your hands dirty" in the gears of the system very often, and it's a trait I admire. So it's very hard to reconcile that vision of Tim with someone who would intentionally unravel their own work... yet by allowing the W3C to approve DRM/EME, I believe that's what has happened.

I had an opportunity to tell Tim what I think about DRM and EME on the web, and unfortunately I blew it. At TPAC (W3C's big conference/gathering of the standards minds) last year, there was a protest against DRM outside. I was too busy to take part, but I did talk to a friend who is close to Tim and was frustrated about the protests happening outside. After I expressed that I sympathized with the protestors (and that I had even indeed protested myself in Boston), I explained my position to my friend. Apparently I was convincing enough where they encouraged me to talk to Tim and offer my perspective; they offered to flag them down for a chat. In fact Tim and I did speak over lunch, but -- although we had met in person before -- it was my first time talking to Tim one-on-one, and I was embarassed for that first interaction would me to be talking about DRM and what I was afraid was a sore subject for him. Instead we had a very pleasant conversation about the work I was doing on ActivityPub and some related friends' work on other standards (such as Linked Data Notifications, etc). It was a good conversation, but when it was over I had an enormous feeling of regret that has been on the back of my mind since.

Here then, is what I wish I had said.

Tim, I have read your article on why the W3C is supporting EME, and that I know you have thought about it a great deal. I think you believe what you are doing what is right for the web, but I believe you are making an enormous miscalculation. You have fought long and hard to build the web into the system it is... unfortunately, I think DRM threatens to undo all that work so thoroughly that allowing the W3C to effectively green-light DRM for the web will be, looking back on your life, your greatest regret.

You and I both know the dangers of DRM: it creates content that is illegal to operate on using any of the tooling you or I will ever be able to write. The power of DRM is not in its technology but in the surrounding laws; in the United States through the DMCA it is a criminal offense to inspect how DRM systems work or to talk about these vulnerabilities. DRM is also something that clearly cannot itself be implemented as a standard; it relies on proprietary secrecy in order to be able to function. Instead, EME defines a DRM-shaped hole, but we all know what goes into that hole... unfortunately, there's no way for you or I to build an open and interoperable system that can fit in that EME hole, because DRM is antithetical to an interoperable, open web.

I think, from reading your article, that you believe that DRM will be safely contained to just "premium movies", and so on. Perhaps if this were true, DRM would still be serious but not as enormous of a threat as I believe it is. In fact, we already know that DRM is being used by companies like John Deere to say that you don't even own your own tractor, car, etc. If DRM can apply to tractors, surely it will apply to more than just movies.

Indeed, there's good reason to believe that some companies will want to apply DRM to every layer of the web. Since the web has become a full-on "application delivery system", of course the same companies that apply DRM to software will want to apply DRM to their web software. The web has traditionally been a book which encourages being opened; I learned much of how to program on the web through that venerable "view source" right-click menu item of web browsers. However I fully expect with EME that we will see application authors begin to lock down HTML, CSS, Javascript, and every other bit of their web applications down with DRM. (I suppose in a sense this is already happening with javascript obfuscation and etc, but the web itself was at least a system of open standards where anyone could build an implementation and anyone could copy around files... with EME, this is no longer the case.) Look at the prevelance of DRM in proprietary applications elsewhere... once the option of a W3C-endorsed DRM-route exists, do you think these same application developers will not reach for it? But I think if you develop the web with the vision of it being humanity's greatest and most empowering knowledge system, you must be against this, because if enough of the web moves over to this model the assumptions and properties of the web as we've known it, as an open graph to free the world, cannot be upheld. I also know the true direction you'd like the web to go, one of linked data systems (of which ActivityPub is somewhat quietly one). Do you think such a world will be possible to build with DRM? I for one do not see how it is possible, but I'm afraid that's the path down which we are headed.

I'm sure you've thought of these things too, so what could be your reason for deciding to go ahead with supporting DRM anyway? My suspicion is it's two things contributing to this:

  1. Fear that the big players will pick up their ball and leave. I suspect there's fear of another WHATWG, that the big players will simply pick up their ball and leave.
  2. Most especially, and related to the above, I suspect the funding and membership structure of the W3C is having a large impact on this. Funding structures tend to have a large impact on decision making, as a kind of Conway's Law effect. W3C is reliant on its "thin gruel" of funding from member organizations (which means that large players tend to have a larger say in how the web is built today).

I suspect this is most of all what's driving the support for DRM within the W3C. However, I know a few W3C staff members who are clearly not excited about DRM, and two who have quit the organization over it, so it's not that EME is internally a technology that brings excitement to the organziation.

I suppose at this point, this is where I diverge with the things I could have said in the past and did not say as an appeal to not allow the W3C to endorse EME. Unfortunately, today EME made it to Recommendation. At the very least, I think the W3C could have gone forward with the Contributor Covenant proposed by the EFF, but did not. This is an enormous disappointment.

What do we do now? I think the best we can do at this point, as individual developers and users, is speak out against DRM and refuse to participate in it.

And Tim, if you're listening, perhaps there's no chance now to stop EME from becoming a Recommendation. But your voice can still carry weight. I encourage you to join in speaking out against the threat DRM brings to unravel the web.

Perhaps if we speak loud enough, and push hard enough, we can still save the web we love. But today is a sad say, and from here I'm afraid it is going to be an uphill battle.

EDIT: If you haven't yet read Cory Doctorow / the EFF's open letter to the W3C, you should.

An Icon I Need Tomorrow

By Christopher Allan Webber on Wed 09 August 2017

So this post is, like the story below, kind of a dumb distraction while I should be doing (or even blogging) more important things, but maybe it's entertaining? I may have borrowed a joke from The Thrilling Adventure Hour. Anyway I don't really know what's the point in posting this aside from demonstrating that I am down to narratively improvise, any time, any place (given the appropriate story cues at least). This starts off with a conversation with my friend Sumana on IRC.


<sumanah> hi paroneayea -- hope you are doing well
<sumanah> paroneayea: you may find this amusing -- I ran across this Noun
	  Project "collection" and distracted Jason for a few minutes as we
	  tried to suss out the theme
          https://thenounproject.com/petervandriel/collection/an-icon-i-need-tomorrow/

Some guy really needs this icons

<paroneayea> sumanah: hi!
<paroneayea> sumanah: wow
<sumanah> paroneayea: YES
<sumanah> what is happening tomorrow?
<sumanah> the world must know
<paroneayea> sumanah: I can only imagine a sexy murder cover-up with an office
	     intrigue
<sumanah> paroneayea: the insect? the egg?
<sumanah> I want a grand unified theory here
<sumanah> maybe it's like Upstream Color
<paroneayea> Two coworkers, who are secretly lovers, meet and exchange
	     "documents".  (It is a plan, a plan for murder.)
<paroneayea> An elderly relative dies.  Their walker is found alone.
<paroneayea> "The act is done", he whispered, passing by her desk.
<paroneayea> Drive to dump the body. The body is dumped.
<paroneayea> Sexy "ha ha it's over" scene.
<paroneayea> "Can we keep our relationship a secret?"
<paroneayea> Beware, the watercooler talk!
<paroneayea> Three construction workers return to their job and discuss, why
	     did our construction equipment move
* cdonnelly is on the edge of her seat
<paroneayea> [Combining two:] With any luck, the scheme won't break.
<paroneayea> Back, meeting at the lovers' home, dinner is cooking on the stove
	     as the couple talks about their life together.
<paroneayea> Suddenly, a call from the royal guard; "Madam, did you know that
	     your Relative(TM)/Former Lover has died????"
<paroneayea> The woman, mid-lipstick application, pauses.
<paroneayea> She drops the egg, which she was about to crack into the pot.  It
	     breaks, symbolically revealing the break in their plan.
<paroneayea> She accidentally brushes against the pound key.  "Madam are you
	     there?  Madam?"
<paroneayea> "Yes sorry, I'm writing down the address.  Yes, I'll be there in
	     half an hour."
<paroneayea> The phone is hung up, she is afraid.  "Calm down, let's finish
	     our dinner first."  They eat.
<paroneayea> Then they get ready to leave.  She loads her lipstick, which also
	     happened to be a bullet, into her gun.
<paroneayea> (gosh this is getting hard)
* sumanah was wondering whether paroneayea was going to notice that the icon
  looks like a lipstick but is labelled as a bullet
<paroneayea> (I already noticed, I tried to do "her lipstick, which is also a
	     bullet" but got lazy then)
* sumanah wonders: a safe deposit box? a self-storage locker?
<paroneayea> He locks the household, and on the way out, makes sure that the
	     key to the safe is still there.
<paroneayea> Now they are back at the construction site.
<paroneayea> The police officer is taking down the record.  Uhuh, yes.  Uhuh,
	     that sounds about right.
<paroneayea> As the record goes on, their story fills the page but also begins
	     to unravel.
* cdonnelly gasps
<paroneayea> The man nervously thumbs through his keys, and then drops them.
	     The officer says, "Hold on, you dropped your keys sir, let me
	     pick those up for you..." when he picks them up, he sees one key
	     in particular... encrusted in blood.
<paroneayea> "Wait a minute... 5608... this is.. exactly the key to the safe
	     of the man who died recently!"  "That can't be it, hold on a
	     minute..." "I'm calling this in.. sir and madam you're under
	     arrest..."
<paroneayea> The woman pulls the gun out of her makeup bag.  "Don't make me
	     shoot!  I've loaded this with lipstick and I'm dangerous!"
<paroneayea> * (But BAM BAM, it went off!  A bullet case / lipstick container,
	     labeled Checkov's Cosmetics, falls ominously to the floor.)
<paroneayea> They run!  Into the sewers.  A cockroach scatters out of the way.
<paroneayea> Squeak!  A Rat darts right out of the way, just in time.  "I
	     can't take any more of this.  I won't soil my suit running into
	     this sewer any more!"  "Damnit Jim, if you run out you'll give us
	     all away!"  "I don't care Janice!"  (I guess they have names
	     now.)
* cdonnelly giggles
* sumanah suspects a particular thing is going to happen next
* jasonaowen too
<paroneayea> "I won't let you do this!  I've worked too hard for this damnit!"
	     A gun is pulled!   A shot is fired!  The man falls to the floor,
	     gasping for breath!  "If I'm going down, you're going down with
	     me!"  The woman cries out in pain, and falls to the floor dying
	     as well... but what could have possibly killed her?
<paroneayea> The scene cuts to outside.  The police have pulled the bodies out
	     of the sewer.  "Gosh, this whole thing got complicated really
	     fast.  It sure did end quickly though.  What could have possibly
	     cut this whole series of narrative devices down all at once?"
	     "Take a look at this Harry."  What's that sticking out of her
	     chest?  Oh of course... the murder weapon..." <cotd>
<paroneayea> The guard seals the razor, which had cut it all down at once,
	     into a plastic bag.  "Occam's Razor company... it always seems to
	     bring a quick end to stories like these..."
<paroneayea> FIN
* sumanah applauds
* cdonnelly applauds
<paroneayea> brought to you by I have played way too many fucking RPGs
<jasonaowen> paroneayea: https://i.imgur.com/e9U73gy.gifv
<paroneayea> If nobody minds I'm going to post this on my blog or something,
	     might as well get a blogpost out of that distraction ;)
<jasonaowen> paroneayea: please do :)
<cdonnelly> paroneayea: +1 to blog post
<sumanah> paroneayea: I was like "I think today I have materially delayed the
	  development of ActivityPub" so I would encourage you to post it
<paroneayea> sumanah: haha :)
* kfogel reads backscroll, vows never to leave desk again

Iced tea and related cold beverages, at home and cheap

By Christopher Allan Webber on Mon 01 May 2017

In spring through fall, we always have a couple of pitchers (one decaf and one caffeinated) of "iced tea" in the fridge (though it's not necessarily the tea plant, if you care about that term... personally I think people are a bit snobby about having only one plant be The Acceptable Drink Herb). It's easy enough... just get a pitcher, add some tea bags to it (or loose tea in some sort of reusable loose-tea-holder or something, but honestly I'm too lazy for that) and some sweetener if that's your bag (we've been using stevia + erythritol sweetener stuff, sold as Truvia or whatever) and fill it with water, stirring once.

Now put it in the fridge.

You're done! You now have a delicious cold beverage that you can drink whenever and which costs about as little as a flavored beverage can.

"Tea" I like to drink:

  • Celestial Seasonings makes a lot of "fruity" herbal tea things that are pretty good in combination with black tea. I put in half black tea bags and half peach/raspberry/mint/whatever.
  • Chai, caffeinated or decaf. If you make this with Stevia or whatever, then pour it into a glass with a tablespoon of creamer, that's 1/5 or less the calories of the "iced chai" sold at a common coffee shop.
  • Some combination of: cucumber slices, lime wedges, lemon wedges, mint leaves.

Takes up some space in the fridge, but IMO it's worth it. And hey, better than filling it up with soda probably...

Gush: A stack based language eventually for genetic programming

By Christopher Allan Webber on Thu 06 April 2017

(This blogpost was going to be just about a project I'm working on Gush, but instead it's turned into a whole lot of backstory, and then a short tutorial about Gush. If you're just interested in the tutorial, skip to the bottom I guess.)

I recently wrote about possible routes for anti-abuse systems. One of the goofier routes I wrote about on there discussed genetic programming. I get the sense that few people believe I could be serious... in some ways, I'm not sure if I myself am serious. But the idea is so alluring! (And, let's be honest, entertaining!) Imagine if you had anti-abuse programs on your computer, and they're growing and evolving based on user feedback (hand-waving aside exactly what that feedback is, which might be the hardest problem), adapting to new threats somewhat invisibly from the user benefiting from them. They have a set of friends who have similar needs and concerns, and so their programs propagate and reproduce with programs in their trust network (along with their datasets, which may be taught to child programs also via a genetic program). Compelling! Would it work? I dunno.

A different, fun use case I can't get to leave my mind is genetic programs as enemy AI in roguelikes. After all, cellular automata are a class of programs frequently used to study genetic programs. And roguelikes aren't tooooooo far away from cellular automata where you beat things up. What if the roguelike adapted to you? Heck, maybe you could even collect and pit your genetic program roguelike monsters against your friends'. Roguelike Pokémon! (Except unlike in Pokémon, "evolving" actually really means "evolving".)

Speculating on the future with Lee Spector

Anyway, how did I get on this crazy kick? On the way back from LibrePlanet (which went quite well, and deserves its own blogpost), I had the good fortune to be able to meet up with Lee Spector. I had heard of Lee because my friend Bassam Kurdali works in the same building as him in Hampshire College, and Bassam had told me a few years ago about Lee's work on genetic algorithms. The system Lee Spector works on is called PushGP (Push is the stack language, and PushGP is Push used for genetic programming). Well, of course once I found out that Push was a lisp-based language, I was intrigued (hosted on lisps traditionally, but not always, and Push itself is kind of like Lisp meets Forth), and so by the time we met up I was relatively familiar with Lee's work.

Lee and I met up at the Haymarket Cafe, which is a friendly coffee shop in Northampton. I mentioned that I had just come from LibrePlanet where I had given a talk on The Lisp Machine and GNU. I was entertained that almost immediately after these words left my mouth, Lee dove into his personal experiences with lisp machines, and his longing for the kind of development experiences lisp machines gave you, which he hasn't been able to find since. That's kind of an aside from this blogpost I suppose, but it was nice that we had something immediately to connect on, including on a topic I had recently been exploring and talking about myself. Anyway, the conversation was pretty wild and wide-ranging.

I had also had the good fortune to speak to Gerald Sussman again at LibrePlanet this year (he also showed up to my talk and answered some questions for the audience). One thing I observed in talking with both Sussman and Spector is they're both very interested in thinking about where to bring computing in terms of examining biological systems, but there was a big difference in terms of their ideas; Sussman is very interested in holding machines "accountable", which seems to frequently also mean being able to examine how they came to conclusions. (You can see more of Sussman's thinking about this in the writeup I did of the first time I ever got to speak with Sussman when I was at FSF's 30th anniversary party... maybe I should try to capture some information about the most recent chat too, before it gets lost to the sands of time...) Spector, on the other hand, seems convinced that to make it to the next level of computing (and maybe even the next level of humanity), we have to be willing to give up on demanding that we can truly understand a system, and that we have to allow processes to run wild and develop into their own things. There's a tinge of Vernor Vinge's original vision of the Singularity, which is that there's a more advanced level of intelligence than humans are able to currently comprehend, and to understand it there you have to have crossed that boundary yourself, like the impossibility of seeing past the event horizon of a black hole. (Note that this is a pretty different definition of singularity from some of the current definitions of Singularity that have come since, which are more about possible outcomes of such a change rather than about the concept of an intellectual/technological event horizon itself.) That's a possible vision for the future of humanity, but it's also a vision of what maybe the right direction is for our programming too.

Of course, this vision that code may be generative in a way where the source is mostly unintelligible to us feels possibly at odds to our current understanding of software freedom, a movement which spends a lot of time talking about inspecting source code. The implications of that is a topic of interest of mine (I wrote about it in the FSF bulletin not too long ago) and I did needle Spector about it... what does copyright mean in a world where humans aren't writing software? Spector seems to acknowledge that it's a concern (he agrees that examples of seed-DRM and genetic patents in the case of Monsanto are troubling) but I got the sense that it's not his biggest interest... Spector thinks that the future of that side of software freedom and copyright might be that humanity realizes how absurd software copyright and patents and other intellectual restriction regimes are once things generate far enough. But I get the sense that more than talking about the legal/licensing aspects of the auto-generative future, Spector would rather be building it. Fair enough!

Anyway, somewhere along these lines I mentioned my interest in distributed anti-abuse systems. We talked about how more basic approaches such as Bayesian filtering might not be good enough to combat modern abuse beyond just spam especially, because the attack patterns taken change so frequently. Suddenly it hit me: I wonder whether or not genetic programs would work pretty well in a distributed system... after all, you could use your web of trust to breed the appropriate filtering programs with your friends' programs... would it work?

Anyway, on the ride back I began playing with some of Push's ideas, and (with a lot of helpful feedback from Lee Spector... thank you, Lee!) I started to put together a toy design for a language inspired by PushGP but with some properties that I think might be more applicable to an anti-abuse system that needs to keep around "memories" between generations. (Whether it's better or not, I don't really know yet.) So...

A little Gush tutorial

At this point, Gush exists. It has the stack based language down, but none of the genetic programming. Nonetheless, it's fun to hack around in, looks an awful lot like Push but also is far enough along to demonstrate its differences, and if you've never played with a stack based language before, it might be a good place to start.

Let's do some fun things. First of all, what does a Push program look like?

(1 2 + dup *)

Ok, that looks an awful lot like a lisp program, and yet not at the same time! If you've installed Gush, you can run this example:

> (use-modules (gush))
> (run '(1 2 + dup *))
$1 = (9)

Whoo, our program ran! But what happened? Gush programs operate on two primary stacks... there's an "exec" stack, which contains the program being evaluated in progress, and a "values" stack, with all the values currently built up by the program. Evaluation happens like so:

exec> '(1 2 + dup *)  ; initial exec stack
vals> '()             ; initial empty values stack
exec> '(2 + dup *)    ; [=> 1] popped off from exec stack
vals> '(1)            ; push 1 (a literal) onto values stack
exec> '(+ dup *)      ; [=> 2] popped off from exec stack
vals> '(2 1)          ; push 2 (a literal) onto values stack
exec> '(dup *)        ; [=> +] popped off of exec stack
vals> '(3)            ; apply `+', which is bound to an operation which takes
                      ;   top two numbers on the values stack and adds them,
                      ;   pushing result onto values stack... 2 + 1 = 3,
                      ;   so 2 and 1 are removed and 3 is added
exec> '(*)            ; [=> dup] popped off of exec stack
vals> '(3 3)          ; `dup' takes top item on stack and duplicates it
exec> '()             ; [=> *] popped off of exec stack
vals> '(9)            ; apply `*', which is bound to an operation which takes
                      ;   top two numbers on values stack and multiplies them,
                      ;   pushing result onto values stack... 3 * 3 = 9,
                      ;   so 3 and 3 are removed and 9 is added
                      ; Nothing left to do on exec, so we're done!

Okay, great! What else can we run?

;; Complicated arithmetic runs
> (run '(3 2 / 4 +))
$2 = (14/3)

;; We can assign variables to values and then reference them
> (run '(88 'foo define foo 22 +))
$3 = (110)

;; However, base operations "know" what types to apply for, and search
;; the stack... the string will be "skipped over" in search for
;; a number here.  This means that we can randomly generate code
;; and we won't run into type errors.
> (run '(1 "two" 3 +))
$4 = (4 "two")

;; Nested parentheses will be "unnested" and applied inline
> (run '(98 ("balloons" "red") 1 +))
$5 = (99 "red" "balloons")
;; so that's the same as
> (run '(98 "balloons" "red" 1 +))
$6 = (99 "red" "balloons")

;; Variables are actually stacks!  Which means we can build up
;; complicated operations on them...
> (run '('+ 'foo define      ; set foo to '(+)
         88 'foo var-push    ; append 88, so foo is now '(88 +)
         2 foo))             ; apply variable stack foo
$7 = (90)

;; Conditionals, etc also work.
> (run '(1 1 + 'b define  ; assign b to the value of 1 + 1
         2 b = if         ; check if b is 2
           "two b"        ; if-then clause
           "not two b"))  ; if-else clause
$8 = ("two b")

There's more to it than that, but that should get you started.

How is Gush different from Push?

Gush takes almost all its good ideas from Push, but there are two big differences.

Both Gush and Push try to avoid type errors. You can do all sorts of code mutation, and whether or not things will actually do anything useful is up for grabs, but it shouldn't crash to a halt. The way Push does it is via different stacks for each value type. This is really clever: it means that each operation applies to very specific types, and if you always know your input types carefully, you can always be safe on a type level and shouldn't have programs that unexpectedly crash (if there aren't enough values on the appropriate type stack, Push just no-ops).

But what if you want to run operations that might apply to more than one type? For example, in Gush you might do:

> (run '(1 2 / 4.5 +))
$9 = (6.5)

In Push, you'd probably do something like this:

> (run '(1 2 INTEGER./ FLOAT.FROMINTEGER 4.5 FLOAT.+))
$9 = (6.5)

(And of course, if you wanted rational numbers rather than just floats, you'd have to add another type stack to that...)

I really wanted generic methods that were able to determine what types they were able to apply to. For one thing, imagine you have a program that's doing a lot of complicated algebra... it should be able to operate on a succession of numbers without having to do type coercion and hit/miss on whether it chose the right of several typed operators, when it could just pick one operator that can apply to several items.

I also wanted to be able to add new types without much difficulty. As it is, I don't have to rewire anything to throw hash-maps into Gush:

> (run '(42 "meaning of life" make-hash hash-set))
$10 = (<hash-table>)

This would just work, no need to wire anything new up!

The way Gush does it is it uses generic operators which know how to check the predicates for each type, and which "search the stack" for values it knows it can apply. (It also no-ops if it can't find anything.) If bells and alarms are going off, you're not wrong! In Gush's current implementation, this does have the consequence that any given operation might be worst case O(n) of the size of the values stack! Owch! However, I'm not too worried. Gush checks how many operations every program takes (and has the option to bail out if a program is taking too many steps) and searching the stack after failing to match initially counts against a program. I figured that if programs are auto-generated, one fitness check can be how many steps it takes for the program to finish its computation, and so programs would be incentivized to keep the appropriate types near where they would be useful. I'm happy to say that it turns out I'm not the only one to think this; unknown to me when I started down this path, there's another Push derivative named Push-Forth which has only one stack altogether (not even separate stacks for exec and values!) and it does some similar-ish (but not quite the same) searching (or converging on a fixed point) by currying operations until the appropriate types are available. (Pretty cool stuff, but to be honest I have a hard time following the Push-Forth examples I've seen.) It comes to the same conclusion that by checking the number of steps a program takes to execute as part of its fitness, programs will be encouraged to keep types in good places anyway. However! There's more reasons to not despair; I'm relatively sure that there are some clever things that can be done with Gush's value stack so that predicate information is cached and looking for the right type can be made O(1). That has yet to be proven though. :)

The other feature Gush differentiates itself from Push is that Gush variables are stacks rather than single values. This ties in nicely with the classic Push approach that lists are unwrapped and applied to the exec stack at the time they are to be evaluated anyway, so it makes no behavioral change in the case that you just use "define" (which will always clobber the state of the stack, whether or not it exists, to replace it with a stack with a single element of the new value). But it also allows you to build up collections of information over time... or even collections of code. An individual variable can be appended to and modified as the program runs, so you could write or even modify subroutines to variables. (Code that writes code! Very lispy, but also a bit crazier because it might happen at runtime.) Push also has this feature, but it has one specific, restricted stack for it, named the CODE stack appropriately. Why have one of these stacks, when you could have an unlimited number of them?

That wasn't the original intent for having variables as values though; I only realized that you could make each variable into a kind of CODE stack later. My original intent was driven by a concern/need to be able to carry information from parent process to child process. I added a structure to Gush programs named "memories", and I figured that parent programs could "teach" their memories to child programs. So this was really just a hash table of symbols to stacks that persisted after the program ended (which, since if you use run-application you get the whole state of the program as the same immutable structure that is folded over during execution, you have that information attached to the application anyway). The idea of "memories" was that parent programs could have another program that, after spawning a child program, could "teach" the things they knew to their children (possibly either by simply copying, or more likely through a separate genetic program applied to that same data). That way a database of accrued information could be passed around from generation to generation... a type of genetic programming educational system (or folklore). So that was there, but then when I began adding variables around the same time I realized that a variable that contained a single value and which was pushed onto to the exec stack was, due to the way Push "unwraps" lists, exactly the same as if there was just that variable alone pushed onto the stack. Plus, it seemed to open up more paths by having the cool effect of having any variable be able to take on the power of Push's CODE stack. (Not to mention, removing the need for a redundant CODE stack!)

Are these really improvements? I don't know, it's hard to say without actually testing with some genetic programming examples. That part doesn't exist yet in Gush... probably I'll follow the current lead of the Push community and do mutation on the linearized Plush representation of Push code.

Anyway, I also want to give a huge thank you to Lee Spector. Lee has been really patient in answering a lot of questions, and even in the case that Gush does have some improvements, they're minor tweaks compared to the years of work and experimentation that has gone into the Push/PushGP designs.

And hey, it was a lot of fun! Not to mention, a great way to procrastinate on the things I should be working on...

Possible routes for distributed anti-abuse systems

By Christopher Allan Webber on Tue 04 April 2017

I work on federated standards and systems, particularly ActivityPub. Of course, if you work on this stuff, every now and then the question of "how do you deal with abuse?" very rightly comes up. Most recently Mastodon has gotten some attention, which is great! But of course, people are raising the question, can federation systems really protect people from abuse? (It's not the first time to come up either; at LibrePlanet in 2015 a number of us held a "social justice for federated free software systems" dinner and were discussing things then.) It's an important question to ask, and I'm afraid the answer is, "not reliably yet". But in this blogpost I hope to show that there may be some hope for the future.

A few things I think you want out of such a system:

  • It should actually be decentralized. It's possible to run a mega-node that everyone screens their content against, but then what's the point?
  • The most important thing is for the system to prevent attackers from being able to deliver hateful content. An attack in a social system means getting your message across, so that's what we don't want to happen.
  • But who are we protecting, and against what? It's difficult to know, because even very progressive groups often don't anticipate who they need to protect; "social justice" groups of the past are often exclusionary against other groups until they find out they need to be otherwise (eg in each of these important social movements, some prominent members have had problems including other social justice groups: racist suffragists, civil rights activists exclusionary against gay and lesbian groups, gay and lesbian groups exclusionary against transgender individuals...). The point is: if we haven't gotten it all right in the past, we might not get it all right in the present, so the most important thing is to allow communities to protect themselves from hate.

Of course, keep in mind that no technology system is going to be perfect; these are all imperfect tools for mitigation. But what technical decisions you make do also affect who is empowered in a system, so it's also still important to work on these, though none of them are panaceas.

With those core bits down, what strategies are available? There are a few I've been paying close attention to (keep in mind that I am an expert in zero of these routes at present):

  • Federated Blocklists: The easiest "starter" route. And good news! If you're using the ActivityPub standard, there's already a Block activity, and you could build up group-moderated collections of people to block. A decent first step, but I don't think it gets you very far; for one thing, being the maintainer of a public blocklist is a risky activity; trolls might use that information to attack you. That and merging/squashing blocklists might be awkward in this system.
  • Federated reputation systems: You could also take it a step further by using something like the Stellar consensus protocol (more info in paper form or even a graphic novel). Stellar is a cryptographically signed ledger. Okay, yes, that makes it a kind of blockchain (which will make some peoples' eyes glaze over, but technically a signed git repository is also a blockchain), but it's not necessarily restricted to use of cryptocurrencies... you can track any kinds of transactions with it. Which means we could also track blocklists, or even less binary reputation systems! But what's most interesting about Stellar is that it's also federated... and in this case, federation means you can choose what groups you trust... but due to math'y concepts that I occasionally totally get upon being explained to me and then forget the moment someone asks me to explain to someone else, consensus is still enforced within the "slices" of groups you are following. You can imagine maybe the needs of an LGBT community and a Furry community might overlap, but they might not be the same, and maybe you'd be subscribed to just one or both, or neither. Or pick your other social groups, go wild. That said, I'm not sure how to make these "transactions" not public in this system, so it's very out there in the open, but since there's a voting system built-in maybe particular individuals won't be as liable for being attacked as individuals maintaining a blocklist are. Introducing a sliding-scale "social reputation system" may also introduce other dangerous problems, though I think Stellar's design is probably the least dangerous of all of these since it probably will still keep abusers out of a particular targeted group, but will allow marginalized-but-not-recognized-by-larger groups still avenues to set up their own slices as well.
  • "Charging" for distributing messages: Hoo boy, this one's going to be controversial! This was suggested to me by someone smart in the whole distributed technology space. It's not necessarily what we would normally consider real money that would be charged to distribute things... it could be a kind of "whuffie" cryptocurrency that you have to pay. Well the upside to this is it would keep low-funded abusers out of a system... the downside is that you've now basically powered your decentralized social network through pay-to-play capitalism. Unfortunately, even if the cryptocurrency is just some "social media fun money", imaginary currencies have a way of turning into real currencies; see paying for in-game currency in any massively multiplayer game ever. I don't think this gives us the power dynamics we want in our system, but it's worth noting that "it's one way to do it"... with serious side effects.
  • Web of trust / Friend of a Friend networks: Well researched in crypto systems, though nobody's built really good UIs for them. Still, a lot of potential if the system was somehow made friendly and didn't require showing up to a nerd-heavy "key-signing party"... if the system could have marking who you trust and who you don't (and not just as in terms of verifying keys) built as an elegant part of the UI, then yes I think this could be a good component for recognizing who you might allow to send you messages. There are also risks in having these associations be completely public, though I think web of trust systems don't necessarily have to be public... you can recurse outward from the individuals you do already know. (Edit: My friend ArneBab suggests that looking at how Freenet handles its web of trust would be a good starting point for someone wishing to research this. I have 0 experience with Freenet, but here are some resources.)
  • Distributed recommendation systems: Think of recommender systems in (sorry for the centralized system references) Amazon, Netflix, or any of the major social networks (Twitter, Facebook, etc). Is there a way to tell if someone or some message may be relevant to you, depending on who else you follow? Almost nobody seems to be doing research here, but not quite nobody; here's one paper: Collaborative Filtering with Privacy. Would it work? I have no idea, but the paper's title sure sounds compelling. (Edit: ArneBab also points out that credence-p2p might also be useful to look at. Relevant papers here.)
  • Good ol' Bayesian filtering: Unfortunately, I think that there's too many alternate routes of attacks for just processing a message's statistical contents to be good enough, though I think it's probably a good component of an anti-abuse system. In fact, maybe we should be talking about solutions that can use multiple components, and be very adaptive...
  • Distributed machine learning sets: Probably way too computationally expensive to run in a decentralized network, but maybe I'm wrong. Maybe this can be done in a the right way, but I get the impression that without the training dataset it's probably not useful? Prove me wrong! But I also just don't know enough about machine learning. Has the right property of being adaptive, though.
  • Genetic programs: Okay, I hear you saying, "what?? genetic programming?? as in programs that evolve?" It's a field of study that has quite a bit of research behind it, but very little application in the real world... but it might be a good basis for filtering systems in a federated network (I'm beginning to explore this but I have no idea if it will bear fruit). Programs might evolve on your machine and mine which adapt to the changing nature of social attacks. And best of all, in a distributed network, we might be able to send our genetic anti-abuse programs to each other... and they could breed and make new anti-abuse baby programs! However, for this to work the programs would have to carry part of the information of their "experiences" from parent to child. After all, a program isn't going to very likely randomly bump into finding out that a hateful group has started using "cuck" as a slur. But programs keep information around while they run, and it's possible that parent programs could teach wordlists and other information to their children, or to other programs. And if you already have a trust network, your programs could propagate their techniques and information with each other. (There's a risk of a side channel attack though: you might be able to find some of the content of information sent/received by checking the wordlists or etc being passed around by these programs.) (You'd definitely want your programs sandboxed if you took this route, and I think it would be good for filtering only... if you expose output methods, your programs might start talking on the network, and who knows what would happen!) One big upside to this is that if it worked, it should work in a distributed system... you're effectively occasionally bringing the anti-abuse hamster cages together now and then. However, you do get into an ontology problem... if these programs are making up wordlists and binding them to generated symbols, you're effectively generating a new language. That's not too far from human-generated language, and so at that point you're talking about a computer-generated natural language... but I think there may be evolutionary incentive to agree upon terms. Setting up the "fitness" of the program (same with the machine learning route) would also have to involve determining what filtering is useful / isn't useful to the user of the program, and that's a whole challenging problem domain of its own (though you could start with just manually marking correct/incorrect the way people train their spam filters with spam/ham). But... okay by now this sounds pretty far-fetched, I know, but I think it has some promise... I'm beginning to explore it with a derivative of some of the ideas from PushGP. I'm not sure if any of these ideas will work but I think this is both the most entertainingly exciting and crazy at the same time. (On another side, I also think there's an untapped potential for roguelike AI that's driven by genetic algorithms...) There's definitely one huge downside to this though, even if it was effective (the same problem machine learning groups have)... the programs would be nearly unreadable to humans! Would this really be the only source of information you'd want to trust?
  • Expert / constraint based systems: Everyone's super into "machine learning" based systems right now, but it's hard to tell what on earth those systems are doing, even when their results are impressive (not far off from genetic algorithms, as above! but genetic algorithms may not require the same crazy large centralized datasets that machine learning systems tend to). Luckily there's a whole other branch of AI involving "expert systems" and "symbolic reasoning" and etc. The most promising of these I think is the propagator model by Sussman / Radul / and many others (if you've seen the constraint system in SICP, this is a grandchild of that design). One interesting thing about the propagator model is that it can come to conclusions from exploring many different sources, and it can tell you how it came to those conclusions. These systems are incredible and under-explored, though there's a catch: usually they're hand-wired, or the rules are added manually (which is partly how you can tell where the conclusions came from, since the symbols for those sources may be labeled by a human... but who knows, maybe there's a way to map a machines concept of some term to a human's anyway). I think this won't probably be adaptive enough for the fast-changing world of different attack structures... but! but! we've explored a lot of other ideas above, and maybe you have some combination of a reputation system, and a genetic programming system, and etc, and this branch of study could be a great route to glue those very differing systems together and get a sense of what may be safe / unsafe from different sources... and at least understand how each source, on its macro level, contributed to a conclusion about whether or not to trust a message or individual.

Okay, well that's it I think! Those are all the routes I've been thinking about. None of these routes are proven, but I hope that gives some evidence that there are avenues worth exploring... and that there is likely hope for the federated web to protect people... and maybe we could even do it better for the silos. After all, if we could do filtering as well as the big orgs, even if it were just at or nearly at the same level (which isn't as good as I'd like), that's already a win: it would mean we could protect people, and also preserve the autonomy of marginalized groups... who aren't very likely to be well protected by centralized regimes if push really does come to shove.

I hope that inspires some people! If you have other routes that should be added to this list or you're exploring or would like to explore one of these directions, please contact me. Once the W3C Social Working Group wraps up, I'm to be co-chair of the following Social Community Group, and this is something we want to explore there.

Update: I'm happy to see that the Matrix folks also see this as "the single biggest existential threat" and "a problem that the whole decentralised web community has in common"... apparently they already have been looking at the Stellar approach. More from their FOSDEM talk slides. I agree that this is a problem facing the whole decentralized web, and I'm glad / hopeful that there's interest in working together. Now's a good time to be implementing and experimenting!