Archives

Tags

Goodbye 2015, Hello 2016

By Christopher Allan Webber on Mon 04 January 2016

I'm sitting on a train traveling from Illinois to California, the long stretch of a journey from Madison to San Francisco. Morgan sits next to me. We are staring out the windows of the observation deck of this train as we watch the snow covered mountains pass by. I am feeling more relaxed and at peace than I have in years.

2016 is opening in a big way for me. As you may have heard (I mentioned it in the last State of the Goblin post) MediaGoblin was accepted into the Stripe Open Source Retreat program. Basically, Stripe gives us no-strings-attached funding for me to advance our work on MediaGoblin, but they wanted me to work from their office during that time. Seems like quite a deal to me! Unfortunately it does mean leaving Morgan behind in Madison for that time period. But that's why we splurged on a fancy train car and why she's joining me in San Francisco for the first week, so we can spend some quality time together. (Plus, Morgan has a conference that first week in San Francisco anyway; double plus, Amtrak has an extremely generous baggage policy so I'm able to get all of the belongings I need for that period shipped along with me fairly easily.) Morgan and I have been talking about but not really taking a vacation for a while, so we decided the moving-scenery approach would be a nice way to do things. It's great... we're mostly reading and drinking tea and staring out the window at the beautiful passings-by. I could hardly imagine a nicer send-off. (So yeah, if you're considering taking such a journey with your loved ones, I recommend it.)

The passage of scenery leads to reflection on the passage of time. Now seems a good time to write a bit about 2015 and what it meant. It was a very eventful year for me. I have come recently to explain to people that "I live a magical and high-stress life"; 2015 evoked that well. From a personal standpoint, Morgan and I's relationship runs strong, maybe stronger than ever, and I am thankful for that. From the broader family standpoint, the graph advances steady at times with strong peaks and valleys, perhaps more pronounced than usual. Love, gain, success, loss... it feels that everything has happened this year. Our lives have also been rearranged dramatically in an attempt to help a family member in a time of need, and that has its own set of peaks and valleys, as is to be expected. But that is the stuff of life, and you do what you can when you can, and you try your best, and you hope that others will try their best, what happens from there happens, and you use it to plan the next round of doing the best you can.

That's all very vague I suppose, but many things feel too private to discuss so publicly. Nonetheless, I wanted to record the texture of the year.

So what in the way of, you know, that thing we call a "career"? Well, it has continued to be magical, in the way that I have had a lot of freedom to explore things and address issues I really care about. Receiving an award (particularly since I did not know I had even been a candidate ahead of being notified that I received it) has also been gratifying and reassuring in some ways; I regularly fear that I am not doing well enough at advancing the issues I care about, but clearly some people do, and that's nice. It has also continued to be high stress, in that the things I worry about feel very high stakes on a global level, and that the difficulty of accomplishing them also feels very strong, and of course many are not there yet. Nonetheless, there has been a lot of progress this year, though it has come with a worrying increase of scope in the number of things I am attempting to accomplish.

We're much nearer to 1.0 on MediaGoblin, which is a huge relief. Of course, this is mostly due to Jessica Tallon's hard work on getting federation in MediaGoblin working, and other MediaGoblin community memebers doing many other interesting things. Embarassingly, I have done a lot less on MediaGoblin than in the last few years. In a sense, this is okay, because the money from the campaign has been going to pay Jessica Tallon, and not myself. I still feel bad about it though. The good news is that the focus time from the Stripe retreat should allow me the space and focus to hopefully get 1.0 actually out the door. So that leads to strong optimism.

The reduced time spent coding on MediaGoblin proper has been deceptive, since most of the projects I've worked on have spun out of work I believe is essential for MediaGoblin's long-term success. I took a sabbatical from MediaGoblin proper mid-year to focus on two goals: advancing federation standards (and my own understanding of them), and advancing the state of free software deployment. (I'm aware of a whiff of yak fumes here, though for each I can't see how MediaGoblin can succeed in their present state.) I believe I have made a lot of progress in both areas. As for federation, I've worked hard in participating in the W3C Social Working Group, I have done some test implementations, and recently I became co-editor on ActivityPump. On deployment, much work has been done on the UserOps side, both in speaking and in actual work. After initially starting to try to use Salt/Ansible as a base and hitting limitations, then trying to build my own Salt/Ansible'esque system in Hy and then Guile and hitting limitations there too, I eventually came to look into (after much prodding) Guix. At the moment, I think it's the only foundation solid enough on which to build the tooling to get us out of this mess. I've made some contributions, albeit mostly minor, have begun promoting the project more heavily, and am trying to work towards getting more deployment tooling done for it (so little time though!). I'm also now dual booting between GuixSD and Debian, and that's nice.

(Speaking of, towards the end of the year I switched to a Minifree x200 on which I'm dual booting Debian and Guix. I believe this puts me much deeper into the "free software vegan" territory.)

I also believe that over the last year I have changed dramatically as a programmer. For nearly ten years I identified as a "python web developer", but I believe that identity no longer feels like an ideal description. One thing I have always been self conscious of is how little I've known about deeper computer science fundamentals. This has changed a lot, and I believe much of it has been spending so much time in the Guile and Scheme communities, and reading the copious interesting literature that is available there. My brother Steve and I also now often meet together and watch various programming lectures and discuss them, which has been both illuminating and also a great way to understand a side of my brother I never knew. It's a nice mix; I'm a very get-things-done person, he's a very theoretical person, and we're meeting partway in the middle and I think both of us are stretching our brains in ways we hadn't before. I feel like a different programmer than I was. A year and a half ago, I remember being on a bike ride with Steve and I remember complaining to him that I didn't understand why functional programmers are so obsessed with immutability... mutation is so useful, I exclaimed! Steve paused and said very carefully, "Well... mutation brings a lot of problems..." but I just didn't understand what he was getting at. Now I look back on that bike ride and wonder at the former-me taking that position.

(All that said though, I'm glad that I've had the background I have of being a "python web developer" first, for a matter of perspective...)

I do feel that much has changed in my life in this last year. There were hard things, but overall, life has been good to me, and I still am doing what I believe in and care about. Not everyone has that opportunity. And this train ride already points the way to a year that should be productive, and will certainly be eventful.

Anyway, that's enough navel-gazing-reflection, I suppose. One more navel-gaze: here's to the changed person on the other end of 2016. I hope I can do them justice. And I hope you can do yourself justice in 2016 too.

VCS friendly, patchable, document line wrapping

By Christopher Allan Webber on Thu 17 December 2015

If you do enough work in any sort of free software environment, you get used to doing lots of writing of documentation or all sorts of other things in some plaintext system which exports to some non-plaintext system. One way or another you have to decide: are you going to wrap your lines with newlines? And of course the answer should be "yes" because lines that trail all the way off the edge of your terminal is a sin against the plaintext gods, who are deceptively mighty, and whose wrath is to be feared (and blessings to be embraced). So okay, of course one line per paragraph is off the table. So what do you do?

For years I've taken the lazy way out. I'm an emacs user, and emacs comes with the `fill-paragraph' command, so conveniently mapped to M-q. So day in and day out I'm either whacking M-q now and then, or I'm being lazy and letting something like `auto-fill-mode' do the job. Overall this results in something rather pleasing to the plaintext-loving eye. If we take our first paragraph as an example, it would look like this:

If you do enough work in any sort of free software environment, you get used to
doing lots of writing of documentation or all sorts of other things in some
plaintext system which exports to some non-plaintext system.  One way or
another you have to decide: are you going to wrap your lines with newlines?
And of course the answer should be "yes" because lines that trail all the way
off the edge of your terminal is a sin against the plaintext gods, who are
deceptively mighty, and whose wrath is to be feared (and blessings to be
embraced).  So okay, of course one line per paragraph is off the table.  So
what do you do?

But my friends, you know as well as I do: this isn't actually good. And we know it's not good because one of the primary benefits of plaintext is that we have nice tools to diff it and patch it and check it into version control systems and so on. And the sad reality is, if you make a change at the start of a paragraph and then you re-fill (or re-wrap for you non-emacs folks) it, you are going to have a bad time! Why? Because imagine you and your friends are working on this document together, and you're working in some branch of your document, and then your friend Sarah or whoever sends you a patch and you're so excited to merge it, and she does a nice job and edits a bunch of paragraphs and re-wraps it or re-fills them because why wouldn't she do that, it's the best convention you have, so you happily merge it in and say thanks, you look forward to future edits, and then you go to merge in your own branch you've been working on privately, but oh god oh no you were working on your own overhaul which re-wrapped many of the same paragraphs and now there are merge conflicts everywhere.

That's not an imaginary possibility; if you've worked on a documentation project big enough, I suspect you've hit it. And hey, look, maybe you haven't hit it, because maybe most of your writing projects aren't so fast paced. But have you ever looked at your version control log? Ever done a `git/svn/foo blame', `git/svn/foo praise', or whatever convention? Eventually you can't figure out what commit anything came from, and my friends, that is a bad time.

In trying to please the plaintext gods, we have defiled their temple. Can we do better?

One interesting suggestion I've heard, but just can't get on board with, is to keep each sentence on its own line. It's a nice idea, and I want to like it, because the core idea is good: each sentence doesn't interfere with the one before or after it, so if you change a sentence, it's easy for both you and the computer to tell which one. This means you can check things in and out of version control, send and receive patches, and from that whole angle, things are great.

But it's a sin to the eye to have stuff scrolling off the edge of your terminal like that, and each sentence on its own line, well... it just confuses me. Let's re-look at that first paragraph again in this style:

If you do enough work in any sort of free software environment, you get used to doing lots of writing of documentation or all sorts of other things in some plaintext system which exports to some non-plaintext system.
One way or another you have to decide: are you going to wrap your lines with newlines?
And of course the answer should be "yes" because lines that trail all the way off the edge of your terminal is a sin against the plaintext gods, who are deceptively mighty, and whose wrath is to be feared (and blessings to be embraced).
So okay, of course one line per paragraph is off the table.
So what do you do?

Ugh, it's hard to put into words why this is so offensive to me. I guess it's because each sentence can get so long that it looks like the separation between sentence is a bigger break than the separation between paragraphs. And I just hate things scrolling off to the right like that. I don't want to be halfway through reading a word on my terminal and then have to jump back so I can keep reading it.

So no, this is not good either. But it is on the right track. Is there a way to get the best of both worlds?

Recently, when talking about this problem with my good friend David Thompson, I came to realize that there is a potentially great solution that makes a hybrid of the technical merits of the one-sentence-per-line approach and the visually pleasing merits of the wrap/fill-your-paragraph approach. And the answer is: put each sentence on its own line, and wrap each sentence!

This is best seen to be believed, so let's take a look at that first paragraph again... this time, as I typed it into my blogging system:

If you do enough work in any sort of free software environment, you get used
  to doing lots of writing of documentation or all sorts of other things in
  some plaintext system which exports to some non-plaintext system.
One way or another you have to decide: are you going to wrap your lines with
  newlines?
And of course the answer should be "yes" because lines that trail all the way
  off the edge of your terminal is a sin against the plaintext gods, who are
  deceptively mighty, and whose wrath is to be feared (and blessings to be
  embraced).
So okay, of course one line per paragraph is off the table.
So what do you do?

Yes, yes, yes! This is what we want! Now it looks good, and it merges good. And we still can preserve the multi-line separation between paragraphs. Also, you might notice that I continue each sentence by giving two spaces before its wrapped continuation, and I think that's an extra nice touch (but you don't have to do it).

This is how I'm writing all my documentation, and the style in which I will request all documentation for projects I start be written in, from now on. Now if you're writing an email, or something else that's meant to be read in plaintext as-is (you do read/write your email in plaintext, right?), then maybe you should just do the traditional fill paragraph approach. After all, you want that to look nice, and in many of those cases, the text doesn't change too much. But if you're writing something where the plaintext version is just intermediate, and you have some other export which is what people mostly will read, I think this is a rather dandy approach.

I hope you find it useful as well! Happy documentation hacking!

Parallels Between Code of Conducts and Copyleft (and their objectors)

By Christopher Allan Webber on Wed 09 December 2015

Here's a strawman I'm sure you've seen before:

Why all this extra process? Shouldn't we trust people to do the right thing? Wouldn't you rather people do the right thing because it's the right thing to do? People are mostly doing the right thing anyway! Trust in the hacker spirit to triumph!

The question is, is this an objection to copyleft, or is it an objection to code of conducts? I've seen objections raised to both that go along these lines. I think there's little coincidence, since both of them are objections to added process which define (and provide enforcement mechanisms for) doing the right thing.

Note that I don't think copyleft and code of conducts are exactly the same thing. They aren't, and the things they try to prevent are probably not proportional.

But I do think that there's an argument that achieving real world social justice involves a certain amount of process, laying the ground for what's permitted and isn't, and (if you have to, but hopefully you don't) a specified direction for requiring compliance with that correct behavior.

Curiously we also have people who are pro copyleft and strongly anti code of conduct, and the reverse. Maybe examining the parallels between objections to both might help identify that a supporter of one might consider that the other makes sense, too.

Update: This was originally posted to the pumpiverse and since cross-posting to my blog, at least one interesting response to it has been made there.

Another update: Sumana Harihareswara pointed out that not only had we probably talked about this when we hung out in May, but that she had previously written on the subject even before then! I had forgotten about reading this (I forget about a lot of things), though honestly, my post is probably a reflection of Sumana's original thoughts. She goes into more detail (and probably much better) than I did here... you should read her writing on the subject! It's good stuff!

Hash tables are easy (in Guile)

By Christopher Allan Webber on Mon 09 November 2015

As a programmer, I use hash tables of varying kinds pretty much all day, every day. But one of the odd and embarrassing parts of being a community-trained programmer is that I've never actually implemented one. Eek! Well, today I pulled an algorithms book off the shelf and decided to see how long it would take me to implement their simplest example in Guile. It turns out that it takes less than 25 lines of code to implement a basic hash table with O(1) best time, O(1) average time, and O(n) worst case time. The worst case won't be too common depending on how we size things so this isn't so bad, but we'll get into that as we go along.

Here's the code:

;;; Simple hash table implementation -- (C) 2015 Christopher Allan Webber
;;; Released under the "Any Free License 2015-11-05", whose terms are the following:
;;;   This code is released under any of the free software licenses listed on
;;;     https://www.gnu.org/licenses/license-list.html
;;;   which for archival purposes is
;;;     https://web.archive.org/web/20151105070140/http://www.gnu.org/licenses/license-list.html

(use-modules (srfi srfi-1))

(define (make-dumbhash size)
  "Make a dumb hash table: an array of buckets"
  (make-array '() size))

(define* (dumbhash-ref dumbhash key #:optional (default #f))
  "Pull a value out of a dumbhash"
  (let* ((hashed-key (hash key (array-length dumbhash)))
         (bucket (array-ref dumbhash hashed-key)))
    (or (find (lambda (x) (equal? (car x) key))
              bucket)
        default)))

(define (dumbhash-set! dumbhash key val)
  "Set a value in a dumbhash"
  (let* ((hashed-key (hash key (array-length dumbhash)))
         (bucket (array-ref dumbhash hashed-key)))
    ;; Only act if it's not already a member
    (if (not (find (lambda (x) (equal? (car x) key))
                   bucket))
        (array-set! dumbhash
                    ;; extend the bucket with the key-val pair
                    (cons (cons key val) bucket)
                    hashed-key))))

You might even notice that some of these lines are shared between dumbhash-ref and dumbhash-set!, so this could be even shorter. As-is, sans comments and docstrings, it's a mere 17 lines. That's nothing.

We also cheated a little: we're using hash and equal? to generate a hash and to test for equality, which are arguably the hard parts of the job. But these are provided by Guile, and it's one less thing to worry about. Here's a brief demonstration though:

(equal? 'a 'a)               ;; => #t, or true
(equal? 'a 'b)               ;; => #f, or false
(equal? "same" "same")       ;; => #t
(equal? "same" "different")  ;; => #f
(hash "foo" 10)              ;; => 6
(hash 'bar 10)               ;; => 5

equal? is self-explanatory. The important thing to know about hash is that it'll pick a hash value for a key (the first parameter) for a hash table of some size (the second parameter).

So let's jump into an example. make-dumbhash is pretty simple. It just creates an array of whatever size we pass into it. Let's make a simple hash now:

scheme@(guile-user)> (define our-hash (make-dumbhash 8))
scheme@(guile-user)> our-hash
$39 = #(() () () () () () () ())

This literally made an array of 8 items which easy start out with the empty list as its value (that's nil for you common lispers). (You can ignore the $39 part, which may be different when you try this; Guile's REPL lets you refer to previous results at your prompt by number for fast & experimental hacking.)

So our implementation of hash tables is of fixed size, which doesn't limit the number of items we put into it, since buckets can contain multiple values in case of collision (and collisions tend to happen a lot in hash tables, and we come prepared for that), but this does mean we have an existing guess of about how many buckets we need for efficiency. (Resizing hash tables is left as an exercise for the reader.) Our hash table also uses simple linked lists for its buckets, which isn't too uncommon as it turns out.

Let's put something in the hash table. Animal noises are fun, so:

scheme@(guile-user)> (dumbhash-set! our-hash 'monkey 'ooh-ooh)
scheme@(guile-user)> our-hash
$40 = #(() () () ((monkey . ooh-ooh)) () () () ())

The monkey was appended to the third bucket. This makes sense, because the hash of monkey for size 8 is 3:

scheme@(guile-user)> (hash 'monkey 8)
$41 = 3

We can get back the monkey:

scheme@(guile-user)> (dumbhash-ref our-hash 'monkey)
$42 = (monkey . ooh-ooh)

We've set this up so that it returns a pair when we get a result, but if we try to access something that's not there, we get #f instead of a pair, unless we set a default value:

scheme@(guile-user)> (dumbhash-ref our-hash 'chameleon)
$43 = #f
scheme@(guile-user)> (dumbhash-ref our-hash 'chameleon 'not-here-yo)
$44 = not-here-yo

So let's try adding some more things to our-hash:

scheme@(guile-user)> (dumbhash-set! our-hash 'cat 'meow)
scheme@(guile-user)> (dumbhash-set! our-hash 'dog 'woof)
scheme@(guile-user)> (dumbhash-set! our-hash 'rat 'squeak)
scheme@(guile-user)> (dumbhash-set! our-hash 'horse 'neigh)
scheme@(guile-user)> ,pp our-hash
$45 = #(()
        ((horse . neigh))
        ()
        ((rat . squeak) (monkey . ooh-ooh))
        ((cat . meow))
        ()
        ((dog . woof))
        ())

(,pp is a shortcut to pretty-print something at the REPL, and I've taken the liberty of doing some extra alignment of its output for clarity.)

So we can see we have a collision in here, but it's no problem. Both rat and monkey are in the same bucket, but when we do a lookup of a hashtable in our implementation, we get a list back, and we search to see if that's in there.

We can figure out why this is O(1) average / best time, but O(n) worst time. Assume we made a hash table of the same size as the number of items we put in... assuming our hash procedure gives pretty good distribution, most of these things will end up in an empty bucket, and if they end up colliding with another item (as the rat and monkey did), no big deal, they're in a list. Even though linked lists are of O(n) complexity to traverse, assuming a properly sized hash table, most buckets don't contain any or many items. There's no guarantee of this though... it's entirely possible that we could have a table where all the entries end up in the same bucket. Luckily, given a reasonably sized hash table, this is unlikely. Of course, if we ended up making a hash table that started out with 8 buckets, and then we added 88 entries... collisions are guaranteed in that case. But I already said resizing hash tables is an exercise for the reader. :)

If you're familiar enough with any Scheme (or probably any other Lisp), reading dumbhash-ref and dumbhash-set! should be pretty self-explanatory. If not, go read an introductory Scheme tutorial, and come back! (Relatedly, I think there aren't many good introductory Guile tutorials... I have some ideas though!

What lessons are there to be learned from this post? One might be that Guile is a pretty dang nice hacking environment, which is true! Another might be that it's amazing how far I've gotten in my career without ever writing a hash table, which is also true! But the lesson I'd actually like to convey is: most of these topics are not as unapproachable as they seem. I had a long-time fear that I would never understand such code until I took the time to actually sit down and attempt to write it.

As an additional exercise for the reader, here's a puzzle: is the Any Free License this code released under actually a free license? And what commentary, if any, might the author be making? :)

Activipy v0.1 released!

By Christopher Allan Webber on Tue 03 November 2015

Hello all! I'm excited to announce v0.1 of Activipy. This is a new library targeting ActivityStreams 2.0.

If you're interested in building and expressing the information of a web application which contains social networking features, Activipy may be a great place to start.

Some things I think are interesting about Activipy:
  • It wraps ActivityStreams documents in pythonic style objects
  • Has a nice and extensible method dispatch system that even works well with ActivityStreams/json-ld's composite types.
  • It has an "Environment" feature: different applications might need to represent different vocabularies or extensions, and also might need to hook up entirely different sets of objects.
  • It hits a good middle ground in keeping things simple, until you need complexity. Everything's "just json", until you need to get into extension-land, in which case json-ld features are introduced. (Under the hood, that's always been there, but users don't necessarily need to understand json-ld to work with it.)
  • Good docs! I think! Or I worked really hard on them, at least!

As you may have guessed, this has a lot to do with our work on federation and the Social Working Group. I intend to build some interesting things on top of this myself.

In the meanwhile, I spent a lot of time on the docs, so I hope you find reading them to be enjoyable, and maybe you can build something neat with it? If you do, I'd love to hear about it!