Block o' conferencing reflections

By Christine Lemmer-Webber on Tue 29 October 2013

So this last month and a half I've done much more conferencing than I normally do; first was GNU 30th where we ran a MediaGoblin hackathon, the Google Summer of Code Mentor Summit 2013 where I ran a federation session and helped Karen Sandler run the Outreach Program for Women session, and the Blender Conference where I gave a talk (which was recorded and uploaded to YouTube, which I suppose I should mirror to MediaGoblin when I get time, or something).

There were a lot of good things that came out of these conferences for me (though I am always worrying while at conferences whether or not I am making the best use of time as it is near impossible to do "normal" tasks there). Spreading news about MediaGoblin was good, and I think I was successful there. More importantly though, I queried a lot of people for advice. I pestered a lot of people to try to get a sense of what's ahead, and I'm grateful to everyone, but especially to Deb Nicholson (as always, as MediaGoblin co-conspirator), Aeva Palecek (who puts up with hearing me think through nearly everything), Karen Sandler, Mike Linksvayer, Bradley Kuhn, John Sullivan, Leslie Hawthorne, Asheesh Laroia, and Ton Roosendaal, who all sat down with me at some point and gave me useful perspectives on how to take various strategies forward for MediaGoblin and related things. Given I'm now nearing the end of MediaGoblin's year of paid work from the MediaGoblin campaign, this was really important for me to figure out how to move ahead with the next year. I feel like I have a good sense of direction now and a set of loose plans that will (hopefully!) work out, and that's really important. Thank you, all, and thanks also to the many people I didn't even list because it would be too long.

I already called them out, but it was really great on this trip especially to talk to and observe Karen Sandler and Ton Roosendaal on the way they organize and plan their respective organizations. Karen gave me a lot of personal advice that I will not repeat here but which gave me some confidence and sense of capability (thanks Karen). I also really admire both what Karen has done with the GNOME Foundation (especially in how it's been branching out to other things with Outreach Program for Women), and I also admire how much of a large universe of things Ton Roosendaal has helped cultivate with the Blender Foundation and Blender Institute. By the way (maybe it helps to know a bit about the Blender community?) I highly recommend watching Ton Roosendaal's "foundation feedback" talk. Maybe one of the most impressive things about Ton's approach to the Blender Foundation (which really is very minimal and just handles being a steward for the code side of things and etc) and the Blender Institute (which is much more ambitious, has a large studio, employs many developers, funds open movie and game projects, and does large and bold things). One thing he talks about is that as soon as the community takes over an activity, such as doing training, the Blender Institute hands it off to the community and stops doing it so it can focus on other things that are not being worked on as strongly. A bold move; many organizations seem to have a super difficult time letting go of something once it's in their domain. But it works well; the Blender Institute focuses on growing the community into new areas, but when those areas are well established to be sustainable externally, just let them be! It's a surprising and refreshing approach in a world where even nonprofits seem to want to establish empires. Additionally, other interesting things happened in that talk and Ton's keynote; UI conversations have been strong and while construtive, somewhat divisive in the community. Ton wrote an article (also discussed in the talk) called (Re)defining Blender and I think it does a good job of reframing the issue in a way that's constructive for everyone.

Related to that, there was quite a bit of racial diversity, but sadly not much gender diversity, at the Blender conference. Sadly I think Blender falls in the intersection of free software communities and 3d graphics, both of which really struggle with gender diversity. Nonetheless, the women who were there all were doing amazing, powerful, and well-respected things, from coding of important Blender tooling, to the authorship of amazing short films, to the use of Blender for fine art, to 3d printing, to anthropological reconstruction, to I am sure some more things from people I did not talk to. But one form of diversity that Aeva and I discussed that we were both impressed by was the large diversity of types of things people are doing; from heart surgery training to animation to games to programming to fine art, people were really all over the place in the things they were accomplishing. That felt really great to see.

Speaking of Blender, it was pretty incredible to hang out in the same room of some of my "childhood heroes"... and by that I mean, I really didn't have many of those since I didn't watch a lot of non-animated television and that seems to be where people pick up their majority of celebrity crushes, and so the age range at which some of my largest "heroes" developed was in late high school and early college, when I became obsessed with free software and especially the Blender community. I had befriended Bassam Kurdali some time ago, but it was also really great to hang out in the room as people such as Andy Goralczyk and Pablo Vaquez and both to talk to them and for them to seem to take me seriously. I told them that if it weren't for being inspired by their creaturey artwork maybe I wouldn't have been so encouraged to continue on with my own creaturey stuff... maybe MediaGoblin wouldn't have a goblin and would be named something else! They seemed appreciative when I showed them Liberated Pixel Cup, the style guide, and the thing I was trying to prove (partly in response to the Open Movie Projects that they had both participated in) that distributed free culture projects are indeed possible if you do enough of the work up front (relevant maybe now due to the nature of the newly announced Project Gooseberry which will be a lot more distributed than previous open movie projects). They both seemed excited and interested (we even talked about how a 3d version of the base could be done, even in the same style), and I felt really happy about that.

There are of course many more things that happened and plenty of other things I can go on about too; maybe one of the more memorable things about this stretch of time was the GNU 30th Hackathon. We had a good turnout:

GNU 30th hackathon turnout

And we were even lucky enough to have all Outreach Program for Women students attend!

OPW students!

I was really happy with this hackathon, and especially the opportunity to have everyone together. As we said goodbye to each person, I felt a little more sad, to the point where when saying goodbye to Jessica Tallon and I split off at the ticketing area at the airport, I was nearly in a funk.

And this leads into something else. At all of these conferences I was asked: will you be coming back? Will we see you next year? Will you come to this other conference our project is running? And there's a temptation to do so... the curse of the traveler raises its head, and I feel that I want to see these people again and that I will miss them when I'm gone (and I will).

But how much conferencing should I do, really? I think the time invested in these conferences was worth it, but I will be glad to be done conferencing for some time. Previously in the year I had been invited to a bunch of conferences and for half a year I avoided all of them because I wanted to get real work done. Let's face it: there's certain kinds of useful work that's done in making connections and establishing/revitalizing community that can be done by conference-going, but it's also quite difficult to do Real Work (TM). (A certain exception: many projects have a role where someone really should be attending more conferences and etc for the pupose of making connections; it should be worked into the expectations of the position that this kills one's capacity to do all sorts of other kinds of work though.)

One thing Ton said in his talk (I'm paraphrasing) was "Every other day I get invited to another conference, and I turn down every one of them. Every conference you go to takes away a week of work, and there's too much work to be done." Too true! While I think I did the right thing by this set of conference attendance, it's time to hole back up in my apartment and get stuff done.

Time to get back to work... and there's plenty to do!

emacslisten (an idea)

By Christine Lemmer-Webber on Fri 11 October 2013

An idea I've wanted to pursue for some time now but never really have had time to work on is some kind of voice-activated emacs interface. (I'm proposing the name emacslisten here partly as a tribute to the super amazing emacspeak, which is kind of the reverse of this accessibility project.) Unfortunately, several attempts of this have been tried, but as far as I know they all rely on Dragon Naturally Speaking. Given that this is nonfree, it's a non-starter for me (not to mention the fact that I neither want to use Windows nor Wine). What to do?

Here's a brief, and I mean really brief, sketch of how I think things maybe could work.

  • Write a python daemon using the gstreamer bindings for pythonsphinx and exposing a d-bus interface. (This tutorial worked for me by the way, though I did have to change gconfaudiosrc to pulsesrc... then it worked.) This will be where commands are actually "listened" from. It might, optionally, have an --interface mode with some kind of gtk dialog.
  • Write an emacs minor-mode to listen to those d-bus calls.
  • Probably, as for how it would work, it would be a bit more vi-style modal, but also contextually modal depending on what major-mode you're in in emacs (yes I know, confusing). So, you could jump in and out of write mode vs different kinds of command mode. Depending on what major mode you're in might affect the kind of commands you're restricted to; this might improve accuracy, since you could set pythonsphinx to a more limited subset of commands. (Presumably you could set up emacs to be able to speak to this process and switch out the command set also.)
  • Just like emacs does every keybinding bound to a lisp function, every vocal command is bound to a function.

Crazy? Probably. Crazy enough to work? Maybe.

I wish I had time to run this project. And admittedly, there's a common, unfortunate pattern amongst hackers that when they're having wrist problems, they're desparate to figure out some kind of voice activated editing software. But when their wrists are okay enough, they're too busy to actually care to invest that time in it.

I can't run this project myself, but I could help with it, if someone else would be willing to take the lead on it. Anyone interested?

EDIT: In case you're wondering, Tavis Rudd's "Using Python to Code By Voice" is definitely an inspiration. As far as I know he hasn't made a release of the software though (he did kindly offer to send me the source at one point, but I didn't want to get Dragon Naturally Speaking, so I never went through with it). It might be a great base though, and anyway, it's definitely a source of inspiration. I'd really love to see a public release of the code!

EDIT / UPDATE 2: I started working on this. Not much to see yet, but you can speak and words appear in the minibuffer. Get it here and help improve it!

Free software password manager roundup

By Christine Lemmer-Webber on Sun 06 October 2013

So, I've had a goofy system that I homerolled for storing randomly generated passwords that I keep encrypted. Let's just say that it's... not ideal and doesn't scale. Really I should be using something that other people have written. So I decided to look around at my options. Here seems to be the best survey of things I could find:

  • lastpass: Irony of putting this first is also the reason in some ways it's first: this is NOT an option, because it's proprietary. That's a non-starter for me already, but it should be an extra non-starter post PRISM. I don't care if the LastPass people say they haven't been contacted/forced to hand over stuff; there's good proof that they could be forced into it. See LavaBit. And if they were forced to do so, you're essentially handing over all your stuff. Also, there's every potential of user stuff all being leaked at once. That's not security.

    Still, people seem to really like the feature set, so this gets a double mention here: it's something that I think is unacceptable/worthless to use, but maybe could be a source of inspiration to free software packages. I've been shown the way the program looks, and it does look and seem to function nicely. That's something free software packages should try to live up to: browser integration with auto form-filling, and a nice, friendly looking UI.

  • FireFox Sync: FireFox Sync is a really cool project; a "least authority" approach to storing passwords, and the fact that you can set up server-side storage of your passwords and have all your machines sync together seems pretty neat, especially because Mozilla can't even read your data. That's pretty exciting.

    However, what advantage would I get of this over setting up my own password sync with something like git-annex? And does it really do much useful for non-browser-things? It's hard for me to tell.

    Still, even though I think it's not for me, I'm glad the project exists. I'm glad that Mozilla took the right way of doing the "even we can't see your data" thing, and I hope that post-PRISM they see the value of this work and keep it as-is!

  • spd: Let's face it, as a plaintext junkie, spd is more or less what I want in a sense... a simple single-file gpg-encrypted password manager. Seems perfect! I could just sync it across machines with git-annex. (And syncing with git-annex, actually, is probably how I want to sync across everything.) It's minimal, and what's extra cool is that it's a good fit for an organization that needs to share passwords; maybe the sysadmins can access X and Y, and the PR people can access Y and Z. spd handles that, and with a simple file format... I think that's pretty awesome. And check out the screenshot on the site. It's so cool and terminal'y, and I like that you can copy-pasta from a terminal, someone can be sitting behind you, and not see what your password... while it still being terminal based! Nice.

    There's just one problem: there's no browser integration. I've been coming to think that browser integration is probably pretty necessary these days to keep up with the massive number of passwords we have to have without reusing the same shitty ones over and over. So, there's that.

    Maybe a browser extension could be added; I don't have the time to write it sadly. Still, the format seems very simple, and probably this is the closest to the kind of system I want on a technical level.

  • pass: Okay so pass is similar to spd, simple, and probably a good solution if you're a command line nerd. I'm still sayin' it could use browser integration.

    It's also apparently written in bash, and just mostly wraps gpg, which I suppose makes it the "git porcelain" of password managers.

    And speaking of git, it does have nice integration with git, though committing passwords (even encrypted) seems a bit weird to me (git-annex seems to make more sense though I have a hard time explaining why I want to drop my history). Maybe more troublesome is if someone gets access to your repo, they can see where all your passwords are, since the usernames / places are just directories. But maybe you don't care about that part being leaked?

  • keepass / keepass2: keepass is free software, and it's had quite a bit of adoption. It seems well used, tested, and liked, and best of all, there's a few browser extensions available... keefox looking the nicest of all of them. Also, it has a single-file db extension system, so that makes it fairly appealing.

    So what's the downsides? It's written in C# for one. Okay, it's still free software, and Mono does work, shut up Chris Webber, don't be ridiculous. But it really feels very windows-y and out of place, not least of all because one of the major UI pieces says "Windows" on it and all of the UI components kind of look like they don't fit totally on my GNOME desktop. The UI also just feels very cumbersome/kludgey so far (it feels a bit like a GNOME 1 or Windows 95 "power user" UI application, if you get my drift), though admittedly I haven't given it much time. Still, of all of these, it probably has the closest to all of the features I've said I want / asked for.

  • KeePassX: KeePassX seems to be the crowd favorite amongst GNU/Linux users. It's much like KeePass but written in QT and C++. So I guesss that reduces my anti-C# bigotry. However, there's no browser extensions. Why not use spd at that point?

    The UI does feel much nicer in GNOME though (and certainly it would be in KDE too). Apparently there's an "autotype" feature, but it's based on the window's title... that seems like a hack... but better than nothing?

  • KeePassC / kppy: Okay, looks pretty cool, a curses based tool using the KeePass 1.X database scheme, Python 3 based, even has a server? No browser integration, but looks promising, as it does have a server... maybe one could be implemented from there.

    However there's some weird code smells in KeePassC, like changing directory to [/var/empty/]{.title-ref} even if it doesn't exist. There's also a kppy which KeePassC uses, which is a general purpose python library to edit such things.

    Maybe a decent base to build things from?

  • GNOME SeaHorse: So, GNOME provides integrated encryption support via a program called SeaHorse. I like GNOME integration, thus I think I'd like this. However, there's also no browser extensions here, and I have a hard time figuring out whether or not I could nicely sync things across machines via git-annex and friends, so... hm.

  • Encrypted plaintext files: Okay, plaintext files plus GPG. It works, right? Except, also no browser integration, and also anyone sitting behind you can read your passwords. Let's stop pretending this is an option.

  • Encrypted org-mode files: Several ways to do it and actually it is probably a little bit less terrible than a plaintext + gpg file: the expansion of sections means you can navigate a bit better, maybe not expose all at once.... hm, you could maybe even hide the passwords with some custom elisp + font locking!

    Okay, except wait, still no browser integration, and I need to stop building systems that work just for me and nobody else in the universe in Emacs + OrgMode. Heh.

There's other options too, but they all seem to have the same problems as the above, or worse.

It really looks like keepass2 + keefox is the best solution that exists yet, but let's be honest... it's not a good solution! It speaks totally to the traditional complaint of encryption tools in free software: they work, we know how to use them in theory, and yet wheen you try to bring them to the end user, they aren't a very pleasant UI experience.

That said, I'd be willing to take a pleasant experience that wasn't really good for everyone, a-la spd, if I could get browser integration... but that's probably admitting I'm not part of the general solution!

EDITS: Added KeePassC and pass. Toned down the KeePassC exuberance after I actually tried it.

EDIT AGAIN: After trying a bunch of things, I'm currently happy with something completely not on this list at all: assword. The name makes it hard to take seriously, but it's great and elegant. Bind "assword gui" to shift-ctrl-p, and it's the simplest system possible: give it a string, and it either makes a new password, which it pastes, or it pastes whatever string you had associated with that string. So. Great. And the technology couldn't be simpler.

Base64 UUIDs in Python

By Christine Lemmer-Webber on Tue 30 July 2013

Hardly even worth writing about, but maybe it's useful to someone. Ever want a base 64 encoded UUID4 in python? I ported the uuid.uuid4() code over for base64 encoding, with a slight cleanup function to make it URL safe.

UPDATE: Making this the most useless blogpost I've already

: written, there's already a urlsave_b64encode method (also, I thus removed the rest of the post above):

>>> base64.urlsafe_b64encode(uuid.uuid4().bytes).strip("=")

Life Update: June 2013

By Christine Lemmer-Webber on Wed 26 June 2013

So I haven't done one of these in a while... the last one I did was right when I was leaving CC to work on MediaGoblin. I think they're pretty good to get out of my system. I tend to have a lot of things accrue that I'd like to talk about, and I just don't get to them. It's nice to kind of reflect all at once.

MediaGoblin

The most publicly visible thing that's changed in my life over the last many months is my shifting to being fulltime on MediaGoblin. I've written about this a bit, but honestly not enough. How is it going? I think given the circumstances, I could hardly ask for more: we have an active community that is a joy to work with each and every day, I love working on the codebase, and I feel like I'm doing something important.

And things have certainly busy. We've put out three releases since the last "life update" post I did. We've got six summer interns participating in Google Summer of Code and GNOME Outreach Program for Women. And 4 out of 6 of these participants are women... that's affected by outreach, but it's also by merit. We got a lot of super strong applicants, and I feel that we did a good job picking the best proposal for each task. Diversity is something I really believe matters, and I feel like we're doing well here. It's good to see that that message is caught on and understood by our community too (see this post by OPW participant Emily O'Leary). Things keep churning forward, and in a good way, in a community that's strong and functional in ways I feel proud of. (We just reached 65 people in the AUTHORS file.... how cool is that?)

Most of the time it's a lot of fun. It's also generally fairly tiring. Not that it isn't worth it, it's totally worth it! But there's always more to do, and I constantly feel bad about that. But I think often it's best to not be feeling bad so much and just keep working forward generally.

One thing I also feel bad about is I don't take the time to write enough about things. I don't give weekly or biweekly updates partly because I don't have a good place to put them, and blogging used to be a painful setup (although I've improved that a bit)... but maybe I should make a sub-blog for it. Or I could just spam this blog a lot more. We do have a list managed by the FSF from the MediaGoblin campaign where I sometimes put out notices and of course there's the MediaGoblin blog. Anyway, it's a situation I'm not super happy with. Joey Hess does an awesome job of blogging on a daily basis about his git-annex work. I talked to him about it, he said he often copies stuff from git logs into there. That wouldn't be hard for me to do. I've also been writing out invoices to the FSF since I am getting paid for MediaGoblin work as a contractor, and I detail most (but not by any means all) work there.

Speaking of Joey Hess, I talked with him when I was at LibrePlanet. (I'm a huge Joey Hess fan, by the way!) It was nice to compare our two projects given we're both people who were paid to work on free software for a year from crowdfunding. One thing I've thought about is the distribution of time... I think Joey spends more time directly on coding than I do, and for his project, I think that makes sense. But git-annex is, I think, mostly Joey's work (which is not to say there aren't other contributors). MediaGoblin is structured differently, and thus my allocation of time is different. Neither of these approaches are better I think, and I think the workflows we have really are probably best for our separate projects.

I write more MediaGoblin code than anyone, but the majority of MediaGoblin code is not written by me. Mostly what I do is write the "core infrastructure" of the project, then people build on top of that. I help coordinate with people on the right direction of things, help build the core bits people need, help people find what they need, do code review on what they've written, and... well, I do a lot of guidance day to day. But this makes sense I think... MediaGoblin really is a community project. I provide a lot of the vision of the project, and I do tons of work on it, but it's not all my decision making. Outside of the general vision, a lot of what happens comes from negotiation on IRC.

How much do I spend on what tasks? I have a rough "guideline schedule":

  • Monday: administrative work (I do this at the start of the week to get it out of the way)
  • Tuesday through Thursday: programming
  • Friday through Saturday: code review / community management
  • Sunday: personal/retooling day

I probably work a little over 8 hours every day (but not too much more so I don't hit total burnout or aggravate my RSI). But how much do I map to the above schedule, really? The truth of the matter is that usually there's some kind of immediate task that takes precedence over this "general schedule", but it's useful for when there isn't. But I think proportionally this comes close to the distribution of workload I have, except probably I spend a bit more time on community management / code review stuff (especially because when you add community management to this, it becomes a much broader category of things I do) than on coding directly. And that's just fine, actually... having a lot of code to review and an active community is one of those things you can't complain about.

I continue to believe that the work we're doing with MediaGoblin is important. If there's some frustration involved it's that it's a long-journey process. But I guess most things worthwhile are. People have asked me whether or not I'll continue to work on MediaGoblin after this year, and the answer is that I intend to if I can (yes, that does require figuring out funding; yes, I am thinking about it; yes I am open to suggestions). One way or another, I have felt a heightened sense of purpose recently. PRISM, Google announcing closing Google Reader and it looking like they'll stop supporting federated IM via XMPP, Google Glass coming out soon and set up by default to stream your life through Google's datacenters... these things continue to reinforce my feelings that working on issues of user freedom in networked applications is an area that critically needs work.

Visiting family and re-contextualizing my work

So things are busy with MediaGoblin, but very recently I took a break to visit some of my dad's side of the family in New Mexico. It was good to see everyone. I had some conversations with one of my aunts, a couple of my uncles, my father, one of my cousins, and one of my younger brothers about all sorts of things ranging from philosophy to religion to social justice issues... it felt like the better side of academic debates, which I miss a little.

Sitting in these conversations and talking to my family members gave me an opportunity to appreciate my family for who they are in a way I haven't thought about as much as I should. My uncle John is a great thinker, has a clearer and sharper vision on the construction of society than anyone I've ever met. He and my aunt Barbara both worked on projects to aid those in poverty, and at one point they lived both in Madison (where we are now, and that's part of the reason my parents moved to the upper midwest and why I grew up there) on a cooperative. My aunt at one point ran a rape crisis and counseling center, and now she is working on issues of helping bring heath care to the disenfranchised. My uncle Bill is working on starting a coffee shop and is trying to figure out how to use it to support local artists and promote ethical trade. My father taught and studied theology from the standpoint of greater inter-cultural understanding and finding common ground and peace between religions. My cousin Wendy is an atheist with a degree in theology and it's interesting to see just how close she and my father think. Just now she's leaving on a humanitarian mission to bring sustainable water solutions to areas that need it most (part of this is grounded in a purpose to show that actually atheists are moral people too). I'm proud to come from a family that is grounded in both thinking and acting upon issues of social justice.

And this lead to something interesting... the news of PRISM broke while I was on this trip and some of that conversation shifted toward the work I was doing, but it also contextualized some things because it was just one part of many conversations. We talked about issues of user freedom, both of PRISM and of the work I am trying to do, interwoven in many issues of social justice and human rights. And this felt significant to me, both in that I'm following in a family tradition of working on social justice issues, but even more importantly, that issues of user freedom are issues of social justice (and furthermore, that this was just very well understood by people who have worked on issues that are much more clearly seen as such).

But there is a flip side to this: if this is true, why have those of us who work on user freedom issues so generally failed to contextualize the issues we are working on in those terms? Why have we failed to describe issues of software freedom (or user freedom concerns more generally) as issues of social justice or contextualize them within human rights? A lot of thinking around this really congealed for me while on this trip, and I think I need to write about it more clearly while it is still fresh in my mind.

I left for this trip feeling like I was taking a semi-vacation from work, but also feeling bad about doing so. I left the trip with a renewed sense of purpose. It was certainly worth it for that.

Talks, conferences, and other projects

One odd thing about this year is that you'd think I'd be giving more talks than ever being fulltime on MediaGoblin, but I have actually avoided going to conferences for the most part. I have gone to two conferences in 2013: FOSDEM and LibrePlanet. I spoke at both, and I am convinced it was worth it. But every time I travel to speak it throws me off from the work I need to be doing and it feels hard to give MeidaGoblin's community its full attention. Right now I think we need more work done than speaking publicity, so after LibrePlanet, I decided to avoid conference-going for some time so I can focus. (Sadly this meant even missing PyCon... the first time I have missed PyCon since I started going in 2008!) I will probably resume some conference-going soon, as I have a collection of things I want to talk about.

But about FOSDEM and LibrePlanet: they were both great conferences, maybe some of the best conferences I have ever attended. I wish I had more time to write about them (or rather, I really should have written more about them closer to when they happened since this post is long enough already) but I will say that they were both amazing experiences. (One thing about both is that they both involved being on panels with people I really admire and have looked up to... it's strange to be taken somewhere near the same level of seriousness as them. Guess I'm doing a good job of tricking people into thinking I'm relevant!)

Oh yeah, I did give a talk about something unrelated recently, and it's the one project I've been helping with a bit: Hy, a pythonic lisp! You should check it out, it's pretty cool. I've been helping with the docs and I made the logo and I gave a talk at ChiPy and some other things. Mostly I just pester Paul Tagliamonte though. :)

Miscellaneous

Well this post is more than long enough, so here's an attempt to wrap it up.

Life is good. I like Madison. I'm slowly meeting friends here. Slowly, but it's happening. I love where we live. Generally, I love life. I'm not the absolute greatest at any of the fields that I'm in, but I seem to be doing well, I think I'm working on the right things, I have the fortune of working on them with amazing people, I'm trying, and I think we have a good chance of making it. At the moment, I'm giving myself permission to feel good about that.

Could we please end the SQLite ALTER TABLE pain?

By Christine Lemmer-Webber on Sat 15 June 2013

I think there's nothing else in the world of programming that's given me more headaches than the lack of proper alter table support in SQLite. I'm not alone, because almost every developer I've worked with has had similar pains and complaints. How many hours of developer time have been wasted on the lack of proper SQLite alter table support? It must be in the thousands of hours. Surely this is fixable, and I'm willing to put my money where my mouth is: if someone is willing to develop SQLite alter support, I'm pre-pledging $200.00 towards fixing the problem. And I bet others would be willing to donate towards such a solution as well.

First of all, yes, there's a venting of frustration above, but it is not a venting of frustration in lack of appreciation of SQLite. SQLite is wonderful software; you could say that it was one of the biggest reasons (maybe the biggest, but of several) for my own project MediaGoblin switching from MongoDB to SQL (with SQLAlchemy, which is also wonderful software). Yes, we want people to be able to run medium to large installations of MediaGoblin (and there's PostgreSQL for that), but we also want people to be able to run smallish installations for themselves or their friends and family as well. SQLite is great for this, and it's also great for making developing super simple.

But one of the original reasons for going with Mongo was remembering how frustrating migration failures in SQL could be. When deciding to switch to SQL, I realized that we'd be moving back into this pain territory (we did do migrations with Mongo; anyone who suggests you don't need migrations with a document store database doesn't know what they're talking about and is aiming a shotgun squarely at their foot... even so, migrations were easier with Mongo). But of ALTER TABLE commands, SQLite only supports RENAME TABLE and ADD COLUMN. I had remembered also how because of this sqlite could require annoying workarounds to make migrations happen. I didn't realize though that at times doing migrations would become nearly impossible.

Since sqlite lacks most ALTER TABLE commands, most migration frameworks like South and sqlalchemy-migrate do crazy workarounds for the missing commands that usually involve renaming the table, creating an entirely new table renamed with the new schema in place, copying all the data back, and killing the old table.

If that sounds like a mess, that's because it is. In fact, the sqlalchemy-migrate project homepage suggests that for new projects to use a successor called Alembic founded by the same core author as sqlalchemy-migrate (edit: I've been corrected on this on HackerNews: "Alembic is not founded by the same core author as sqlalchemy-migrate. Alembic is founded by Mike Bayer who is the core author of SQLAlchemy itself."). But we didn't use Alembic because at the time there wasn't much support for sqlite (I think bit of support has been added since then, though I don't know how much) and we knew we really wanted it. In fact, on the Alembic homepage, this is listed as a goal:

Don't break our necks over SQLite's inability to ALTER things. SQLite has almost no support for table or column alteration, and this is likely intentional. Alembic's design is kept simple by not contorting its core API around these limitations, understanding that SQLite is simply not intended to support schema changes. While Alembic's architecture can support SQLite's workarounds, and we will support these features provided someone takes the initiative to implement and test, until the SQLite developers decide to provide a fully working version of ALTER, it's still vastly preferable to use Alembic, or any migrations tool, with databases that are designed to work under the assumption of in-place schema migrations taking place.

And indeed, there's a lot of neck-breaking involved in trying to use migrations with SQLite...

At one point I tried dropping a boolean field but discovered this was impossible because SQLAlchemy doesn't have a good sense of the constraints on an sqlite table, so sqlalchemy-migrate tries to reproduce the table without the boolean field, but since the boolean check is implemented as a constraint, the new table still has a constraint on a non-existent field and sqlalchemy-migrate doesn't notice. When the statement to create the new table is executed, sqlite explodes wondering what this boolean check is doing on this field that doesn't exist. More recently, one of our Summer of Code students tried writing some migrations, discovered that one broke in sqlite and another that didn't break deleted the unique constraint. We have no idea how to move forward on some of these issues, and that's a frustrating situation to be in.

Given all the above pain, why doesn't SQLite implement ALTER TABLE? I actually don't really know the details, but one of our contributors knows a bit about the sqlite structure and tells me that he thinks it might be because the data format makes some actions like appending rows fairly easy, but other actions like deleting a field would mean rewriting the entire table line by line.

But to that I think: migration frameworks are already rewriting tables entirely! So as far as I can tell, in the worst case scenario, sqlite implementing these other alter table methods means that it will be doing the same thing that migration frameworks have to do already, but in an official way, with a better sense of the structure of the existing tables, and probably even a bit faster than some other program likely operating through a different language doing the same. Sure, this may not be ideal, but it would be much better than the present situation. The documentation could even say this: "be aware that due to the nature of the sqlite file structure, this is a very slow operation that requires rewriting your entire table." But at least it would be a operation that rewrites the table natively, and would not explode in such strange and unpredictable ways!

I would be interested in helping myself, but I don't know SQLite's codebase, database structures is a domain I don't presently know, and I do not have time to learn it. But I'm more than happy to donate money (and I'm running off a crowdfunding campaign salary... I don't normally donate this amount of money to things, but surely fixing the most frustrating recurring bug in my programming career is worth putting $200 down). And I bet I'm not alone. If someone experienced with developing sqlite was willing to make an upstream-aimed contribution to kill this pain point, I bet it'd be a very fundable project.

Update: This post has gotten a fair amount of discussion on HackerNews, which is good! I'm surprised though at the amount of people who are taking the stance of "that feature doesn't exist, so why would you want that feature?" I thought this reply gave a good response to that.

Another update: One comment on HackerNews suggests that SQLite needs to stay around 250kb to stay "light". But on my Debian install, the sqlite binary is 680kb. Now granted, Debian probably has everything optional compiled in. But there's your answer if you're afraid about the binary getting too large: make it a compile-time option!

Now on Pelican

By Christine Lemmer-Webber on Fri 14 June 2013

I guess I've switched my blog around plenty in the last number of years. Not long ago I was running Zine, then PyBlosxom with some Jinja2 templating hacks I wrote (which had all sorts of unicode problems), and now, Pelican.

I'm happy with the switch: Pelican seems sane, very cleanly built, and things are working. I took a bit to clean up some other things about the site. And ah yeah, there's a new self portrait on the homepage.

I guess that's not terribly exciting, but I've cleaned it up and made pushing changes as easy as a "make" command.

You can also check the site's contents out via git if you like. Not sure anyone would care, but hey! There it is.

Flash Fiction Dystopias Volume 1

By Christine Lemmer-Webber on Mon 03 June 2013

I thought I'd have some fun and try my hand at some flash fiction dystopian futures. This will be an exercise in brevity writing about topics I think about without taking them too seriously. Without further ado:

HTML5 DRM goes through; Netflix sweeps into the web and over the next few years takes over the majority of video and audio distribution on the web, becoming the web's first super-monopoly on media. The DRM standard is easily worked around, but that was just a front for a legal excuse to sue anyone who does. Netflix eventually starts sending out lawsuits to anyone who doesn't have a subscription; I mean come on, do they really believe you haven't been watching any of these popular shows? Seems unlikely.

Monsanto does military contracting and creates a disease (or non-dispersable airborne chemical, or a plain old buildup in toxins from embedding pesticides inside of food, take your pick) that wipes out half the population; children are particularly susceptible. In further military contracting, they had already built several lines of genetically engineered embryos that are immune. After the war, future-parents are offered the purchase of said embryos, but you basically have a choice between one of 5 different sets of DNA. Also Monsanto has control on the patents so attempts to genetically engineer your own immune children are seen as piracy; said children are confiscated and disposed of.

Genetic patents on cancer and cancer cures means you get charged both for the cure to the disease and for infringing on the company's patents for having the disease in the first place.

Google Glass comes out, free software alternatives struggle to develop for its infrastructure, especially on the backend. The pressure to own a set and stream your life through Google's datacenters is far greater than the pressure to own a cell phone ever was (and so is the social ostracization). EEG keyboards also come out; computation has moved into the point of being an additional processor for your brain. Free alternatives start to get good, but aren't quite there, and the authors never bother to successfully advocate to non-developers; occasionally people try to move over but discover that the only people they can communicate with in the free alternative are free software developers, and get tired of it. The singularity happens, and is wholly owned and operated by Google, Inc.

Automated cars come out with no free alternatives, are a remarkable improvement in automotive safety, but mean total surveillance on movement and become an effective tool (along with Google Glass) in an emerging police state. Bicyclists suddenly become the only group that have true autonomy, and begin to realize it. Unfortunately that leads 5% of the biking population to become really smug about things and the popular image of bicyclists on the road becomes so low that pushing for legislation to improve conditions for bicyclists to make it a more feasible daily urban mode of transport never take off.

Full computing, data, and software user freedom is eventually achieved, but it turns out it doesn't matter because it happens at approximately the same time that modern civilization becomes unsustainable due to resource depletion and catastrophic climate change.

FOSDEM 2013

By Christine Lemmer-Webber on Thu 31 January 2013

I'll be at FOSDEM 2013 this year speaking on the AGPL Panel. It's not "a MediaGoblin talk" but the reasons I'm there are entirely to do with MediaGoblin being under the AGPL.

There will be several other MediaGoblin community members there. I'm really interested in Deb Nicholson's Messaging for Free Software talk. We might do a hackathon.

Will you be there? Maybe we should meet up! Or at least drop by and say hello. I'm arriving tomorrow and leaving early Monday morning. If you're interested, shoot me an email at cwebber AT dustycloud DOT org. Or just come up and say hello!

My taste in music

By Christine Lemmer-Webber on Tue 29 January 2013

True stories in the life of me, as told on the #mediagoblin IRC channel:

<paroneayea> a song comes on next on rhythmbox and I'm like, Oh, I love this
             song's intro, haven't heard it in a while
<paroneayea> turns out it was a car outside with a belt problem
<paroneayea> how you know you have terrible taste in music, part 1

Morgan, who has previously compared the music I listen to to dial up modems and car horns, would probably agree with this self-assesment.