Archives

Tags

Posts with tag "deployment"

Chicago GNU/Linux talk on Guix retrospective

By Christopher Lemmer Webber on Thu 01 October 2015

GuixSD logo

Friends... friends! I gave a talk on Guix last night in Chicago, and it went amazingly well. That feels like an undersell actually; it went remarkably well. There were 25 people, and apparently there was quite the waitlist, but I was really happy with the set of people who were in the room. I haven't talked about Guix in front of an audience before and I was afraid it would be a dud, but it's hard to explain the reaction I got. It felt like there was a general consensus in the room: Guix is taking the right approach to things.

I didn't expect this! I know some extremely solid people who were in the audience, and some of them are working with other deployment technologies, so I expected at some point to be told "you are wrong", but that moment didn't come. Instead, I met a large amount of enthusiasm for the subject, a lot of curious questions, and well... there was some criticism of the talk, though it mostly came to presentation style and approach. Also, I had promised to give two talks, both about federation and about Guix, but instead I just gave the latter and splatted over the latter's time. Though people seemed to enjoy themselves enough that I was asked to come back again and give the federation talk as well.

Before coming to this talk, I had wondered whether I had gone totally off the yaks in heading in this direction, but giving this talk was worth it in at least that the community reaction has been a huge confidence booster. It's worth persuing!

So, here are some things that came out of the talk for me:

  • The talk was clear, and generally people said that though I went into the subject to quite some depth, things were well understood and unambiguous to them.
  • This is important, because it means that once people understood the details of what I was saying, it gave them a better opportunity to evaluate for whether it was true or not... and so the general sense of the room that this is the right approach was reassuring.
  • A two tier strategy for pushing early adoption with "practical developers" probably makes sense:
    • Developers seem really excited about the "universal virtualenv" aspect of Guix (using "guix environment") and this is probably a good feature to start gaining adoption.
    • Getting GuixOps working and solidly demonstrable
  • The talk was too long. I think everything I said was useful, but I literally filled two talk slots. There are some obvious things that can be cut or reduced from the talk.
  • In a certain sense, this is also because the talk was not one, but multiple talks. Each of these items could be cut to a brief slide or two and then expanded into its own talk:
    • An intro to functional programming. I'm glad to see this this intro was very clear, and though concise, could be reduced within the scope of this talk to two quick slides rather than four with code examples.
    • An "Intro to Guile"
    • Lisp history, community, and its need for outreach and diversity
    • "Getting over your parenthesis phobia"
  • I simply unfolded an orgmode tree while presenting the talk, and while this made things easy on me, it's not very interesting for most audience members (though my friend Aeva clearly enjoyed it)

Additionally, upon hearing my talk, my friend Karl Fogel seemed quite convinced about Guix's direction (and there's few people whose analysis I'd rate higher). He observed that Guix's fundamentals seem solid, but that what it probably needs is institutional adoption at this point to bring it to the next level, and he's probably right. He also pointed out that it's not too much for an organization to invest themselves in Guix at this point, considering that developers are using way less stable software than Guix to do deployments. He suggested I try to give this talk at various companies, which could be interesting... well, maybe you'll hear more about this in the future. Maybe as a trial run I should submit some podcast episodes to Hacker Public Radio or something!

Anyway, starting next week I'm putting my words to action and working on doing actual deployments using Guix. Now that'll be interesting to write about! So stay tuned, I guess!

PS: You can download the orgmode file of the talk or peruse the html rendered version or even better check out my talks repo!

Why is it hard to move from one machine to another? An analysis. [x-post from Userops]

By Christopher Lemmer Webber on Wed 08 April 2015

NOTE: This is a shameless cross-post of something I originally sent to the userops list, where we discuss deployment things.


Hello all,

For a while I've been considering, why is it so harder for me to migrate from server to server than it is for me to migrate from desktop to desktop? For years, ever since I discovered rsync, migrating between machines has not been hard. I simply rsync my home directory over to the new machine (or maybe even just keep the old /home/ directory's partition where it is!) and bam, I am done. Backing this up is easy; it's just another rsync away. (I use dirvish as a simple wrapper around rsync so it can manage incremental backups.)

If I set up a new machine, it is no worry. Even if my current machine dies, it is mostly no worry. Rsync back my home directory, and done. I will spend a week or so discovering that certain programs I rely on are not there, and I'll install them one by one. In a way it's refreshing: I can install the programs I need, and the old cruft is gone!

This is not true for servers. At the back of my mind I realized this, but until the end of Stefano Zacchiroli's excellent LibrePlanet talk when I posed a question surrounding this situation, I hadn't totally congealed in my head: why is it so much harder for me to move from server to server? Assume I even have the old server around and I want to move. It isn't easy!

So here are some thoughts that come out of this:

  • For my user on my workstation, configuration and data are in the same place: /home/ (including lots of little dotfiles for the configs, and the rest is mostly data). Sure, there's some configuration stuff in /etc/ and data in /var/ but it mostly doesn't really matter, and copying that between machines is not hard.
  • Similarly, for my user on workstation experience, it is very little stress if I set up a machine and am missing some common packages. I can just install them again as I find them missing.
  • Neither of these are true for my server! In addition to caring about /home/, and even more importantly, I have to worry about configuration in /etc/ and data in /var/. These are both pains in the butt for me for different reasons.
  • Lots of stuff in /etc/ is configuration that interacts with the rest of the system in specific ways. I could rsync it to a new machine, but I feel like that's just blindly copying over stuff where I really need to know how it works and how it was set up with the rest of the machine in the first place.
  • This is compounded by the fact that people rarely set up one machine these days; usually they have to set up several machines for several users. Remembering how all that stuff worked is hard. The only solution seems to be to have some sort of reproducible configuration system. Hence the rise of salt, ansible, etc. But these aren't really "userops" systems, they're "devops"... developer focused. Not only do you need to know how they work, you need to know how the rest of the system works. And it's not easy to share that knowledge.
  • /var/ is another matter. Theoretically, most of my program data goes there (unless, of course, it went to /srv/, god help us). But I can't just rsync it! There are some processes that are very persnickety about the stuff there. I have to dump my databases and etc before I can move them or back up. Nothing sets up an automatic cronjob for me on these, I have to know to dump postgres. Hopefully I set up a cronjob!
  • While I as a workstation user don't stress too much if I'm missing some packages (just install them as I go), that is NOT true of my servers. If my mail servers aren't running, if jabber isn't on, (if SSH isn't running!!!), there are other servers expecting to communicate with my machine, and if I don't set them up, I miss out.
  • Not only this, assuming I have moved between servers correctly, even once I have set up my machine and it has become a perfectly okay running special snowflake, there are certain routine tasks that require a lot of manual intervention, and I have only picked up the right steps by knowing the right friends, having run across the right tutorials which hopefully have shown me the right setup, etc. SSL configuration, I'm looking at you; the only savior that I have is that I have written myself my own little orgmode notes on what to do the next time my certs expire.
  • My servers do become special snowflakes, and that is very stressful to me. I will, in the future, need to set up one more server, and remembering what I did in the past will be very hard.
  • Assuming I use all the mainstream tools, not talking about "upcoming" ones, a better configuration management solution is probably the answer, right? That's a lot to ask users though: it's not a solution to existing deployment, because it doesn't remove the need to know about all the layers underneath, it just adds a new layer to understand.

Those are all headaches, and they are not the only headaches. But here are some thoughts on things that can help:

  • If I recognize which parts of my system are "immutable" and which parts are "mutable", it's easier to frame how my system works.

    • /var/ is mutable, it's data. There's no making this "reproducible" really: it needs to be backed up and moved around.
    • My packages and system are immutable, or mostly should be. Even if not using a perfectly immutable system like guix/nix, it's helpful to act like this part of the system is pseudo-immutable, and simply derived from some listing of things I said I wanted installed. (Nix/Guix probably do this the most nicely though.)
    • /etc/ is similarly "immutable but derived" in the best case. I should be able to give the same system configuration inputs and always get the same system of packages and configuration files.
  • I like Guix/Nix, but my usage of Debian and Fedora and friends is not going away anytime soon. Nonetheless, configuration management systems like puppet/ansible/salt help give the illusion of an immutable system derived from a set of inputs, even though they are working within a mutable one.

  • Language packaging for deployment needs to die. Yes, I say this as a project that advocates that very route. We're doing it wrong, and I want to change it. (Language packaging for development though: that's great!)

  • Asking people to use systems like ansible/salt/puppet is asking users too much. You're just asking them to learn one more layer on top of knowing how the whole system works. Sharing common code is mostly copy and paste. There are some layers built on top of here to mitigate this but afaict they aren't really good, not good enough. (I am working on something to solve this...)

  • Pre-built containers are not the solution. Sorry container people! Containers can be really useful but only if they are built in some reproducible way. But very few people using Docker and etc seem to be doing this. But here's another thing: Docker and friends contain their own deployment domain specific languages, which is dumb. If a reproducible configuration system is good enough, it should be good enough for a VM or a container or a vanilla server or a desktop. So maybe we can use containers as lightweight and even sandboxed VMs, but we shouldn't be installing prebuilt containers on our servers alone as a system.

    Otherwise else you're running 80 heavy and expensive Docker images that slowly go out of date... now you're not maintaining 1 distribution install, you're maintaining 81 of them. Yikes! Good luck with the next Shellshock!

  • Before Asheesh jumps in here: yes I will say that Sandstorm is taking maybe the best route as in terms of a system that uses containers heavily (and unlike Docker, they seem actually sandboxed) in that it seems to have a separation between mutable parts and immutable parts: the container is more or less an immutable machine from what I can tell that has /var/ mounted into it, which is a pretty good route.

    In this sense I think Sandstorm has a good picture of things. There are other things that I am still very unsure about, and Asheesh knows because I have expressed them to him (I sure hope that iframe thing goes away, and that daemons like Celery can run, and etc!) but at least in this sense, Sandstorm's container story is more sane.

So there are some reflections in case you are planning on debugging why these things are hard.

-- Chris

PS: If you haven't gotten the sense, the direction I'm thinking of is more along the lines of Guix becoming our Glorious Future (TM) assuming something like GuixOps can happen (go Dave Thompson, go Guix crew!) and a web UI can be built on top of it with some sort of common recipe system.

But I don't think our imperative systems like Debian are going away anytime soon; I certainly don't intend to move all my stuff over to Guix at this time. For that reason, I think there needs to be another program to fit the middle ground: something like salt/ansible/puppet, but with less insane one-off domain specific languages, with a sharable recipe system, and scalable both from developer-oriented scripts but also having a user-friendly web interface. I've begun working on this tool, and it's called Opstimal. Expect to hear more about it soon.

Empathy for PHP + Shared Hosting (Which is Living in the Past, Dude)

By Christopher Lemmer Webber on Sun 30 March 2014

After I wrote my blogpost yesterday about deployment it generated quite a bit of discussion on the pumpiverse. Mike Linksvayer pointed out (and correctly) that "anti-PHP hate" is a poor excuse for why the rest of us are doing so bad, so I edited that bit out of my text.

After this though, maiki made a great series of posts, first asking "Should a homeless person be able to 'host' MediaGoblin?" and then talking about their own experiences. Go read it and then come back. It's well written and there's lots to think about. (Read the whole thread, in fact!) The sum of it though is that there's a large amount of tech privilege involved in installing a lot of modern web applications, but maiki posts their own experiences about why having access to free software with a lower barrier to entry was key to them making changes in their life, and ends with the phrase "aim lower". (By the way, maiki is actually a MediaGoblin community member and for a long time ran an instance.)

So, let's start out with the following set of assertions, of which I think maiki and I both agree:

  • Tech privilege is a big issue, and that lowering the barrier to entry is critical.
  • PHP + shared hosting is probably the lowest barrier to entry we have, assuming your application falls within certain constraints. This is something PHP does right! (Hence the "empathy for PHP" above.)
  • "Modern" web applications written in Python, Ruby, Node, etc, all require a too much tech privilege to run and maintain, and this is a problem.

So given all that, and given that I "fixed up" my previous post by removing the anti-PHP language, the title I chose for this blogpost probably seems pretty strange, or like it's undoing all that work. And it probably seems strange that given the above, I'll still argue that the choices around MediaGoblin were actively chosen to tackle tech privilege, and that tackling these issues head-on is critical, or free software network services will actually be in a worse place, especially in a tech privilege sense.

That's a lot to unpack, so let's step back.

I think there's an element of my discussion about web technology and even PHP that hasn't been well articulated, and that fault is my own... but it's hard to explain without going into detail. So first of all, apologies; I have been antagonistic towards PHP, and that's unfair to the language that currently powers some of the most important software on earth. That's lame of me, and I apologize.

So that's the empathy part of this title. Then, why would I include that line from my slides, that "PHP is Living in the past, Dude", in this blogpost? It seems to undo everything I'm writing. Well, I want to explain what I meant about the above language. It's not about "PHP sucks". And it does relate to free software's future, and also 5factors into conversations about tech privilege. (It also misleading in that I do not mean that modern web applications can't be written in PHP, or that their communities will be bad for such a choice, but that PHP + shared hosting as a deployment solution assumes constraints insufficient for the network freedom future I think we want.)

Consider the move to GNOME 3, the subject of Bradley's "living in the past" blogpost: during the move to GNOME 3, there were really two tech privilege issues at stake. One is that actually you're requiring newer technology with OpenGL support, and that's a tech privilege issue for people who can't afford that newer technology. (If you volunteered at a FreeGeek center, you'd probably hear this complaint, for example.) But the other one is that GNOME 3 was also trying to make the desktop easier for people, and in a direction of usability that people expect these days. That's also a tech privilege issue, and actually closer to the one we're discussing now: if the barrier to entry is that things are too technical and too foreign to what users know and expect, you're still building a privilege divide. I think GNOME made the right decision on addressing privilege, and I think it was a forward-facing one.

Thus, let me come back around to why, knowing that Python and friends are much harder, I decided to write MediaGoblin in Python anyway.

The first one is functionality. MediaGoblin probably could never be a good video hosting platform on shared hosting + PHP only; the celery component, though it makes it harder to deploy, is the whole reason MediaGoblin can process media in the background without timing out. So in MediaGoblin's case (where media types like video were always viewed as a critical part of the project), Celery does matter. More and more modern web applications are being written in ways that PHP + Shared Hosting just can't provide: they need websockets, they need external daemons which process things, and so on.

And let's not forget that web applications are not the only thing. PHP + shared hosting does not solve the email configuration problem, for example. More and more people are moving to GMail and friends; this is a huge problem for user freedom on the net. And as someone who maintains their own email server, I don't blame them. Configuring and running this stuff is just too hard. And it's not like it's a new technology... email is the oldest stable federated technology we have.

Not to mention that I've argued previously that shared hosting is not user freedom friendly. That's almost a separate conversation, though.

I also disagree that things like encryption certificates, which are also hard, don't matter. I think peoples' privacy does matter immensely, and I think we've only seen more and more reason to believe that this is an area we must work on over the last few years. (You might say that "SSL is doing it wrong" anyway, and I agree, though that's a separate conversation. Proably something that does things right will be just as hard to set up signing-wise if it's actually secure, though.)

Let's also come back to me being a Python programmer. Even given all the above, there are a lot of people out there like me who are just not interested in programming in PHP. This doesn't mean there aren't good PHP communities, clearly there are. But I do think more and more web applications are being written in non-PHP languages, and there's good reason for that. But yes, that means that these web applications are hard to deploy.

What's the answer to that? Assuming that lots of people want to write things in non-PHP languages, and that PHP + shared hosting is insufficient for a growing number of needs anyway, what do we do?

For the most of the non-PHP network services world, it has felt like the answer is to not worry about the end user side of things. Why bother, when you aren't releasing your end web application anyway? And so we've seen the rise of devops coincide with the rise of "release everything but your secret sauce" (and, whether you like it or not, with the decline of PHP + shared hosting).

I was fully aware of all of this when I decided MediaGoblin would be written in Python. Part of it is because I like Python, and well, I'm the one starting the project! But part of it is because the patterns I described above are not going away. In order for us to engage the future of the web, I think we need to tackle this direction head-on.

In the meanwhile, it's hard. It's hard in the way that installing and maintaining a free software desktop was super hard for me back in 2001, when I became involved in free software for the first time. But installers have gotten better, and the desktop has gotten better. The need for the installfest has gone away. I think that we are in a similar state with free network services, but I believe things can be improved. And that's why I wrote that piece yesterday about deployment, because I am trying to think about how to make things better. And I believe we need to, to build web applications that meet the needs of what people expect, to make free network services comparable to the devops-backed modern architected proprietary network services of today.

So, despite what it might appear at the moment, tech privilege has always been on my mind, but it's something that's forward-looking. That's hard to explain though when you're stuck in the present. I hope this blogpost helps.

Configuration Management for the People

By Christopher Lemmer Webber on Sat 29 March 2014

One of the things I've talked about in the talks I've been giving about MediaGoblin lately is the issue of deployment. Basically it boils down to the following points:

  • MediaGoblin is harder than we'd like to deploy, but it's probably easier than most other Python web applications. All modern web applications are currently hard to maintain and deploy.
  • We've worked to try to make it so you don't have to know how to be a $LANGUAGE developer (in this case, Python) to deploy MediaGoblin, but once things break you do have to be a $LANGUAGE developer, or know one. (We spend a lot of time answering these things in #mediagoblin. This can be improved if we get proper system packaging though (which is currently a work in progress, at least for Debian. I hope to see it soon...))
  • Even if you get system packaging, that's not enough. You need to do a mess of other things to set up web applications: configure the web server (Apache, Nginx), configure sending mail, configure a bunch of things. And let's not even talk about getting SSL set up right. Oy.
  • The reason people don't see that modern web applications are hard to deploy is because even though they are, there's a team of devops behind the scenes handling it for them.

We can do as good as we can to try to make MediaGoblin's docs easy to understand, but I'm convinced the solution needs to be a layer higher than MediaGoblin. That's probably one of two things:

  • A Platform as a Service solution like OpenShift. (Note, there are other proprietary ones out there, but if it's not free software, what's the point?) These solutions are kind of heavy, but they seem like a step in the right direction.
  • A configuration and deployment abstraction system, like Puppet/Chef/SaltStack/Ansible, etc.

The latter seems like the right solution to me, but it's not enough as-is. The configuration and deployment systems we presently have are too devops-focused. We can't free society by expecting everyone to join the world of devops... for one thing, these tools are way too hard to learn at present, and for second, a world where only the technically elite are free is a pretty nonfree world, really.

(I don't think things like Docker or virtual machine images are the answer. Aside from being pretty heavy, they don't solve the configuration issue.)

But these systems are pretty close. Close enough even! I think we can pull this off. Let's see what we need and what's missing:

  • Needs to have share'able recipes. (Yes yes, people currently do sad, sad things like dumping some recipes to code hosting platforms. That's not the same thing. I want something like an apt repository, but for recipes. (Note, juju might have this... I don't know much about it))
  • Recipes should be pretty simple... based on variables set by the user, or variables interpolated by user-specific settings (all these config management systems handle this, I think)... mostly I like what I've seen of how Salt handles this.
  • There should be an expectation that you should be able to mix and match recipes somewhat. This means some agreements higher up the chain on how we expect a mail server configuration is going to be described, etc. I'm not sure how the standards/governance around this could be best handled.
  • I think a layer on top of all this that's really needed is some kind of web UI for application install and configuration. If I install the MediaGoblin recipe, it may be that a lot of defaults can be set and guessed. But what if I want to turn on the video plugin, and I want to change my authentication system to using LDAP because that's what my company already uses, etc? I think this can be pretty minimal, we can have a specification that both describes config options as well as how to represent them in the web UI.
  • It shouldn't be tied to any one specific platform. Not to a wall wart, not to a VPS, not to a Raspberry Pi. It should be generic enough to work on all of these. Again, I see no reason this can't be pretty minimal.

I've been fielding this by people for a while, trying to quiz all the smarter-than-me-people I know about what they think, but I keep coming back to this. At a dinner at LibrePlanet, all the usual suspects were raised as possible solutions, and none of them seemed to fit. I was happy to hear Joey Hess say, "I think that Chris is right, we need the 'layer above apt' configuration solution." (If that wasn't some reassurance that I'm on the right track, I don't know what would be... if anyone would know it would be Joey...)

(Note: GNUToo suggested at FOSDEM that I should look at LuCI... LuCI always felt a bit clumsier than I'd like when I used it but maybe it does do these things. I don't know if it handling recipes like I said, but maybe? Worth looking into.)

Here's a rough plan on how I'd go about doing this, if I had time to do this (which I don't think I do, though I can help someone kick off the project, and I can make some contributions):

  • Start by investigating Salt (or Ansible?). It's a little bit heavy but not too heavy (the current python package appears to be 2.8mb...). Plan on using the master-less setup. Salt provides a lot of abstractions around installihng packages and a lot of these other things, so that may be a good start.
  • It might be that things need to be paired down to something even simpler, or that Salt is just too hard to build on top of. If so, start simple on recipes. I'd save a dependency graph system as something optional as a "milestone 2" type thing, personally.
  • You need some way to bundle a description of the variables to be provided but also how to represent them configuration-wise to the user. I think a Hy DSL might be the fastest way to start writing up variable descriptions including web represenation, but also embed code to retreive results in case a default is not provided.
  • At this point you should have a command-line tool that you can run, reads your currently installed recipes and your current settings, and executes them.
  • Build a web UI.
  • Figure out how to publish up a public repository of recipes. (May need to be distro specific.)

That's it I think. I think if we had something like this, it would simplify deployment greatly.

Are you interested in taking this on? We should talk! ;)

PS: We'd like to do further research and work into making MediaGoblin easier to deploy over the next year. Our capacity to do that largely depends on how much we're able to raise... consider donating to our campaign!

Edit: Mike Linksvayer makes strong points about previous language around PHP; removed. So, reverting language that makes PHP sounds like a problem, but I'll still argue that it's not actually a full solution (there are configuration issues not resolved by language choice)

Edit 2: I guess I didn't say this, so it's worth saying... a lot of difficulties in modern deployment are because people aren't using system packaging (this includes MediaGoblin's docs, which suggest the breaks-all-the-time-even-though-I-understand-why-but-our-users-don't world of Python packaging... we're waiting on Debian packaging. Real Soon Now, I hope!). Using system packaging certainly solves a lot of these headaches, but it doesn't solve nearly all of them. There's still the issues of configuring things, and there's really a lot... too much!... to configure: mail transfer agents, hooking your application up to the web server, SSL, so on and so on. That's something that hasn't been solved, especially not for non-devops people.

I've started cobbling together something that might solve things for non-devops people, and even devops people both! Maybe it'll see the light of day sometime. In the meanwhile, I'm really interested in other peoples' solutions to the problems described above.