Posts with tag "userops"

Chicago GNU/Linux talk on Guix retrospective

By Chris Lemmer-Webber on Thu 01 October 2015

GuixSD logo

Friends... friends! I gave a talk on Guix last night in Chicago, and it went amazingly well. That feels like an undersell actually; it went remarkably well. There were 25 people, and apparently there was quite the waitlist, but I was really happy with the set of people who were in the room. I haven't talked about Guix in front of an audience before and I was afraid it would be a dud, but it's hard to explain the reaction I got. It felt like there was a general consensus in the room: Guix is taking the right approach to things.

I didn't expect this! I know some extremely solid people who were in the audience, and some of them are working with other deployment technologies, so I expected at some point to be told "you are wrong", but that moment didn't come. Instead, I met a large amount of enthusiasm for the subject, a lot of curious questions, and well... there was some criticism of the talk, though it mostly came to presentation style and approach. Also, I had promised to give two talks, both about federation and about Guix, but instead I just gave the latter and splatted over the latter's time. Though people seemed to enjoy themselves enough that I was asked to come back again and give the federation talk as well.

Before coming to this talk, I had wondered whether I had gone totally off the yaks in heading in this direction, but giving this talk was worth it in at least that the community reaction has been a huge confidence booster. It's worth persuing!

So, here are some things that came out of the talk for me:

  • The talk was clear, and generally people said that though I went into the subject to quite some depth, things were well understood and unambiguous to them.
  • This is important, because it means that once people understood the details of what I was saying, it gave them a better opportunity to evaluate for whether it was true or not... and so the general sense of the room that this is the right approach was reassuring.
  • A two tier strategy for pushing early adoption with "practical developers" probably makes sense:
    • Developers seem really excited about the "universal virtualenv" aspect of Guix (using "guix environment") and this is probably a good feature to start gaining adoption.
    • Getting GuixOps working and solidly demonstrable
  • The talk was too long. I think everything I said was useful, but I literally filled two talk slots. There are some obvious things that can be cut or reduced from the talk.
  • In a certain sense, this is also because the talk was not one, but multiple talks. Each of these items could be cut to a brief slide or two and then expanded into its own talk:
    • An intro to functional programming. I'm glad to see this this intro was very clear, and though concise, could be reduced within the scope of this talk to two quick slides rather than four with code examples.
    • An "Intro to Guile"
    • Lisp history, community, and its need for outreach and diversity
    • "Getting over your parenthesis phobia"
  • I simply unfolded an orgmode tree while presenting the talk, and while this made things easy on me, it's not very interesting for most audience members (though my friend Aeva clearly enjoyed it)

Additionally, upon hearing my talk, my friend Karl Fogel seemed quite convinced about Guix's direction (and there's few people whose analysis I'd rate higher). He observed that Guix's fundamentals seem solid, but that what it probably needs is institutional adoption at this point to bring it to the next level, and he's probably right. He also pointed out that it's not too much for an organization to invest themselves in Guix at this point, considering that developers are using way less stable software than Guix to do deployments. He suggested I try to give this talk at various companies, which could be interesting... well, maybe you'll hear more about this in the future. Maybe as a trial run I should submit some podcast episodes to Hacker Public Radio or something!

Anyway, starting next week I'm putting my words to action and working on doing actual deployments using Guix. Now that'll be interesting to write about! So stay tuned, I guess!

PS: You can download the orgmode file of the talk or peruse the html rendered version or even better check out my talks repo!

Userops Acid Test v0.1

By Chris Lemmer-Webber on Sun 27 September 2015

Hello all!

So a while ago we started talking about this userops thing. Basically, the idea is "deployment for the people", focusing on user computing / networking freedom (in contrast to "devops", benefits to large institutions are sure to come as a side effect, but are not the primary focus. There's kind of a loose community surrounding the term now, and a number of us are working towards solutions. But I think something that has been missing for me at least is something to test against. Everyone in the group wants to make deployment easiser. But what does that mean?

This is an attempt to sketch out requirements. Keep in mind that I'm writing out this draft on my own, so it might be that I'm missing some things. And of course, some of this can be interpreted in multiple ways. But it seems to me that if we want to make running servers something for mere mortals to do for themselves, their friends, and their families, these are some of the things that are needed:

  1. Free as in Freedom:

    I think this one's a given. If your solution isn't free and open source software, there's no way it can deliver proper network freedoms. I feel like this goes without saying, but it's not considered a requirement in the "devops" world... but the focus is different there. We're aiming to liberate users, so your software solution should of course itself start with a foundation of freedom.

  2. Reproducible:

    It's important to users that they be able to have the same system produced over and over again. This is important for experimenting with a setup before deployment, for ensuring that issues are reproducible and friends and communities helping each other debug problems when they run into them. It's also important for security; you should be able to be sure that the system you have described is the system you are really running, and that if someone has compromised your system, that you are able to rebuild it. And you shouldn't be relying on someone to build a binary version of your system for you, unless there's a way to rebuild that binary version yourself and you have a way to be sure that this binary version corresponds to the system's description and source. (Use the source, Luke!)

    Nonetheless, I've noticed that when people talk about reproducibility, they sometimes are talking about two distinct but highly related things.

    1. Reproducible packages:

      The ability to compile from source any given package in a distribution, and to have clear methods and procedures to do so. While has been a given in the free software world for a long time, there's been a trend in the devops-type world towards a determination that packaging and deployment in modern languages has gotten too complex, so simply rely on some binary deployment. For reasons described above and more, you should be able to rebuild your packages... *and* all of your packages' dependencies... and their dependencies too. If you can't do this, it's not really reproducible.

      An even better goal is to guarantee not only that packages can be built, but that they are byte-for-byte identical to each other when built upon all their previous dependencies on the same architecture. The Debian Reproducibility Project is a clear example of this principle in action.

    2. Reproducible systems:

      Take the package reproducibility description above, and apply it to a whole system. It should be possible to, one way or another, either describe or keep record of (or even better, both) the system that is to be built, and rebuild it again. Given selected packages, configuration files, and anything else that is not "user data" (which is addressed in the next section), it should be possible to have the very same system that existed previously.

      As with many things on this list, this is somewhat of a gradient. But one extrapoliation, if taken far enough, I believe is a useful one (and ties in with the "recoverable sytem" part): systems should not be necessarily dependent upon the date and time they are deployed. That is to say, if I deployed a system yesterday, I should be able to redeploy that same system today on an entirely new system using all the packages that were installed yesterday, even if my distribution now has newer, updated packages. It should be possible for a system to be reproducible towards any state, no matter what fixed point in time we were originally referring to.

  3. Recoverable:

    Few things are more stressful than having a computer that works, is doing something important for you, and then something happens... and suddenly it doesn't, and you can't get back to the point where your computer was working anymore. Maybe you even lost important data!

    If something goes wrong, it should be possible to set things right again. A good userops system should do this. There are two domains to which this applies:

    1. Recoverable data:

      In other words, backups. Anything that's special, mutable data that the user wants to keep fits in this territory. As much as possible, a userops system should seek to make running backups easy. Identifying based on system configuration which files to copy and helping to provide this information to a backup system, or simply only leaving all mutable user data in an easy-to-back-up location would help users from having to determine what to back up on their own, which can be easily overwhelming and error-prone for an individual.

      Some data (such as data in many SQL databases) is a bit more complex than just copying over files. For something like this, it would be best if a system could help with setting up this data to be moved to a more appropriate backup serialization.

    2. Recoverable system:

      Linking somewhat to the "reproducible system" concept, a user should be able to upgrade without fear of becoming stuck. Upgrade paralysis is something I know I and many others have experienced. Sometimes it even appears that an upgrade will go totally fine, and you may have tested carefully to make sure it will, but you do an upgrade, and suddenly things are broken. The state of the system has moved to a point where you can't get back! This is a problem.

      If a user experiences a problem in upgrading their system software and configuration, they should have a good means of rolling back. I believe this will remove much of the anxiety out of server administration especially for smaller scale deployments... I know it would for me.

  4. Friendly GUI

    It should be possible to install the system via a friendly GUI. This probably should be optional; there may be lower level interfaces to the deployment system that some users would prefer to use. But many things probably can be done from a GUI, and thus should be able to be.

    Many aspects of configuring a system require filling in shared data between components; a system should generally follow a Don't Repeat Yourself type philosophy. A web application may require the configuration details of a mail transfer agent, and the web application may also need to provide its own details to a web server such as Nginx or Apache. Users should have to fill in these details in one place each, and they should propagate configuration to the other components of the system.

  5. Scriptable

    Not everyone should have to work with this layer directly, but everyone benefits from scriptability. Having your system be scriptable means that users can properly build interfaces on top of your system and additional components that extend it beyond directions you may be able to do directly. For example, you might not have to build a web interface yourself; if your system exposes its internals in a language capable enough of building web applications, someone else can do that for you. Similarly with provisioning, etc.

    Working with the previous section, bonus points if the GUI can "guide users" into learning how to work with more lower level components; the Blender UI is a good example of this, with most users being artists who are not programmers, but hovering over user interface elements exposes their Python equivalents, and so many artists do not start out as developers, but become so in working to extend the program for their needs bit by bit. (Emacs has similar behavior, but is already built for developers, so is not as good of an example.) "Self Extensibility" is another way to look at this.

  6. Collaboration friendly:

    Though many individuals will be deploying on their own, many server deployments are set up to serve a community. It should be possible for users to help each other collaborate on deployment. This may mean a variety of things, from being able to collaborate on configuration, to having an easy means to reproduce a system locally.

    Additionally, many deployments share steps. Users should be able to help each other out and share "recipes" of deployment steps. The most minimalist (and least useful) version of this is something akin to snippet sharing on a wiki. Most likely, wikis already exist, so more interestingly, it should be possible to share deployment strategies via code that is proceduralized in some form. As such, in an ideal form, deployment recipes should be made available similar to how packages are in present distributions, with the appropriate slots left open for customization for a particular deployment.

  7. Fleet manageable:

    Many users have not one but many computers to take care of these days. Keeping so many systems up to date can be very hard; being able to do so for many systems at once (especially if your system allows them to share configuration components) can help a user actually keep on track of things and lead to less neglected systems.

    There may be different sets, or "fleets", of computers to take care of... you may find that a user discovers that she needs to both take care of a set of computers for her (and maybe her loved ones') personal use, but she also has servers to take care of for a hobby project, and another set of servers for work.

    Not all users require this, and perhaps this can be provided on another layer via some other scripting. But enough users are in "maintainance overload" of keeping track of too many computers that this should probably be provided.

  8. Secure

    One of the most important and yet open ended requirements, proper security is critical. Security decisions usually involve tradeoffs, so what security decisions are made is left somewhat open ended, but there should be a focus of security within your system. Most importantly, good security hygeine should be made easy for your users, ideally as easy or easier than not following good hygeiene.

    Particular areas of interest include: encrypted communication, preferring or enforcing key based authentication over passwords, isolating and sandboxing applications.

To my knowledge, at this time no system provides all the features above in a way that is usable for many everyday users. (I've also left some ambiguity in how to achieve these properties above, so in a sense, this is not a pass/fail type test, but rather a set of properties to measure a system against.) In an ideal future, more Userops type systems will provide the above properties, and ideally not all users will have to think too much about their benefits. (Though it's always great to give the opportunity to users who are interested in thinking about these things!) In the meanwhile, I hope this document will provide a useful basis for implementing and thinking about mapping one's implementation against!

Why is it hard to move from one machine to another? An analysis. [x-post from Userops]

By Chris Lemmer-Webber on Wed 08 April 2015

NOTE: This is a shameless cross-post of something I originally sent to the userops list, where we discuss deployment things.

Hello all,

For a while I've been considering, why is it so harder for me to migrate from server to server than it is for me to migrate from desktop to desktop? For years, ever since I discovered rsync, migrating between machines has not been hard. I simply rsync my home directory over to the new machine (or maybe even just keep the old /home/ directory's partition where it is!) and bam, I am done. Backing this up is easy; it's just another rsync away. (I use dirvish as a simple wrapper around rsync so it can manage incremental backups.)

If I set up a new machine, it is no worry. Even if my current machine dies, it is mostly no worry. Rsync back my home directory, and done. I will spend a week or so discovering that certain programs I rely on are not there, and I'll install them one by one. In a way it's refreshing: I can install the programs I need, and the old cruft is gone!

This is not true for servers. At the back of my mind I realized this, but until the end of Stefano Zacchiroli's excellent LibrePlanet talk when I posed a question surrounding this situation, I hadn't totally congealed in my head: why is it so much harder for me to move from server to server? Assume I even have the old server around and I want to move. It isn't easy!

So here are some thoughts that come out of this:

  • For my user on my workstation, configuration and data are in the same place: /home/ (including lots of little dotfiles for the configs, and the rest is mostly data). Sure, there's some configuration stuff in /etc/ and data in /var/ but it mostly doesn't really matter, and copying that between machines is not hard.
  • Similarly, for my user on workstation experience, it is very little stress if I set up a machine and am missing some common packages. I can just install them again as I find them missing.
  • Neither of these are true for my server! In addition to caring about /home/, and even more importantly, I have to worry about configuration in /etc/ and data in /var/. These are both pains in the butt for me for different reasons.
  • Lots of stuff in /etc/ is configuration that interacts with the rest of the system in specific ways. I could rsync it to a new machine, but I feel like that's just blindly copying over stuff where I really need to know how it works and how it was set up with the rest of the machine in the first place.
  • This is compounded by the fact that people rarely set up one machine these days; usually they have to set up several machines for several users. Remembering how all that stuff worked is hard. The only solution seems to be to have some sort of reproducible configuration system. Hence the rise of salt, ansible, etc. But these aren't really "userops" systems, they're "devops"... developer focused. Not only do you need to know how they work, you need to know how the rest of the system works. And it's not easy to share that knowledge.
  • /var/ is another matter. Theoretically, most of my program data goes there (unless, of course, it went to /srv/, god help us). But I can't just rsync it! There are some processes that are very persnickety about the stuff there. I have to dump my databases and etc before I can move them or back up. Nothing sets up an automatic cronjob for me on these, I have to know to dump postgres. Hopefully I set up a cronjob!
  • While I as a workstation user don't stress too much if I'm missing some packages (just install them as I go), that is NOT true of my servers. If my mail servers aren't running, if jabber isn't on, (if SSH isn't running!!!), there are other servers expecting to communicate with my machine, and if I don't set them up, I miss out.
  • Not only this, assuming I have moved between servers correctly, even once I have set up my machine and it has become a perfectly okay running special snowflake, there are certain routine tasks that require a lot of manual intervention, and I have only picked up the right steps by knowing the right friends, having run across the right tutorials which hopefully have shown me the right setup, etc. SSL configuration, I'm looking at you; the only savior that I have is that I have written myself my own little orgmode notes on what to do the next time my certs expire.
  • My servers do become special snowflakes, and that is very stressful to me. I will, in the future, need to set up one more server, and remembering what I did in the past will be very hard.
  • Assuming I use all the mainstream tools, not talking about "upcoming" ones, a better configuration management solution is probably the answer, right? That's a lot to ask users though: it's not a solution to existing deployment, because it doesn't remove the need to know about all the layers underneath, it just adds a new layer to understand.

Those are all headaches, and they are not the only headaches. But here are some thoughts on things that can help:

  • If I recognize which parts of my system are "immutable" and which parts are "mutable", it's easier to frame how my system works.

    • /var/ is mutable, it's data. There's no making this "reproducible" really: it needs to be backed up and moved around.
    • My packages and system are immutable, or mostly should be. Even if not using a perfectly immutable system like guix/nix, it's helpful to act like this part of the system is pseudo-immutable, and simply derived from some listing of things I said I wanted installed. (Nix/Guix probably do this the most nicely though.)
    • /etc/ is similarly "immutable but derived" in the best case. I should be able to give the same system configuration inputs and always get the same system of packages and configuration files.
  • I like Guix/Nix, but my usage of Debian and Fedora and friends is not going away anytime soon. Nonetheless, configuration management systems like puppet/ansible/salt help give the illusion of an immutable system derived from a set of inputs, even though they are working within a mutable one.

  • Language packaging for deployment needs to die. Yes, I say this as a project that advocates that very route. We're doing it wrong, and I want to change it. (Language packaging for development though: that's great!)

  • Asking people to use systems like ansible/salt/puppet is asking users too much. You're just asking them to learn one more layer on top of knowing how the whole system works. Sharing common code is mostly copy and paste. There are some layers built on top of here to mitigate this but afaict they aren't really good, not good enough. (I am working on something to solve this...)

  • Pre-built containers are not the solution. Sorry container people! Containers can be really useful but only if they are built in some reproducible way. But very few people using Docker and etc seem to be doing this. But here's another thing: Docker and friends contain their own deployment domain specific languages, which is dumb. If a reproducible configuration system is good enough, it should be good enough for a VM or a container or a vanilla server or a desktop. So maybe we can use containers as lightweight and even sandboxed VMs, but we shouldn't be installing prebuilt containers on our servers alone as a system.

    Otherwise else you're running 80 heavy and expensive Docker images that slowly go out of date... now you're not maintaining 1 distribution install, you're maintaining 81 of them. Yikes! Good luck with the next Shellshock!

  • Before Asheesh jumps in here: yes I will say that Sandstorm is taking maybe the best route as in terms of a system that uses containers heavily (and unlike Docker, they seem actually sandboxed) in that it seems to have a separation between mutable parts and immutable parts: the container is more or less an immutable machine from what I can tell that has /var/ mounted into it, which is a pretty good route.

    In this sense I think Sandstorm has a good picture of things. There are other things that I am still very unsure about, and Asheesh knows because I have expressed them to him (I sure hope that iframe thing goes away, and that daemons like Celery can run, and etc!) but at least in this sense, Sandstorm's container story is more sane.

So there are some reflections in case you are planning on debugging why these things are hard.

-- Chris

PS: If you haven't gotten the sense, the direction I'm thinking of is more along the lines of Guix becoming our Glorious Future (TM) assuming something like GuixOps can happen (go Dave Thompson, go Guix crew!) and a web UI can be built on top of it with some sort of common recipe system.

But I don't think our imperative systems like Debian are going away anytime soon; I certainly don't intend to move all my stuff over to Guix at this time. For that reason, I think there needs to be another program to fit the middle ground: something like salt/ansible/puppet, but with less insane one-off domain specific languages, with a sharable recipe system, and scalable both from developer-oriented scripts but also having a user-friendly web interface. I've begun working on this tool, and it's called Opstimal. Expect to hear more about it soon.