Archives

Tags

Posts with tag "python"

Racket is an acceptable Python

By Christopher Lemmer Webber on Tue 09 July 2019

A little over a decade ago, there were some popular blogposts about whether Ruby was an acceptable Lisp or whether even Lisp was an acceptable Lisp. Peter Norvig was also writing at the time introducing Python to Lisp programmers. Lisp, those in the know knew, was the right thing to strive for, and yet seemed unattainable for anything aimed for production since the AI Winter shattered Lisp's popularity in the 80s/early 90s. If you can't get Lisp, what's closest thing you can get?

This was around the time I was starting to program; I had spent some time configuring my editor with Emacs Lisp and loved every moment I got to do it; I read some Lisp books and longed for more. And yet when I tried to "get things done" in the language, I just couldn't make as much headway as I could with my preferred language for practical projects at the time: Python.

Python was great... mostly. It was easy to read, it was easy to write, it was easy-ish to teach to newcomers. (Python's intro material is better than most, but my spouse has talked before about some major pitfalls that the Python documentation has which make getting started unnecessarily hard. You can hear her talk about that at this talk we co-presented on at last year's RacketCon.) I ran a large free software project on a Python codebase, and it was easy to get new contributors; the barrier to entry to becoming a programmer with Python was low. I consider that to be a feature, and it certainly helped me bootstrap my career.

Most importantly of all though, Python was easy to pick up and run with because no matter what you wanted to do, either the tools came built in or the Python ecosystem had enough of the pieces nearby that building what you wanted was usually fairly trivial.

But Python has its limitations, and I always longed for a lisp. For a brief time, I thought I could get there by contributing to the Hy project, which was a lisp that transformed itself into the Python AST. "Why write Python in a syntax that's easy to read when you could add a bunch of parentheses to it instead?" I would joke when I talked about it. Believe it or not though, I do consider lisps easier to read, once you are comfortable to understand their syntax. I certainly find them easier to write and modify. And I longed for the metaprogramming aspects of Lisp.

Alas, Hy didn't really reach my dream. That macro expansion made debugging a nightmare as Hy would lose track of where the line numbers are; it wasn't until that when I really realized that without line numbers, you're just lost in terms of debugging in Python-land. That and Python didn't really have the right primitives; immutable datastructures for whatever reason never became first class, meaning that functional programming was hard, "cons" didn't really exist (actually this doesn't matter as much as people might think), recursive programming isn't really as possible without tail call elimination, etc etc etc.

But I missed parentheses. I longed for parentheses. I dreamed in parentheses. I'm not kidding, the only dreams I've ever had in code were in lisp, and it's happened multiple times, programs unfolding before me. The structure of lisp makes the flow of code so clear, and there's simply nothing like the comfort of developing in front of a lisp REPL.

Yet to choose to use a lisp seemed to mean opening myself up to eternal yak-shaving of developing packages that were already available on the Python Package Index or limiting my development community an elite group of Emacs users. When I was in Python, I longed for the beauty of a Lisp; when I was in a Lisp, I longed for the ease of Python.

All this changed when I discovered Racket:

  • Racket comes with a full-featured editor named DrRacket built-in that's damn nice to use. It has all the features that make lisp hacking comfortable previously mostly only to Emacs users: parenthesis balancing, comfortable REPL integration, etc etc. But if you want to use Emacs, you can use racket-mode. Win-win.
  • Racket has intentionally been built as an educational language, not unlike Python. One of the core audiences of Racket is middle schoolers, and it even comes with a built-in game engine for kids. (The How to Design Programs prologue might give you an introductory taste, and Realm of Racket is a good book all about learning to program by building Racket games.)
  • My spouse and I even taught classes about how to learn to program for humanities academics using Racket. We found the age-old belief that "lisp syntax is just too hard" is simply false; the main thing that most people lack is decent lisp-friendly tooling with a low barrier to entry, and DrRacket provides that. The only people who were afraid of the parentheses turned out to be people who already knew how to program. Those who didn't even praised the syntax for its clarity and the way the editor could help show you when you made a syntax error (DrRacket is very good at that). "Lisp is too hard to learn" is a lie; if middle schoolers can learn it, so can more seasoned programmers.
  • Racket might even be more batteries included than Python. At least all the batteries that come included are generally nicer; Racket's GUI library is the only time I've ever had fun in my life writing GUI programs (and they're cross platform too). Constructing pictures with its pict library is a delight. Plotting graphs with plot is an incredible experience. Writing documentation with Scribble is the best non-org-mode experience I've ever had, but has the advantage over org-mode in that your document is just inverted code. I could go on. And these are just some packages bundled with Racket; the Package repository contains much more.
  • Racket's documentation is, in my experience, unparalleled. The Racket Guide walks you through all the key concepts, and the Racket Reference has everything else you need.
  • The tutorials are also wonderful; the introductory tutorial gets your feet wet not through composing numbers or strings but by building up pictures. Want to learn more? The next two tutorials show you how to build web applications and then build your own web server.
  • Like Python, even though Racket has its roots in education, it is more than ready for serious practical use. These days, when I want to build something and get it done quickly and efficiently, I reach for Racket first.

Racket is a great Lisp, but it's also an acceptable Python. Sometimes you really can have it all.

Why XUDD is stuck (or: why Python needs better immutable structures)

By Christopher Lemmer Webber on Mon 23 February 2015

Update: Well, you post a thing, and sometimes that's enough for people to come and help you realize how wrong you are. Which is good! There are a number of ways forward (some obvious in retrospect). For one, pyrsistent does exist and looks nice and... well it's even actively developed. But even aside from that, there are several clean solutions: wrapper objects which "lock" the child object with getters but no setters, or even just using alist style tuples of tuples for a fake hashmap. Options are indeed abound.

And the exception thing? Well, that wasn't listed as a permanent problem below, but the solution is even easier! It would be simple to have a MessageError("some_identifier") which has a minimalist identifier which can be passed across the wire, and the directive of this error can be a special case.

Anyway, you can read the original post below. But it's good to be wrong. XUDD no longer has reason to be dead. Long live XUDD!

tl;dr: Kind of along post, but basically the lack of good functional data structures in Python has kinda killed the project.

One of my favorite projects ever has been to work on XUDD, an asynchronous actor model system for Python. Originally born out of a quest to build a MUD (hence the name), but eventually became the focus of being an actor model itself, it was a really interesting exploration for me.

There's a lot of things I like about the actor model... for one thing, functional programming is all the rage, right? But not all systems are easy to express in a purely functional style, and done right an actor model can be fairly object oriented'y, but done right, you can have your mutable cake and eat it too, safely! Your actor can mutate some of its own variables, but when it communicates across the wire, since it's a "shared nothing" environment you can get even better scale-to-the-moon type functionality than in many functional language systems: it's trivial to write code where actors communicate across multiple processes or machines in just the same way as if they were all on the same machine in the same process and thread.

I put a lot of thought into XUDD, and I've looked into some of the alternatives like Pykka, and I still think XUDD has some ideas that kicks butt on other systems. I still think the use of coroutines feels very clean and easy to read, the "hive" model is pretty nice, and the way it's built on top of the awesome asyncio system for Python are all things I'm happy with.

So, every once in a while I get an email from someone who reads the XUDD documentation and also gets excited and asks me what's going on.

The sad reality is: I'm stuck. I'm stuck on two fronts, and one I can figure my way out of, but the other one doesn't seem easy to deal with in Python as-is.

The first issue is of error propagation. This is solvable, but when an exception is raised, it would be nice to propagate this back to the original actor. There are some side issues I'm not sure about: in an inter-hive-communication (read: multiple machines or processes) type scenario, should we use standard exceptions and try to import and reproduce the same exception that was raised elsewhere? That seems like it could be... gnarly to do. Raising the error inside the original routine is also a bit tricky, but not too hard; python's coroutines can support it and I just need to think about it. So exceptions are annoying but solvable.

But the other issue... I'm not sure what to do about it. Basically, it's an issue of a lack of immutable types, or we might even say "purely functional datastructures" that are robust enough to continue with. Why does that matter? Messages sent between actors shouldn't have any mutable data. It's fine and well and even a nice feature for actors to be able to have mutable data within themselves, and actually even provide a nice way to pull of things that are just damned hard in purely functional systems, but between actors, mutable data is a no-no.

It's easy to see why: say we have a function that has a list in it, say of a number of children in my classroom, and I send this list over to an actor that controls some sort of database, right? I'm doing things in a nice, fancy coroutine'ish type way, which means my function can just suspend mid-execution while it waits for that database actor to generate some sort of reply and send it back to me. What happens if that other function pops one of the items off the list, or appends to it, or in some way actually mutates the list? Now, when my function continues, it'll be operating on a differently formed list than the one it thinks it has. I might have a reference to the third item in the list, but it turns out that there isn't a third item in the list anymore, because the other function popped it off. This can introduce all sorts of subtle bugs, and it's bumming me out that I don't have a good solution to them.

Now, there's a way around this: you can serialize each and every message completely before sending it to another actor. And of course, if actors are on different threads, processes, or even entirely different machines, of course we'd do this. But XUDD has the concept of actors being on the same hive, and there are a number of reasons for this, but one of them is that for local message passing, packing and unpacking data in some sort of serialized format for every call slows things down by a lot. When I originally began designing XUDD, the plan was for games that might need to shard out to a number of different servers but have players that can traverse different parts of the system and communicate with other shards (without knowing or the code mostly knowing that it's communicating with other actors that are technically remote). I want to be able to pass many messages at once to actors that are on the same hive, while still having a totally safe time of doing so. But there's no way to do so without a nice set of immutable / "purely functional" types, and Python just doesn't have this right now. None of the third party libraries I've found seem well maintained (am I missing something?), and the standard library is fairly deficient here. Why? I'm not really sure. I guess Python's history is just synchronously imperative enough that it just hasn't mattered.

I'd like to continue research into the actor model... I have some projects I'd like to work on where the actor model seems perfectly tuned to those tasks. What to do?

Well, I'm not really sure... I guess I could just serialize everything all the time, but it's kind of a bummer to me that so many cycles would be wasted for local computation. Maybe it's a dumb reason to feel exhausted with things, but that's the state of it. I'm not enough of a datastructure wizard to implement these things myself, but they exist. I've thought about giving up on XUDD being a Python project and to move over to something else... Guile has a cooperative REPL which would be great for debugging, and I really like the community there, so maybe that would be a nice place to go. Not really sure there's anything else I'm interested enough in at the moment. I think I'd miss Python. Or maybe I'm over-thinking everything in the first place? (Wouldn't be the first time.)

Maybe there's another way out. If you have any ideas, contact me.

How Hy backported "yield from" to Python 2

By Christopher Lemmer Webber on Thu 20 November 2014

Hello everyone! Time for a bit of a diversion towards one of my favorite projects, Hy! (I'm an occasional committer, but the main mastermind behind the project is my good friend Paul Tagliamonte.) For those of you who don't know, Hy is a Lisp that transforms into the Python AST. Even more fun: you can import .hy files in .py files and .py files in .hy files! Crazy!

Now, when many people hear that, they say, "Huh what, why on earth would you do such a thing?" The usual response is something like, "Because it's fun!" But today, dear readers, I am going to show you a real... dare I say practical reason for using Hy. Because a cool feature just landed in Hy: a backport of "yield from" to Python 2.

Let's back up a bit. First, you might not know what "yield from" is or why it's cool. Well, Python has this thing called coroutines which allow you to do cool things, including suspending and resuming functions, which it turns out is really great for writing asynchronous code (sure, just wake me back up when we get the next network stanza, eh buddy?). Once you provide the ability to nest together coroutines by "delegating to subgenerators" (what "yield from" does that "yield" does not), this stuff starts to get really powerful. This feature is so useful that it's the basis of maybe the world's coolest asynchronous programming environment, asyncio. Only one problem: "yield from" didn't exist until Python 3.3... which means you can't use it with Python 2. Bummer!

Or can you? Time for our second bit of context. Ever hear of a Lisp programmer talk about something called a "macro"? No? Okay, think harder. Maybe it was in one of those conversations where you were talking about your favorite new your-pet-language feature, and the Lisp hacker was like, "Oh that's cute... yeah Lisp had that decades ago." And then you got really mad and brought up a bunch more features, and the Lisp hacker kept saying that Lisp had them before you were born, and "Oh yeah, and whatever features Lisp doesn't have, you can add really fast because Lisp has macros. You can basically program any feature with macros." Maybe you asked them, what the heck is a macro, and they said something inane like "it's a feature where you can program other features, or write code that writes code", but you barely remember, because at that point you just wanted to punch them in their smug little face. (And besides, you wondered, if you have higher order functions, isn't that enough?)

Well friends, today it is my face that you will want to do the punching to, because I'm about to show you how cool macros are and why having them makes Hy so awesome. But, after the face punching thing, you'll also thank me. (Also, please don't punch me in the face, this blogpost is not consent for face-punching.)

Enough with the talk. Time for examples! Let's look at some code. Say you have this Python 3.3+ code:

from awesomelib import IrcBot, bake_cookie, async

def irc_to_cookies(**connection_stuff):
    our_bot = IrcBot(**connection_stuff)
    yield from our_bot.open_connection()

    while True:
        message = yield from our_bot.get_next_message()

        if message.command == "bake_cookie":
            yield from async(bake_cookie())

Groovy. In our example above, we built a cookie baker that can be plugged into our awesomelib asynchronous network library and cookie baking pipeline system. (And of course, we wrote it in Hy, because we love Hy, and you can still run Hy code in vanilla Python.) We're feeling pretty good about this. We kind of wish we could run it in Python 2.X still, but Python 3.3+ is the future anyway, and no use worrying about the past really... right?

Our Hy example looks pretty similar:

(defn irc-to-cookies [&kwargs connection-stuff]
  (setv our-bot (apply IrcBot [] connection-stuff))
  (yield-from (.open-connection our-bot))

  (while True
    (setv message (yield-from (.get-next-message our-bot)))

    ;; If an irc user gives the "bake_cookie" command,
    ;; put a cookie in our ez_bake oven
    (if (= message.command "bake_cookie")
      (yield-from (async (bake-cookie))))))

But there's something magical... this code makes use of yield-from, which in Python 3.3+ Hy just uses the actual real built in "yield from" (or more accurately, ast.YieldFrom). But what about Python 2? Mere higher ordered function magic can't save us here. We need a way to implement a new feature.

Except oh right, this is a lisp, and we have macros! So why not write a macro for yield-from?

And it turns out that's exactly what paultag did:

(if-python2
  (defmacro/g! yield-from [expr]
    `(do (import types)
         (setv ~g!iter (iter ~expr))
         (setv ~g!return nil)
         (setv ~g!message nil)
         (while true
           (try (if (isinstance ~g!iter types.GeneratorType)
                  (setv ~g!message (yield (.send ~g!iter ~g!message)))
                  (setv ~g!message (yield (next ~g!iter))))
           (catch [~g!e StopIteration]
             (do (setv ~g!return (if (hasattr ~g!e "value")
                                     (. ~g!e value)
                                     nil))
               (break)))))
           ~g!return))
  nil)

This simple macro above is an implementation of yield-from which works in Python 2. The macro is more or less a function that writes new code to be expanded in place... allowing us to use basic building blocks of the language to build more complex features. Since in Lisp, code is a very simple, manipulatable data structure (lists!), we can literally write out code that writes code without too much trouble. (There's some magic going on with the ` character above, called backquoting... but it's best to read a tutorial on macros if it's not clear to you how the backquote is building the list of code there.) Hey look... we just brought a feature back to the future... as long as we're writing code in Hy, we can do subgenerator delegation with yield-from. Cool! That sure makes coroutines a lot more useful to those of us living in the past.

So wait, does this mean you can now use asyncio with Python 2? Well, not quite... asyncio is written in normal Python syntax, which means that it's using "yield from", not our more versatile "yield-from", and the library itself isn't written to support Python 2.7. So, no. (But, if asyncio was written in Hy, we could, even though Python 2.7 doesn't have "yield from"!)

The real goal of this article isn't to convince you to start backporting features to Python 2.X via Hy, though. Really, Python 3 is the future, write Python 3 code! But the point here is to get you thinking about how having macros allows you to implement new features now! Why wait for the features of Python 4.X? In Hy you can have them now! (Or even start prototyping them today!)

And that's worth getting excited about. And once you realize that, it's a bit easier to understand where lispers are coming from when they nerd out about how cool macros are. (Even if you still want to punch us in the face.)

(PS: think this is pretty cool? Hy is a really welcoming community, and there's a lot of fun stuff to do! Learn about language implementation and learn about lisp in a paradoxically fun and pythonic environment! We'd love to have you join in hacking with us!)

Empathy for PHP + Shared Hosting (Which is Living in the Past, Dude)

By Christopher Lemmer Webber on Sun 30 March 2014

After I wrote my blogpost yesterday about deployment it generated quite a bit of discussion on the pumpiverse. Mike Linksvayer pointed out (and correctly) that "anti-PHP hate" is a poor excuse for why the rest of us are doing so bad, so I edited that bit out of my text.

After this though, maiki made a great series of posts, first asking "Should a homeless person be able to 'host' MediaGoblin?" and then talking about their own experiences. Go read it and then come back. It's well written and there's lots to think about. (Read the whole thread, in fact!) The sum of it though is that there's a large amount of tech privilege involved in installing a lot of modern web applications, but maiki posts their own experiences about why having access to free software with a lower barrier to entry was key to them making changes in their life, and ends with the phrase "aim lower". (By the way, maiki is actually a MediaGoblin community member and for a long time ran an instance.)

So, let's start out with the following set of assertions, of which I think maiki and I both agree:

  • Tech privilege is a big issue, and that lowering the barrier to entry is critical.
  • PHP + shared hosting is probably the lowest barrier to entry we have, assuming your application falls within certain constraints. This is something PHP does right! (Hence the "empathy for PHP" above.)
  • "Modern" web applications written in Python, Ruby, Node, etc, all require a too much tech privilege to run and maintain, and this is a problem.

So given all that, and given that I "fixed up" my previous post by removing the anti-PHP language, the title I chose for this blogpost probably seems pretty strange, or like it's undoing all that work. And it probably seems strange that given the above, I'll still argue that the choices around MediaGoblin were actively chosen to tackle tech privilege, and that tackling these issues head-on is critical, or free software network services will actually be in a worse place, especially in a tech privilege sense.

That's a lot to unpack, so let's step back.

I think there's an element of my discussion about web technology and even PHP that hasn't been well articulated, and that fault is my own... but it's hard to explain without going into detail. So first of all, apologies; I have been antagonistic towards PHP, and that's unfair to the language that currently powers some of the most important software on earth. That's lame of me, and I apologize.

So that's the empathy part of this title. Then, why would I include that line from my slides, that "PHP is Living in the past, Dude", in this blogpost? It seems to undo everything I'm writing. Well, I want to explain what I meant about the above language. It's not about "PHP sucks". And it does relate to free software's future, and also 5factors into conversations about tech privilege. (It also misleading in that I do not mean that modern web applications can't be written in PHP, or that their communities will be bad for such a choice, but that PHP + shared hosting as a deployment solution assumes constraints insufficient for the network freedom future I think we want.)

Consider the move to GNOME 3, the subject of Bradley's "living in the past" blogpost: during the move to GNOME 3, there were really two tech privilege issues at stake. One is that actually you're requiring newer technology with OpenGL support, and that's a tech privilege issue for people who can't afford that newer technology. (If you volunteered at a FreeGeek center, you'd probably hear this complaint, for example.) But the other one is that GNOME 3 was also trying to make the desktop easier for people, and in a direction of usability that people expect these days. That's also a tech privilege issue, and actually closer to the one we're discussing now: if the barrier to entry is that things are too technical and too foreign to what users know and expect, you're still building a privilege divide. I think GNOME made the right decision on addressing privilege, and I think it was a forward-facing one.

Thus, let me come back around to why, knowing that Python and friends are much harder, I decided to write MediaGoblin in Python anyway.

The first one is functionality. MediaGoblin probably could never be a good video hosting platform on shared hosting + PHP only; the celery component, though it makes it harder to deploy, is the whole reason MediaGoblin can process media in the background without timing out. So in MediaGoblin's case (where media types like video were always viewed as a critical part of the project), Celery does matter. More and more modern web applications are being written in ways that PHP + Shared Hosting just can't provide: they need websockets, they need external daemons which process things, and so on.

And let's not forget that web applications are not the only thing. PHP + shared hosting does not solve the email configuration problem, for example. More and more people are moving to GMail and friends; this is a huge problem for user freedom on the net. And as someone who maintains their own email server, I don't blame them. Configuring and running this stuff is just too hard. And it's not like it's a new technology... email is the oldest stable federated technology we have.

Not to mention that I've argued previously that shared hosting is not user freedom friendly. That's almost a separate conversation, though.

I also disagree that things like encryption certificates, which are also hard, don't matter. I think peoples' privacy does matter immensely, and I think we've only seen more and more reason to believe that this is an area we must work on over the last few years. (You might say that "SSL is doing it wrong" anyway, and I agree, though that's a separate conversation. Proably something that does things right will be just as hard to set up signing-wise if it's actually secure, though.)

Let's also come back to me being a Python programmer. Even given all the above, there are a lot of people out there like me who are just not interested in programming in PHP. This doesn't mean there aren't good PHP communities, clearly there are. But I do think more and more web applications are being written in non-PHP languages, and there's good reason for that. But yes, that means that these web applications are hard to deploy.

What's the answer to that? Assuming that lots of people want to write things in non-PHP languages, and that PHP + shared hosting is insufficient for a growing number of needs anyway, what do we do?

For the most of the non-PHP network services world, it has felt like the answer is to not worry about the end user side of things. Why bother, when you aren't releasing your end web application anyway? And so we've seen the rise of devops coincide with the rise of "release everything but your secret sauce" (and, whether you like it or not, with the decline of PHP + shared hosting).

I was fully aware of all of this when I decided MediaGoblin would be written in Python. Part of it is because I like Python, and well, I'm the one starting the project! But part of it is because the patterns I described above are not going away. In order for us to engage the future of the web, I think we need to tackle this direction head-on.

In the meanwhile, it's hard. It's hard in the way that installing and maintaining a free software desktop was super hard for me back in 2001, when I became involved in free software for the first time. But installers have gotten better, and the desktop has gotten better. The need for the installfest has gone away. I think that we are in a similar state with free network services, but I believe things can be improved. And that's why I wrote that piece yesterday about deployment, because I am trying to think about how to make things better. And I believe we need to, to build web applications that meet the needs of what people expect, to make free network services comparable to the devops-backed modern architected proprietary network services of today.

So, despite what it might appear at the moment, tech privilege has always been on my mind, but it's something that's forward-looking. That's hard to explain though when you're stuck in the present. I hope this blogpost helps.

Base64 UUIDs in Python

By Christopher Lemmer Webber on Tue 30 July 2013

Hardly even worth writing about, but maybe it's useful to someone. Ever want a base 64 encoded UUID4 in python? I ported the uuid.uuid4() code over for base64 encoding, with a slight cleanup function to make it URL safe.

UPDATE: Making this the most useless blogpost I've already
written, there's already a urlsave_b64encode method (also, I thus removed the rest of the post above):
>>> base64.urlsafe_b64encode(uuid.uuid4().bytes).strip("=")