Archives

Tags

Posts with tag "hacking"

VCS friendly, patchable, document line wrapping

By Christopher Allan Webber on Thu 17 December 2015

If you do enough work in any sort of free software environment, you get used to doing lots of writing of documentation or all sorts of other things in some plaintext system which exports to some non-plaintext system. One way or another you have to decide: are you going to wrap your lines with newlines? And of course the answer should be "yes" because lines that trail all the way off the edge of your terminal is a sin against the plaintext gods, who are deceptively mighty, and whose wrath is to be feared (and blessings to be embraced). So okay, of course one line per paragraph is off the table. So what do you do?

For years I've taken the lazy way out. I'm an emacs user, and emacs comes with the `fill-paragraph' command, so conveniently mapped to M-q. So day in and day out I'm either whacking M-q now and then, or I'm being lazy and letting something like `auto-fill-mode' do the job. Overall this results in something rather pleasing to the plaintext-loving eye. If we take our first paragraph as an example, it would look like this:

If you do enough work in any sort of free software environment, you get used to
doing lots of writing of documentation or all sorts of other things in some
plaintext system which exports to some non-plaintext system.  One way or
another you have to decide: are you going to wrap your lines with newlines?
And of course the answer should be "yes" because lines that trail all the way
off the edge of your terminal is a sin against the plaintext gods, who are
deceptively mighty, and whose wrath is to be feared (and blessings to be
embraced).  So okay, of course one line per paragraph is off the table.  So
what do you do?

But my friends, you know as well as I do: this isn't actually good. And we know it's not good because one of the primary benefits of plaintext is that we have nice tools to diff it and patch it and check it into version control systems and so on. And the sad reality is, if you make a change at the start of a paragraph and then you re-fill (or re-wrap for you non-emacs folks) it, you are going to have a bad time! Why? Because imagine you and your friends are working on this document together, and you're working in some branch of your document, and then your friend Sarah or whoever sends you a patch and you're so excited to merge it, and she does a nice job and edits a bunch of paragraphs and re-wraps it or re-fills them because why wouldn't she do that, it's the best convention you have, so you happily merge it in and say thanks, you look forward to future edits, and then you go to merge in your own branch you've been working on privately, but oh god oh no you were working on your own overhaul which re-wrapped many of the same paragraphs and now there are merge conflicts everywhere.

That's not an imaginary possibility; if you've worked on a documentation project big enough, I suspect you've hit it. And hey, look, maybe you haven't hit it, because maybe most of your writing projects aren't so fast paced. But have you ever looked at your version control log? Ever done a `git/svn/foo blame', `git/svn/foo praise', or whatever convention? Eventually you can't figure out what commit anything came from, and my friends, that is a bad time.

In trying to please the plaintext gods, we have defiled their temple. Can we do better?

One interesting suggestion I've heard, but just can't get on board with, is to keep each sentence on its own line. It's a nice idea, and I want to like it, because the core idea is good: each sentence doesn't interfere with the one before or after it, so if you change a sentence, it's easy for both you and the computer to tell which one. This means you can check things in and out of version control, send and receive patches, and from that whole angle, things are great.

But it's a sin to the eye to have stuff scrolling off the edge of your terminal like that, and each sentence on its own line, well... it just confuses me. Let's re-look at that first paragraph again in this style:

If you do enough work in any sort of free software environment, you get used to doing lots of writing of documentation or all sorts of other things in some plaintext system which exports to some non-plaintext system.
One way or another you have to decide: are you going to wrap your lines with newlines?
And of course the answer should be "yes" because lines that trail all the way off the edge of your terminal is a sin against the plaintext gods, who are deceptively mighty, and whose wrath is to be feared (and blessings to be embraced).
So okay, of course one line per paragraph is off the table.
So what do you do?

Ugh, it's hard to put into words why this is so offensive to me. I guess it's because each sentence can get so long that it looks like the separation between sentence is a bigger break than the separation between paragraphs. And I just hate things scrolling off to the right like that. I don't want to be halfway through reading a word on my terminal and then have to jump back so I can keep reading it.

So no, this is not good either. But it is on the right track. Is there a way to get the best of both worlds?

Recently, when talking about this problem with my good friend David Thompson, I came to realize that there is a potentially great solution that makes a hybrid of the technical merits of the one-sentence-per-line approach and the visually pleasing merits of the wrap/fill-your-paragraph approach. And the answer is: put each sentence on its own line, and wrap each sentence!

This is best seen to be believed, so let's take a look at that first paragraph again... this time, as I typed it into my blogging system:

If you do enough work in any sort of free software environment, you get used
  to doing lots of writing of documentation or all sorts of other things in
  some plaintext system which exports to some non-plaintext system.
One way or another you have to decide: are you going to wrap your lines with
  newlines?
And of course the answer should be "yes" because lines that trail all the way
  off the edge of your terminal is a sin against the plaintext gods, who are
  deceptively mighty, and whose wrath is to be feared (and blessings to be
  embraced).
So okay, of course one line per paragraph is off the table.
So what do you do?

Yes, yes, yes! This is what we want! Now it looks good, and it merges good. And we still can preserve the multi-line separation between paragraphs. Also, you might notice that I continue each sentence by giving two spaces before its wrapped continuation, and I think that's an extra nice touch (but you don't have to do it).

This is how I'm writing all my documentation, and the style in which I will request all documentation for projects I start be written in, from now on. Now if you're writing an email, or something else that's meant to be read in plaintext as-is (you do read/write your email in plaintext, right?), then maybe you should just do the traditional fill paragraph approach. After all, you want that to look nice, and in many of those cases, the text doesn't change too much. But if you're writing something where the plaintext version is just intermediate, and you have some other export which is what people mostly will read, I think this is a rather dandy approach.

I hope you find it useful as well! Happy documentation hacking!

Minimalist bundled and distributed bugtracker w/ orgmode

By Christopher Allan Webber on Sun 11 October 2015

Thinking out loud here... this isn't a new idea but maybe here's a solid workflow...

"Distributed" as in the project's existing DVCS.

  • Check a TODO.org orgmode file right into your project's git repo
  • Accept additions/adjustments to TODO.org via patches on your mailing list
  • As soon as a bug is "accepted", it's committed to the project.
  • When a bug is finished, it's closed and archived.
  • Contributors are encouraged to submit closing tasks in the orgmode tree as part of their patch.
  • Bug commentary happens on-list, but if users have useful information to contribute to someone working on a bug, they can submit that as a patch.

I think this would be a reasonably complete but very emacs user oriented bugtracker solution, so maybe in addition:

  • A script can be provided which renders a static html copy for browsing open/closed bugs.
  • A "form" can be provided on that page to email the list about new discovered bugs, and formats the submission as an orgmode TODO subsection. This way maintainers can easily file the bug into the tracker file if they deem appropriate.

I think this would work. Lately I've been hacking on a project that's mostly just me so far, so I just have an orgmode file bundled with the repo, but I must say that it's rather nice to just hack an orgmode file and have your mini-bugtracker distributed with your project. I've done this a few times but as soon as the project grows to multiple contributors, I move everything over to some web based bugtracker UI. But why not distribute all bugs with the project itself? My main thinking is that there's a tool-oriented barrier to entry, but maybe the web page render can help with that.

I've been spending more time working on more oldschool projects that just take bugs submitted on mailing lists as a contribution project. They seem to do just fine. So I guess it entirely depends on the type of project, but this may work well for some.

And yes, there are a lot of obvious downsides to this too; paultag points out a few :)

Wisp: Lisp, minus the parentheses

By Christopher Allan Webber on Wed 23 September 2015

Arne Babenhauserheide has built a really cool syntax alternative for Scheme, Wisp (not to be confused with a different lisp-related-wisp), or in standards version, SRFI 119. It looks pretty nice:

;; hello world example
display                             ;    (display
  string-append "Hello " "World!"   ;      (string-append "Hello " "World!"))
display "Hello Again!"              ;    (display "Hello Again!")

;; hello world function
define : hello who                  ;    (define (hello who)
  display                           ;      (display 
    string-append "Hello " who "!"  ;        (string-append "Hello " who "!")))

Actually, let's see that in emacs, just to be sure.

Wisp and hello world

How about something slightly more substantial? How about a real life Guix package for GNU Grep:

Wisp, Emacs, Guix and Grep

Wow, not bad... not bad at all! I'd say that's quite readable! (Too bad the lines don't line up exactly in that screenshot; that's not the code but rather my emacs theme bolding the wisp code.)

What's nice is that unlike most s-expression alternatives, it doesn't lack any of the power of Lisp; it's "just lisp" with the parentheses hidden by vaguely pythonesque indentation, which means even macros work.

Now me personally? I've learned to love the parens, and there's nothing that beats an editor that knows how to do cool structural s-expression editing and navigation. But I admit that learning to read through all the parentheses was a tough thing for me initally, and certainly for many others. Maybe this can help boil the lisp frog for some.

Now what would really be hylarious would be to port this to Hy...

More careful exceptions in Guile

By Christopher Allan Webber on Sat 05 September 2015

So as I've probably said before, I've been spending a lot more time hacking in Guile lately. I like it a lot!

However, there is one thing that really irks me: error handling. Though a programmer in Guile has a lot of flexibility to define their own error handling mechanisms, really I think a language should be providing good builtin ways of doing so. Guile does provide some builtin methods, but I have problems with both of them.

The first is the more egregious of the two, and is a procedure known simply as error, which takes one argument: a string describing what went wrong. Usage looks like so:

(if (something-bad? thing)
  (error "You shouldn't have done that!"))

This is fast to toss through your code without thinking, but at serious cost. The problem is that this follows the "diaper pattern" (or "diaper antipattern?"). Guile provides a catch procedure, but if you try catching these errors, they are all thrown with the "misc-error" symbol, and there is no way to catch the right errors.

(catch 'misc-error
  ;; the code we're running
  (lambda ()
    (let ((http-response (get-some-url)))
      (if http-response
          ;; all went well, continue with our webby things
          (do-web-things http-response)
          ;; Uhoh!
          (error "the internet's tubes are filled"))))
  ;; The code to catch things
  (lambda _ (display "sorry, someone broke the internet\n")))

But wait... what if the user gave a keyboard interrupt and instead your database execution code caught it instead? I you can't catch errors precisely, things might bubble to the wrong place.

This is not an abstract problem; this happened to me in an extremely well written Guile program, Guix: I was working on adding a new package and had screwed up the definition, so somewhere up the chain Guix threw an error about my malformed package, but I didn't know... instead, when I was attempting to run the "guix package" command to test out my command, suddenly the "guix package" command disappeared entirely. Whaaaaat? I did some debugging and found a (catch 'misc-error) in the command line arguments handling code. Whew! Well, that usage of "(error)" got replaced with some more careful code, but what if I couldn't find it, or was a more green developer?

So, luckily, Guile does provide a better exception handling system, mostly. There's throw, which looks a bit like this in your code:

(catch 'http-tubes-error
  ;; the code we're running
  (lambda ()
    (let ((http-response (get-some-url)))
      (if http-response
          ;; all went well, continue with our webby things
          (do-web-things http-response)
          ;; Uhoh!
          (throw 'http-tubes-error "the internet's tubes are filled"))))
  ;; The code to catch things
  (lambda _ (display "sorry, someone broke the internet\n")))

Okay, great! This is much more specific, yay!

Except... it still kind of bothers me. Maybe I'm being overly pedantic here, but what if you and I both had 'json-error exceptions in our own separate libraries? The problem is (unlike in common lisp) there aren't module-specific symbols in Guile! This means we could catch someone else's 'json-error when we really wanted to catch our own.

Okay, maybe this is rare, but I really don't like running into these kinds of problems. I want my exception symbols to be unique per package, damnit!

So in the interest of doing so, let me present you with a terrible hack of scheme code (which like all other code content in this blogpost, I both waive under CC0 1.0 Universal (and also do waive any potential patent "rights") and also release under LGPLv3 or later, your choice):

(define-syntax-rule (define-error-symbol error-symbol)
  (define error-symbol
    (gensym
     ;; gensym can take a prefix
     (symbol->string (quote error-symbol)))))

Okay, it's kind of hacky, but what this does is give you a nice convenient way to define unique symbols. (Edit: turns out gensym can take a prefix, so the above code is even easier and less hacky now! Thanks for the tip, taylanub!) You can use it like so:

(define-error-symbol http-tubes-error)

(catch http-tubes-error
  ;; the code we're running
  (lambda ()
    (let ((http-response (get-some-url)))
      (if http-response
          ;; all went well, continue with our webby things
          (do-web-things http-response)
          ;; Uhoh!
          (throw http-tubes-error "the internet's tubes are filled"))))
  ;; The code to catch things
  (lambda _ (display "sorry, someone broke the internet\n")))

See? All you have to do is do a simple definition above and you have a unique-per-your-program error symbol (thanks to the gensym). Now if users want to catch your errors, but only your errors, they can import the error symbol directly from your package.

So the lesson from this post is: if you're going to use exceptions in your code, please be careful... and specific!

Update: Apparently I can't be the only one who finds the need for this; turns out that prompts (which have a similar "unwinding" property to exceptions) also take symbols, but usefully there's (make-prompt-tag) which does pretty much exactly the same thing as define-error-symbol above. So I must not be totally crazy!

Fauxnads

By Christopher Allan Webber on Sat 14 March 2015

So off and on, I've been trying to understand monads. It turns out I have a use case: making web applications in the style I'd like but have them be asynchronous leads to trouble because you need a non-global-variable way of passing along context. I've tried thinking of some solutions, but a friend of mine convinced me that monads, if I could wrap my head around them, would solve the problem for me.

With that in mind, I did some reading... there are plenty of resources out there, some of them with nice pictures, and at one point I tried to understand them by just reading about them and not writing any code relevant to them. This lead to me guessing how they might work by trying to contextualize them to the problems I wanted to solve. So along that path, I had a misunderstanding, a mistaken vision of how monads might work, but while I was wrong, it turned out that this vision of ~monads is kind of fun and had some interesting properties, so I decided to code it up into a paradigm I'm jokingly calling "fauxnads".

Brace yourself, we're about to get into code, pseudo- and real, and it's going to be in Guile. (All code in this blogpost, GPLv3+ or LGPLv3+, as published by the FSF!)

So here's the rundown of fauxnads:

  • They still pass along a context, and under the hood, they do pass it in still as the first argument to a function!
  • The context gets passed up (and optionally, down... more on that in a few) in some kind of associative array... but we don't want to accidentally change the context that we passed to other functions already, so we'll use something immutable for that.
  • The user doesn't really access the context directly. They specify what variables they want out of it, and the fauxnad macro extracts it for them.
  • Fauxnads can add properties to the context that they'll call subroutines with so that subsequent fauxnad calls can have access to those.
  • Calling child fauxnads happens via invoking a function (=>) exposed to the rest of the fauxnad via some lexical scope hacks.

So when sketching this out, I tried to boil down the idea to a quick demo:

;; fleshed out version of what a fauxnad should approx expand to
(define (context-test1 context arg1 arg2)
  (letrec ((new-context context)
           ;; Define function for manipulating the context
           (context-assoc
            (lambda (key value)
              (set! new-context
                    (vhash-cons key value new-context))))
           ;; a kind of apply function
           (=>
            (lambda (func . args)
              (apply func (cons new-context args))))) ;; should be gensym'ed

    ;; This part would be filled in by the macro.
    ;; The user would set which variables they want from the context
    ;; as well as possibly the default values
    (let ((a (or (vhash-assoc 'a context) "default-value for a"))
          (b (or (vhash-assoc 'b context) "default-value for b"))
          (c (or (vhash-assoc 'c context) "default-value for c")))
      (values
       (begin
         (context-assoc 'lol "cats")
         (=> context-test2 "sup cat")
         (context-assoc 'a "new a")
         (=> context-test2 "sup cat")
         (format #t "a is ~s, b is ~s, and c is ~s\n"
                 a b c)
         (string-append arg1 " " arg2))
       new-context))))

;; intentionally simpler, not a "real fauxnad", to demo
;; the fauxnad concept at its most minimal
(define (context-test2 context arg1)
  (begin
    (format #t "Got ~s from the context, and ~s from args, mwahaha!\n"
            (vhash-assoc 'a context)
            arg1))
  (values
   "yeahhh"
   context))

Then calling in the console:

scheme@(guile-user)> (context-test1 (alist->vhash '((a . 1) (b . 2))) "passed arg1" "passed arg2")
Got (a . 1) from the context, and "sup cat" from args, mwahaha!
Got (a . "new a") from the context, and "sup cat" from args, mwahaha!
a is (a . 1), b is (b . 2), and c is "default-value for c"
$79 = "passed arg1 passed arg2"
$80 = #<vhash 2205920 4 pairs>

Okay, whaaa? Let's look at the requirement again. We'll be passing in a function to the start of the function, and then having some other args. We'll then pass that along to subsequent functions. So more or less, we know that looks like this. (I know, not the most useful or pretty or functional code, but it's just a demo of the problem!)

(define (main-request-handler context request)
  ;; print out hello world in the right language
  (display (translate-this (context-get context 'lang) "Hello, world"))
  (newline)

  ;; now call another function
  (next-function new-context (smorgify arg1)))

(define (next-function context what-to-smorgify)
  (write-to-db
   ;; Lots of functions need access to the
   ;; current database connection, so we keep it in the context...
   (context-get context 'db-conn)
   (smorgify-this what-to-smorgify)))

But wouldn't it be cool if we didn't have to pass around the context? And what if we just said, "we want this and this from the context", then forgot about the rest of the context? We'd never need to call context-get again! It would also be cool to have a way to set things in the context for subsequent calls. Ooh, and if we coud avoid having to type "context" over and over again when passing it into functions, that would also be awesome.

So how about a syntax like this:

(define-fauxnad (our-special-function arg1)
  ((lang "en"))  ;; we want the language variable, but if not set,
                 ;; default to "en" or english
  ;; (body is below:)
  ;; print out hello world in the right language
  (display (translate-this lang "Hello, world"))
  (newline)
  ;; now call another function
  (=> next-function (smorgify arg1)))

We also know we want to use some sort of immutable hashmap. Guile provides vhashes which provide "typically constant-time" data access, and while there are some caveats (a new vhash returned by appending a key/value pair where that key already existed in the vhash will just keep the old pair around... but on our pseudo-stack that shouldn't happen very often, so vhashes should be fine), they work for our purposes.

Okay, cool. So what would that look like, expanded? Something along the lines of:

(define (our-special-function context arg1)
  (letrec ((new-context context)
           ;; Define function for manipulating the context
           (context-assoc
            (lambda (key value)
              (set! new-context
                    (vhash-cons key value new-context))))
           ;; a kind of apply function
           (=>
            (lambda (func . args)
              (apply func (cons new-context args))))) ;; should be gensym'ed

    ;; This part would be filled in by the macro.
    ;; The user would set which variables they want from the context
    ;; as well as possibly the default values
    (let ((lang (or (vhash-assoc 'lang context) "en")))
      (values
       (begin
         ;; print out hello world in the right language
         (display (translate-this (context-get context 'lang) "Hello, world"))
         (newline)

         ;; now call another function
         (next-function new-context (smorgify arg1)))
       new-context))))

As a bonus, we've taken advantage of Guile's multi-value return support, so any parent function which cares can get back the new context we defined for subsequent calls, in case we want to merge contexts or something. But functions not aware of this will simply ignore the second returned parameter. (I'm not sure this is a useful feature or not, but it's nice that Guile makes it easy to implement!)

That's clearly quite a complicated thing to implement manually though... so it's time to write some code to write code. That's right, it's macro time! Guile has some pretty cool hygienic macro support that uses "syntax tranformation"... a bit nicer than common lisp's defmacro, but also less low-level. Anwyay, if you're not familiar with that syntax, trust me that this does the right thing I guess:

(define-syntax define-fauxnad
  (lambda (x)
    (syntax-case x ()
      ((_ (func-name . args)
          ((context-key context-default) ...)
          body ...)
       (with-syntax ((=> (datum->syntax x '=>))
                     (context-assoc (datum->syntax x 'context-assoc)))
         #'(define (func-name context . args)
             (letrec ((new-context context)
                      ;; Define function for manipulating the context
                      (context-assoc
                       (lambda (key value)
                         (set! new-context
                               (vhash-cons key value new-context))))
                      ;; a kind of apply function
                      (=>
                       (lambda (func . func-args)
                         (apply func (cons new-context func-args))))) ;; should be gensym'ed

               ;; This part would be filled in by the macro.
               ;; The user would set which variables they want from the context
               ;; as well as possibly the default values
               (let ((context-key (or (vhash-assoc (quote context-key) context)
                                      context-default))
                     ...)
                 (values
                  (begin
                    body ...)
                  new-context)))))))))

Nice, now writing our fauxnads is dead-simple:

(define-fauxnad (our-special-function arg1)
  ((lang "en"))  ;; we want the language variable, but if not set,
  ;; default to "en" or english
  ;; (body is below:)
  ;; print out hello world in the right language
  (display (translate-this lang "Hello, world"))
  (newline)
  ;; now call another function
  (=> next-function (smorgify arg1)))

(define-fauxnad (next-function context what-to-smorgify)
  ((db-conn #nil))
  (write-to-db
   ;; Lots of functions need access to the
   ;; current database connection, so we keep it in the context...
   (context-get context 'db-conn)
   (smorgify-this what-to-smorgify)))

Okay, again, my demos don't make this look very appealing I suppose. We can now transform the original demos I sketched up into fauxnads though:

(define-fauxnad (context-test1 arg1 arg2)
  ((a "default value for a")
   (b "default value for b")
   (c "default value for c"))
  (context-assoc 'lol "cats")
  (=> context-test2 "sup cat")
  (context-assoc 'a "new a")
  (=> context-test2 "sup cat")
  (format #t "a is ~s, b is ~s, and c is ~s\n"
          a b c)
  (string-append arg1 " " arg2))

(define-fauxnad (context-test2 arg1)
  ((a #nil))
  (format #t "Got ~s from the context, and ~s from args, mwahaha!\n"
          a arg1)
  "yeahhh")

And calling it:

scheme@(guile-user)> (context-test1 (alist->vhash '((a . 1) (b . 2))) "passed arg1" "passed arg2")
Got (a . 1) from the context, and "sup cat" from args, mwahaha!
Got (a . "new a") from the context, and "sup cat" from args, mwahaha!
a is (a . 1), b is (b . 2), and c is "default value for c"
$81 = "passed arg1 passed arg2"
$82 = #<vhash 2090ae0 4 pairs>

Okay, so what's the point? I doubt this blogpost really would sell anyone on fauxnads, and maybe why would you use fauxnads when you can use real monads? But here's some interesting properties:

  • fauxnads are still super simple functions that you can call manually: just pass in the context (a vlist) as the first parameter.
  • the "binding" function for calling sub-fauxnads is sugar, but hidden (and otherwise inaccessible, because the function hygeine keeps you from accessing "new-context") from the user.
  • I still like that you can get back the new context via multiple value return, but totally ignore it if you don't care about it.
  • I understand how they work.

And on that last note, I still don't understand monads, but I feel like I'm getting closer to it. It was fun to document, and put to code, a misunderstanding though!