Archives

Tags

Posts with tag "speech recognition"

emacslisten (an idea)

By Christopher Allan Webber on Fri 11 October 2013

An idea I've wanted to pursue for some time now but never really have had time to work on is some kind of voice-activated emacs interface. (I'm proposing the name emacslisten here partly as a tribute to the super amazing emacspeak, which is kind of the reverse of this accessibility project.) Unfortunately, several attempts of this have been tried, but as far as I know they all rely on Dragon Naturally Speaking. Given that this is nonfree, it's a non-starter for me (not to mention the fact that I neither want to use Windows nor Wine). What to do?

Here's a brief, and I mean really brief, sketch of how I think things maybe could work.

  • Write a python daemon using the gstreamer bindings for pythonsphinx and exposing a d-bus interface. (This tutorial worked for me by the way, though I did have to change gconfaudiosrc to pulsesrc... then it worked.) This will be where commands are actually "listened" from. It might, optionally, have an --interface mode with some kind of gtk dialog.
  • Write an emacs minor-mode to listen to those d-bus calls.
  • Probably, as for how it would work, it would be a bit more vi-style modal, but also contextually modal depending on what major-mode you're in in emacs (yes I know, confusing). So, you could jump in and out of write mode vs different kinds of command mode. Depending on what major mode you're in might affect the kind of commands you're restricted to; this might improve accuracy, since you could set pythonsphinx to a more limited subset of commands. (Presumably you could set up emacs to be able to speak to this process and switch out the command set also.)
  • Just like emacs does every keybinding bound to a lisp function, every vocal command is bound to a function.

Crazy? Probably. Crazy enough to work? Maybe.

I wish I had time to run this project. And admittedly, there's a common, unfortunate pattern amongst hackers that when they're having wrist problems, they're desparate to figure out some kind of voice activated editing software. But when their wrists are okay enough, they're too busy to actually care to invest that time in it.

I can't run this project myself, but I could help with it, if someone else would be willing to take the lead on it. Anyone interested?

EDIT: In case you're wondering, Tavis Rudd's "Using Python to Code By Voice" is definitely an inspiration. As far as I know he hasn't made a release of the software though (he did kindly offer to send me the source at one point, but I didn't want to get Dragon Naturally Speaking, so I never went through with it). It might be a great base though, and anyway, it's definitely a source of inspiration. I'd really love to see a public release of the code!

EDIT / UPDATE 2: I started working on this. Not much to see yet, but you can speak and words appear in the minibuffer. Get it here and help improve it!