If the last few posts have seemed a little different to you, it may be because I did not type them in. I picked up a copy of Dragon NaturallySpeaking 10.0 over the weekend—I think they’re about to come out with a new version and so are dropping the price on the old one—and I’ve been using it, well, mostly as a toy.
I type close to 90-100 words a minute so there is a real limit to how much dictation software can help in terms of speeding up my writing. I mean, I’m not sure that I even speak 100 words a minute so most of the time I’m sitting around waiting for my mouth catch up with my fingers. No, that’s not right. My fingers can more easily keep up with my brain than my mouth can. Wait. That doesn’t sound right either.
Well, I guess it’s better than my brain lagging behind my mouth.
I started typing before I was 10, back when that was unusual. What that means, in practical terms, is that for most of my life I’ve been using a brain-to-finger connection, as you might call it. I’m very used to thinking going right to my fingers. It’s actually extremely challenging for me to compose by talking.
Of course people have been doing this for a long time using non-computer-based technology. Dictating into a tape recorder for example, or just dictating directly to a human. So, in that sense, it’s not all that high-tech. Just new to me.
I’ve been fascinated by the concept of computer dictation going back to the original television series “Battlestar Galactica” where commander Adama used to dictate his logs and we watched the words appear on his oh-so-high-tech green phosphorus monitor. Strictly speaking, that’s not really possible. The computer can’t know which word you mean without context. (There’s no there there. They’re their there.)
I tried the earliest computer dictation software back in the mid-90s, using OS/2. It took some huge amount of memory, like 16 or even 32 MB of RAM, and a blazing 100 MHz Pentium power, and when all is said and done, it didn’t work very well.
I tried again in seven or eight years ago, with somewhat better results, but with enough corrections needed so as to make the whole exercise more trouble than it was worth.
Well, now Dragon is at version 10, and it’s almost useful. The accuracy is really quite good.but it does underscore certain things about the way I write, to say nothing of the way I speak. I like to use punctuation, for example, which—when you are dictating to a computer requires you to say “comma”, “quote”, and personal favorite “em-dash”. This is far from an organic experience.
Also, it’s necessarily complex. Yesterday, it was smart enough to put a space after every period and then capitalize the next word. Today, it wants to make me do it. It is spelling out the word “space"—but not always! The correction feature requires you to re-dictate or pick one from a number of options by saying "choose one”. But for some reason Dragon is actually typing out the phrase “choose one” rather than making the selection.
Still, it has a lot fewer annoying habits than previous versions (and I include in this voice-recognition software from all the other vendors I’ve tried not just Dragon), like wanting to type a word because you exhaled forcefully or scratched your nose.
Here is Dragon listening to me recite Poe:
once upon a midnight dreary,
while I pondered weak and weary,
over many a curious and cleaned the volume of forgotten lore,
while I nodded nearly napping,
suddenly there came a tapping,
as of someone gently rapping,
rapping at my chamber door,
“to some visitor,” I muttered, “rapping at my chamber door,”
“only this, and nothing more.”
pot, distinctly I remember,
it was in the bleak December,
and each separate dying ember,
Roth its ghost upon the floor,
eagerly I saw tomorrow,
vainly I had sought to borrow,
from my books surcease Asaro,
sorrow for the lost Lenore,
lost to me forever more
You can see where it has difficulty, and it’s not hard to see why it would. Unusual word choices and grammatical structures are naturally the most likely to cause problems. It doesn’t do interjections very well as far as I can tell. Things like “Ah” or “Oh” or “Wow"—well I had to type all those in by hand.
The other thing it doesn’t do well—and again it’s not hard to see why—is vocal modulation. When I’m speaking aloud in this kind of context my voice modulates and sometimes stretches out a length of words: a sort of prosaic form of melisma
which is doubtless exacerbated by decades of reading stories to children. At the same time, I don’t find myself speaking in clipped robotic monotones in order to make the software understand me, which I have done with previous incarnations.
Anyway, it’s an interesting purchase for $30 (the markdown price if you can find it). If you’ve ever had an interest in this sort of thing it’s worth checking out.