Pearls

Speech recognition programs

Author and Disclosure Information

Faster, more accurate, easier to use.


 

References

Speech recognition technology—once too slow and inaccurate for clinical practice—is increasingly helping psychiatrists record patient notes, dictate letters and lengthy reports, and operate their computers.

Is speech recognition right for your practice? This article will help you decide by reviewing available programs and offering insights on choosing one for your practice.

Speaking of progress

Clinicians in radiology and pathology were among the first to use limited programs that employed voice commands and short phrases. Shortcuts that told the computer to type boilerplate passages have been available for more than 25 years. Early speech recognition programs required users to speak with a stilted voice and pauses between words.

By 1998, several “continuous speech” systems were available, but most were not suitable for a solo or small group psychiatric practice. Some included diagnosis-specific report templates that could be customized for initial assessment and progress notes.1 These programs were useful for dictating short memos or e-mails but not for composing longer documents because of awkward editing and difficulties with punctuation, formatting, and on-screen navigation.2

Today’s programs recognize natural speech without pauses between words. As the clinician speaks into a microphone at a normal pace, text is typed at speeds approaching 160 words per minute with 90% to 98% accuracy. New versions of some programs save the dictation in audio files, allowing an assistant to listen to the dictation later and edit the transcription. These programs also eliminate superfluous utterances and perform macro commands, including handling e-mail.

Newer speech recognition products are much easier to use than older programs. They can be mastered within days and are cost-effective for solo or small group practices. Single-user programs range in cost from free (included with the Windows XP or newer Macintosh operating systems), to approximately $175 for ViaVoice Professional, to about $800 for the highly recommended Dragon Naturally Speaking Version 8.0.3

How to get started

Speech recognition programs require a powerful computer with processor (microchip) speed >1 gigahertz, random access memory (RAM) ≥ 512 megabytes, and a platform no older than Windows 2000 or Mac OS X Version 10.1.

After the software is installed, some clicking and keystroking may still be necessary. Learning when to talk or type can help users increase efficiency and prevent repetitive strain injury.

Critical factors for successful use include user motivation and training (or consultation with a reseller), a specialized vocabulary and language model, and a high-quality sound card and microphone (the most sophisticated hardware available is recommended, and this usually must be purchased separately).

Most speech recognition programs allow different users to train on the same computer. Users can dictate directly into word-processing applications, and some products allow dictation into other office programs.

What’s available

Dragon Naturally Speaking Professional Medical Solutions Version 8.0 (www.dragontalk.com/DNS_MED_ PRO.htm) is widely recognized for its performance, high accuracy, and easy user interface. Users can dictate directly into a PC for immediate transcription or into selected digital recorders or personal digital assistants for transcription later. A file of your recorded speech can be saved along with the computer-transcribed document for future proofreading.

The program can process previously completed reports to customize word-use patterns and build a personalized vocabulary. The optional but useful medical vocabulary includes many terms unique to psychiatry and psychology.

Dragon responds to voice commands and macros and features online training and user guides. It works in most Windows-based applications but is not available for Macintosh.4

IBM’s ViaVoice Release 10 (uk.scansoft.com/viavoice) is available in six languages and multiple levels and comes in versions for Windows and Macintosh platforms (also optimized for G4). Its manufacturer offers support for selected digital handheld recorders.

ViaVoice can analyze previous documents and save recorded dictation, is strong on voice navigation, and recognizes file names and tool bar buttons. Users can open a file by speaking its name or activate a command by saying it.

There are some drawbacks, however. Specialized medical vocabularies must be obtained from outside the company, creating additional technical obstacles or requiring developer assistance. Also, several reviewers do not consider ViaVoice as robust, accurate, or fast as Dragon for lengthy or medical dictation.5

Philips SpeechMagic. (www.speech.philips.com) Philips has developed sophisticated tools for document creation, transcription, and commands that integrate with larger information systems. The products are network-based and scalable, essentially designed for large groups or medical centers. The programs cannot be purchased from Philips but are installed by its distributors and software vendors.

Microsoft Office XP (http://office.microsoft.com) includes an alternative user input speech recognition feature within the operating system that offers dictation and voice command modes. It works with any office program and offers a “taste” of speech recognition, but with extremely limited function. It requires awkward switching between dictation and commands, does not filter out extraneous noises, and has no specialized medical vocabulary.

Pages

Recommended Reading

Computer/typing injuries: Keys to prevention
MDedge Psychiatry
Choose precise BMI charts to track youths’ weight gain
MDedge Psychiatry
Taking the mystery out of missing persons
MDedge Psychiatry
Stumped? 5 steps to find the latest evidence
MDedge Psychiatry
A holiday wish: More security and accessibility
MDedge Psychiatry
E-mailing on the run
MDedge Psychiatry
Printing on the go
MDedge Psychiatry
Bedside psychotherapy
MDedge Psychiatry
Data backup: Don’t wait for the next crash
MDedge Psychiatry
Printing on the go
MDedge Psychiatry