SpeechActs has been designed for traveling professionals who require access to on-line information while they are away from their computer. While a portable computer can empower the nomad, the logistics of using a modem can often be impractical or impossible. Telephone-based interaction can be a simpler, lighter-weight means of staying in touch. Conversational speech offers an attractive alternative to keypad input for telephone-based interaction. The telephone is ubiquitous, and conversational speech is familiar, leaves hands and eyes free, and opens up the possibility of a virtually unlimited set of commands.
Implementing a usable conversational interface with error-prone speech recognizers, however, is a challenging task. Despite the difficulties, current speech technologies are now good enough to make experimentation worthwhile. The SpeechActs system is the result of this experimentation.
The video shows SpeechActs being used in the home, in the airport, and at a hotel. While the video focuses primarily on the end-user aspects of the system, it also touches briefly on some of the natural language underpinnings.
To date, SpeechActs includes speech-only interfaces to a suite of integrated applications. Two of these are speech interfaces to Sun's graphical deskset tools, Mail Tool and Calendar Manager. Another two applications provide speech access to dynamic data feeds for weather forecasts and stock quotations. A fifth application, called Notify, allows the user to set up reminder telephone calls.
Following is an example of a typical interaction with the mail application:
SpeechActs: Switching to mail. You have 10 old messages and 63 new messages, of which 14 are priority. Your first three messages are from people in the Labs." User: Let me hear them. SpeechActs: Message 1 from Eric Baatz, subject: "Draft Report." User: Read it to me. SpeechActs: "The first draft is ready for your comments. Eric." User: Reply and include the current message. SpeechActs: Preparing message to Eric Baatz. Begin recording after the tone. When finished, pause for several seconds. User: Eric, I'll get those to you this afternoon. SpeechActs: Say cancel, send, or review. User: Send. SpeechActs: Message sent. What now? User: Next message. SpeechActs: Message 2 from Stuart Adams, subject "Can we meet at 3:00 today?" User: Switch to calendar...
What do I have tomorrow? What about Bob? What did he have last Wednesday? And next Thursday? What was Paul doing three days after Labor Day?
Based on user studies, the speech version of the calendar was designed to support extensive use of pronominal references, which are common in conversational speech, as well as use of relative dates, which are essential if no graphical calendar is available.
What's the weather in Seattle? How about Texas? I'd like the extended forecast for Boston.
What's the price of Sun? What was the volume? Tell me about IBM.
To simplify the software developer's task of writing an application that can be used with speech recognizers from different vendors, the SpeechActs framework includes a Unified Grammar language. This language allows developers to write speech recognition grammars in a recognizer-independent manner. In addition, the Unified Grammars can be augmented with tests and actions for our Swiftus natural language processor. Swiftus is responsible for translating the conversational speech input into sets of feature/value pairs that are easier for a backend application to parse than English-language sentences.
At run time, a Discourse Manager keeps track of state and other information necessary for successful communication with the user. The Discourse Manager interprets the meaning of pronouns, translates relative dates into specific ones, disambiguates ambiguous user names, and stores common information so that it can be shared by the applications.
2. Yankelovich, Nicole, Gina-Anne Levow, and Matt Marx. "Designing SpeechActs: Issues in Speech User Interfaces," SIGCHI `95, Human Factors in Computing Systems Proceedings, Denver, CO, May 7-11, 1995.
3. Yankelovich, Nicole and Eric Baatz. "SpeechActs: A Framework for Building Speech Applications," AVIOS `94 Conference Proceedings, San Jose, CA, September 20-23, 1994.