SIRIDUS Annual Report 2001
Project Aims
Spoken language dialogue systems, such as automated telephone enquiry systems
and hands-free in-car device control, are rapidly becoming a commercial
reality. SIRIDUS aims to improve the understanding of what is required
to provide reusable, robust and user-friendly spoken dialogue systems.
The project demonstrators will include an automated telephone operator,
and an integrated toolset for dialogue researchers.
Particular concerns in SIRIDUS are:
-
achieving robustness when user utterances are unpredictable, and speech
recognition is noisy
-
showing that generic strategies for dialogue management can be applied
to a wide range of dialogues including "command" dialogues and negotiative
dialogues, not just information seeking dialogues.
-
providing architectures which allow appropriate sharing of information
between modules, for example, enabling dialogue systems to generate appropriately
stressed output e.g. Did you mean the KITCHEN light or the HALL light vs.
Did you mean the kitchen LIGHT or the kitchen FAN?
Summary of 2001 Activities
The second year highlights include:
-
an initial version of the telephone operator demonstrator
-
more rounded demonstrators built from the dialogue system toolset
-
incorporation of Siridus technology in the demonstrators for the EU Project
D'Homme
User requirements and Market Prospects
The market for dialogue systems has been developed rapidly. The Voice
XML standard is increasingly prominent, and incorporates some
flexibility, enabling users to answer more than one question at
once. There seems little doubt that it would be useful for users to
also be able to contradict existing information, or ask a question
themselves. By pushing the limits of the kinds of flexibility which
can be achieved in practical systems, the Siridus project should
provide a good basis for the future developments of the standards in
this area, and for building more user friendly dialogue systems. By
emphasising reconfigurability and robustness, we also hope to meet the
challenge of providing greater user-friendliness without incurring
lower reliability or higher deployment costs. The market for
telephone systems which allow dialing by name has been shown by voice
operated personnel assistants such as Wildfire. By allowing more
natural voice based exchanges the Siridus telephone demonstrator will
provide a similar service to untrained users in a corporate
environment. An analysis of user requirements for the telephone
operator system is provide by Deliverable D3-1, and an analysis of
requirements for architectures which support advanced dialogue systems
in Deliverable D6-1.
Technology outlook and innovative features
The Siridus project aims to provide innovative research which is applicable
for real systems in the near term. The first two years have displayed
innovative research work on how to provide systems for natural command
languages and negotiative dialogue. Natural command language dialogues
immediately take us outside simple slot filling based on filling in multiple
parameters for a single task, since multiple tasks are often specified
in the same utterance e.g. "Call Heather and transfer incoming calls to
Peter". The work on negotiative dialogue is challenging to conventional
views of how dialogue is structured. Examples from the corpus of travel
agency dialogues show that questions can remain unanswered, and that users
negotiate with the agent as to which parameters (such as the destination
city, or the arrival time) they want to fix. This requires new ways for
a system to structure a dialogue. A candidate theoretical framework for
this is specified in D1.2.
A key to the work on robust interpretation is that it provides a uniform
way to express rules based on keyword/key phrase spotting or more detailed
linguistic descriptions. This makes it simple to ensure that the system
always performs at least as well as a keyword/key phrase spotting system,
since extra linguistic information is only used if it is both available
and likely to be helpful. Innovative work in Siridus, described in D4.1
is aimed at making this work fit with repair strategies for robustness,
and to show that the work can fit with rather more detailed linguistic
descriptions thereby encompassing approaches which aim at a full linguistic
analysis of user utterances.
The Demonstrator
The Siridus project is building two main demonstrators. The first is the
telephone operator dialogue system mentioned above. This demonstrator is
in Spanish, and allows a user to conduct a dialogue such as the following (translated
from Spanish)
U: Hello I would like to place a collect call
S: Please specify a destination for the collect call
U: To the number 123456789
S: Placing the collect call. Would you like to continue?
U: Yes please
S: Please specify a function
U: I wish to send a message to Juan Perez
The demonstrator allows a user to call people by name (e.g. "Phone Fred
Smith") to transfer calls (e.g. "Transfer my calls to Fred Smith") and
to arrange conference calls. This saves the effort of first looking up
e.g. the corporate directory over the web before making a call. The dialogue
history is also used to enable functions which are not primitive operations
of the PABX exchange e.g. "retry last call".
The second demonstrator is a toolkit for dialogue researchers. The aim
is to provide a both a library of modules, and a toolkit in which dialogue
researchers can plug in their particular module to test its effect on a
whole system. The current toolkit is a derivative of the Trindikit, which
is based on the Information State update view of dialogue. This is particularly
convenient when experimenting with new modules e.g. interpreters or generation
components which access dialogue state information (e.g. the last move).
The Trindikit now allows asynchronous communication between modules. Current
developments are designed to make it easier to plug and play different
components in different programming languages, and to provide handles for
integrating a variety of recognisers and synthesisers (recognisers used
at the Siridus partner sites include Nuance, Dragon, and IBM ViaVoice and
there are plans to also use the HTK Toolkit, synthesisers include IBM, Nuance and
Festival).
Currently the Trindikit is supported under Linux and Solaris. On top
of the Trindikit there are already several example dialogue systems, including
a telephone operator system in Spanish, and a travel booking system in
Swedish and English. A new system incorporating the Siridus approach to robust interpretation has been
available on the website for the EU project, D'Homme.
The current Trindikit can be downloaded from http://www.ling.gu.se/research/projects/trindi/trindikit/
User Group, Promotion and Awareness
The project has created an International Consultation and User Group
(ICUG). All members will be invited to a Siridus workshop being organised
for April 2002. Dissemination of project results has so far been
primarily to the computational linguistics and dialogue community
through conference papers and proceedings, including ACL2000, Gotalog,
Bi-Dialog, NAACL 2001, SEPLN 2001. Siridus results have also been
presented to the Speech Community at the WISP conference organised by
the British Institute of Acoustics. A particularly good showcase for
the project was the European Summer School in Logic Language and
Information, advanced course on the Information State Approach to
Dialogue Management: Theory and Implementation, where work in both
Siridus and its predecessor project Trindi was presented to a wide
audience.
Future Work
The Siridus Project will run until the end of 2002. In 2002 we will provide:
-
The final telephone operator demonstrator
-
Evaluation of the information state based view of dialogue
-
Integrated demonstrator of robust interpretation and repair mechanisms
-
Demonstration of prosodically varied speech output using information from the information state
-
Revised toolkit for dialogue designers
-
Prototypes dialogue systems illustrating flexible dialogue
Further Information
Further information about the project, including publications and deliverables,
can be obtained from the Siridus website http://www.ling.gu.se/projeck/siridus/
or from the project administrative coordinator.
Administrative Coordinator
Marieke Schmitt, ms@eurice.de
Technical Coordinator
Robin Cooper, cooper@ling.gu.se
|