The results for DSTC 3 have been released! There were 7 teams submitting a total of 23 entries. The results are available for download on the right, and the Featured Metrics are summarised at DSTC 3 Featured Metrics.
DSTC 2 concluded, with 9 teams participating and 31 entries total. The results may be downloaded in the section to the right. The Featured Metrics are presented in a table: DSTC 2 Featured Metrics, and the results are also summarised in a paper at SIGdial.
The Dialog State Tracking Challenge (DSTC) is a research challenge focused on improving the state of the art in tracking the state of spoken dialog systems. State tracking, sometimes called belief tracking, refers to accurately estimating the user's goal as a dialog progresses. Accurate state tracking is desirable because it provides robustness to errors in speech recognition, and helps reduce ambiguity inherent in language within a temporal process like dialog.
The first DSTC was a success, with 9 teams participating and a total of 27 entries. The data, publications and other materials are available on the DSTC 1 website. DSTC 2 then introduced more complicated and dynamic dialog states, which may change through the dialog, in a new domain (restaurant information). DSTC 3 presents the challenge of adapting to a new domain with a small amount of seed data, and lots of data in a similar but smaller domain.
Until recently, many state tracking models and approaches had been shared, but direct comparisons were impossible. A shared research task like this facilitates direct comparisons among state tracking models, helping to advance the state-of-the art.
In this challenge, participants are given labelled corpora of dialogs to develop state tracking algorithms. The trackers will then be evaluated on a common set of held-out dialogs which are released, un-labelled, during a one week period.
The corpus was collected using Amazon Mechanical Turk, and consists of dialogs in two domains: restaurant information, and tourist information. Tourist information subsumes restaurant information, and includes bars, cafés etc. as well as multiple new slots. There will be two rounds of evaluation using this data:
Dialogs used for training are fully labelled; user transcriptions, user dialog-act semantics and dialog state are all annotated. (This corpus therefore is also suitable for studies in Spoken Language Understanding.)
After each round of evaluation, the labelled test sets will be released, along with the output of the trackers entered into the challenge.
For more detailed information, please see the handbook.
To join the mailing list, send an email to listserv@
Post to the list using the address: email@example.com.