We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
A rich task-oriented dialogue corpus in Vietnamese.
- Authors
Luong, Tho Chi; Le-Hong, Phuong; Tran, Oanh Thi
- Abstract
This paper introduces a new Vietnamese multi-domain task-oriented dialogue corpus which is fully labeled with rich information on dialogue structure and contextual information. The corpus contains 1910 dialogues, with a total of more than 18,000 turns in four domains (i.e., ProductInfo, OrderInfo, Shipping and Chatchit). To the best of our knowledge, this is the first dialogue corpus towards building automated conversations in e-commerce. We describe the rigorous annotation process of labelling rich information about dialogue segmentation, dialogue acts (DAs, a.k.a communicative functions), dependency relations, rhetorical relations and slot-values on both user and system sides. This corpus will alleviate the shortage of dialogue datasets in low-resource languages, namely Vietnamese. It can be exploited in diverse contexts to facilitate research toward building complete dialogue systems. The large size and rich annotation of the corpus make it suitable to investigate a variety of different tasks in conversational systems. In this paper, we perform extensive experiments and report preliminary results for future studies in this interesting yet unexplored field. Specifically, we illustrate the usage of the corpus in developing key modules such as natural language understanding, belief tracking, dialogue policy management and natural language generation.
- Subjects
VIETNAMESE language; NATURAL languages; CORPORA; LABELING theory; FREEDOM of information
- Publication
Language Resources & Evaluation, 2023, Vol 57, Issue 4, p1767
- ISSN
1574-020X
- Publication type
Article
- DOI
10.1007/s10579-022-09618-z