Item type |
SIG Technical Reports(1) |
公開日 |
2017-07-12 |
タイトル |
|
|
タイトル |
Test Collections and Measures for Evaluating Customer-Helpdesk Dialogues |
タイトル |
|
|
言語 |
en |
|
タイトル |
Test Collections and Measures for Evaluating Customer-Helpdesk Dialogues |
言語 |
|
|
言語 |
eng |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
対話と生成 |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
著者所属 |
|
|
|
Waseda University |
著者所属 |
|
|
|
Tsinghua University |
著者所属 |
|
|
|
Huawei Noah's Ark Lab |
著者所属 |
|
|
|
Huawei Noah's Ark Lab |
著者所属 |
|
|
|
Waseda University |
著者所属(英) |
|
|
|
en |
|
|
Waseda University |
著者所属(英) |
|
|
|
en |
|
|
Tsinghua University |
著者所属(英) |
|
|
|
en |
|
|
Huawei Noah's Ark Lab |
著者所属(英) |
|
|
|
en |
|
|
Huawei Noah's Ark Lab |
著者所属(英) |
|
|
|
en |
|
|
Waseda University |
著者名 |
Zhaohao, Zeng
Cheng, Luo
Lifeng, Shang
Hang, Li
Tetsuya, Sakai
|
著者名(英) |
Zhaohao, Zeng
Cheng, Luo
Lifeng, Shang
Hang, Li
Tetsuya, Sakai
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
We address the problem of evaluating textual, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards evaluating automatic helpdesk dialogue systems, we have constructed a test collection comprising 3,700 real Customer-Helpdesk multi-turn dialogues by mining Weibo, a major Chinese social media. We have annotated each dialogue with multiple subjective quality annotations and nugget annotations. In addition, 10% of the dialogues have been manually translated into English. Our test collection, DCH-1, will be made publicly available for research purposes. We also propose a simple nugget-based evaluation measure for task-oriented dialogue evaluation, which we call UCH, and explore its usefulness and limitations. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
We address the problem of evaluating textual, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards evaluating automatic helpdesk dialogue systems, we have constructed a test collection comprising 3,700 real Customer-Helpdesk multi-turn dialogues by mining Weibo, a major Chinese social media. We have annotated each dialogue with multiple subjective quality annotations and nugget annotations. In addition, 10% of the dialogues have been manually translated into English. Our test collection, DCH-1, will be made publicly available for research purposes. We also propose a simple nugget-based evaluation measure for task-oriented dialogue evaluation, which we call UCH, and explore its usefulness and limitations. |
書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AN10115061 |
書誌情報 |
研究報告自然言語処理(NL)
巻 2017-NL-232,
号 11,
p. 1-7,
発行日 2017-07-12
|
ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
2188-8779 |
Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |