Item type |
SIG Technical Reports(1) |
公開日 |
2024-06-21 |
タイトル |
|
|
タイトル |
Towards a Benchmark Dataset for Stress-testing Fallacy Detection Models |
タイトル |
|
|
言語 |
en |
|
タイトル |
Towards a Benchmark Dataset for Stress-testing Fallacy Detection Models |
言語 |
|
|
言語 |
eng |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
言語資源 (2) |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
著者所属 |
|
|
|
Japan Advanced Institute of Science and Technology |
著者所属 |
|
|
|
Beyond Reason |
著者所属 |
|
|
|
Japan Advanced Institute of Science and Technology/RIKEN |
著者所属 |
|
|
|
Japan Advanced Institute of Science and Technology |
著者所属 |
|
|
|
Insa Lyon |
著者所属 |
|
|
|
Tohoku University |
著者所属 |
|
|
|
Ricoh Company, Ltd. |
著者所属 |
|
|
|
RIKEN |
著者所属 |
|
|
|
Mohamed bin Zayed University of Artificial Intelligence/RIKEN/Tohoku University |
著者所属(英) |
|
|
|
en |
|
|
Japan Advanced Institute of Science and Technology |
著者所属(英) |
|
|
|
en |
|
|
Beyond Reason |
著者所属(英) |
|
|
|
en |
|
|
Japan Advanced Institute of Science and Technology / RIKEN |
著者所属(英) |
|
|
|
en |
|
|
Japan Advanced Institute of Science and Technology |
著者所属(英) |
|
|
|
en |
|
|
Insa Lyon |
著者所属(英) |
|
|
|
en |
|
|
Tohoku University |
著者所属(英) |
|
|
|
en |
|
|
Ricoh Company, Ltd. |
著者所属(英) |
|
|
|
en |
|
|
RIKEN |
著者所属(英) |
|
|
|
en |
|
|
Mohamed bin Zayed University of Artificial Intelligence / RIKEN / Tohoku University |
著者名 |
Surawat, Pothong
Paul, Reisert
Naoya, Inoue
Irfan, Robbani
CaméLia, Guerraoui
Wenzhi, Wang
Shoichi, Naito
Jungmin, Choi
Kentaro, Inui
|
著者名(英) |
Surawat, Pothong
Paul, Reisert
Naoya, Inoue
Irfan, Robbani
CaméLia, Guerraoui
Wenzhi, Wang
Shoichi, Naito
Jungmin, Choi
Kentaro, Inui
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Arguments with logical fallacies are common in daily discourse. While several benchmark datasets for automated fallacy detection exist [1], none are specifically designed for stress-testing models' ability to assess the validity of underlying logic in the arguments. To address this issue, inspired by the Winograd Schema Challenge, we introduce a pilot benchmark dataset for fallacy detection. Our dataset consists of minimally different pairs of arguments that differ only in their degree of fallaciousness. We report the results of a pilot annotation study and preliminary experiments with large language models. Finally, we discuss potential future directions. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Arguments with logical fallacies are common in daily discourse. While several benchmark datasets for automated fallacy detection exist [1], none are specifically designed for stress-testing models' ability to assess the validity of underlying logic in the arguments. To address this issue, inspired by the Winograd Schema Challenge, we introduce a pilot benchmark dataset for fallacy detection. Our dataset consists of minimally different pairs of arguments that differ only in their degree of fallaciousness. We report the results of a pilot annotation study and preliminary experiments with large language models. Finally, we discuss potential future directions. |
書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AN10115061 |
書誌情報 |
研究報告自然言語処理(NL)
巻 2024-NL-260,
号 18,
p. 1-6,
発行日 2024-06-21
|
ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
2188-8779 |
Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |