WEKO3
アイテム
Towards Reliable LLM-Controlled Systems: Developing Evaluation Standards and Testing Frameworks
https://ipsj.ixsq.nii.ac.jp/records/241877
https://ipsj.ixsq.nii.ac.jp/records/241877a702bca9-e53a-4a6e-8308-0960bbdf2b91
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
2026年12月27日からダウンロード可能です。
|
Copyright (c) 2024 by the Information Processing Society of Japan
|
|
| 非会員:¥0, IPSJ:学会員:¥0, EMB:会員:¥0, DLIB:会員:¥0 | ||
| Item type | Symposium(1) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2024-12-27 | |||||||||
| タイトル | ||||||||||
| タイトル | Towards Reliable LLM-Controlled Systems: Developing Evaluation Standards and Testing Frameworks | |||||||||
| タイトル | ||||||||||
| 言語 | en | |||||||||
| タイトル | Towards Reliable LLM-Controlled Systems: Developing Evaluation Standards and Testing Frameworks | |||||||||
| 言語 | ||||||||||
| 言語 | eng | |||||||||
| 資源タイプ | ||||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_5794 | |||||||||
| 資源タイプ | conference paper | |||||||||
| 著者所属 | ||||||||||
| Shibaura Institute of Technology | ||||||||||
| 著者所属 | ||||||||||
| Shibaura Institute of Technology | ||||||||||
| 著者所属(英) | ||||||||||
| en | ||||||||||
| Shibaura Institute of Technology | ||||||||||
| 著者所属(英) | ||||||||||
| en | ||||||||||
| Shibaura Institute of Technology | ||||||||||
| 著者名 |
Masaharu, Takahashi
× Masaharu, Takahashi
× Kenji, Hisazumi
|
|||||||||
| 著者名(英) |
Masaharu, Takahashi
× Masaharu, Takahashi
× Kenji, Hisazumi
|
|||||||||
| 論文抄録 | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | This study proposes evaluation criteria to promote the development of a large language models (LLM) system trusted by users. In addition to this, we will study the creation of tests based on the evaluation criteria. First, the reliability aspects of the LLM system are extracted, and specifications and conditions for satisfying these aspects are presented. Experiments are conducted to demonstrate the applicability of these specifications and conditions to actual LLM systems. The ultimate goal is to improve the quality of the LLM system by helping developers in their development, and to contribute to the user's decision to use the system. | |||||||||
| 論文抄録(英) | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | This study proposes evaluation criteria to promote the development of a large language models (LLM) system trusted by users. In addition to this, we will study the creation of tests based on the evaluation criteria. First, the reliability aspects of the LLM system are extracted, and specifications and conditions for satisfying these aspects are presented. Experiments are conducted to demonstrate the applicability of these specifications and conditions to actual LLM systems. The ultimate goal is to improve the quality of the LLM system by helping developers in their development, and to contribute to the user's decision to use the system. | |||||||||
| 書誌情報 |
Proceedings of Asia Pacific Conference on Robot IoT System Development and Platform 巻 2024, p. 65-66, 発行日 2024-12-27 |
|||||||||
| 出版者 | ||||||||||
| 言語 | ja | |||||||||
| 出版者 | 情報処理学会 | |||||||||