A Comparative Analysis of Large Language Models for Contextually Relevant Question Generation in Education

Lodovico, Molina Ivo; Tsubasa, Minematsu; Atsushi, Shimada; Lodovico, Molina Ivo; Tsubasa, Minematsu; Atsushi, Shimada

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

A Comparative Analysis of Large Language Models for Contextually Relevant Question Generation in Education

https://ipsj.ixsq.nii.ac.jp/records/231227

名前 / ファイル	ライセンス	アクション
IPSJ-CE23172025.pdf (1.3 MB)	Copyright (c) 2023 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2023-11-25

タイトル

A Comparative Analysis of Large Language Models for Contextually Relevant Question Generation in Education

タイトル

言語

タイトル

A Comparative Analysis of Large Language Models for Contextually Relevant Question Generation in Education

言語

eng

キーワード

主題Scheme

Other

主題

CLE一般セッション(2)

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

Faculty of Information Science and Electrical Engineering, Kyushu University

著者所属

Data-Driven Innovation Initiative, Kyushu University

著者所属

Faculty of Information Science and Electrical Engineering, Kyushu University

著者所属(英)

Faculty of Information Science and Electrical Engineering, Kyushu University

著者所属(英)

Data-Driven Innovation Initiative, Kyushu University

著者所属(英)

Faculty of Information Science and Electrical Engineering, Kyushu University

著者名

Lodovico, Molina Ivo
Tsubasa, Minematsu
Atsushi, Shimada

著者名(英)

Lodovico, Molina Ivo
Tsubasa, Minematsu
Atsushi, Shimada

論文抄録

内容記述タイプ

Other

内容記述

This paper explores the potential of large language models (LLMs) for Automatic Question Generation in educational contexts. We compare three models - GPT-3.5 Turbo, Flan T5 XXL, and Llama 2-Chat 13B - on their ability to generate relevant questions from university slide text without finetuning. Questions were generated in a two-step pipeline: first, answer phrases were extracted from slides using Llama 2-Chat 13B; then, questions were generated for each answer by the three models. To evaluate question quality, a survey was conducted where students rated 144 questions across five metrics: clarity, relevance, difficulty, slide relation, and answer correctness. Results showed GPT-3.5 and Llama 2-Chat 13B outperformed Flan T5 XXL overall, with lower scores on clarity and answer-question alignment for Flan T5 XXL. GPT-3.5 excelled at tailoring questions to match input answers. While isolated questions seemed coherent for all models, Llama 2-Chat 13B and Flan T5 XXL showed weaker alignment between generated questions and answers compared to GPT-3.5. This research analyzes the capacity of LLMs for Automatic Question Generation to enhance education, particularly GPT-3.5 and Llama 2-Chat 13B, without any finetuning. Further work is needed to optimize models and methodologies to continually improve question relevance and quality.

論文抄録(英)

内容記述タイプ

Other

内容記述

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10096193

書誌情報

研究報告コンピュータと教育（CE）

巻 2023-CE-172, 号 25, p. 1-8, 発行日 2023-11-25

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8930

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 10:50:38.015122

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

A Comparative Analysis of Large Language Models for Contextually Relevant Question Generation in Education

× Lodovico, Molina Ivo

× Tsubasa, Minematsu

× Atsushi, Shimada

× Lodovico, Molina Ivo

× Tsubasa, Minematsu

× Atsushi, Shimada

Versions

Share

Cite as

エクスポート