強化学習を用いた弾幕シューティングゲームを攻略するエージェントの作成

藤本, 修嗣; Shuji, Fujimoto

WEKO3

インデックスツリー

RootNode

アイテム

強化学習を用いた弾幕シューティングゲームを攻略するエージェントの作成

https://ipsj.ixsq.nii.ac.jp/records/240726

名前 / ファイル	ライセンス	アクション
IPSJ-GPWS2024008.pdf (1.8 MB)	Copyright (c) 2024 by the Information Processing Society of Japan
オープンアクセス

Item type

Symposium(1)

公開日

2024-11-15

タイトル

強化学習を用いた弾幕シューティングゲームを攻略するエージェントの作成

タイトル

言語

タイトル

An Agent Playing a Bullet Hell Game Using Reinforcement Learning

言語

jpn

キーワード

主題Scheme

Other

主題

深層強化学習

キーワード

主題Scheme

Other

主題

弾幕シューティングゲーム

キーワード

主題Scheme

Other

主題

Rainbow

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_5794

資源タイプ

conference paper

著者所属

京都大学大学院人間・環境学研究科

著者所属(英)

Graduate School of Human and Environmental Studies, Kyoto University

著者名

藤本, 修嗣

著者名(英)

Shuji, Fujimoto

論文抄録

内容記述タイプ

Other

内容記述

近年, 強化学習に深層モデルを取り入れた deep Q-Network がゲームの詳細なルールを教わることなく人間のスコアを超えることに成功し, 続く Rainbow は deep Q-Network 以後有効とされた Double DQN, Prioritized experience replay, Dueling network, Multi-step learning, Categorical DQN, NoisyNetを組み合わせることで Atari のゲームに対し成功を収めた. しかし, 弾幕シューティングゲームではゲームの内部情報を用いず画面上のピクセル情報のみ観測できるとき, 人間に匹敵する学習をするエージェントの報告はそれほど多くない. 弾幕シューティングゲームはシューティングゲームのもつ「避ける」というゲーム性に焦点を当てたゲームである. ゲームの特性として, 敵弾に対する正確な回避行動と詰み状況に陥らないためのある程度長期的な計画が必要である点, 環境に確率的な要素があり , また内部情報を用いない場合は詰み状況の判定が容易ではない点がある. Rainbow では複数の学習手法を組み合わせるが, ゲームによっては要素を除いた方が学習が改善する場合が知られている . 本研究では弾幕シューティングゲームに対し, いくつかの学習手法やニューラルネットワークへの入力の工夫によるパフォーマンスの変化や特徴を比較した. その結果, Rainbow の要素の中では Categorical DQN を用いることで大域的な回避行動を学習し, 安定的なスコアの向上につながると考えられる結果を得た.

論文抄録(英)

内容記述タイプ

Other

内容記述

Recently, deep Q-Network, which integrates deep neural networks in reinforcement learning, has successfully surpassed human scores in various games without being taught the detailed rules of the games. Rainbow, which combines techniques such as Double DQN, Prioritized experience replay, Dueling network, Multi-step learning, Categorical DQN, and NoisyNet, has also achieved significant progress in Atari games. However, there are relatively few reports of agents that can learn a bullet hell game as well as humans, when only pixel information on the screen is observed without using the internal information of the game. Bullet hell games emphasize the“avoidance” aspect of shooting games, requiring precise avoidance behavior against bullets and some long-term planning to avoid situations that are impossible to escape. The stochastic nature of these games makes it difficult to predict unavoidable situations without internal information. In Rainbow, the combination of multiple learning techniques improves performance in many cases, but in some games, removing certain components can enhance learning. In this study, we investigate the performance and characteristics of several learning methods and input types for neural networks in a bullet hell game. As a result, it is considered that among the elements of Rainbow, Categorical DQN is particularly effective in learning global avoidance behavior, leading to more stable score improvements.

書誌情報

ゲームプログラミングワークショップ2024論文集

巻 2024, p. 51-57, 発行日 2024-11-15

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 07:53:14.253080

Show All versions

Cite as

藤本, 修嗣, 2024: 情報処理学会, 51–57 p.

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

強化学習を用いた弾幕シューティングゲームを攻略するエージェントの作成

× 藤本, 修嗣

× Shuji, Fujimoto

Versions

Share

Cite as

エクスポート