二人零和ゲームにおける突然変異駆動型正則化先導者追従法の終極反復収束

阿部, 拳之; 豊島, 健太郎; 坂本, 充生; 岩崎, 敦; Kenshi, Abe; Kentaro, Toyoshima; Mitsuki, Sakamoto; Atsushi, Iwasaki

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

二人零和ゲームにおける突然変異駆動型正則化先導者追従法の終極反復収束

https://doi.org/10.20729/00234158

名前 / ファイル	ライセンス	アクション
IPSJ-JNL6505010.pdf (1.2 MB) 2026年5月15日からダウンロード可能です。	Copyright (c) 2024 by the Information Processing Society of Japan
非会員：¥660, IPSJ:学会員：¥330, 論文誌:会員：¥0, DLIB:会員：¥0

Item type

Journal(1)

公開日

2024-05-15

タイトル

二人零和ゲームにおける突然変異駆動型正則化先導者追従法の終極反復収束

タイトル

言語

タイトル

Mutation-driven Follow the Regularized Leader for Last-iterate Convergence in Zero-sum Games

言語

jpn

キーワード

主題Scheme

Other

主題

[一般論文（推薦論文）] 終極反復収束，正則化先導者追従法，零和ゲーム，突然変異付きレプリケータダイナミクス

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

ID登録

10.20729/00234158

ID登録タイプ

JaLC

著者所属

株式会社サイバーエージェント／電気通信大学

著者所属

電気通信大学

著者所属

電気通信大学

著者所属

電気通信大学

著者所属(英)

CyberAgent, Inc. / The University of Electro-Communications

著者所属(英)

The University of Electro-Communications

著者所属(英)

The University of Electro-Communications

著者所属(英)

The University of Electro-Communications

著者名

阿部, 拳之
豊島, 健太郎
坂本, 充生
岩崎, 敦

著者名(英)

Kenshi, Abe
Kentaro, Toyoshima
Mitsuki, Sakamoto
Atsushi, Iwasaki

論文抄録

内容記述タイプ

Other

内容記述

本研究では，二人零和ゲームにおける正則化先導者追従法（Follow the Regularized Leader，FTRL）に突然変異を導入した学習アルゴリズムの帰結を吟味する．FTRLは，戦略の時間平均がナッシュ均衡に収束することが保証されているアルゴリズムのクラスである．しかし，その多くは周回軌道に陥ってしまい，均衡に直接収束しないことが知られている．そこで本研究では，進化ゲームの文脈でよく用いられる突然変異付きレプリケータダイナミクスと等価なやり方で，突然変異を利用したMutant FTRL（M-FTRL）を提案する．次いでM-FTRLの連続時間ダイナミクスを分析し，ナッシュ均衡に近似する定常点に向けての強い収束性を保証した．さらに，M-FTRLの突然変異項に含まれる参照戦略（reference strategy）を適宜更新することで，近似でない厳密なナッシュ均衡への直接収束（終極反復収束）を保証した．

論文抄録(英)

内容記述タイプ

Other

内容記述

This study considers a variant of the Follow the Regularized Leader (FTRL) dynamics in two-player zero-sum games. FTRL is guaranteed to converge to a Nash equilibrium when time-averaging the strategies. At the same time, a lot of variants suffer from the issue of limit cycling behavior, i.e., lack the last-iterate convergence guarantee. To this end, we propose the mutant FTRL (M-FTRL) algorithm that introduces mutation to perturb action probabilities. We then investigate the continuous-time dynamics of M-FTRL and provide strong convergence guarantees toward stationary points that approximate a Nash equilibrium. Furthermore, by updating the reference strategy of the mutation term in M-FTRL, we ensure the last-iterate convergence to an exact Nash equilibrium.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 65, 号 5, p. 968-979, 発行日 2024-05-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

公開者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 09:49:49.952896

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

二人零和ゲームにおける突然変異駆動型正則化先導者追従法の終極反復収束

× 阿部, 拳之

× 豊島, 健太郎

× 坂本, 充生

× 岩崎, 敦

× Kenshi, Abe

× Kentaro, Toyoshima

× Mitsuki, Sakamoto

× Atsushi, Iwasaki

Versions

Share

Cite as

エクスポート