多戦略学習手法MS - RL：環境変動下におけるロバストな学習エージェントの実現

岡本, 充義; 山口, 智浩; 谷内田, 正彦; Mitsuyoshi, Okamoto; Tomohiro, Yamaguchi; Masahiko, Yachida

WEKO3

インデックスツリー

RootNode

アイテム

多戦略学習手法MS - RL：環境変動下におけるロバストな学習エージェントの実現

https://ipsj.ixsq.nii.ac.jp/records/50734

名前 / ファイル	ライセンス	アクション
IPSJ-ICS98115012.pdf (856.8 kB)	Copyright (c) 1999 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

1999-01-11

タイトル

多戦略学習手法MS - RL：環境変動下におけるロバストな学習エージェントの実現

タイトル

言語

タイトル

MS - RL : Multi - Strategy Reinforcement Learning method for a learning agent under a variant environment

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

大阪大学基礎工学研究科

著者所属

大阪大学基礎工学研究科

著者所属

大阪大学基礎工学研究科

著者所属(英)

Graduate School of Engineering Science, Osaka University

著者所属(英)

Graduate School of Engineering Science, Osaka University

著者所属(英)

Graduate School of Engineering Science, Osaka University

著者名

岡本, 充義山口, 智浩谷内田, 正彦

著者名(英)

Mitsuyoshi, Okamoto Tomohiro, Yamaguchi Masahiko, Yachida

論文抄録

内容記述タイプ

Other

内容記述

本研究の目的は、学習条件が変動する動的環境でロバストかつ柔軟な学習エージェントの実現である。未知環境においてエージェントがロバストに行動する為に要求されるのは、収束に多量の時間を要する最適性ではなく、エージェントにおいて実現可能な限られた時間においての学習効率や，環境の変動に対する適応や再学習能力である。しかし、このような変化する環境において、単一で静的／不変な条件を仮定した，従来の強化学習アルゴリズムを用いると、対応ができずに学習パフォーマンスが大幅に低下するという問題が発生する。そこで本研究では、複数の異なる強化学習アルゴリズムを並列に実行する，多戦略並列強化学習手法を提案する。

論文抄録(英)

内容記述タイプ

Other

内容記述

The object of this research is to realize a robust and flexible learning agent under a variant environment with intermittent changes of the learning conditions. Reinforcement learning is one of the possible behavior learning methods for an agent that behaves robustly in an unknown environment. Most previous reinforcement learning researches assume the limited conditions such as MDP environment to guarantee a rationality for learning, and tend to seek the convergence of the optimal learning result in infinite learning time. This paper presents Multi-Strategy Parallel Reinforcement Learning method(MSP-RL, in short) that performs the several different reinforcement learning algorithms in parallel.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11135936

書誌情報

情報処理学会研究報告知能と複雑系（ICS）

巻 1999, 号 1(1998-ICS-115), p. 77-84, 発行日 1999-01-11

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-22 07:29:39.234919

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

多戦略学習手法MS - RL：環境変動下におけるロバストな学習エージェントの実現

× 岡本, 充義山口, 智浩谷内田, 正彦

× Mitsuyoshi, Okamoto Tomohiro, Yamaguchi Masahiko, Yachida

Versions

Share

Cite as

エクスポート

インデックスリンク

インデックスツリー

アイテム

多戦略学習手法MS - RL：環境変動下におけるロバストな学習エージェントの実現

× 岡本, 充義 山口, 智浩 谷内田, 正彦

× Mitsuyoshi, Okamoto Tomohiro, Yamaguchi Masahiko, Yachida

Versions

Share

Cite as

エクスポート

× 岡本, 充義山口, 智浩谷内田, 正彦