強化学習を用いたチーム編成の効率化モデルの提案と環境変化に対する評価

佐藤, 大樹; 菅原, 俊治; Daiki, Satoh; Toshiharu, Sugawara

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

強化学習を用いたチーム編成の効率化モデルの提案と環境変化に対する評価

https://ipsj.ixsq.nii.ac.jp/records/81487

名前 / ファイル	ライセンス	アクション
IPSJ-TOM0501006.pdf (2.4 MB)	Copyright (c) 2012 by the Information Processing Society of Japan
オープンアクセス

Item type

Trans(1)

公開日

2012-03-05

タイトル

強化学習を用いたチーム編成の効率化モデルの提案と環境変化に対する評価

タイトル

言語

タイトル

Efficient Team Formation Based on Learning and Reorganization and Influence of Change of Tasks

言語

jpn

キーワード

主題Scheme

Other

主題

オリジナル論文

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

早稲田大学大学院基幹理工学研究科情報理工学専攻

著者所属

早稲田大学大学院基幹理工学研究科情報理工学専攻

著者所属(英)

Department of Computer Science and Engineering, Waseda University

著者所属(英)

Department of Computer Science and Engineering, Waseda University

著者名

佐藤, 大樹菅原, 俊治

著者名(英)

Daiki, Satoh Toshiharu, Sugawara

論文抄録

内容記述タイプ

Other

内容記述

インターネット上のサービスに対応したタスクは，それを構成する複数のサブタスクを処理することで達成される．効率的なタスク処理のためには，サブタスクを対応する能力やリソースを持つエージェントに適切に割り当てる必要がある．我々はこれまで，強化学習とそれに基づくネットワーク構造の再構成により，チーム編成とネットワーク構造を同時に効率化する手法を提案してきた．さらに，通信遅延の生じる環境においても既存手法より効率的なチームを編成できることを示した．しかし，そこで用いた機械学習は，近隣のエージェントの内部状態を既知としており，必ずしも現実のシステムと合致していない．また，実験で仮定したエージェントの配置も固定的であった．そこで本論文では，まず提案手法を，他のエージェントの内部状態ではなく，近隣からのメッセージと遅延を考慮した減衰率から報酬を求め，それに基づいてQ学習するようにモデル化する．次に，エージェントの配置もランダムに行い，多様な配置の初期状態にかかわらず，学習と組織構造の変化を組み合わせることで既存手法よりも効率化できることを示す．さらに，タスクの量・種類といった環境の変化についても，効率的なチーム編成が可能なことを実験により評価する．

論文抄録(英)

内容記述タイプ

Other

内容記述

A task in a distributed environment is usually achieved by doing a number of subtasks that require different functions and resources. These subtasks have to be processed cooperatively in the appropriate team of agents that have the required functions with sufficient resources, but it is difficult to anticipate, during the design stage of the system, what kinds of tasks will be requested in the dynamic and open environment. We already showed that the proposed method combines the learning for team formation and reorganization in a way that is adaptive to the environment and that it can improve the overall performance and increase the success in communication delay that may change dynamically. However, in the previous method, we assume that agents know the internal states of neighboring agents to learn the appropriate actions; this is not always available in real systems. In this paper, we propose the method of distributed team formation that uses modified Q-learning combining the reward and successful messages from downstream agents and their times elapsed from task requests. We also perform a number of experiments in more general deployment of agents. We show that it can improve the overall performance and can adapt to the environments that may change the range and quantity of tasks.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11464803

書誌情報

情報処理学会論文誌数理モデル化と応用（TOM）

巻 5, 号 1, p. 40-49, 発行日 2012-03-05

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7780

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-21 19:20:48.872685

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

強化学習を用いたチーム編成の効率化モデルの提案と環境変化に対する評価

× 佐藤, 大樹菅原, 俊治

× Daiki, Satoh Toshiharu, Sugawara

Versions

Share

Cite as

エクスポート

インデックスリンク

インデックスツリー

アイテム

強化学習を用いたチーム編成の効率化モデルの提案と環境変化に対する評価

× 佐藤, 大樹 菅原, 俊治

× Daiki, Satoh Toshiharu, Sugawara

Versions

Share

Cite as

エクスポート

× 佐藤, 大樹菅原, 俊治