対局に基づいた教師データの重要度の学習

佐藤, 佳州; 高橋, 大介; Yoshikuni, Sato; Daisuke, Takahashi

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

対局に基づいた教師データの重要度の学習

https://ipsj.ixsq.nii.ac.jp/records/106984

名前 / ファイル	ライセンス	アクション
IPSJ-JNL5511012.pdf (772.8 kB)	Copyright (c) 2014 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

2014-11-15

タイトル

対局に基づいた教師データの重要度の学習

タイトル

言語

タイトル

Learning Weights of Training Data by Game Results

言語

jpn

キーワード

主題Scheme

Other

主題

[特集：ゲームプログラミング] ゲーム，人工知能，将棋

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

筑波大学大学院システム情報工学研究科／現在，パナソニック株式会社先端技術研究所

著者所属

筑波大学システム情報系

著者所属(英)

Graduate School of Systems and Information Engineering, University of Tsukuba / Presently with Advanced Technology Research Laboratories, Panasonic Corporation

著者所属(英)

Faculty of Engineering, Information and Systems, University of Tsukuba

著者名

佐藤, 佳州
高橋, 大介

著者名(英)

Yoshikuni, Sato
Daisuke, Takahashi

論文抄録

内容記述タイプ

Other

内容記述

近年，ゲームプログラミングの分野では機械学習が大きな注目を集めており，評価関数，探索深さ，モンテカルロ木探索のplayoutの方策等，多くのパラメータの学習で成功を収めている．現在のゲームプログラミングにおける機械学習では，人間のエキスパートの棋譜を教師として，その指し手に近づけるようにパラメータの調整を行っている．しかし，将棋等のゲームでは，コンピュータはすでに人間のトッププレイヤに迫る強さとなっており，単純に人間の指し手を再現することが必ずしも「強い」プレイヤの生成に結び付くとは限らない．本論文では，このような課題を改善するため，教師データに重要度を導入した学習手法を提案する．提案手法では，勝率を適応度とした進化的計算による重要度の学習と，重要度に従ったパラメータ学習を組み合わせた学習を行う．提案手法を将棋の評価関数，実現確率，playoutの方策の学習へ適用した結果，従来手法との対局実験において有意に勝ち越すことに成功し，その有効性を示した．また，実験結果から局面の進行度や戦術等によって教師データの重要度に違いが生じることが分かり，教師データの効果的な利用により，より強いプログラムを実現する知識の獲得が可能となることを示した．

論文抄録(英)

内容記述タイプ

Other

内容記述

Recently, machine learning is attracting much attention in the field of game programming, and it has succeeded in tuning evaluation functions, search depth, playout policies in Monte-Carlo Tree Search, etc. Existing machine learning methods in game programming tune parameters by using game records of human expert players. However, computer programs have almost the same strength as human professional players in some games such as shogi. Thus, learning by simply using human records is not necessarily good for generating strong computer players. In this paper, we propose a new learning method that estimates the importance of each training record by playing many games and tunes parameters according to the importance. The experimental results show the effectiveness of our method for learning evaluation functions, realization probability search, and playout policies. Moreover, the results show that features of training data such as progress of games or tactics affects their importance.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 55, 号 11, p. 2399-2409, 発行日 2014-11-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-21 09:12:08.840281

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

対局に基づいた教師データの重要度の学習

× 佐藤, 佳州

× 高橋, 大介

× Yoshikuni, Sato

× Daisuke, Takahashi

Versions

Share

Cite as

エクスポート