2024-03-29T23:19:27Zhttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_oaipmhoai:ipsj.ixsq.nii.ac.jp:002079172024-03-29T05:26:34Z01164:02592:10084:10406
Q-Learning を用いたアナログ回路の素子値決定における学習精度と効率改善The Improvement of Learning Accuracy and Efficiency about Element Value Determination of Analog Circuit Design using Q-Learningjpnhttp://id.nii.ac.jp/1001/00207815/Technical Reporthttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_action_common_download&item_id=207917&item_no=1&attribute_id=1&file_no=1Copyright (c) 2020 by the Institute of Electronics, Information and Communication Engineers This SIG report is only available to those in membership of the SIG.山梨大学工学部山崎, 優一佐藤, 隆英小川, 覚美Q-Learning を用いたアナログ回路の素子値決定を行い,その課題を述べる.行動に対する報酬を評価指標の現在値と目標値の比と定めると,十分な回数の学習結果を用いた場合も,設計結果が設計時の初期値に依存し,最適な設計結果が得られない.また,簡単な回路であっても学習には非常に長い時間が必要となる.そこで,本稿では学習途中で評価指標の暫定の最適値を保持し,現在の評価値が暫定値を越える場合に大きな報酬を与える手法を提案する.この手法を用いることで局所的な最適値に至る行動の Q 値のピークが最適値に至る行動の Q 値以下となり,設計時の初期値依存性を低減できる.また,最適値に至る行動が多く選択されるため,少ないシミュレーション回数で学習が完了する.提案手法の採用により,設計回数の 99.7 %で最適な設計値を得ることが出来ことを確認している.We state some problems in determining element values of Analog Circuit using Q-Learning. When the reward for actions is defined as the ratio of current value to target value, optimal design result can’t be obtained due to initial value dependence even if prepared sufficient times of simulation. Now we propose a method that gives a large reward if the current value exceeds the maximum provisional value which is kept during learnings. By using this method, initial value dependence can be reduced. In addition, the number of simulations is reduced since many actions leading to the optimal value are selected. It has been confirmed that the optimal value can be obtained 99.7% of number of designs by adopting the proposed method.AN1009593X研究報告アルゴリズム(AL)2020-AL-18015162020-11-182188-85662020-11-17