<?xml version='1.0' encoding='UTF-8'?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2026-04-20T22:55:33Z</responseDate>
  <request verb="GetRecord" metadataPrefix="oai_dc" identifier="oai:ipsj.ixsq.nii.ac.jp:00209715">https://ipsj.ixsq.nii.ac.jp/oai</request>
  <GetRecord>
    <record>
      <header>
        <identifier>oai:ipsj.ixsq.nii.ac.jp:00209715</identifier>
        <datestamp>2025-01-19T18:25:39Z</datestamp>
        <setSpec>1164:2735:10526:10527</setSpec>
      </header>
      <metadata>
        <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
          <dc:title>Generating Intrinsic Rewards by Random Recurrent Network Distillation</dc:title>
          <dc:title>Generating Intrinsic Rewards by Random Recurrent Network Distillation</dc:title>
          <dc:creator>Zefeng, Xu</dc:creator>
          <dc:creator>Koichi, Moriyama</dc:creator>
          <dc:creator>Tohgoroh, Matsui</dc:creator>
          <dc:creator>Atsuko, Mutoh</dc:creator>
          <dc:creator>Nobuhiro, Inuzuka</dc:creator>
          <dc:creator>Zefeng, Xu</dc:creator>
          <dc:creator>Koichi, Moriyama</dc:creator>
          <dc:creator>Tohgoroh, Matsui</dc:creator>
          <dc:creator>Atsuko, Mutoh</dc:creator>
          <dc:creator>Nobuhiro, Inuzuka</dc:creator>
          <dc:description>Exploration in sparse reward environments pose significant challenges for many reinforcement learning algorithms. Rather than solely relying on extrinsic rewards provided by environments, many state-of-the-art methods generate intrinsic rewards to encourage the agent explore the environments. However, we found that existing models fall short in some environments, where the agent must visit a same state more than once. Thus, we improve an existing model to propose a novel type of intrinsic exploration bonus which will reward the agent when a new sequence is discovered. The intrinsic reward is the error of a recurrent neural network predicting features of the sequences given by a fixed randomly initialized recurrent neural network. Our approach performs well in some Atari games where conditions must be fulfilled to develop stories.</dc:description>
          <dc:description>Exploration in sparse reward environments pose significant challenges for many reinforcement learning algorithms. Rather than solely relying on extrinsic rewards provided by environments, many state-of-the-art methods generate intrinsic rewards to encourage the agent explore the environments. However, we found that existing models fall short in some environments, where the agent must visit a same state more than once. Thus, we improve an existing model to propose a novel type of intrinsic exploration bonus which will reward the agent when a new sequence is discovered. The intrinsic reward is the error of a recurrent neural network predicting features of the sequences given by a fixed randomly initialized recurrent neural network. Our approach performs well in some Atari games where conditions must be fulfilled to develop stories.</dc:description>
          <dc:description>technical report</dc:description>
          <dc:publisher>情報処理学会</dc:publisher>
          <dc:date>2021-02-22</dc:date>
          <dc:format>application/pdf</dc:format>
          <dc:identifier>研究報告数理モデル化と問題解決（MPS）</dc:identifier>
          <dc:identifier>15</dc:identifier>
          <dc:identifier>2021-MPS-132</dc:identifier>
          <dc:identifier>1</dc:identifier>
          <dc:identifier>6</dc:identifier>
          <dc:identifier>2188-8833</dc:identifier>
          <dc:identifier>AN10505667</dc:identifier>
          <dc:identifier>https://ipsj.ixsq.nii.ac.jp/record/209715/files/IPSJ-MPS21132015.pdf</dc:identifier>
          <dc:language>eng</dc:language>
        </oai_dc:dc>
      </metadata>
    </record>
  </GetRecord>
</OAI-PMH>
