<?xml version='1.0' encoding='UTF-8'?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2026-03-06T00:14:21Z</responseDate>
  <request metadataPrefix="jpcoar_1.0" verb="GetRecord" identifier="oai:ipsj.ixsq.nii.ac.jp:00057154">https://ipsj.ixsq.nii.ac.jp/oai</request>
  <GetRecord>
    <record>
      <header>
        <identifier>oai:ipsj.ixsq.nii.ac.jp:00057154</identifier>
        <datestamp>2025-01-22T04:36:35Z</datestamp>
        <setSpec>1164:5159:5192:5193</setSpec>
      </header>
      <metadata>
        <jpcoar:jpcoar xmlns:datacite="https://schema.datacite.org/meta/kernel-4/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcndl="http://ndl.go.jp/dcndl/terms/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:jpcoar="https://github.com/JPCOAR/schema/blob/master/1.0/" xmlns:oaire="http://namespace.openaire.eu/schema/oaire/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rioxxterms="http://www.rioxx.net/schema/v2.0/rioxxterms/" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="https://github.com/JPCOAR/schema/blob/master/1.0/" xsi:schemaLocation="https://github.com/JPCOAR/schema/blob/master/1.0/jpcoar_scm.xsd">
          <dc:title>周波数帯域ごとの重みつき尤度を用いた雑音に頑健な音声認識</dc:title>
          <dc:title xml:lang="en">Noise - robust speech recognition using band - dependent weighted likelihood</dc:title>
          <jpcoar:creator>
            <jpcoar:creatorName>西村, 義隆</jpcoar:creatorName>
            <jpcoar:creatorName>篠崎, 隆宏</jpcoar:creatorName>
            <jpcoar:creatorName>岩野, 公司</jpcoar:creatorName>
            <jpcoar:creatorName>古井, 貞煕</jpcoar:creatorName>
          </jpcoar:creator>
          <jpcoar:creator>
            <jpcoar:creatorName xml:lang="en">Yoshitaka, Nishimura</jpcoar:creatorName>
            <jpcoar:creatorName xml:lang="en">Takahiro, Shinozaki</jpcoar:creatorName>
            <jpcoar:creatorName xml:lang="en">Koji, Iwano</jpcoar:creatorName>
            <jpcoar:creatorName xml:lang="en">Sadaoki, Furui</jpcoar:creatorName>
          </jpcoar:creator>
          <datacite:description descriptionType="Other">音声認識では，認識のための音声特徴量としてケプストラム領域の特徴量であるMFCC(Mel Frequency Cepstrum coefficient)を用いることが一般的である．ケプストラム領域は，対数スペクトルをフーリエ変換した領域であるため，スペクトルの領域においてある箇所のみに重畳していた雑音であっても，ケプストラム領域ではその雑音が広がってしまい，ケプストラムの全ての項に対して雑音の影響を与えてしまう欠点がある．このため，加法性雑音に対する領域性を考えたとき，スペクトル領域の特徴量を用いることができれば，雑音の分離がしやすく有利である．スペクトル特徴量を用いた音声認識はこれまでも試みられているが，狭帯域雑音などの特徴の条件下でしか有効性が示されていない．そこで本稿では，従来用いられていたスペクトル特徴量とMFCC特徴量の比較を行い．MFCCと同程度の認識ができる対数スペクトル特徴量を提案する．実験の結果，スペクトルピークへの重みづけを加えることにより，広帯域雑音環境下においてMFCCよりも高い認識率を確認した．</datacite:description>
          <datacite:description descriptionType="Other">In most of the state-of-the-art automatic speech recognition (ASR) systems, speech is converted into a time function of the MFCC (Mel Frequency Cepstrum Coefficient) vector. However, the MFCC has a problem in that noise effects spread over all the coefficients even when the noise is limited within a narrow frequency range. If a spectrum feature is directly used, this problem can be avoided and thus robustness against noise could be expected to increase. Although various researches on using spectral domain features have been conducted, improvement of recognition performances has been reported only in limited noise conditions. This paper proposes a novel multi-band ASR method using a new log-spectral domain feature. Experimental results using bubble noise-added speech show that recognition performance is improved by the proposed method in comparison with the MFCC-based method. The performance is further improved by a spectral-peak weighting technique.</datacite:description>
          <dc:publisher xml:lang="ja">情報処理学会</dc:publisher>
          <datacite:date dateType="Issued">2003-12-18</datacite:date>
          <dc:language>jpn</dc:language>
          <dc:type rdf:resource="http://purl.org/coar/resource_type/c_18gh">technical report</dc:type>
          <jpcoar:identifier identifierType="URI">https://ipsj.ixsq.nii.ac.jp/records/57154</jpcoar:identifier>
          <jpcoar:sourceIdentifier identifierType="NCID">AN10442647</jpcoar:sourceIdentifier>
          <jpcoar:sourceTitle>情報処理学会研究報告音声言語情報処理（SLP）</jpcoar:sourceTitle>
          <jpcoar:volume>2003</jpcoar:volume>
          <jpcoar:issue>124(2003-SLP-049)</jpcoar:issue>
          <jpcoar:pageStart>19</jpcoar:pageStart>
          <jpcoar:pageEnd>24</jpcoar:pageEnd>
          <jpcoar:file>
            <jpcoar:URI>https://ipsj.ixsq.nii.ac.jp/record/57154/files/IPSJ-SLP03049004.pdf</jpcoar:URI>
            <jpcoar:mimeType>application/pdf</jpcoar:mimeType>
            <jpcoar:extent>550.0 kB</jpcoar:extent>
            <datacite:date dateType="Available">2005-12-18</datacite:date>
          </jpcoar:file>
        </jpcoar:jpcoar>
      </metadata>
    </record>
  </GetRecord>
</OAI-PMH>
