@techreport{oai:ipsj.ixsq.nii.ac.jp:00209775, author = {林, 恒太朗 and 古明地, 秀治 and 三橋, 匠 and 飯村, 康司 and 鈴木, 皓晴 and 菅野, 秀宣 and 篠田, 浩一 and 田中, 聡久 and Kotaro, Hayashi and Shyuji, Komeiji and Takumi, Mitsuhashi and Yasushi, Iimura and Hiroharu, Suzuki and Hidenori, Sugano and Koichi, Sshinoda and Toshihisa, Tanaka}, issue = {37}, month = {Feb}, note = {近年の信号処理・機械学習技術の進展によって,発声時や傾聴時の音声を頭蓋内脳波から推定したり再構成することが可能になりつつある.一方で,想像している発話の推定は,脳波と正解ラベルの同期を取るのが困難であることもあり,めぼしい成果が出ていないのが現状である.本稿では,想像音声と脳波が適切に同期していれば,発声や傾聴時脳波の場合と同様に,脳波から音声をデコーディングできるという仮説を立てた.そこで,短い文が映し出された画面を実験参加者に呈示し,文字の色を1文字ずつハイライトすることで,想像時のタイミングや想像速度を制御できる実験を設計した.その上で,音声想像,音声傾聴,発声の3種類タスクを課し,そのときの頭蓋内脳波を記録した.さらに,傾聴タスクでは呈示した音声,発声タスクでは実験参加者の発話を記録した.計測した頭蓋内脳波に対して,発声または傾聴時の音声のメルケプストラム係数をもちいたエンコーダ・デコーダモデルによって,想像音声を学習・推論した.想像時の頭蓋内脳波からデコーディングした文の文字誤り率は,最良で約17%を達成した., Recent advances in signal processing and machine learning technologies have made it possible to estimate and reconstruct speech or text during speaking and listening from invasive electrocorticogram (ECoG). Meanwhile, the estimation of imagined speech has not been successful due to the difficulty in synchronizing the ECoG with the target label. In this paper, we hypothesize that if imagined speech and ECoG are adequately synchronized, speech can be decoded from ECoG as in the case of ECoG during speaking and listening. We designed an experiment in which participants were presented with a screen on which short sentences were projected, and by highlighting the colors of the letters one by one, the timing and speed of imagination could be controlled. The ECoG was recorded during the three tasks of imagining speech, listening to speech, and speaking. Moreover, we recorded the speech presented in the listening task and the participants’ speech in the speaking task. From the measured ECoG, we built an encoder-decoder model using the Mel cepstrum coefficients of the speech during the speaking and listening tasks to infer the imagined speech. The best character error rate of about 17% was achieved for sentences decoded from the imagined ECoG.}, title = {頭蓋内脳波からのエンコーダ・デコーダモデルによる想像音声推定}, year = {2021} }