WEKO3
アイテム
Optimization of Large Scale Neural Networks for Speech Recognition
https://ipsj.ixsq.nii.ac.jp/records/107354
https://ipsj.ixsq.nii.ac.jp/records/107354cebcf43d-ceb9-44c4-b250-41030bc9d9cb
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2014 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | SIG Technical Reports(1) | |||||||
---|---|---|---|---|---|---|---|---|
公開日 | 2014-12-08 | |||||||
タイトル | ||||||||
タイトル | Optimization of Large Scale Neural Networks for Speech Recognition | |||||||
タイトル | ||||||||
言語 | en | |||||||
タイトル | Optimization of Large Scale Neural Networks for Speech Recognition | |||||||
言語 | ||||||||
言語 | eng | |||||||
キーワード | ||||||||
主題Scheme | Other | |||||||
主題 | 招待講演 | |||||||
資源タイプ | ||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_18gh | |||||||
資源タイプ | technical report | |||||||
著者所属 | ||||||||
Google Inc. | ||||||||
著者所属(英) | ||||||||
en | ||||||||
Google Inc. | ||||||||
著者名 |
Michiel, Bacchiani
× Michiel, Bacchiani
|
|||||||
著者名(英) |
Michiel, Bacchiani
× Michiel, Bacchiani
|
|||||||
論文抄録 | ||||||||
内容記述タイプ | Other | |||||||
内容記述 | Recent years have shown a large scale adoption of speech recognition by the public, in particular around mobile devices. Simultaneously, the state of the art of speech recognition has improved significantly with the adoption of deep neural networks as the key technology for acoustic modeling. This approach is particularly effective when applied at scale. This talk will first provide an overview of how the Google speech group has grown over the years. Starting from a research effort, it has developed into an integral input modality of the Android operating system and is launched in more than 50 languages worldwide. The presentation will provide some historical perspective of that growth, discussing earlier products and outline the current mobile products and future directions of those. The success and adoption of the Google speech technology is evident by the decade of speech our systems receives each day. The talk will then focus on recent advances in training algorithms and infrastructure to train large neural networks on large data sets. Although not exclusively, the talk will provide a Google perspective to approaching this problem. It will describe our asynchronous parallel training infrastructure and how key algorithmic improvements fit within that framework. Particular attention will be paid to optimization using a sequence objective and recurrent network architectures (using Long Short Term Memory models). | |||||||
論文抄録(英) | ||||||||
内容記述タイプ | Other | |||||||
内容記述 | Recent years have shown a large scale adoption of speech recognition by the public, in particular around mobile devices. Simultaneously, the state of the art of speech recognition has improved significantly with the adoption of deep neural networks as the key technology for acoustic modeling. This approach is particularly effective when applied at scale. This talk will first provide an overview of how the Google speech group has grown over the years. Starting from a research effort, it has developed into an integral input modality of the Android operating system and is launched in more than 50 languages worldwide. The presentation will provide some historical perspective of that growth, discussing earlier products and outline the current mobile products and future directions of those. The success and adoption of the Google speech technology is evident by the decade of speech our systems receives each day. The talk will then focus on recent advances in training algorithms and infrastructure to train large neural networks on large data sets. Although not exclusively, the talk will provide a Google perspective to approaching this problem. It will describe our asynchronous parallel training infrastructure and how key algorithmic improvements fit within that framework. Particular attention will be paid to optimization using a sequence objective and recurrent network architectures (using Long Short Term Memory models). | |||||||
書誌レコードID | ||||||||
収録物識別子タイプ | NCID | |||||||
収録物識別子 | AN10442647 | |||||||
書誌情報 |
研究報告音声言語情報処理(SLP) 巻 2014-SLP-104, 号 7, p. 1-1, 発行日 2014-12-08 |
|||||||
Notice | ||||||||
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. | ||||||||
出版者 | ||||||||
言語 | ja | |||||||
出版者 | 情報処理学会 |