WEKO3
アイテム
A Multi-Label Convolutional Neural Network for Automatic Image Annotation
https://ipsj.ixsq.nii.ac.jp/records/145553
https://ipsj.ixsq.nii.ac.jp/records/1455535894e253-ed86-4aa6-b81e-2d6529225ab6
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
Copyright (c) 2015 by the Information Processing Society of Japan
|
|
| オープンアクセス | ||
| Item type | Journal(1) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2015-10-15 | |||||||||
| タイトル | ||||||||||
| タイトル | A Multi-Label Convolutional Neural Network for Automatic Image Annotation | |||||||||
| タイトル | ||||||||||
| 言語 | en | |||||||||
| タイトル | A Multi-Label Convolutional Neural Network for Automatic Image Annotation | |||||||||
| 言語 | ||||||||||
| 言語 | eng | |||||||||
| キーワード | ||||||||||
| 主題Scheme | Other | |||||||||
| 主題 | [特集:E-Service and Knowledge Management toward Smart Computing Society] convolutional neural networks, multi-label classification, animation images | |||||||||
| 資源タイプ | ||||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||
| 資源タイプ | journal article | |||||||||
| 著者所属 | ||||||||||
| Graduate School of Design, Kyushu University | ||||||||||
| 著者所属 | ||||||||||
| Faculty of Design, Kyushu University | ||||||||||
| 著者所属(英) | ||||||||||
| en | ||||||||||
| Graduate School of Design, Kyushu University | ||||||||||
| 著者所属(英) | ||||||||||
| en | ||||||||||
| Faculty of Design, Kyushu University | ||||||||||
| 著者名 |
Alexis, Vallet
× Alexis, Vallet
× Hiroyasu, Sakamoto
|
|||||||||
| 著者名(英) |
Alexis, Vallet
× Alexis, Vallet
× Hiroyasu, Sakamoto
|
|||||||||
| 論文抄録 | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | Over the past few years, convolutional neural networks (CNN) have set the state of the art in a wide variety of supervised computer vision problems. Most research effort has focused on single-label classification, due to the availability of the large scale ImageNet dataset. Via pre-training on this dataset, CNNs have also shown the ability to outperform traditional methods for multi-label classification. Such methods, however, typically require evaluating many expensive forward passes to produce a multi-label distribution. Furthermore, due to the lack of a large scale multi-label dataset, little effort has been invested into training CNNs from scratch with multi-label data. In this paper, we address both issues by introducing a multi-label cost function adequate for deep CNNs, and a prediction method requiring only a single forward pass to produce multi-label predictions. We show the performance of our method on a newly introduced large scale multi-label dataset of animation images. Here, our method reaches 75.1% precision and 66.5% accuracy, making it suitable for automated annotation in practice. Additionally, we apply our method to the Pascal VOC 2007 dataset of natural images, and show that our prediction method outperforms a comparable model for a fraction of the computational cost. \n------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.23(2015) No.6 (online) ------------------------------ |
|||||||||
| 論文抄録(英) | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | Over the past few years, convolutional neural networks (CNN) have set the state of the art in a wide variety of supervised computer vision problems. Most research effort has focused on single-label classification, due to the availability of the large scale ImageNet dataset. Via pre-training on this dataset, CNNs have also shown the ability to outperform traditional methods for multi-label classification. Such methods, however, typically require evaluating many expensive forward passes to produce a multi-label distribution. Furthermore, due to the lack of a large scale multi-label dataset, little effort has been invested into training CNNs from scratch with multi-label data. In this paper, we address both issues by introducing a multi-label cost function adequate for deep CNNs, and a prediction method requiring only a single forward pass to produce multi-label predictions. We show the performance of our method on a newly introduced large scale multi-label dataset of animation images. Here, our method reaches 75.1% precision and 66.5% accuracy, making it suitable for automated annotation in practice. Additionally, we apply our method to the Pascal VOC 2007 dataset of natural images, and show that our prediction method outperforms a comparable model for a fraction of the computational cost. \n------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.23(2015) No.6 (online) ------------------------------ |
|||||||||
| 書誌レコードID | ||||||||||
| 収録物識別子タイプ | NCID | |||||||||
| 収録物識別子 | AN00116647 | |||||||||
| 書誌情報 |
情報処理学会論文誌 巻 56, 号 10, 発行日 2015-10-15 |
|||||||||
| ISSN | ||||||||||
| 収録物識別子タイプ | ISSN | |||||||||
| 収録物識別子 | 1882-7764 | |||||||||