WEKO3
アイテム
A Multi-Label Convolutional Neural Network for Automatic Image Annotation
https://ipsj.ixsq.nii.ac.jp/records/145553
https://ipsj.ixsq.nii.ac.jp/records/1455535894e253-ed86-4aa6-b81e-2d6529225ab6
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2015 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | Journal(1) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2015-10-15 | |||||||||
タイトル | ||||||||||
タイトル | A Multi-Label Convolutional Neural Network for Automatic Image Annotation | |||||||||
タイトル | ||||||||||
言語 | en | |||||||||
タイトル | A Multi-Label Convolutional Neural Network for Automatic Image Annotation | |||||||||
言語 | ||||||||||
言語 | eng | |||||||||
キーワード | ||||||||||
主題Scheme | Other | |||||||||
主題 | [特集:E-Service and Knowledge Management toward Smart Computing Society] convolutional neural networks, multi-label classification, animation images | |||||||||
資源タイプ | ||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||
資源タイプ | journal article | |||||||||
著者所属 | ||||||||||
Graduate School of Design, Kyushu University | ||||||||||
著者所属 | ||||||||||
Faculty of Design, Kyushu University | ||||||||||
著者所属(英) | ||||||||||
en | ||||||||||
Graduate School of Design, Kyushu University | ||||||||||
著者所属(英) | ||||||||||
en | ||||||||||
Faculty of Design, Kyushu University | ||||||||||
著者名 |
Alexis, Vallet
× Alexis, Vallet
× Hiroyasu, Sakamoto
|
|||||||||
著者名(英) |
Alexis, Vallet
× Alexis, Vallet
× Hiroyasu, Sakamoto
|
|||||||||
論文抄録 | ||||||||||
内容記述タイプ | Other | |||||||||
内容記述 | Over the past few years, convolutional neural networks (CNN) have set the state of the art in a wide variety of supervised computer vision problems. Most research effort has focused on single-label classification, due to the availability of the large scale ImageNet dataset. Via pre-training on this dataset, CNNs have also shown the ability to outperform traditional methods for multi-label classification. Such methods, however, typically require evaluating many expensive forward passes to produce a multi-label distribution. Furthermore, due to the lack of a large scale multi-label dataset, little effort has been invested into training CNNs from scratch with multi-label data. In this paper, we address both issues by introducing a multi-label cost function adequate for deep CNNs, and a prediction method requiring only a single forward pass to produce multi-label predictions. We show the performance of our method on a newly introduced large scale multi-label dataset of animation images. Here, our method reaches 75.1% precision and 66.5% accuracy, making it suitable for automated annotation in practice. Additionally, we apply our method to the Pascal VOC 2007 dataset of natural images, and show that our prediction method outperforms a comparable model for a fraction of the computational cost. \n------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.23(2015) No.6 (online) ------------------------------ |
|||||||||
論文抄録(英) | ||||||||||
内容記述タイプ | Other | |||||||||
内容記述 | Over the past few years, convolutional neural networks (CNN) have set the state of the art in a wide variety of supervised computer vision problems. Most research effort has focused on single-label classification, due to the availability of the large scale ImageNet dataset. Via pre-training on this dataset, CNNs have also shown the ability to outperform traditional methods for multi-label classification. Such methods, however, typically require evaluating many expensive forward passes to produce a multi-label distribution. Furthermore, due to the lack of a large scale multi-label dataset, little effort has been invested into training CNNs from scratch with multi-label data. In this paper, we address both issues by introducing a multi-label cost function adequate for deep CNNs, and a prediction method requiring only a single forward pass to produce multi-label predictions. We show the performance of our method on a newly introduced large scale multi-label dataset of animation images. Here, our method reaches 75.1% precision and 66.5% accuracy, making it suitable for automated annotation in practice. Additionally, we apply our method to the Pascal VOC 2007 dataset of natural images, and show that our prediction method outperforms a comparable model for a fraction of the computational cost. \n------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.23(2015) No.6 (online) ------------------------------ |
|||||||||
書誌レコードID | ||||||||||
収録物識別子タイプ | NCID | |||||||||
収録物識別子 | AN00116647 | |||||||||
書誌情報 |
情報処理学会論文誌 巻 56, 号 10, 発行日 2015-10-15 |
|||||||||
ISSN | ||||||||||
収録物識別子タイプ | ISSN | |||||||||
収録物識別子 | 1882-7764 |