http://swrc.ontoware.org/ontology#Article
An Exhaustive Search and Stability of Sparse Estimation for Feature Selection Problem
en
[オリジナル論文] feature selection, exhaustive search, cross validation, exchange Monte Carlo method
The University of Tokyo
Kobe University
Technische Universitat Berlin
Fukushima Medical University
The University of Toyama
The University of Tokyo／JST ERATO OKANOYA EMOTIONAL
Kenji Nagata
Jun Kitazono
Shinichi Nakajima
Satoshi Eifuku
Ryoi Tamura
Masato Okada
Feature selection problem has been widely used for various fields. In particular, the sparse estimation has the advantage that its computational cost is the polynomial order of the number of features. However, it has the problem that the obtained solution varies as the dataset has changed a little. The goal of this paper is to exhaustively search the solutions which minimize the generalization error for feature selection problem to investigate the problem of sparse estimation. We calculate the generalization errors for all combinations of features in order to get the histogram of generalization error by using the cross validation method. By using this histogram, we propose a method to verify whether the given data include information for binary classification by comparing the histogram of predictive error for random guessing. Moreover, we propose a statistical mechanical method in order to efficiently calculate the histogram of generalization error by the exchange Monte Carlo (EMC) method and the multiple histogram method. We apply our proposed method to the feature selection problem for selecting the relevant neurons for face identification.
Feature selection problem has been widely used for various fields. In particular, the sparse estimation has the advantage that its computational cost is the polynomial order of the number of features. However, it has the problem that the obtained solution varies as the dataset has changed a little. The goal of this paper is to exhaustively search the solutions which minimize the generalization error for feature selection problem to investigate the problem of sparse estimation. We calculate the generalization errors for all combinations of features in order to get the histogram of generalization error by using the cross validation method. By using this histogram, we propose a method to verify whether the given data include information for binary classification by comparing the histogram of predictive error for random guessing. Moreover, we propose a statistical mechanical method in order to efficiently calculate the histogram of generalization error by the exchange Monte Carlo (EMC) method and the multiple histogram method. We apply our proposed method to the feature selection problem for selecting the relevant neurons for face identification.
AA11464803
情報処理学会論文誌数理モデル化と応用（TOM）
8
2
23-30
2015-07-24
1882-7780