Given the relatively small number of microarrays typically used in gene-expression-based
classification, all of the data must be used to train a classifier and therefore the
same training data is used for error estimation. The key issue regarding the quality
of an error estimator in the context of small samples is its accuracy, and this is
most directly analyzed via the deviation distribution of the estimator, this being
the distribution of the difference between the estimated and true errors. Past studies
indicate that given a prior set of features, cross-validation does not perform as
well in this regard as some other training-data-based error estimators. The purpose
of this study is to quantify the degree to which feature selection increases the variation
of the deviation distribution in addition to the variation in the absence of feature
selection. To this end, we propose the coefficient of relative increase in deviation
dispersion (CRIDD), which gives the relative increase in the deviation-distribution
variance using feature selection as opposed to using an optimal feature set without
feature selection. The contribution of feature selection to the variance of the deviation
distribution can be significant, contributing to over half of the variance in many
of the cases studied. We consider linear-discriminant analysis, 3-nearest-neighbor,
and linear support vector machines for classification; sequential forward selection,
sequential forward floating selection, and the
-test for feature selection; and
-fold and leave-one-out cross-validation for error estimation. We apply these to three
feature-label models and patient data from a breast cancer study. In sum, the cross-validation
deviation distribution is significantly flatter when there is feature selection, compared
with the case when cross-validation is performed on a given feature set. This is reflected
by the observed positive values of the CRIDD, which is defined to quantify the contribution
of feature selection towards the deviation variance.
Research Article
References
-
L Devroye, L Gyorfi, G Lugosi, A Probabilistic Theory of Pattern Recognition (Springer, New York, NY, USA, 1996)
-
U Braga-Neto, ER Dougherty, Is cross-validation valid for small-sample microarray classification? Bioinformatics 20(3), 374–380 (2004). PubMed Abstract | Publisher Full Text
-
U Braga-Neto, ER Dougherty, Bolstered error estimation. Pattern Recognition 37(6), 1267–1281 (2004). Publisher Full Text
-
C Sima, U Braga-Neto, ER Dougherty, Superior feature-set ranking for small samples using bolstered error estimation. Bioinformatics 21(7), 1046–1054 (2005). PubMed Abstract | Publisher Full Text
-
C Sima, S Attoor, U Brag-Neto, J Lowey, E Suh, ER Dougherty, Impact of error estimation on feature selection. Pattern Recognition 38(12), 2472–2482 (2005). Publisher Full Text
-
AM Molinaro, R Simon, RM Pfeiffer, Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005). PubMed Abstract | Publisher Full Text
-
P Pudil, J Novovicova, J Kittler, Floating search methods in feature selection. Pattern Recognition Letters 15(11), 1119–1125 (1994). Publisher Full Text
-
Y Xiao, J Hua, ER Dougherty, Feature selection increases cross-validation imprecision. Proceedings of the 4th IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS '06), College Station, Tex, USA, May 2006
-
LJ van't Veer, H Dai, MJ van de Vijver, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002). PubMed Abstract | Publisher Full Text
-
MJ van de Vijver, YD He, LJ van't Veer, et al. A gene-expression signature as a predictor of survival in breast cancer. New England Journal of Medicine 347(25), 1999–2009 (2002). PubMed Abstract | Publisher Full Text
-
A Choudhary, M Brun, J Hua, J Lowey, E Suh, ER Dougherty, Genetic test bed for feature selection. Bioinformatics 22(7), 837–842 (2006). PubMed Abstract | Publisher Full Text
-
A Jain, D Zongker, Feature selection: evaluation, application, and small sample performance. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(2), 153–158 (1997). Publisher Full Text
-
M Kudo, J Sklansky, Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 33(1), 25–41 (2000). Publisher Full Text




