Classification of High Dimensional and Imbalanced Hyperspectral Imagery Data

García, Vicente; Sánchez, Javier Salvador; Mollineda, Ramón A.

doi:10.1007/978-3-642-21257-4_80

Vicente García¹⁹,
Javier Salvador Sánchez¹⁹ &
Ramón A. Mollineda¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6669))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

3152 Accesses
5 Citations

Abstract

The present paper addresses the problem of the classification of hyperspectral images with multiple imbalanced classes and very high dimensionality. Class imbalance is handled by resampling the data set, whereas PCA is applied to reduce the number of spectral bands. This is a preliminary study that pursues to investigate the benefits of using together these two techniques, and also to evaluate the application order that leads to the best classification performance. Experimental results demonstrate the significance of combining these preprocessing tools to improve the performance of hyperspectral imagery classification. Although it seems that the most effective order of application corresponds to first a resampling algorithm and then PCA, this is a question that still needs a much more thorough investigation.

Partially supported by the Spanish Ministry of Education and Science under grants CSD2007–00018, AYA2008–05965–0596–C04–04/ESP and TIN2009–14205–C04–04, and by Fundacio Caixa Castello–Bancaixa under grant P1–1B2009–04.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blagus, R., Lusa, L.: Class prediction for high-dimensional class-imbalanced data. Bioinformatics 11(1), 523–540 (2010)
Google Scholar
Bruzzone, L., Serpico, S.B.: Classification of imbalanced remote-sensing data by neural networks. Pattern Recogn. Lett. 18(11-13), 1323–1328 (1997)
Article Google Scholar
Camps-Valls, G.: Machine learning in remote sensing data processing. In: Proc. IEEE Int’l. Workshop Machine Learning for Signal Processing, Grenoble, France, pp. 1–6 (2009)
Google Scholar
Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
MATH Google Scholar
Chen, X., Fang, T., Huo, H., Li, D.: Semisupervised feature selection for unbalanced sample sets of VHR images. IEEE Geosci. Remote Sens. Lett. 7(4), 781–785 (2010)
Article Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Article MATH Google Scholar
Ezawa, K., Singh, M., Norton, S.: Learning goal oriented bayesian networks for telecommunications risk management. In: Proc. 13th Int’. Conf. Machine Learning, pp. 139–147 (1996)
Google Scholar
Fawcett, T., Provost, F.: Adaptive fraud detection. Data Min. Knowl. Disc. 1(3), 291–316 (1997)
Article Google Scholar
García, S., Herrera, F.: Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy. Evol. Comput. 17(3), 275–306 (2009)
Article Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. SIGKDD Explor. Newslett. 11, 10–18 (2009)
Article Google Scholar
Han, H., Wang, W.Y., Mao, B.H.: Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In: Proc. Int’l. Conf. Intelligent Computing, Hefei, China, pp. 878–887 (2005)
Google Scholar
He, H., Garcia, E.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Hsu, P.H., Tseng, Y.H., Gong, P.: Dimension reduction of hyperspectral images for classification applications. Geogr. Inf. Sci. 8(1), 1–8 (2002)
Google Scholar
Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intell. Data Anal. 6(5), 429–449 (2002)
MATH Google Scholar
Kamal, A., Zhu, X., Narayanan, R.: Gene selection for microarray expression data with imbalanced sample distributions. In: Proc. Int’l. Joint Conf. Bioinformatics, Systems Biology and Intelligent Computing, Shanghai, China, pp. 3–9 (2009)
Google Scholar
Kubat, M., Holte, R., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2-3), 195–215 (1998)
Article Google Scholar
Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: One-sided selection. In: Proc. 14th Int’l. Conf. Machine Learning, Nashville, USA, pp. 179–186 (1997)
Google Scholar
Lin, L., Ravitz, G., Shyu, M.L., Chen, S.C.: Effective feature space reduction with imbalanced data for semantic concept detection. In: Proc. Int’l. Conf. Sensor Networks, Ubiquitous, and Trustworthy Computing, Taichung, Taiwan, pp. 262–269 (2008)
Google Scholar
Liu, X.Y., Zhou, Z.H.: The influence of class imbalance on cost-sensitive learning: An empirical study. In: Proc. 6th Int’l. Conf. Data Mining, Hong Kong, pp. 970–974 (2006)
Google Scholar
Maloof, M.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: Workshop Learning from Imbalanced Data Sets II, Washington, DC (2003)
Google Scholar
Martínez-Usó, A., Pla, F., Sotoca, J.M., García-Sevilla, P.: Clustering-based hyperspectral band selection using information measures. IEEE Trans. Geosci. Remote Sens. 45(12), 4158–4171 (2007)
Article Google Scholar
Melgani, F., Bruzzone, L.: Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 42(8), 1778–1790 (2004)
Article Google Scholar
Richards, J., Jia, X.: Using suitable neighbors to augment the training set in hyperspectral maximum likelihood classification. IEEE Geosci. Remote Sens. Lett. 5(4), 774–777 (2008)
Article Google Scholar
Trebar, M., Steele, N.: Application of distributed SVM architectures in classifying forest data cover types. Comput. Electron. Agr. 63(2), 119–130 (2008)
Article Google Scholar
Van Hulse, J., Khoshgoftaar, T., Napolitano, A., Wald, R.: Feature selection with high-dimensional imbalanced data. In: IEEE Int’l. Conf. Data Mining Workshops, Miami, USA, pp. 507–514 (2009)
Google Scholar
Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)
Article Google Scholar
Waske, B., Benediktsson, J.A., Sveinsson, J.R.: Classifying remote sensing data with support vector machines and imbalanced training data. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 375–384. Springer, Heidelberg (2009)
Chapter Google Scholar
Williams, D., Myers, V., Silvious, M.: Mine classification with imbalanced data. IEEE Geosci. Remote Sens. Lett. 6(3), 528–532 (2009)
Article Google Scholar
Zhang, J., Mani, I.: kNN approach to unbalanced data distributions: a case study involving information extraction. In: Proc. Workshop Learning from Imbalanced Datasets, Washington DC (2003)
Google Scholar
Zhou, Z.H., Liu, X.Y.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of New Imaging Technologies Department of Computer Languages and Systems, Universitat Jaume I, Av. Sos Baynat s/n, 12071, Castelló de la Plana, Spain
Vicente García, Javier Salvador Sánchez & Ramón A. Mollineda

Authors

Vicente García
View author publications
You can also search for this author in PubMed Google Scholar
Javier Salvador Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Ramón A. Mollineda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departament de Matemàtica Aplicada i Anàlisi, Universitat de Barcelona, Facultat de Matemàtiques, Gran Via de les Corts Catalanes 585, 08007, Barcelona, Spain
Jordi Vitrià
Instituto de Sistemas e Robótica / Instituto Superior Técnico, Av. Rovisco Pais, 1, 1049-001, Lisbon, Portugal
João Miguel Sanches
Institute for Intelligent Systems and Numerical Applications in Engineering (SIANI), Edificio de Informática y Matemáticas, University of Las Palmas de Gran Canaria, Campus Universitario de Tafira, 35017, Las Palmas, Spain
Mario Hernández

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

García, V., Sánchez, J.S., Mollineda, R.A. (2011). Classification of High Dimensional and Imbalanced Hyperspectral Imagery Data. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds) Pattern Recognition and Image Analysis. IbPRIA 2011. Lecture Notes in Computer Science, vol 6669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21257-4_80

Download citation

DOI: https://doi.org/10.1007/978-3-642-21257-4_80
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21256-7
Online ISBN: 978-3-642-21257-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics