Abstract
In many real-world applications, labeled data are usually expensive to get, while there may be a large amount of unlabeled data. To reduce the labeling cost, active learning attempts to discover the most informative data points for labeling. The challenge is which unlabeled samples should be labeled to improve the classifier the most. Classical optimal experimental design algorithms are based on least-square errors over the labeled samples only while the unlabeled points are ignored. In this paper, we propose a novel active learning algorithm called neighborhood preserving D-optimal design. Our algorithm is based on a neighborhood preserving regression model which simultaneously minimizes the least-square error on the measured samples and preserves the neighborhood structure of the data space. It selects the most informative samples which minimize the variance of the regression parameter. We also extend our algorithm to nonlinear case by using kernel trick. Experimental results on terrain classification show the effectiveness of proposed approach.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-012-1155-3/MediaObjects/521_2012_1155_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-012-1155-3/MediaObjects/521_2012_1155_Fig2_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-012-1155-3/MediaObjects/521_2012_1155_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-012-1155-3/MediaObjects/521_2012_1155_Fig4_HTML.gif)
Similar content being viewed by others
References
Zhu X (2005) Semi-supervised learning literature survey. Technical Report 1530, Department of Computer Sciences, University of Wisconsin Madison
Zhou D, Bousquet O, Lal TN et al (2004) Learning with local and global consistency. Adv Neural Inf Process Syst 16:321–328
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Cohn DA, Ghahramani Z, Jordan MI (1996) Active learning with statistical models. J Artif Intell Res 4:129–145
Settles B (2009) Active learning literature survey, computer sciences. Technical Report 1648, University of Wisconsin Madison
Atkinson A, Donev A, Tobias R (2007) Optimum experimental designs with SAS. Oxford University Press, Oxford
Yu K, Bi J, Tresp V (2006) Active learning via transductive experimental design. In: Presented at the 23rd international conference of machine learning, Pittsburgh, PA
Yu K, Zhu S, Xu W, Gong Y (2008) Non-greedy active learning for text categorization using convex transductive experimental design. InSIGIR’08: proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY, USA, pp 635–642
Tong S, Koller D (2002) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66
Tong S, Chang E (2001) Support vector machine active learning for image retrieval. In: MULTIMEDIA’01: proceedings of the ninth ACM international conference on Multimedia. ACM, New York, NY, USA, pp 107–118
Goh K-S, Chang EY, Lai W-C (2004) Multimodal concept-dependent active learning for image retrieval. In: Presented at the ACM conference multimedia, New York
Schohn G, Cohn D (2000) Less is more: active learning with support vector machines. In: Presented at the 17th international conference machine learning, Stanford, CA
He X (2010) Laplacian regularized D-optimal design for active learning and its application to image retrieval. IEEE Trans Image Process 19(1):254–263
Chen C, Chen Z, Bu J, Wang C, Zhang L, Zhang C (2010). G-Optimal design with laplacian regularization. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence (AAAI-10)
Zhang L, Chen C, Chen W, Bu J, Cai D, He X (2009) Convex experimental design using manifold structure for image retrieval. In: Proceedings of the 17th ACM international conference, New York, USA
He X, Ji M, Bao H (2009) A unified active and semi-supervised learning framework for image compression. Comput Vis Pattern Recogn
Zhang L, Chen C, Bu J, Cai D, He X, Huang TS (2011) Active learning based on locally linear reconstruction. IEEE Trans Pattern Anal Mach Intell 33(10):2026–2038
Shen J, Ju B, Jiang T et al (2011) Column subset selection for active learning in image classification. Neurocomputing 74:3785–3792
Lu K, Zhao J (2011) Neighborhood preserving regression for image retrieval. Neurocomputing 74:1467–1473
Flaherty P, Jordan MI, Arkin AP (2005) Robust design of biological experiments. In: Presented at the advances in neural information processing systems 18, Vancouver, BC, Canada
Chung FRK (1997) Spectral graph theory, Regional Conference Series in Mathematics, vol 92
Wan M, Lai Z, Jin Z (2011) Feature extraction using two-dimensional local graph embedding based on maximum margin criterion. Appl Math Comput (AMC) 217(23):9659–9668
Wan M, Lai Z, Shao J, Jin Z (2009) Two-dimensional local graph embedding discriminant analysis (2DLGEDA) with its application to face and palm biometrics. Neurocomputing (IJON) 73(1–3):197–203
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326
He X, Cai D, Yan S, Zhang H-J (2005) Neighborhood preserving embedding. In: IEEE international conference on computer vision, Beijing, China, pp 1208–1213
Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York
Procopio MJ, Mulligan J, Grudic G (2009) Learning terrain segmentation with classifier ensembles for autonomous robot navigation in unstructured environments. J Field Robot 26(2):145–175
Procopio (2007) An experimental analysis of classifier ensembles for learning drifting concepts over time in autonomous outdoor robot navigation. University of Colorado
Procopio MJ Hand-labeled DARPA LAGR data sets (2010) Available at http://www.mikeprocopio.com/labeledlagrdata.html
Stockman G, Shapiro LG (2001) Computer vision. Prentice Hall, Upper Saddle River
Acknowledgments
This project is supported by NSFC of China (Grants No. 60632050, No. 60705006, No. 60873151, and No. 60973098).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gu, Y., Jin, Z. Neighborhood preserving D-optimal design for active learning and its application to terrain classification. Neural Comput & Applic 23, 2085–2092 (2013). https://doi.org/10.1007/s00521-012-1155-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1155-3