Learning with positive and unlabeled examples using biased twin support vector machine
- 315 Downloads
- 12 Citations
Abstract
PU classification problem (‘P’ stands for positive, ‘U’ stands for unlabeled), which is defined as the training set consists of a collection of positive and unlabeled examples, has become a research hot spot recently. In this paper, we design a new classification algorithm to solve the PU problem: biased twin support vector machine (B-TWSVM). In B-TWSVM, two nonparallel hyperplanes are constructed such that the positive examples can be classified correctly, and the number of unlabeled examples classified as positive is minimized. Moreover, considering that the unlabeled set also contains positive data, different penalty parameters for positive and negative data are allowed in B-TWSVM. Experimental results demonstrate that our method outperforms the state-of-the-art methods in most cases.
Keywords
Machine learning Classification Support vector machineNotes
Acknowledgments
This work has been partially supported by grants from Funding Project for Academic Human Resources Development in Institutions of Higher Learning Under the Jurisdiction of Beijing Municipality (PHR201107123), the Key Laboratory for Urban Geomatics of National Administration of Surveying, Mapping and Geoinformation Foundation (20111216N), Beijing Municipal Commission of Education, Science and Technology Development (KM201210016014), and Scientific Research Foundation of Beijing University of Civil Engineering and Architecture (00331609054).
References
- 1.Liu B, Lee WS, Yu PS, Li X (2002) Partially supervised classification of text documents. In: Machine learning-international workshop then conference, Citeseer, pp 387–394Google Scholar
- 2.Yu H, Han J, Chang KC-C (2002) Pebl: positive example based learning for web page classification using svm. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 239–248Google Scholar
- 3.Li X, Liu B (2003) Learning to classify texts using positive and unlabeled data. In: International joint conference on artificial intelligence, vol 18. Lawrence Erlbaum Associates Ltd, pp 587–594Google Scholar
- 4.Rocchio Jr J (1971) Relevance feedback in information retrieval. In: The SMART system experiments in automatic document processing. Prentice Hall, New York, pp 313–323Google Scholar
- 5.Liu B, Dai Y, Li X, Lee WS, Yu PS (2003) Building text classifiers using positive and unlabeled examples. In: Data mining, 2003. ICDM 2003. Third IEEE international conference on, IEEE, pp 179–186Google Scholar
- 6.Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodological) 39(1):1–38Google Scholar
- 7.Mangasarian OL, Wild EW (2006) Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74CrossRefGoogle Scholar
- 8.Khemchandani R, Chandra S et al (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910CrossRefGoogle Scholar
- 9.Kumar MA, Gopal M (2008) Application of smoothing technique on twin support vector machines. Pattern Recogn Lett 29(13):1842–1848CrossRefGoogle Scholar
- 10.Khemchandani R, Chandra S et al (2009) Optimal kernel selection in twin support vector machines. Optim Lett 3(1):77–88MathSciNetCrossRefMATHGoogle Scholar
- 11.Arun Kumar M, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4):7535–7543CrossRefGoogle Scholar
- 12.Ghorai S, Mukherjee A, Dutta PK (2009) Nonparallel plane proximal classifier. Signal Process 89(4):510–522CrossRefMATHGoogle Scholar
- 13.Peng X (2010) Primal twin support vector regression and its sparse approximation. Neurocomputing 73(16):2846–2858CrossRefGoogle Scholar
- 14.Peng X (2012) Efficient twin parametric insensitive support vector regression model. Neurocomputing 79:26–38CrossRefGoogle Scholar
- 15.Shao Y-H, Deng N-Y (2012) A coordinate descent margin based-twin support vector machine for classification. Neural Netw 25:114–121CrossRefMATHGoogle Scholar
- 16.Shao Y-H, Zhang C-H, Wang X-B, Deng N-Y (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968CrossRefGoogle Scholar
- 17.Shao YH, Chen WJ, Deng NY (2014) Nonparallel hyperplane support vector machine for binary classification problems. Inf Sci 263:22–35Google Scholar
- 18.Tian Y, Qi Z, Ju X, Shi Y, Liu X (2014) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern. doi: 10.1109/TCYB.2013.2279167
- 19.Vapnik V (1999) The nature of statistical learning theory. Springer, BerlinGoogle Scholar
- 20.Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRefGoogle Scholar
- 21.Wang Z, Chen J, Qin M (2010) Nonparallel planes support vector machine for multi-class classification. In: 2010 international conference on logistics systems and intelligent management (ICLSIM) vol 1, pp 581–585Google Scholar
- 22.Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html