Neural Computing and Applications

, Volume 25, Issue 6, pp 1303–1311 | Cite as

Learning with positive and unlabeled examples using biased twin support vector machine

Original Article

Abstract

PU classification problem (‘P’ stands for positive, ‘U’ stands for unlabeled), which is defined as the training set consists of a collection of positive and unlabeled examples, has become a research hot spot recently. In this paper, we design a new classification algorithm to solve the PU problem: biased twin support vector machine (B-TWSVM). In B-TWSVM, two nonparallel hyperplanes are constructed such that the positive examples can be classified correctly, and the number of unlabeled examples classified as positive is minimized. Moreover, considering that the unlabeled set also contains positive data, different penalty parameters for positive and negative data are allowed in B-TWSVM. Experimental results demonstrate that our method outperforms the state-of-the-art methods in most cases.

Keywords

Machine learning Classification Support vector machine 

Notes

Acknowledgments

This work has been partially supported by grants from Funding Project for Academic Human Resources Development in Institutions of Higher Learning Under the Jurisdiction of Beijing Municipality (PHR201107123), the Key Laboratory for Urban Geomatics of National Administration of Surveying, Mapping and Geoinformation Foundation (20111216N), Beijing Municipal Commission of Education, Science and Technology Development (KM201210016014), and Scientific Research Foundation of Beijing University of Civil Engineering and Architecture (00331609054).

References

  1. 1.
    Liu B, Lee WS, Yu PS, Li X (2002) Partially supervised classification of text documents. In: Machine learning-international workshop then conference, Citeseer, pp 387–394Google Scholar
  2. 2.
    Yu H, Han J, Chang KC-C (2002) Pebl: positive example based learning for web page classification using svm. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 239–248Google Scholar
  3. 3.
    Li X, Liu B (2003) Learning to classify texts using positive and unlabeled data. In: International joint conference on artificial intelligence, vol 18. Lawrence Erlbaum Associates Ltd, pp 587–594Google Scholar
  4. 4.
    Rocchio Jr J (1971) Relevance feedback in information retrieval. In: The SMART system experiments in automatic document processing. Prentice Hall, New York, pp 313–323Google Scholar
  5. 5.
    Liu B, Dai Y, Li X, Lee WS, Yu PS (2003) Building text classifiers using positive and unlabeled examples. In: Data mining, 2003. ICDM 2003. Third IEEE international conference on, IEEE, pp 179–186Google Scholar
  6. 6.
    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodological) 39(1):1–38Google Scholar
  7. 7.
    Mangasarian OL, Wild EW (2006) Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74CrossRefGoogle Scholar
  8. 8.
    Khemchandani R, Chandra S et al (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910CrossRefGoogle Scholar
  9. 9.
    Kumar MA, Gopal M (2008) Application of smoothing technique on twin support vector machines. Pattern Recogn Lett 29(13):1842–1848CrossRefGoogle Scholar
  10. 10.
    Khemchandani R, Chandra S et al (2009) Optimal kernel selection in twin support vector machines. Optim Lett 3(1):77–88MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Arun Kumar M, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4):7535–7543CrossRefGoogle Scholar
  12. 12.
    Ghorai S, Mukherjee A, Dutta PK (2009) Nonparallel plane proximal classifier. Signal Process 89(4):510–522CrossRefMATHGoogle Scholar
  13. 13.
    Peng X (2010) Primal twin support vector regression and its sparse approximation. Neurocomputing 73(16):2846–2858CrossRefGoogle Scholar
  14. 14.
    Peng X (2012) Efficient twin parametric insensitive support vector regression model. Neurocomputing 79:26–38CrossRefGoogle Scholar
  15. 15.
    Shao Y-H, Deng N-Y (2012) A coordinate descent margin based-twin support vector machine for classification. Neural Netw 25:114–121CrossRefMATHGoogle Scholar
  16. 16.
    Shao Y-H, Zhang C-H, Wang X-B, Deng N-Y (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968CrossRefGoogle Scholar
  17. 17.
    Shao YH, Chen WJ, Deng NY (2014) Nonparallel hyperplane support vector machine for binary classification problems. Inf Sci 263:22–35Google Scholar
  18. 18.
    Tian Y, Qi Z, Ju X, Shi Y, Liu X (2014) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern. doi: 10.1109/TCYB.2013.2279167
  19. 19.
    Vapnik V (1999) The nature of statistical learning theory. Springer, BerlinGoogle Scholar
  20. 20.
    Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRefGoogle Scholar
  21. 21.
    Wang Z, Chen J, Qin M (2010) Nonparallel planes support vector machine for multi-class classification. In: 2010 international conference on logistics systems and intelligent management (ICLSIM) vol 1, pp 581–585Google Scholar
  22. 22.
    Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.School of ScienceBeijing University of Civil Engineering and ArchitectureBeijingChina
  2. 2.Research Center on Fictitious Economy and Data ScienceChinese Academy of SciencesBeijingChina
  3. 3.Key Laboratory for Urban Geomatics of National Administration of Surveying, Mapping and GeoinformationBeijing University of Civil Engineering and ArchitectureBeijingChina

Personalised recommendations