Optimization Approaches to Semi-Supervised Learning

Demiriz, Ayhan; Bennett, Kristin P.

doi:10.1007/978-1-4757-3279-5_6

Ayhan Demiriz⁴ &
Kristin P. Bennett⁵

Part of the book series: Applied Optimization ((APOP,volume 50))

505 Accesses
6 Citations

Abstract

We examine mathematical models for semi-supervised support vector machines (S³VM). Given a training set of labeled data and a working set of unlabeled data, S³VM constructs a support vector machine using both the training and working sets. We use S³VM to solve the transductive inference problem posed by Vapnik. In transduction, the task is to estimate the value of a classification function at the given points in the working set. This contrasts with inductive inference which estimates the classification function at all possible values. We propose a general S³VM model that minimizes both the misclassification error and the function capacity based on all the available data. Depending on how poorly-estimated unlabeled data are penalized, different mathematical models result. We examine several practical algorithms for solving these model. The first approach utilizes the S³VM model for 1-norm linear support vector machines converted to a mixed-integer program (MIP). A global solution of the MIP is found using a commercial integer programming solver. The second approach uses a nonconvex quadratic program. Variations of block-coordinate-descent algorithms are used to find local solutions of this problem. Using this MIP within a local learning algorithm produced the best results. Our experimental study on these statistical learning methods indicates that incorporating working data can improve generalization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. G. Atkeson, A. W. Moore, and S. Schaal. Locally weighted learning. Artificial Intelligence Review, 11:11–73, 1997.
Article Google Scholar
K. P. Bennett. Global tree optimization: a non-greedy decision tree algorithm. Computing Science and Statistics, 26:156–160, 1994.
Google Scholar
K. P. Bennett. Combining support vector and mathematical programming methods for classification. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods — Support Vector Machines, pages 307–326, Cambridge, MA, 1999. MIT Press.
Google Scholar
K. P. Bennett and E. J. Bredensteiner. Geometry i learning. Web manuscript, Rensselaer Polytechnic Institute, http://www.rpi.edu/~bennek/geometry2.ps, 1996. Accepted for publication in Geometry at Work, C. Gorini et al, editors, MAA Press.
Google Scholar
K. P. Bennett and A. Demiriz. Semi-supervised support vector machines. In D. Cohn M. Kearns, S. Solla, editor, Advances in Neural Information Processing Systems, pages 368–374, Cambridge, MA, 1999. MIT Press.
Google Scholar
K. P. Bennett and O. L. Mangasarian. Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 1:23–34, 1992.
Article Google Scholar
K. P. Bennett and O. L. Mangasarian. Bilinear separation in n-space. Computational Optimization and Applications, 4(4):207–227, 1993.
Article MathSciNet Google Scholar
D. P. Bertsekas. Nonlinear Programming. Aethena Scientific, Cambridge, MA, 1996.
Google Scholar
J. Blue. A hybrid of tabu search and local descent algorithms with applications in artificial intelligence. PhD thesis, Rensselaer Polytechnic Institute, Troy, NY, 1998.
Google Scholar
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the 1998 Conference on Computational Learning Theory, Madison WI, 1998. ACM Inc.
Google Scholar
E. J. Bredensteiner and K. P. Bennett. Feature minimization within decision trees. Computational Optimization and Applications, 10:110–126, 1997.
MathSciNet Google Scholar
V. Castelli and T. M. Cover. On the exponential value of labeled samples. Pattern Recognition Letters, 16:105–111, 1995.
Article Google Scholar
Z. Cataltepe and M. Magdon-Ismail. Incorporating test inputs into learning. In Proceedings of the Advances in Neural Information Processing Systems, 10, Cambridge, MA, 1997. MIT Press.
Google Scholar
C. Cortes and V. N. Vapnik. Support vector networks. Machine Learning, 20:273–297, 1995.
MATH Google Scholar
CPLEX Optimization Incorporated, Incline Village, Nevada. Using the CPLEX Callable Library, 1994.
Google Scholar
R. Fourer, D. Gay, and B. Kernighan. AMPL A Modeling Language for Mathematical Programming. Boyd and Frazer, Danvers, MA, 1993.
Google Scholar
T. Hastie and R. Tibshirani. Discriminant adaptive nearest neighbor classification. IEEE PAMI, 18:607–616, 1996.
Article Google Scholar
T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In European Conference on Machine Learning(ECML), 1998.
Google Scholar
T. Joachims. Transductive inference for text classification using support vector machines. In International Conference on Machine Learning, 1999.
Google Scholar
S. Lawrence, A. C. Tsoi, and A. D. Back. Function approximation with neural networks and local methods: Bias, variance and smoothness. In Peter Bartlett, Anthony Burkitt, and Robert Williamson, editors, Australian Conference on Neural Networks, ACNN 96, pages 16–21. Australian National University, 1996.
Google Scholar
O. L. Mangasarian. Arbitrary norm separating plane. Operations Research Letters, 24(1–2), 1999.
Google Scholar
O. L. Mangasarian. Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 135–146, Cambridge, MA, 2000. MIT Press. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98–14.ps.
Google Scholar
A. McCallum and K. Nigam. Employing em and pool-based active learning for text classification. In Proceedings of the 15th International Conference on Machine Learning (ICML-98), 1998.
Google Scholar
P.M. Murphy and D.W. Aha. UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, California, 1992.
Google Scholar
D. R. Musser and A. Saini. STL Tutorial and Reference Guide: C++ Programming with the Standard Template Library. Addison-Wesley, 1996.
Google Scholar
K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Learning to classify text from labeled and unlabeled documents. In Proceedings of the 15th National Conference on Artificial Intelligence (AAAI-98), 1998.
Google Scholar
S. Odewahn, E. Stockwell, R. Pennington, R Humphreys, and W Zumach. Automated star/galaxy discrimination with neural networks. Astronomical Journal, 103(1):318–331, 1992.
Article Google Scholar
V. N. Vapnik. Estimation of dependencies based on empirical Data. Springer, New York, 1982. English translation, Russian version 1979.
Google Scholar
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, New York, 1995.
MATH Google Scholar
V. N. Vapnik. Statistical Learning Theory. Wiley Inter-Science, 1998.
MATH Google Scholar
V. N. Vapnik and A. Ja. Chervonenkis. Theory of Pattern Recognition. Nauka, Moscow, 1974. In Russian.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
Ayhan Demiriz
Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
Kristin P. Bennett

Authors

Ayhan Demiriz
View author publications
You can also search for this author in PubMed Google Scholar
Kristin P. Bennett
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, USA
Michael C. Ferris & Olvi L. Mangasarian &
Department of Mathematical Sciences, The Johns Hopkins University, Baltimore, Maryland, USA
Jong-Shi Pang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Demiriz, A., Bennett, K.P. (2001). Optimization Approaches to Semi-Supervised Learning. In: Ferris, M.C., Mangasarian, O.L., Pang, JS. (eds) Complementarity: Applications, Algorithms and Extensions. Applied Optimization, vol 50. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-3279-5_6

Download citation

DOI: https://doi.org/10.1007/978-1-4757-3279-5_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-4847-2
Online ISBN: 978-1-4757-3279-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics