Skip to main content

Mathematical Programming in Machine Learning

  • Chapter

Abstract

We describe in this work a number of central problems of machine learning and show how they can be modeled and solved as mathematical programs of various complexity.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. P. Bennett, “Decision tree construction via linear programming”. In M. Evans, editor, Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society Conference, pages 97-101, Utica, Illinois, 1992.

    Google Scholar 

  2. K. P. Bennett and E. J. Bredensteiner, “A parametric optimization method for machine learning”. Department of Mathematical Sciences Report No. 217, Rensselaer Polytechnic Institute, Troy, NY 12180, 1994.

    Google Scholar 

  3. K. P. Bennett and O. L. Mangasarian, “Neural network training via linear programming”. In P. M. Pardalos, editor, Advances in Optimization and Parallel Computing, pages 56-67, Amsterdam, 1992. North Holland.

    Google Scholar 

  4. K.P. Bennett and O.L. Mangasarian, “Robust linear programming discrimination of two linearly inseparable sets”. Optimization Methods and Software, 1:23–34, 1992.

    Article  Google Scholar 

  5. K.P. Bennett and O.L. Mangasarian, “Bilinear separation of two sets in n-space”. Computational Optimization & Applications, 2:207–227, 1993.

    Article  MathSciNet  MATH  Google Scholar 

  6. A. Charnes, “Some fundamental theorems of perceptron theory and their geometry”. In J. T. Lou and R. H. Wilcox, editors, Computer and Information Sciences, pages 67-74, Washington, D.C., 1964. Spartan Books.

    Google Scholar 

  7. Chunhui Chen and O. L. Mangasarian, “Hybrid misclassification minimization”. Technical Report 95-05, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, February 1995. Advances in Computational Mathematics, submitted. Available from ftp://ftp.cs.wisc.edu/math-prog/tech-reports/95-05.ps.Z.

    Google Scholar 

  8. M. Frank and P. Wolfe, “An algorithm for quadratic programming”. Naval Research Logistics Quarterly, 3:95–110, 1956.

    Article  MathSciNet  Google Scholar 

  9. G.M. Georgiou, “Comments on hidden nodes in neural nets”. IEEE Transactions on Circuits and Systems, 38:1410, 1991.

    Article  Google Scholar 

  10. David Heath, “A geometric Framework for Machine Learning”. PhD thesis, Department of Computer Science, Johns Hopkins University—Baltimore, Maryland, 1992.

    Google Scholar 

  11. J. Hertz, A. Krogh, and R.G. Palmer, “Introduction to the Theory of Neural Computation”. Addison-Wesley, Redwood City, California, 1991.

    Google Scholar 

  12. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators”. Neural Networks, 2:359–366, 1989.

    Article  Google Scholar 

  13. Y. le Cun, J. S. Denker, and S. A. Solla, “Optimal brain damage”. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems II (Denver 1989), pages 598-605, San Mateo, California, 1990. Morgan Kaufmann.

    Google Scholar 

  14. Z.-Q. Luo, J.-S. Pang, D. Ralph, and S.-Q. Wu, “Exact penalization and stationarity conditions of mathematical programs with equilibrium constraints”. Technical Report 275, Communications Research Laboratory, McMaster University, Hamilton, Ontario, Hamilton, Ontario L8S 4K1, Canada, 1993. Mathematical Programming, to appear.

    Google Scholar 

  15. O.L. Mangasarian, “Linear and nonlinear separation of patterns by linear programming”. Operations Research, 13:444–452, 1965.

    Article  MathSciNet  MATH  Google Scholar 

  16. O.L. Mangasarian, “Multi-surface method of pattern separation”. IEEE Transactions on Information Theory, IT-14:801–807, 1968.

    Article  Google Scholar 

  17. O.L. Mangasarian, “Mathematical programming in neural networks”. ORSA Journal on Computing, 5(4):349–360, 1993.

    Article  MathSciNet  MATH  Google Scholar 

  18. O.L. Mangasarian, “Misclassification minimization”. Journal of Global Optimization, 5:309–323, 1994.

    Article  MathSciNet  MATH  Google Scholar 

  19. O. L. Mangasarian, R. Setiono, and W. H. Wolberg. “Pattern recognition via linear programming: Theory and application to medical diagnosis”. In T. F. Coleman and Y. Li, editors, Large-Scale Numerical Optimization, pages 22-31, Philadelphia, Pennsylvania, 1990. SLAM. Proceedings of the Workshop on Large-Scale Numerical Optimization, Cornell University, Ithaca, New York, October 19–20, 1989.

    Google Scholar 

  20. O.L. Mangasarian and M.V. Solodov, “Serial and parallel backpropagation convergence via nonmonotone perturbed minimization”. Optimization Methods and Software, 4(2):103–116, 1994.

    Article  Google Scholar 

  21. O. L. Mangasarian, W. Nick Street, and W. H. Wolberg, “Breast cancer diagnosis and prognosis via linear programming”. Technical Report 94-10, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin 53706, 1994. Operations Research 43(4) 1995, to appear. Available from ftp://ftp.cs.wisc.edu/math-prog/tech-reports/94-10.ps.Z.

    Google Scholar 

  22. M. Minsky and S. Papert, “Perceptrons: An Introduction to Computational Geometry”. MIT Press, Cambridge, Massachusetts, 1969.

    MATH  Google Scholar 

  23. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation”. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed Processing, pages 318-362, Cambridge, Massachusetts, 1986. MIT Press.

    Google Scholar 

  24. C. Schaffer, “Overfitting avoidance as bias”. Machine Learning, 10:153–178, 1993.

    Google Scholar 

  25. M. V. Solodov and S. K. Zavriev, “Stability properties of the gradient projection method with applications to the backpropagation algorithm”. Computer Sciences Department, Mathematical Programming Technical Report 94-05, University of Wisconsin, Madison, Wisconsin, June 1994. SIAM Journal on Optimization, submitted.

    Google Scholar 

  26. M. Stone, “Cross-validatory choice and assessment of statistical predictions”. Journal of the Royal Statistical Society, 36:111–147, 1974.

    MATH  Google Scholar 

  27. W. Nick Street and O. L. Mangasarian, “Improved generalization via tolerant training”. Technical report, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, 1995. To appear.

    Google Scholar 

  28. D. H. Wolpert, editor, “The Mathematics of Generalization”, Reading, MA, 1995. Addison-Wesley.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer Science+Business Media New York

About this chapter

Cite this chapter

Mangasarian, O.L. (1996). Mathematical Programming in Machine Learning. In: Di Pillo, G., Giannessi, F. (eds) Nonlinear Optimization and Applications. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-0289-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-0289-4_20

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4899-0291-7

  • Online ISBN: 978-1-4899-0289-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics