Annals of Operations Research

, Volume 188, Issue 1, pp 215–249 | Cite as

A new column generation algorithm for Logical Analysis of Data

Article

Abstract

We present a new column generation algorithm for the determination of a classifier in the two classes LAD (Logical Analysis of Data) model. Unlike existing algorithms who seek a classifier that at the same time maximizes the margin of correctly classified observations and minimizes the amount of violations of incorrectly classified observations, we fix the margin to a difficult-to-achieve target and minimize a piecewise convex linear function of the violation of incorrectly classified observations. Moreover a part of the training set, called control set, is reserved to select, among all feasible classifiers found by the algorithm, the one with highest performance on that set. One advantage of the proposed algorithm is that it essentially does not require any calibration. Computational results are presented that show the effectiveness of this approach.

Keywords

Logical analysis of data Column generation Classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ben-David, S., Eiron, N., & Long, P. M. (2003). On the difficulty of approximately maximizing agreements. Journal of Computer and System Sciences, 66(3), 496–514. CrossRefGoogle Scholar
  2. Bennett, K. P., & Mangasarian, O. L. (1992). Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods & Software, 1, 23–34. CrossRefGoogle Scholar
  3. Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press. Google Scholar
  4. Bonates, T. O. (2007). Optimization in logical analysis of data. PhD thesis, Rutgers. The State University of New Jersey. Google Scholar
  5. Bonates, T. O. (2010). Large margin rule-based classifiers. In J. J. Cochran (Ed.), Wiley encyclopedia of operations research and management science (pp. 1–12). New York: Wiley. Google Scholar
  6. Bonates, T. O. (2007). Personnal communication. Google Scholar
  7. Bonates, T. O., & Hammer, P. L. (2007a). A branch-and-bound algorithm for a family of pseudo-boolean optimization problems (Technical Report RRR 21-2007). Rutcor, July 2007. Google Scholar
  8. Bonates, T. O., & Hammer, P. L. (2007b). Large margin LAD classifiers (Technical Report RRR 22-2007). Rutcor, July 2007. Google Scholar
  9. Bonates, T. O., Hammer, P. L., & Kogan, A. (2008). Maximum patterns in datasets. Discrete Applied Mathematics, 156(6), 846–861. CrossRefGoogle Scholar
  10. Boros, E., Hammer, P. L., Ibaraki, T., Kogan, A., Mayoraz, E., & Muchnik, I. (2000). An implementation of logical analysis of data. IEEE Transactions on Knowledge and Data Engineering, 12(2), 292–306. CrossRefGoogle Scholar
  11. Bradley, P. S., & Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. In Proceedings of the fifteenth international conference on machine learning (pp. 82–90). San Francisco: Morgan Kaufmann. Google Scholar
  12. Carrizosa, E., Martin-Barragan, B., & Morales, D. R. (2010a). Binarized support vector machines. INFORMS Journal on Computing, 22(1), 154–167. CrossRefGoogle Scholar
  13. Carrizosa, E., Martin-Barragan, B., & Morales, D. R. (2010b). Detecting relevant variables and interactions in supervised classification. European Journal of Operational Research. doi: 10.1016/j.ejor.2010.03.020. In Press. Google Scholar
  14. Crama, Y., Hammer, P. L., & Ibaraki, T. (1988). Cause-effect relationships and partially defined Boolean functions. Annals of Operation Research, 16(1–4), 299–325. CrossRefGoogle Scholar
  15. Demiriz, A., Bennett, K. P., & Shawe-Taylor, J. (2002). Linear programming boosting via column generation. Machine Learning, 46, 225–254. CrossRefGoogle Scholar
  16. Eckstein, J., & Goldberg, N. (2009). An improved branch-and-bound method for maximum monomial agreement (Technical Report RRR 14). Rutcor, July 2009. Google Scholar
  17. Feldman, V., Gopalan, P., Khot, S., & Ponnuswami, A. (2009). On agnostic learning of parities, monomials and halfspaces. SIAM Journal on Computing, 39(2), 606–645. CrossRefGoogle Scholar
  18. Goldberg, N., & Shan, C. C. (2007). Boosting optimal logical patterns using noisy data. In Proceedings of the SIAM international conference on data mining (pp. 228–236). Google Scholar
  19. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. In SIGKDD Explorations (Vol. 11(1)). Google Scholar
  20. Hammer, P. L. (1986). Partially defined boolean functions and cause-effect relationships. In Proceedings international conf. multi-attribute decision making via OR-based expert systems, Passau, 1986. Google Scholar
  21. Hammer, P. L., & Bonates, T. O. (2006). Logical Analysis of Data—an overview: from combinatorial optimization to medical applications. Annals of Operation Research, 148, 203–225. CrossRefGoogle Scholar
  22. Hammer, P. L., Kogan, A., Simeone, B., & Szedmák, S. (2004). Pareto-optimal patterns in logical analysis of data. Discrete Applied Mathematics, 144(1–2), 79–102. CrossRefGoogle Scholar
  23. ILOG, CPLEX 10.1.1 documentation (2006). Ilog Cplex Optimization Inc. Google Scholar
  24. Kearns, M. J., Schapire, R. E., & Sellie, L. M. (1994). Toward efficient agnostic learning. Machine Learning, 17, 115–141. Google Scholar
  25. Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th international joint conference on artificial intelligence (IJCAI) (pp. 1137–1143). Google Scholar
  26. Mangasarian, O. L. (2005). Support vector machine classification via parameterless robust linear programming. Optimization Methods & Software, 20(1), 115–125. CrossRefGoogle Scholar
  27. Martin-Barragan, B. (2006). Mathematical programming for support vector machines. PhD thesis, Universidad de Sevilla. Google Scholar
  28. Mayoraz, E. (1996). C++ tools for logical analysis of data. Technical Report RTR 1-95, Rutgers University, July 1995. revised June 1996. Google Scholar
  29. Newman, D., Hettich, S., Blake, C., & Merz, C. (1998). UCI repository of machine learning databases. Google Scholar
  30. Prechelt, L. (1998). Early stopping—but when? In G. Orr & K.-R. Müller (Eds.), Lecture notes in computer science: Vol. 1524. Neural networks: tricks of the trade (pp. 55–69). Berlin: Springer. CrossRefGoogle Scholar
  31. Ryoo, H. S., & Jang, I.-Y. (2009). MILP approach to pattern generation in logical analysis of data. Discrete Applied Mathematics, 157(4), 749–761. CrossRefGoogle Scholar
  32. Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 297–336. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.GERAD & Méthodes Quantitatives de GestionHEC MontréalMontrealCanada
  2. 2.GERADHEC MontréalMontrealCanada

Personalised recommendations