Abstract
Most classification methods assume that the samples are drawn independently and identically from an unknown data generating distribution, yet this assumption is violated in several real life problems. In order to relax this assumption, we consider the case where batches or groups of samples may have internal correlations, whereas the samples from different batches may be considered to be uncorrelated. Two algorithms are developed to classify all the samples in a batch jointly, one based on a probabilistic analysis and another based on a mathematical programming approach. Experiments on three real-life computer aided diagnosis (CAD) problems demonstrate that the proposed algorithms are significantly more accurate than a naive SVM which ignores the correlations among the samples.
Chapter PDF
Similar content being viewed by others
Keywords
- Support Vector Machine
- Pulmonary Embolism
- Training Point
- Multiple Instance Learning
- Mathematical Programming Approach
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bogoni, L., Cathier, P., Dundar, M., Jerebko, A., Lakare, S., Liang, J., Periaswamy, S., Baker, M., Macari, M.: CAD for colonography: A tool to address a growing need. British Journal of Radiology 78, 57–62 (2005)
Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Proc. 15th International Conf. on Machine Learning, pp. 82–90. Morgan Kaufmann, San Francisco (1998)
Jemal, D., Tiwari, R., Murray, T., Ghafoor, A., Saumuels, A., Ward, E., Feuer, E., Thun, M.: Cancer statistics (2004)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)
Mangasarian, O.L.: Generalized support vector machines. In: Advances in Large Margin Classifiers, pp. 135–146 (2000)
Quist, M., Bouma, H., Kuijk, C.V., Delden, O.V., Gerritsen, F.: Computer aided detection of pulmonary embolism on multi-detector CT (2004)
Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: Advances in Neural Information Processing Systems, vol. 16. MIT Press, Cambridge (2004)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Zhou, C., Hadjiiski, L.M., Sahiner, B., Chan, H.-P., Patel, S., Cascade, P., Kazerooni, E.A., Wei, J.: Computerized detection of pulmonary embolism in 3D computed tomographic (CT) images: vessel tracking and segmentation techniques. In: Medical Imaging 2003: Image Processing. Proceedings of the SPIE, vol. 5032, pp. 1613–1620 (May 2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vural, V., Fung, G., Krishnapuram, B., Dy, J., Rao, B. (2006). Batch Classification with Applications in Computer Aided Diagnosis. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science(), vol 4212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871842_43
Download citation
DOI: https://doi.org/10.1007/11871842_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45375-8
Online ISBN: 978-3-540-46056-5
eBook Packages: Computer ScienceComputer Science (R0)