Abstract
The identification of breast cancer patients for whom chemotherapy could prolong survival time is treated here as a data mining problem. This identification is achieved by clustering 253 breast cancer patients into three prognostic groups: Good, Poor and Intermediate. Each of the three groups has a significantly distinct Kaplan-Meier survival curve. Of particular significance is the Intermediate group, because patients with chemotherapy in this group do better than those without chemotherapy in the same group. This is the reverse case to that of the overall population of 253 patients for which patients undergoing chemotherapy have worse survival than those who do not. We also prescribe a procedure that utilizes three nonlinear smooth support vector machines (SSVMs) for classifying breast cancer patients into the three above prognostic groups. These results suggest that the patients in the Good group should not receive chemotherapy while those in the Intermediate group should receive chemotherapy based on our survival curve analysis. To our knowledge this is the first instance of a classifiable group of breast cancer patients for which chemotherapy can possibly enhance survival.
Similar content being viewed by others
References
P.S. Bradley and O.L. Mangasarian, “Feature selection via concave minimization and support vector machines,” in Machine Learning Proceedings of the Fifteenth International Conference (ICML'98), J. Shavlik (Ed.), Morgan Kaufmann: San Francisco, CA, 1998, pp. 82-90. ftp://ftp.cs.wisc.edu/mathprog/ tech-reports/98-03.ps.
P.S. Bradley, O.L. Mangasarian, and W.N. Street, “Clustering via concave minimization,” in Advances in Neural Information Processing Systems-9-, M.C. Mozer, M.I. Jordan, and T. Petsche (Eds.), MIT Press: Cambridge, MA, 1997, pp. 368-374. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/96-03.ps.
C. Chen and O.L. Mangasarian, “Smoothing methods for convex inequalities and linear complementarity problems,” Mathematical Programming, vol. 71, no. 1, pp. 51-69, 1995.
C. Chen and O.L. Mangasarian, “A class of smoothing functions for nonlinear and mixed complementarity problems,” Computational Optimization and Applications, vol. 5, no. 2, pp. 97-138, 1996.
V. Cherkassky and F. Mulier, Learning from Data-Concepts, Theory and Methods. John Wiley & Sons: New York, 1998.
E.L. Kaplan and P. Meier, “Nonparametric estimation from incomplete observations,” Journal of the American Statistical Association, vol. 53, pp. 457-481, 1958.
D.G. Kleinbaum, Survival Analysis, Springer-Verlag: New York, 1996.
Y.-J. Lee, O.L. Mangasarian, and W.H. Wolberg, “Breast cancer survival and chemotherapy: A support vector machine analysis,” Technical Report 99-10, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, December 1999. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, American Mathematical Society, Volume 55, 2000, 1-10. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/99-10.ps.
Y.-J. Lee and O.L. Mangasarian, “SSVM: A smooth support vector machine,” Computational Optimization and Applications, vol. 20, pp, 5-22, 2001. Data Mining Institute, University of Wisconsin, Technical Report 99-03. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/99-03.ps.
O.L. Mangasarian, “Mathematical programming in neural networks,” ORSA Journal on Computing, vol. 5, no. 4, pp. 349-360, 1993.
O.L. Mangasarian, “Generalized support vector machines,” in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans (Eds.), MIT Press: Cambridge, MA, 2000, pp. 135-146. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps.
O.L. Mangasarian and D.R. Musicant, “Successive overrelaxation for support vector machines,” IEEE Transactions on Neural Networks, vol. 10, pp. 1032-1037, 1999. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-18.ps.
O.L. Mangasarian, W.N. Street, and W.H. Wolberg, “Breast cancer diagnosis and prognosis via linear programming,” Operations Research, vol. 43, no. 4, pp. 570-577, 1995.
MATLAB, User's Guide. The MathWorks, Inc., Natick,MA01760, 1994-2001. http://www.mathworks.com.
J.C. Platt, N. Cristianini, and J. Shawe-Taylor, “Large margin dags for multiclass classification,” Advances in Neural Information Processing Systems (NIPS2000), vol. 12, pp. 547-553, 2000.
W.N. Street, O.L. Mangasarian, and W.H. Wolberg, “An inductive learning approach to prognostic prediction,” in Machine Learning: Proceedings of the Twelfth International Conference, A. Prieditis and S. Russell (Eds.), Morgan Kaufmann: San Francisco, 1995, pp. 522-530.
W.N. Street, W.H. Wolberg, and O.L. Mangasarian, “Nuclear feature extraction for breast tumor diagnosis,” in Biomedical Image Processing and Biomedical Visualization, vol. 1905, pp. 861-870, San Jose, CA, 1993. SPIE—The International Society for Optical Engineering.
V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd edn. Springer: New York, 2000.
W.H. Wolberg, Y.-J. Lee, and O.L. Mangasarian, “WPBCC:Wisconsin Prognostic Breast Cancer Chemotherapy Database,” Computer Sciences Department, University ofWisconsin, Madison, ftp://ftp.cs.wisc.edu/mathprog/ cpo-dataset/machine-learn/cancer/WPBCC/, 1999.
W.H. Wolberg, W.N. Street, and O.L. Mangasarian, “Breast cytology diagnosis via digital image analysis,” Analytical and Quantitative Cytology and Histology, vol. 15, no. 6, pp. 396-404, 1993.
W.H. Wolberg, W.N. Street, and O.L. Mangasarian, “Machine learning techniques to diagnose breast cancer from image-processed nuclear features of fine needle aspirates,” Cancer Letters, vol. 77, pp. 163-171, 1994.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lee, YJ., Mangasarian, O. & Wolberg, W. Survival-Time Classification of Breast Cancer Patients. Computational Optimization and Applications 25, 151–166 (2003). https://doi.org/10.1023/A:1022953004360
Issue Date:
DOI: https://doi.org/10.1023/A:1022953004360