Clustering Analysis for Semi-supervised Learning Improves Classification Performance of Digital Pathology
Purpose: Completely labeled datasets of pathology slides are often difficult and time consuming to obtain. Semi-supervised learning methods are able to learn reliable models from small number of labeled instances and large quantities of unlabeled data. In this paper, we explored the potential of clustering analysis for semi-supervised support vector machine (SVM) classifier. Method: A clustering analysis method was proposed to find regions of high density prior to finding the decision boundary using a supervised SVM and was compared with another state-of-the-art semi-supervised technique. Different percentages of labeled instances were used to train supervised and semi-supervised SVM learners from an image dataset generated from 50 whole-mount images (8 patients) of breast specimen. Their cross-validated classification performances were compared with each other using the area under the ROC curve measure. Result: Our proposed clustering analysis for semi-supervised learning was able to produce a reliable classification model from small amounts of labeled data. Comparing the proposed method in this study with a well-known implementation of semi-supervised SVM, our method performed much faster and produced better results.
Unable to display preview. Download preview PDF.
- 1.Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: ordering points to identify the clustering structure. In: ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM Press (1999)Google Scholar
- 3.Chapelle, O., Schölkopf, B.: Semi-Supervised Learning. The MIT Press, September 2006Google Scholar
- 4.Chapelle, O., Sindhwani, V., Keerthi, S.: Branch and bound for semi-supervised support vector machines. In: Advances in Neural Information Processing Systems (NIPS) (2006)Google Scholar
- 6.Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: Tenth International Workshop on Artificial Intelligence and Statistics (AISTAT 2005) (2005)Google Scholar
- 7.Chapelle, O., Zien, A.: A continuation method for semi-supervised SVMs. In: International Conference on Machine Learning (2006)Google Scholar
- 10.Helmi, H., Teck, D., Lai, C., Garibaldi, J.M.: Semi-supervised techniques in breast cancer classification. In: 12th Annual Workshop on Computational Intelligence (UKCI) (2012)Google Scholar
- 11.Joachims, T., Dortmund, U., Joachimscsuni-Dortmundde, T.: Advances in kernel methods. In: Support Vector Learning, pp. 169–184 (1999)Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.