Clustering Analysis for Semi-supervised Learning Improves Classification Performance of Digital Pathology

  • Mohammad PeikariEmail author
  • Judit Zubovits
  • Gina Clarke
  • Anne L. Martel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9352)


Purpose: Completely labeled datasets of pathology slides are often difficult and time consuming to obtain. Semi-supervised learning methods are able to learn reliable models from small number of labeled instances and large quantities of unlabeled data. In this paper, we explored the potential of clustering analysis for semi-supervised support vector machine (SVM) classifier. Method: A clustering analysis method was proposed to find regions of high density prior to finding the decision boundary using a supervised SVM and was compared with another state-of-the-art semi-supervised technique. Different percentages of labeled instances were used to train supervised and semi-supervised SVM learners from an image dataset generated from 50 whole-mount images (8 patients) of breast specimen. Their cross-validated classification performances were compared with each other using the area under the ROC curve measure. Result: Our proposed clustering analysis for semi-supervised learning was able to produce a reliable classification model from small amounts of labeled data. Comparing the proposed method in this study with a well-known implementation of semi-supervised SVM, our method performed much faster and produced better results.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: ordering points to identify the clustering structure. In: ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM Press (1999)Google Scholar
  2. 2.
    Chang, C.C., Lin, C.J.: LIBSVM : A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2(3), 27:1–27:27 (2011)CrossRefGoogle Scholar
  3. 3.
    Chapelle, O., Schölkopf, B.: Semi-Supervised Learning. The MIT Press, September 2006Google Scholar
  4. 4.
    Chapelle, O., Sindhwani, V., Keerthi, S.: Branch and bound for semi-supervised support vector machines. In: Advances in Neural Information Processing Systems (NIPS) (2006)Google Scholar
  5. 5.
    Chapelle, O., Sindhwani, V., Keerthi, S.: Optimization Techniques for Semi-Supervised Support Vector Machines. Journal of Machine Learning Research 9, 203–233 (2008)zbMATHGoogle Scholar
  6. 6.
    Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: Tenth International Workshop on Artificial Intelligence and Statistics (AISTAT 2005) (2005)Google Scholar
  7. 7.
    Chapelle, O., Zien, A.: A continuation method for semi-supervised SVMs. In: International Conference on Machine Learning (2006)Google Scholar
  8. 8.
    Gan, H., Sang, N., Huang, R., Tong, X., Dan, Z.: Using clustering analysis to improve semi-supervised classification. Neurocomputing 101, 290–298 (2013)CrossRefGoogle Scholar
  9. 9.
    Geusebroek, J.M., Smeulders, A.W.M., van de Weijer, J.: Fast anisotropic Gauss filtering. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society 12(8), 938–943 (2003)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Helmi, H., Teck, D., Lai, C., Garibaldi, J.M.: Semi-supervised techniques in breast cancer classification. In: 12th Annual Workshop on Computational Intelligence (UKCI) (2012)Google Scholar
  11. 11.
    Joachims, T., Dortmund, U., Joachimscsuni-Dortmundde, T.: Advances in kernel methods. In: Support Vector Learning, pp. 169–184 (1999)Google Scholar
  12. 12.
    Shi, M., Zhang, B.: Semi-supervised learning improves gene expression-based prediction of cancer recurrence. Bioinformatics 27(21), 3017–3023 (2011)CrossRefGoogle Scholar
  13. 13.
    Yuille, A.L., Rangarajan, A.: The Concave-Convex Procedure (CCCP). Neural Computation 15(2), 915–936 (2003)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  • Mohammad Peikari
    • 1
    Email author
  • Judit Zubovits
    • 2
  • Gina Clarke
    • 3
  • Anne L. Martel
    • 1
    • 3
  1. 1.Medical BiophysicsUniversity of TorontoTorontoCanada
  2. 2.Faculty of MedicineUniversity of TorontoTorontoCanada
  3. 3.Physical SciencesSunnybrook Research InstituteTorontoCanada

Personalised recommendations