Multimodal Deep Learning for Cervical Dysplasia Diagnosis

  • Tao Xu
  • Han Zhang
  • Xiaolei HuangEmail author
  • Shaoting Zhang
  • Dimitris N. Metaxas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9901)


To improve the diagnostic accuracy of cervical dysplasia, it is important to fuse multimodal information collected during a patient’s screening visit. However, current multimodal frameworks suffer from low sensitivity at high specificity levels, due to their limitations in learning correlations among highly heterogeneous modalities. In this paper, we design a deep learning framework for cervical dysplasia diagnosis by leveraging multimodal information. We first employ the convolutional neural network (CNN) to convert the low-level image data into a feature vector fusible with other non-image modalities. We then jointly learn the non-linear correlations among all modalities in a deep neural network. Our multimodal framework is an end-to-end deep network which can learn better complementary features from the image and non-image modalities. It automatically gives the final diagnosis for cervical dysplasia with 87.83 % sensitivity at 90 % specificity on a large dataset, which significantly outperforms methods using any single source of information alone and previous multimodal frameworks.


Cervical Intraepithelial Neoplasia Convolutional Neural Network Deep Neural Network Late Fusion Cervical Dysplasia 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Chang, S.K., Mirabal, Y.N., et al.: Combined reflectance and fluorescence spectroscopy for in vivo detection of cervical pre-cancer. J. Biomed. Optics 10(2), 024–031 (2005)CrossRefGoogle Scholar
  2. 2.
    Cireşan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Mitosis detection in breast cancer histology images with deep neural networks. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 411–418. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40763-5_51 CrossRefGoogle Scholar
  3. 3.
    DeSantis, T., Chakhtoura, N., Twiggs, L., Ferris, D., Lashgari, M., et al.: Spectroscopic imaging as a triage test for cervical disease: a prospective multicenter clinical trial. J. Lower Genital Tract Dis. 11(1), 18–24 (2007)CrossRefGoogle Scholar
  4. 4.
    Herrero, R., Schiffman, M., Bratti, C., et al.: Design and methods of a population-based natural history study of cervical neoplasia in a rural province of costa rica: the guanacaste project. Rev Panam Salud Publica 1, 362–375 (1997)CrossRefGoogle Scholar
  5. 5.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML, pp. 448–456 (2015)Google Scholar
  6. 6.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)Google Scholar
  7. 7.
    Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: ICML, pp. 689–696 (2011)Google Scholar
  8. 8.
    Roth, H.R., et al.: A new 2.5D representation for lymph node detection using random sets of deep convolutional neural network observations. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8673, pp. 520–527. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10404-1_65 Google Scholar
  9. 9.
    Shin, H., Orton, M., Collins, D.J., Doran, S.J., Leach, M.O.: Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data. TPAMI 35(8), 1930–1943 (2013)CrossRefGoogle Scholar
  10. 10.
    Song, D., Kim, E., Huang, X., Patruno, J., Munoz-Avila, H., Heflin, J., Long, L., Antani, S.: Multi-modal entity coreference for cervical dysplasia diagnosis. TMI 34(1), 229–245 (2015)Google Scholar
  11. 11.
    Suk, H., Lee, S., Shen, D.: Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. NeuroImage 101, 569–582 (2014)CrossRefGoogle Scholar
  12. 12.
    Suk, H.-I., Shen, D.: Deep learning-based feature representation for AD/MCI classification. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 583–590. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40763-5_72 CrossRefGoogle Scholar
  13. 13.
    WHO: Human papillomavirus and related cancers in the world. Summary report. ICO Information Centre on HPV and Cancer, August 2014Google Scholar
  14. 14.
    Xu, T., Huang, X., Kim, E., Long, L., Antani, S.: Multi-test cervical cancer diagnosis with missing data estimation. In: SPIE Medical Imaging, p. 94140X–94140X-8 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Tao Xu
    • 1
  • Han Zhang
    • 2
  • Xiaolei Huang
    • 1
    Email author
  • Shaoting Zhang
    • 3
  • Dimitris N. Metaxas
    • 2
  1. 1.Computer Science and Engineering DepartmentLehigh UniversityBethlehemUSA
  2. 2.Department of Computer ScienceRutgers UniversityPiscatawayUSA
  3. 3.Department of Computer ScienceUNC CharlotteCharlotteUSA

Personalised recommendations