Diabetes-Related Topic Detection in Chinese Health Websites Using Deep Learning

  • Xinhuan Chen
  • Yong Zhang
  • Chunxiao Xing
  • Xiao Liu
  • Hsinchun Chen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8549)


With 98.4 million people diagnosed with diabetes in China, most of the Chinese health websites provide diabetes related news and articles in diabetes subsection for patients. However, most of the articles are uncategorized and without a clear topic or theme, resulting in time consuming information seeking experience. To address this issue, we propose an advanced deep learning approach to detect topics for diabetes related articles from health websites. Our research framework for topic detection on diabetes related articles in Chinese is the first one to incorporate deep learning in topic detection in Chinese. It can identify topics of diabetes articles with high performance and potentially assist health information seeking. To evaluate our framework, experiment is conducted on a test bed of 12,000 articles. The results showed the framework achieved an accuracy of 70% in detecting topics and significantly outperformed the SVM based approach.


classification topic detection diabetes Chinese deep learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Monnier, J., Laken, M., Carter, C.L.: Patient and Caregiver Interest in Internet-Based Cancer Services. Cancer Practice 10(6), 305–310 (2002)CrossRefGoogle Scholar
  2. 2.
  3. 3.
    Deep Learning Tutorials by LISA lab,
  4. 4.
    Socher, R., Bengio, Y., Manning, C.D.: Deep learning for NLP (without magic).Tutorial Abstracts of ACL. p. 5. Association for Computational Linguistics (2012)Google Scholar
  5. 5.
    Gouws, S.: Deep unsupervised feature learning for natural language processing. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics, pp. 48–53 (2012)Google Scholar
  6. 6.
    Lu, Y., Zhang, P., Liu, J., et al.: Health-related hot topic detection in online communities using text clustering. PloS One 8(2), e56221 (2013)Google Scholar
  7. 7.
    Weitzman, E.R., Cole, E., Kaci, L., et al.: Social but safe? Quality and safety of diabetes-related online social networks. JAMIA 18(3), 292–297 (2011)Google Scholar
  8. 8.
    Shrank, W.H., Choudhry, N.K., Swanton, K., et al.: Variations in structure and content of online social networks for patients with diabetes. Archives of Internal Medicine 171(17), 1589–1591 (2011)CrossRefGoogle Scholar
  9. 9.
    Greene, J.A., Choudhry, N.K., Kilabuk, E., et al.: Online social networking by patients with diabetes: a qualitative evaluation of communication with Facebook. Journal of General Internal Medicine 26(3), 287–292 (2011)CrossRefGoogle Scholar
  10. 10.
    Klemm, P., Nolan, M.T.: Internet cancer support groups: legal and ethical issues for nurse researchers. Oncology Nursing Forum 25(4), 673–676 (1998)Google Scholar
  11. 11.
    Basch, E.M., Thaler, H.T., Shi, W., et al.: Use of information resources by patients with cancer and their companions. Cancer 100(11), 2476–2483 (2004)CrossRefGoogle Scholar
  12. 12.
    Li, N., Wu, D.D.: Using text mining and sentiment analysis for online forums hotspot detection and forecast. Decision Support Systems 48(2), 354–368 (2010)CrossRefGoogle Scholar
  13. 13.
    Lin, Y., Li, W., Chen, K., et al.: A document clustering and ranking system for exploring MEDLINE citations. JAMIA 14(5), 651–661 (2007)Google Scholar
  14. 14.
    Kandula, S., Curtis, D., Hill, B., et al.: Use of topic modeling for recommending relevant education material to diabetic patients. In: AMIA, vol. 2011, p. 674 (2011)Google Scholar
  15. 15.
    Brody, S., Elhadad, N.: Detecting salient aspects in online reviews of health providers. In: AMIA, vol. 2010, p. 202 (2010)Google Scholar
  16. 16.
    Tamilselvan, P., Wang, P.: Failure diagnosis using deep belief learning based health state classification. Reliability Engineering & System Safety 115, 124–135 (2013)CrossRefGoogle Scholar
  17. 17.
    Wang, B., Liu, B., Wang, X., et al.: Deep learning approaches to semantic relevance modeling for chinese question-answer pairs. TALIP 10(4), 21 (2011)CrossRefGoogle Scholar
  18. 18.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  19. 19.
    Hinton, G.: A practical guide to training restricted Boltzmann machines. Momentum 9(1), 926 (2010)Google Scholar
  20. 20.
    Salakhutdinov, R., Hinton, G.: Semantic hashing. International Journal of Approximate Reasoning 50(7), 969–978 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Xinhuan Chen
    • 1
  • Yong Zhang
    • 1
  • Chunxiao Xing
    • 1
  • Xiao Liu
    • 2
  • Hsinchun Chen
    • 2
  1. 1.Research Institute of Information Technology, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and TechnologyTsinghua UniversityBeijingChina
  2. 2.MIS DepartmentUniversity of ArizonaUnited States

Personalised recommendations