Spatial Distribution Based Provisional Disease Diagnosis in Remote Healthcare

  • Indrani BhattacharyaEmail author
  • Jaya Sil
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10597)


Patients in rural India cannot able to enquire about their health using appropriate disease related keywords, submitted as query. Lack of domain knowledge prevents the patients to refine the query using well-known feedback mechanism. Moreover, due to scarcity of doctors in rural India, the health assistants who run the health centers do not have enough knowledge to treat the patients based on the imprecise query. In the paper, we propose an autonomous provisional disease diagnosis system by classifying the query, which has been expanded using semantic of the domain knowledge. First, we apply spatial distribution based nearest neighbor spacing distribution (NNSD) on the disease related medical document corpus (MDC) to find the relevant terms, mostly symptoms with respect to different diseases. We frame a symptom vocabulary (SV) with the unique terms present in different diseases, known apriori. Each query is expanded as bag of symptoms (BoS) using 5-gram collocation model and log likelihood ratio (LLR) to measure the association between the query and the terms in the MDC. The terms in the BoS may not exactly match with the symptoms in the SV but have contextual similarity. We propose a novel approach to know which symptoms in the SV are nearest in context to the corresponding terms in the BoS. The feature vector is obtained by encoding the SV with respect to (w.r.t.) each BoS, which is sparse in nature. We apply sparse representation based classifier (SRC) to classify the query into a particular disease. Proposed nearest neighbor spacing distribution based sparse representation classifier (NNSD-SRC) shows promising performance considering MDC dataset and we validate the results with the doctors showing negligible error.


Spatial distribution Provisional diagnosis Sparse classifier 



This research was supported by grants from Information Technology Research Academy (ITRA), under the Department of Electronics and Information Technology (DeitY), Government of India.


  1. 1.
    Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. (CSUR) 44(1), 1 (2012)CrossRefzbMATHGoogle Scholar
  2. 2.
    Sil, J., Bhattacharya, I.: Patient classification based on expanded query using 5-gram collocation and binary tree. In: IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2015 36678 2015, pp. 1–10. IEEE (2015)Google Scholar
  3. 3.
    Mehta, M.L.: Random Matrices, vol. 142. Academic Press, Amsterdam (2004)zbMATHGoogle Scholar
  4. 4.
    Carpena, P., Bernaola-Galván, P., Hackenberg, M., Coronado, A.V., Oliver, J.L.: Level statistics of words: Finding keywords in literary texts and symbolic sequences. Phys. Rev. E 79(3), 035102 (2009)CrossRefGoogle Scholar
  5. 5.
    Ramos, J.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning (2003)Google Scholar
  6. 6.
    Pauls, A., Klein, D.: Faster and smaller N-gram language models. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 258–267. Association for Computational Linguistics (2011)Google Scholar
  7. 7.
    Yang, J., Chu, D., Zhang, L., Xu, Y., Yang, J.: Sparse representation classifier steered discriminative projection with applications to face recognition. IEEE Trans. Neural Netw. Learn. Syst. 24(7), 1023–1035 (2013)CrossRefGoogle Scholar
  8. 8.
    Donoho, D.L., Tsaig, Y.: Fast solution of-norm minimization problems when the solution may be sparse. IEEE Trans. Inf. Theor. 54(11), 4789–4812 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  9. 9.
    Bhattacharya, I., Sil, J.: Query classification using LDA topic model and sparse representation based classifier. In: 2016 Proceedings of the 3rd IKDD Conference on Data Science, p. 24. ACM, March 2016Google Scholar
  10. 10.
    Harrison’s Principles of Internal Medicine, vol. 2. McGraw-Hill Medical, New York (2008)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer Science and TechnologyIndian Institute of Engineering Science and TechnologyShibpurIndia

Personalised recommendations