Analyzing impact of parental occupation on child’s learning performance: a semantics-driven probabilistic approach


Scientific research on the effect of parent’s socioeconomic status on child’s learning performance is a popular topic from the last century. However, majority of these researches are based on traditional statistical models and involve subjective analysis of the survey data which often becomes impractical when the data volume is huge. Consequently, there remain ample scopes for developing improved formal methods that can replace the subjective analysis with appropriate embedding of data semantics, and thereby, can generate similar or even new insights into this domain. This paper proposes a semantically enhanced probabilistic model based on a variant of semantic Bayesian network to analyze the impact of the parents’ profession on their child’s learning performance. The novelty of the work lies in incorporating occupational semantics in the child-performance analysis model and developing a probabilistic framework to study the impact. Experimentation using real-world datasets from a number of state-level schools in India reveals interesting facts, especially regarding regional influence of parents’ occupation on the child performance. The results of comparative study demonstrate that the proposed model with embedded data semantics has better implication power than the techniques with no incorporated semantics.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7


  1. 1.


  1. 1.

    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., Choi, K., Noy, N., Allemang, D., Lee, K., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) The Semantic Web. ISWC 2007, ASWC 2007. Lecture Notes in Computer Science, vol. 4825, pp. 722–735. Springer, Berlin, Heidelberg (2007)

    Google Scholar 

  2. 2.

    Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)

  3. 3.

    Bradley, R.H., Corwyn, R.F.: Socioeconomic status and child development. Annu. Rev. Psychol. 53(1), 371–399 (2002)

    Google Scholar 

  4. 4.

    Das, M., Ghosh, S.K.: Detection of climate zones using multifractal detrended cross-correlation analysis: a spatio-temporal data mining approach. In: 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR), pp. 1–6. IEEE (2015)

  5. 5.

    Das, M., Ghosh, S.K.: semBnet: A semantic bayesian network for multivariate prediction of meteorological time series data. Pattern Recognit. Lett. 93, 192–201 (2017)

    Google Scholar 

  6. 6.

    Davis-Kean, P. E., Sexton, H. R., Magnuson, K. A.: How does parents’ educational level influence parenting and children’s achievement. In: U. o. M. CAPCA (Center for Analysis of Pathways from Childhood to Adulthood), editor. In Proc. CDS-II Early Results Workshop (2005). publication/237697198_How_Does_Parents’_Education_Level_Influence_Parenting_and_Children’s_Achievement/links/0c96052e164de3cb7a000000.pdf

  7. 7.

    Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601–610. ACM (2014)

  8. 8.

    Fan, X.: Parental involvement and students’ academic achievement: a growth modeling analysis. J. Exp. Educ. 70(1), 27–61 (2001)

    Google Scholar 

  9. 9.

    Fan, X., Chen, M.: Parental involvement and students’ academic achievement: a meta-analysis. Educ. Psychol. Rev. 13(1), 1–22 (2001)

    Google Scholar 

  10. 10.

    Gramener: Gramener—a data science company. (2018). Accessed Dec 2018

  11. 11.

    Helal, S., Li, J., Liu, L., Ebrahimie, E., Dawson, S., Murray, D.J.: Identifying key factors of student academic performance by subgroup discovery. Int. J. Data Sci. Anal. 7(3), 227–245 (2019)

    Google Scholar 

  12. 12.

    Hoff, E., Laursen, B., Tardif, T., Bornstein, M.: Socioeconomic status and parenting. Handb. Parent. Vol. 2: Biol. Ecol. Parent. 8(2), 231–252 (2002)

    Google Scholar 

  13. 13.

    Huynh, X.H., Guillet, F., Briand, H.: Evaluating interestingness measures with linear correlation graph. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp. 312–321. Springer, Berlin (2006)

    Google Scholar 

  14. 14.

    Jeynes, W.H.: A meta-analysis of the relation of parental involvement to urban elementary school student academic achievement. Urban Educ. 40(3), 237–269 (2005)

    Google Scholar 

  15. 15.

    Li, Y., Bandar, Z.A., McLean, D.: An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003)

    Google Scholar 

  16. 16.

    Muller, C.: Parent involvement and academic achievement: an analysis of family resources available to the child. In: Parents, Their Children, and Schools, pp. 77–114. Routledge, London (2018)

    Google Scholar 

  17. 17.

    Mutlu, A., Kara, Ö.K., Livanelioğlu, A., Karahan, S., Alkan, H., Yardımcı, B.N., Hidecker, M.J.C.: Agreement between parents and clinicians on the communication function levels and relationship of classification systems of children with cerebral palsy. Disabil. Health J. 11(2), 281–286 (2018)

    Google Scholar 

  18. 18.

    Park, K.: Park’s Textbook of Preventive and Social Medicine. Preventive Medicine in Obstet, Paediatrics and Geriatrics, 18th edn. Banarsidas Bhanot Publishers, India (2005)

    Google Scholar 

  19. 19.

    Polyzou, A., Karypis, G.: Grade prediction with models specific to students and courses. Int. J. Data Sci. Anal. 2(3–4), 159–171 (2016)

    Google Scholar 

  20. 20.

    Raghu, V.K., Ramsey, J.D., Morris, A., Manatakis, D.V., Sprites, P., Chrysanthis, P.K., Glymour, C., Benos, P.V.: Comparison of strategies for scalable causal discovery of latent variable models from mixed data. Int. J. Data Sci. Anal. 6(1), 33–45 (2018)

    Google Scholar 

  21. 21.

    Shumow, L., Miller, J.D.: Parents’ at-home and at-school academic involvement with young adolescents. J. Early Adolesc. 21(1), 68–91 (2001)

    Google Scholar 

  22. 22.

    Sirin, S.R.: Socioeconomic status and academic achievement: a meta-analytic review of research. Rev. Educ. Res. 75(3), 417–453 (2005)

    Google Scholar 

  23. 23.

    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)

  24. 24.

    Weka: Weka 3: data mining software in Java. (2019). Accessed Jan 2019

  25. 25.

    Zhao, T., Huang, H., Yao, X., Fu, X., et al.: Predicting individual socioeconomic status from mobile phone data: a semi-supervised hypergraph-based factor graph approach. Int. J. Data Sci. Anal. 9(3), 361–372 (2020)

    Google Scholar 

Download references


I would like to thank Mr. Kathirmani Sukumar (former senior data scientist, Gramener Technologies) for sharing the datasets to carry out the experiments.

Author information



Corresponding author

Correspondence to Monidipa Das.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Das, M. Analyzing impact of parental occupation on child’s learning performance: a semantics-driven probabilistic approach. Int J Data Sci Anal (2020).

Download citation


  • Semantics
  • Domain knowledge
  • Bayesian network
  • Probabilistic inference
  • Child performance