Educational Data Mining: A Study on Socioeconomic Indicators in Education in INEP Database

  • Aurea T. B. SantosEmail author
  • Jonatã Paulino
  • Marcelino S. Silva
  • Liviane Rego
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 37)


The educational data mining enables the discovery of factors that make it possible to improve educational proposals, as well as to predict student performance and factors that influence learning. In view of this, the present work uses the database provided by INEP, with the purpose of explaining better which socioeconomic variables influence the grades that the students obtained in the test of the ENEM 2016, one of the examinations of major importance and with an elavada quantity untapped data. The PCA technique was applied and the Bayesian networks were generated to analyze the performance. The results show that income, parental schooling and school type are strong influencing factors.


Data mining Bayesian network Government data ENEM Educational indicators 

Palavras-Chave de Classificação ACM

Mineração de dados Inteligencia artificial Representação e raciocinio de conhecimento 



Thanks to CAPES and the CNPQ for their financial assistance to carry out the research.


  1. 1.
    C.C. Dutra, K.M.G. Lopes, Dados Abertos: uma forma inovadora de transparência, in VI Congresso Consad de Gestão Pública (Brasília, 2013)Google Scholar
  2. 2.
    R. Jindal, MD Borah, A survey on educational data mining and research trends. Int. J. Database Manag. Syst. 5(3), 53–73 (2013)CrossRefGoogle Scholar
  3. 3.
    J. Srivastava, A.K. Srivastava, Data mining in education sector: a review, in International Journal of Advanced Networking Applications, Special Conference Issue, National Conference on Current Research Trends in Cloud Computing & Big Data (2013), pp. 184–190. Acesso em: 31 agosto. 2018
  4. 4.
    INEP, Conheça o Enem. Acesso em: 12 agosto 2018
  5. 5.
    J. Pearl, Probabilistic Reasoning in Intelligent Systems (Morgan Kaufmann, San Francisco, Calif, 1988)zbMATHGoogle Scholar
  6. 6.
    J.S. Russel, P. Norving, Artificial Intelligence: A Modern Approach, 3rd edn. (Pearson, 2010)Google Scholar
  7. 7.
    E. Hlel, S. Jamoussi, A.B. Hamadou, Bayesian network for discovering the interests of authors, in 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA) (2016), pp. 1–6Google Scholar
  8. 8.
    M. Borunda, et al., Bayesian networks in renewable energy systems: a bibliographical survey. Renew. Sustain. Energy Rev. 62, 32–45 (2016). ISSN 1364-0321CrossRefGoogle Scholar
  9. 9.
    W.B. Andriola, Doze motivos favoráveis à adoção do Exame Nacional do Ensino Médio (ENEM) pelas instituições de Ensino Superior (IFES). Ensaio: Avaliação e Políticas Públicas em Educação, Rio de Janeiro 19(70), 107–126 (2011). Acesso em: 03 agosto. 2018
  10. 10.
    L.A. Silva, A.H. Morino, T.M.C. Sato, Prática de Mineração de Dados no Exame Nacional do Ensino Médio. Anais dos Workshops do Congresso Brasileiro de Informática na Educação 3(1), 651–660 (2014)CrossRefGoogle Scholar
  11. 11.
    B. Stearns, F. Rangel, F. Rangel, F. Firmino, J. Oliveira, Scholar performance prediction using boosted regression trees techniques, in European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) (2017)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Aurea T. B. Santos
    • 1
    Email author
  • Jonatã Paulino
    • 1
  • Marcelino S. Silva
    • 1
  • Liviane Rego
    • 1
  1. 1.Universidade Federal do Pará - UFPAGuamá, BelémBrazil

Personalised recommendations