Skip to main content
Log in

Adolescent Family Experiences Predict Young Adult Educational Attainment: A Data-Based Cross-Study Synthesis With Machine Learning

  • Original Paper
  • Published:
Journal of Child and Family Studies Aims and scope Submit manuscript

Abstract

Grounded in theory and research on the role of adolescent family experiences in young adult educational attainment, this study took the novel step of synthesizing results from prior studies and using a machine learning (ML) approach to address three questions: (1) By incorporating adolescent family experience factors examined across prior studies in a single analysis, how accurately can we predict young adult educational attainment? (2) Which family experience factors are the best predictors of young adult educational attainment? (3) What complex patterns among family experience predictors merit further examination? Based on a review of 101 publications that used National Longitudinal Study of Adolescent Health data to investigate links between adolescent family experiences and young adult attainment, we identified 53 family experience independent variables. We used an ML-based approach to train and test models with these 53 Wave I family variables (adolescent in Grade 7–12) as predictors of both college enrollment (N = 4598) and graduation (N = 4180) at Wave IV (young adult mean age = 28.88, SD = 1.76). Our models (1) obtained prediction accuracies of 73.43% and 72.33% for college enrollment, and 79.10% and 79.07% for college graduation, (2) identified the best predictors of college enrollment and graduation, including family socioeconomic characteristics and parent educational expectations, and (3) highlight nonlinear patterns for further examination. This study advanced understanding of how adolescent family experiences may influence educational attainment and provided a paradigm for developmental research to synthesize existing findings into novel discoveries with large-scale datasets.

Highlights

  • A machine learning paradigm for cross-study synthesis with large-scale datasets.

  • Synthesized 101 education attainment studies with 53 adolescent family predictors.

  • Family experiences predicted young adult attainment with 72.33–79.10% accuracy.

  • Identified 12–19 key family predictors of young adult education attainment.

  • Partial dependence plots highlight nonlinear patterns in prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

This research uses data from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Special acknowledgment is due Ronald R. Rindfuss and Barbara Entwisle for assistance in the original design. Information on how to obtain the Add Health data files is available on the Add Health website (http://www.cpc.unc.edu/addhealth). No direct support was received from grant P01-HD31921 for this analysis.

References

  • Allison, P. D. (2001). Missing data. Thousand Oaks, CA: Sage.

  • American Psychological Association, Presidential Task Force on Educational Disparities (2012). Ethnic and racial disparities in education: psychology’s contributions to understanding and reducing disparities. http://www.apa.org/ed/resources/racial-disparities.aspx.

  • Archer, K. J., & Kimes, R. V. (2008). Empirical characterization of random forest variable importance measures. Computational Statistics & Data Analysis, 52, 2249–2260.

    Google Scholar 

  • Ashtiani, M., & Feliciano, C. (2018). Access and mobilization: How social capital relates to low-income youth’s postsecondary educational (PSE) attainment. Youth & Society, 50, 439–461.

    Google Scholar 

  • Baltes, P. B., Reese, H. W., & Nesselroade, J. R. (1977). Life-span developmental psychology: introduction to research methods. Monterey, CA: Brooks.

  • Benner, A. D., Boyle, A. E., & Sadler, S. (2016). Parental involvement and adolescents’ educational success: the roles of prior achievement and socioeconomic status. Journal of Youth and Adolescence, 45, 1053–1064.

    PubMed  Google Scholar 

  • Benner, A. D., & Wang, Y. (2014). Demographic marginalization, social integration, and adolescents’ educational success. Journal of Youth and Adolescence, 43, 1611–1627.

    PubMed  Google Scholar 

  • Boardman, J. D., Alexander, K. B., Miech, R. A., MacMillan, R., & Shanahan, M. J. (2012). The association between parent's health and the educational attainment of their children. Social Science & Medicine, 75, 932–939.

    Google Scholar 

  • Brandmaier, A. M., von Oertzen, T., McArdle, J. J., & Lindenberger, U. (2013). Structural equation model trees. Psychological Methods, 18, 71–86.

    PubMed  Google Scholar 

  • Breiman, L. (1984). Classification and regression trees. Wadsworth International Group.

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Google Scholar 

  • Brick, T. R., Koffer, R. E., Gerstorf, D., & Ram, N. (2017). Feature selection methods for optimal design of studies for developmental inquiry. The Journals of Gerontology: Series B, 73, 113–123.

    Google Scholar 

  • Bronfenbrenner, U., & Morris, P. A. (2006). The bioecological model of human development. In R. M. Lerner, & W. Damon (Eds), Handbook of child psychology (5th ed.). Theoretical models of human development, Vol. 1. (pp. 793–828). New York, NY: Wiley.

  • Burke, T. A., Ammerman, B. A., & Jacobucci, R. (2019). The use of machine learning in the study of suicidal and non-suicidal self-injurious thoughts and behaviors: a systematic review. Journal of Affective Disorders, 245, 869–884.

    PubMed  Google Scholar 

  • Choudhary, P., Kramer, A., & datascience.com team (2018). Skater: Model Interpretation Library. https://doi.org/10.5281/zenodo.1198885.

  • Chouldechova, A., Benavides-Prado, D., Fialko, O., & Vaithianathan, R. (2018). A case study of algorithm-assisted decision making in child maltreatment hotlinescreening decisions. Proceedings of the 1st Conference on Fairness, Accountability and Transparency, in PMLR, 81, 134–148.

  • Christodoulou, E., Ma, J., Collins, G. S., Steyerberg, E. W., Verbakel, J. Y., & Van Calster, B. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 12–22.

    PubMed  Google Scholar 

  • Coleman, J. S. (1988). Social capital in the creation of human capital. American Journal of Sociology, 94, 95–120.

    Google Scholar 

  • Couronné, R., Probst, P., & Boulesteix, A. L. (2018). Random forest versus logistic regression: a large-scale benchmark experiment. BMC Bioinformatics, 19, 270.

    PubMed  PubMed Central  Google Scholar 

  • Eccles, J. (2011). Gendered educational and occupational choices: applying the Eccles et al. model of achievement-related choices. International Journal of Behavioral Development, 35, 195–201.

    Google Scholar 

  • Elder, Jr., G. H. (1998). The life course as developmental theory. Child Development, 69, 1–12.

    PubMed  Google Scholar 

  • Erickson, L. D., McDonald, S., & Elder, G. H.Jr. (2009). Informal mentors and education: Complementary or compensatory resources? Sociology of Education, 82, 344–367.

    PubMed  PubMed Central  Google Scholar 

  • Faas, C., Benson, M. J., & Kaestle, C. E. (2013). Parent resources during adolescence: effects on education and careers in young adulthood. Journal of Youth Studies, 16, 151–171.

    Google Scholar 

  • Fasang, A. E., Mangino, W., & Brückner, H. (2014). Social closure and educational attainment. Sociological Forum, 29, 137–164.

    Google Scholar 

  • Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874.

    Google Scholar 

  • Feliciano, C., & Lanuza, Y. R. (2017). An immigrant paradox? Contextual attainment and intergenerational educational mobility. American Sociological Review, 82, 211–241.

    Google Scholar 

  • Fletcher, J., & Lehrer, S. (2009). The effects of adolescent health on educational outcomes: causal evidence using genetic lotteries between siblings. Forum for Health Economics & Policy, 12(2), Article 8.

    Google Scholar 

  • Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29, 1189–1232.

    Google Scholar 

  • Geisser, S. (1975). The predictive sample reuse method with applications. Journal of the American statistical Association, 70, 320–328.

    Google Scholar 

  • Gillette, M. T., & Gudmunson, C. G. (2014). Processes linking father absence to educational attainment among African American females. Journal of Research on Adolescence, 24, 309–321.

    Google Scholar 

  • Glanville, J. L., Sikkink, D., & Hernández, E. I. (2008). Religious involvement and educational outcomes: the role of social capital and extracurricular participation. The Sociological Quarterly, 49, 105–137.

    Google Scholar 

  • Gordon, M. S., & Cui, M. (2012). The effect of school-specific parenting processes on academic achievement in adolescence and young adulthood. Family Relations, 61, 728–741.

    Google Scholar 

  • Harris, K. M., & Udry, J. R. (1994–2008) National Longitudinal Study of Adolescent to Adult Health (Add Health) [Public Use]. Ann Arbor, MI: Carolina Population Center, University of North Carolina-Chapel Hill [distributor], Inter-university Consortium for Political and Social Research [distributor], 2018-08-06.

  • Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.

    Google Scholar 

  • Holder, H. (2010). Prevention programs in the 21st century: what we do not discuss in public. Addiction, 105, 578–581.

    PubMed  Google Scholar 

  • Humberstone, E. (2018). Social networks and educational attainment among Adolescents Experiencing Pregnancy. Socius, 4, 1–13.

    Google Scholar 

  • IOM (Institute of Medicine) & NRC (National Research Council) (2015). Investing in the health and well-being of young adults. Washington, DC: The National Academies Press.

  • Joel, S., Eastwick, P. W., & Finkel, E. J. (2017). Is romantic desire predictable? Machine learning applied to initial romantic attraction. Psychological Science, 28, 1478–1489.

    PubMed  Google Scholar 

  • Krstajic, D., Buturovic, L. J., Leahy, D. E., & Thomas, S. (2014). Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of Cheminformatics, 6, 10.

    PubMed  PubMed Central  Google Scholar 

  • Lakkaraju, H., Aguiar, E., Shan, C., Miller, D., Bhanpuri, N., Ghani, R., & Addison, K. L. (2015). A machine learning framework to identify students at risk of adverse academic outcomes. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1909–1918). ACM.

  • Mahatmya, D., & Smith, A. (2017). Family and neighborhood influences on meeting college expectations in emerging adulthood. Emerging Adulthood, 5, 164–176.

    Google Scholar 

  • Mangino, W. (2014). The negative effects of privilege on educational attainment: gender, race, class, and the bachelor’s degree. Social Science Quarterly, 95, 760–784.

    Google Scholar 

  • McArdle, J. J. (2013). Exploratory data mining using decision trees in the behavioral sciences. In J. J. McArdle & G. Ritschard (Eds.), Contemporary issues in exploratory data mining in the behavioral sciences (pp. 3–47). Routledge.

  • Mears, D. P., & Siennick, S. E. (2016). Young adult outcomes and the life-course penalties of parental incarceration. Journal of Research in Crime and Delinquency, 53, 3–35.

    Google Scholar 

  • Minuchin, P. (1985). Families and individual development: provocations from the field of family therapy. Child Development, 56, 289–302.

    PubMed  Google Scholar 

  • Molnar, C. (2019). Interpretable machine learning: a guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/.

  • Monserud, M. A., & Elder, Jr., G. H. (2011). Household structure and children’s educational attainment: a perspective on coresidence with grandparents. Journal of Marriage and Family, 73, 981–1000.

    Google Scholar 

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., & Vanderplas, J. (2011). Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

    Google Scholar 

  • Pettit, G. S., Davis-Kean, P. E., & Magnuson, K. (2009). Educational attainment in developmental perspective: longitudinal analyses of continuity, change, and process. Merrill-Palmer Quarterly, 55, 217–223.

    Google Scholar 

  • Probst, P., Wright, M. N., & Boulesteix, A. L. (2019). Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(3), e1301.

    Google Scholar 

  • Ryabov, I. (2013). The influence of co-racial versus inter-racial peer friendships on academic achievement of Asian-American adolescents. Asian American Journal of Psychology, 4, 201–210.

    Google Scholar 

  • Ryabov, I. (2016). Colorism and educational outcomes of Asian Americans: evidence from the National Longitudinal Study of Adolescent Health. Social Psychology of Education, 19, 303–324.

    Google Scholar 

  • Serang, S., & Jacobucci, R. (2020). Exploratory mediation analysis of dichotomous outcomes via regularization. Multivariate Behavioral Research, 55, 69–86.

    PubMed  Google Scholar 

  • Stokes, C. E. (2008). The role of parental religiosity in high school completion. Sociological Spectrum, 28, 531–555.

    PubMed  PubMed Central  Google Scholar 

  • Strobl, C., Boulesteix, A. L., Kneib, T., Augustin, T., & Zeileis, A. (2008). Conditional variable importance for random forests. BMC Bioinformatics, 9, 307.

    PubMed  PubMed Central  Google Scholar 

  • Sun, X., McHale, S. M., & Updegraff, K. A. (2017). Maternal and paternal resources across childhood and adolescence as predictors of young adult achievement. Journal of Vocational Behavior, 100, 111–123.

    PubMed  PubMed Central  Google Scholar 

  • Turley, R. N. L., Desmond, M., & Bruch, S. K. (2010). Unanticipated educational consequences of a positive parent-child relationship. Journal of Marriage and Family, 72, 1377–1390.

    Google Scholar 

  • van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: multivariate imputation by chained equations in R. Journal of Statistical Software, 45, 1–67.

    Google Scholar 

  • Varma, S., & Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7, 91.

    PubMed  PubMed Central  Google Scholar 

  • Whelan, R., Watts, R., Orr, C. A., Althoff, R. R., Artiges, E., Banaschewski, T., Barker, G. J., Bokde, A. L. W., Büche, C., Carvalho, F. M., Conrod, P. J., Flor, H., Fauth-Bühler, M., Frouin, V., Gallinat, J., Gan, G., Gowland, P., Heinz, A., & Ittermann, B., The IMAGEN Consortium. (2014). Neuropsychosocial profiles of current and future adolescent alcohol misusers. Nature, 512, 185–189.

    PubMed  PubMed Central  Google Scholar 

  • Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: lessons from machine learning. Perspectives on Psychological Science, 12, 1100–1122.

    PubMed  PubMed Central  Google Scholar 

Download references

Funding

This material is based on work supported by the National Science Foundation under IGERT Grant DGE-1144860, Big Data Social Science.

Author information

Authors and Affiliations

Authors

Contributions

X.S. led the design of the study, data analyses and writing. N.R. and S.M.M. collaborated with the design and writing of the study.

Corresponding author

Correspondence to Xiaoran Sun.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

This study conducts secondary data analysis with a publicly available dataset. This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

The following declaration is obtained from the Add Health website (http://www.cpc.unc.edu/addhealth): “Add Health participants provided written informed consent for participation in all aspects of Add Health in accordance with the University of North Carolina School of Public Health Institutional Review Board guidelines that are based on the Code of Federal Regulations on the Protection of Human Subjects 45CFR46: http://www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html”.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, X., Ram, N. & McHale, S.M. Adolescent Family Experiences Predict Young Adult Educational Attainment: A Data-Based Cross-Study Synthesis With Machine Learning. J Child Fam Stud 29, 2770–2785 (2020). https://doi.org/10.1007/s10826-020-01775-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10826-020-01775-5

Keywords

Navigation