Abstract
Youth in the American foster care system are significantly more likely than their peers to face a number of negative life outcomes, from homelessness to incarceration. Administrative data on these youth have the potential to provide insights that can help identify ways to improve their path towards a better life. However, such data also suffer from a variety of biases, from missing data to reflections of systemic inequality. The present work proposes a novel, prescriptive approach to using these data to provide insights about both data biases and the systems and youth they track. Specifically, we develop a novel categorical clustering and cluster summarization methodology that allows us to gain insights into subtle biases in existing data on foster youth, and to provide insight into where further (often qualitative) research is needed to identify potential ways of assisting youth.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
AFCARS foster care annual file user’s guide (2019). https://www.ndacan.acf.hhs.gov/datasets/pdfs_user_guides/afcars-foster-care-users-guide-2000-present.pdf
The AFCARS report. Tech. Rep. 27, Administration on Children Youth and Families, Children’s Bureau, US Department of Health and Human Services (2020)
Andreopoulos, B., An, A., Wang, X., Schroeder, M.: A roadmap of clustering algorithms: Finding a match for a biomedical application. Briefings in bioinformatics
Bald, A., Doyle, Joseph J, J., Gross, M., Jacob, B.: Economics of foster care. Working Paper 29906, National Bureau of Economic Research, April 2022
Barbará, D., Li, Y., Couto, J.: Coolcat: an entropy-based algorithm for categorical clustering. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 582–589 (2002)
Camasso, M.J., Jagannathan, R.: Conceptualizing and testing the vicious cycle in child protective services: the critical role played by child maltreatment fatalities. Child Youth Serv. Rev. 103, 178–189 (2019)
Connell, C.M., Vanderploeg, J.J., Flaspohler, P., Katz, K.H., Saunders, L., Tebes, J.K.: Changes in placement among children in foster care: a longitudinal study of child and case influences. Social Service Review 80(3), 398–418 (2006)
Connelly, R., Playford, C.J., Gayle, V., Dibben, C.: The role of administrative data in the big data revolution in social science research. Social Science Research 59
Courtney, M., Dworsky, A., Brown, A., Cary, C., Love, K., Vorhies, V.: Midwest evaluation of the adult functioning of former foster youth: Outcomes at age 26. Tech. Rep. 9, University of Chicago, Chapin Hall Center for Children (2011)
Cusick, G., Courtney, M.: Offending during late adolescence: How do youth aging out of care compare with their peers?, January 2007
Daley, D., Bachmann, M., Bachmann, B.A., Pedigo, C., Bui, M.T., Coffman, J.: Risk terrain modeling predicts child maltreatment. Child Abuse & Neglect 62
Day, A.G., Dworsky, A., Fogarty, K.J., Damashek, A.: An examination of post-secondary retention and graduation among foster care youth enrolled in a four-year university. Child Youth Serv. Rev. 33, 2335–2341 (2011)
Deng, S., He, Z., Xu, X.: G-anmi: a mutual information based genetic clustering algorithm for categorical data. Knowl.-Based Syst. 23(2), 144–149 (2010)
Dua, D., Graff, C.: UCI machine learning repository (2017)
Dworsky, A., Napolitano, L., Courtney, M.: Homelessness during the transition from foster care to adulthood. American Journal of Public Health 103(S2)
Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS-clustering categorical data using summaries. In: SIGKDD, pp. 73–83 (1999)
Green, B.L., et al.: It’s not as simple as it sounds: Problems and solutions in accessing and using administrative child welfare data for evaluating the impact of early childhood interventions. Children Youth Serv. Rev. 57, 40–49
Guha, S., Rastogi, R., Shim, K.: Rock: a robust clustering algorithm for categorical attributes. Inf. Syst. 25(5), 345–366 (2000)
He, Z., Xu, X., Deng, S.: k-anmi: a mutual information based clustering algorithm for categorical data. Inf. Fusion 9(2), 223–233 (2008)
Jadhav, A., Pramod, D., Ramanathan, K.: Comparison of performance of data imputation methods for numeric dataset. Appl. Artif. Intell. 33(10)
Martin, E.: Hidden Consequences: The Impact of Incarceration on Dependent Children, March 2017
Matta Oshima, K.M., Narendorf, S.C., McMillen, J.C.: Pregnancy risk among older youth transitioning out of foster care. Children and Youth Services Review 35(10)
NYS Office of Children and Family Services.: Eligibility manual for child welfare programs (2018)
Qin, H., Ma, X., Herawan, T., Zain, J.M.: MGR: an information theory based hierarchical divisive clustering algorithm for categorical data. Knowl.-Based Syst. 67, 401–411 (2014)
Rodriguez, M.Y., DePanfilis, D., Lanier, P.: Bridging the gap: Social work insights for ethical algorithmic decision-making in human services. IBM J. Res. Dev. 63(4/5), 8:1–8:8 (2019)
Romano, S., Bailey, J., Nguyen, V., Verspoor, K.: Standardized mutual information for clustering comparisons: one step further in adjustment for chance. Proc. Mach. Learn. Res. 32, 1143–1151 (2014)
Schwartz, I.M., York, P., Nowakowski-Sims, E., Ramos-Hernandez, A.: Predictive and prescriptive analytics, machine learning and child welfare risk assessment: The broward county experience. Children Youth Serv. Rev. 81, 309–320
Vaithianathan, R., Maloney, T., Putnam-Hornstein, E., Jiang, N.: Children in the public benefit system at risk of maltreatment: Identification via predictive modeling. Am. J. Prev. Med. 45(3), 354–359 (2013)
de Vos, N.J.: kmodes categorical clustering library (2015–2021). https://github.com/nicodv/kmodes
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sankhe, P., Hall, S.F., Sage, M., Rodriguez, M.Y., Chandola, V., Joseph, K. (2022). Mutual Information Scoring: Increasing Interpretability in Categorical Clustering Tasks with Applications to Child Welfare Data. In: Thomson, R., Dancy, C., Pyke, A. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2022. Lecture Notes in Computer Science, vol 13558. Springer, Cham. https://doi.org/10.1007/978-3-031-17114-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-17114-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17113-0
Online ISBN: 978-3-031-17114-7
eBook Packages: Computer ScienceComputer Science (R0)