Skip to main content

What race and gender stand for: using Markov blankets to identify constitutive and mediating relationships

Abstract

A growing body of research points to the limitations of conceptualizing and measuring race and gender using a single, time-invariant categorical variable. Researchers have argued that the complex processes underlying race and gender cannot meaningfully be reduced into these categories, and that these measures tend to generate essentialist misconceptions. Yet even if more nuanced measures of race and gender have been developed, most datasets in social sciences still contain single categorical variables to measure these constructs. In this paper, I argue that one way of empirically investigating the meaning and role that these variables play in a specific system is by identifying their Markov blanket, which is composed of the variables carrying all information about the variable of interest. I illustrate this exploratory approach by searching for the Markov blanket of race and gender in a nationally representative dataset containing a wide range of factors related to child development.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

Notes

  1. As Bollen and Diamantopoulos [29] explain, the constitutive elements (what they call “causal-formative indicators”), require “conceptual unity” in the sense that they need to correspond to the concept’s meaning, and as a consequence can be considered measures of a latent variable. For example, genetic ancestry, skin color and self-identification exhibit conceptual unity, as they are part of the meaning of race. On the other hand, the mediating factors of a variable need not be conceptually related to that variable. For example, parental socioeconomic status might affect the child’s academic achievement through the child’s self-efficacy beliefs, and these beliefs are not conceptually related to socioeconomic status (i.e., they are not part of the meaning of socioeconomic status). I will argue that in the case of race and gender this latter point does not necessarily hold, as the mediating elements can also be part of the meaning of the latent variable.

  2. The co-parents of \(X\) are in \({\text{Mb}}\left( X \right)\) due to the dependencies generated by conditioning on a collider. Consider the relationship between \(X\), the child of \(X \left( C \right)\) and the spouse of \(X \left( S \right)\) represented by the following DAG: \(X \to C \leftarrow S\). Based on the d-separation criterion, we know that \(P(X|C) \ne P(X|C,S).\) Now, the predictive power of S on X will depend on the coefficients of the paths \(X \to C\) (\(a\)) and \(C \leftarrow S\) (\(b\)). Specifically, the partial covariance \(\sigma_{XS.C} = \sigma_{XS} - \frac{{\sigma_{SC} \sigma_{CX} }}{{\sigma_{C}^{2} }}\), and given that \(X\) and \(S\) are marginally independent we can simplify this expression as \(\sigma_{XS.C} = 0 - \frac{ab}{1} = - ab\) [38]. In other words, the covariance between \(X\) and \(S\) given \(C\) will depend on the two structural coefficients (\(a\), \(b\)) in the model.

  3. It is worth noting that exploratory factor analysis can only be used with effect (reflective) indicators rather than causal (formative) indicators [29], which is how the latent variables are defined in this study.

  4. The choice of the dependent variables is based on the well-documented differences on reading achievement across racial groups [18] as well as differences on mathematics achievement across gender categories [54]. These differences can be perceived in the dataset used in this study. A regression of reading achievement in fifth grade on race yields a F-statistic of 121.5 (p < 0.001), and a regression of mathematics achievement on gender yields a F-statistic of 26.6 (p < 0.001). This indicates that there is strong evidence in the data that both race and gender are associated with reading and mathematics achievement, respectively. However, it is important to note that these predictive models are implemented primarily for illustrative purposes, and that I could have chosen other outcome variables. As a robustness check, Table S1 in the Supplementary Appendix presents the results of models using race to predict mathematics achievement and gender to predict reading achievement (rather than vice versa). I also present models using two unrelated outcomes (behavioral engagement and externalizing behaviors) with both race and gender as predictors.

  5. Researchers disagree on the existence of a gender gap in mathematic achievement [56]. However, researchers generally agree about the existence of a gender gap in reading achievement, which can also be found using historical as well as international data [57]. The gender gap in mathematic typically favors men, while the gender gap in reading favors women. Several reasons have been provided to explain these gaps, many of which emphasize how socialization and cultural influences associate gender stereotypes to particular behaviors [57, 58].

  6. The dataset used in the empirical analysis is publicly available and can be downloaded here https://nces.ed.gov/ecls/dataproducts.asp. Code to reproduce the prediction and mediation models can be found in the Supplemental materials. Pseudo-code of the FGES algorithm can be found in Ramsey et al. [49].

  7. This example also illustrates the difficulty of clearly differentiating between constitutive and mediating relationships, as play can be considered both an effect (or mediating factor) of gender, as well as part of the meaning of gender (i.e., a constitutive element). That is, one can at the same time claim that gender has an effect on how children play, and that how children play is integral to the social and cultural differences contributing to the meaning of gender.

References

  1. Martin, J. L., & Yeung, K.-T. (2003). The use of the conceptual category of race in American sociology, 1937–99. Sociological Forum, 18(4), 521–543.

    Article  Google Scholar 

  2. Westbrook, L., & Saperstein, A. (2015). New categories are not enough: rethinking the measurement of sex and gender in social surveys. Gender & Society, 29(4), 534–560.

    Article  Google Scholar 

  3. Roth, W. D. (2016). The multiple dimensions of race. Ethnic and Racial Studies, 39(8), 1310–1338.

    Article  Google Scholar 

  4. Saperstein, A., & Penner, A. M. (2012). Racial fluidity and inequality in the United States. American Journal of Sociology, 118(3), 676–727.

    Article  Google Scholar 

  5. Saperstein, A., & Westbrook, L. (2020). Categorical and gradational: Alternative survey measures of sex and gender. European Journal of Politics and Gender, 20, 11–30.

    Google Scholar 

  6. Sen, M., & Wasow, O. (2016). Race as a bundle of sticks: Designs that estimate effects of seemingly immutable characteristics. Annual Review of Political Science, 19, 499–522.

    Article  Google Scholar 

  7. Helms, J. E., Jernigan, M., & Mascher, J. (2005). The meaning of race in psychology and how to change it: A methodological perspective. American Psychologist, 60(1), 1–27.

    Article  Google Scholar 

  8. Bailey, S. R., Saperstein, A., & Penner, A. M. (2014). Race, color, and income inequality across the Americas. Demographic Research, 31, 735–756.

    Article  Google Scholar 

  9. Dixon, A. R., & Telles, E. E. (2017). Skin color and colorism: Global research, concepts, and measurement. Annual Review of Sociology, 43, 405–424.

    Article  Google Scholar 

  10. Magliozzi, D., Saperstein, A., & Westbrook, L. (2016). Scaling up: Representing gender diversity in survey research. Socius, 2, 1–11.

    Article  Google Scholar 

  11. Vargas, N., & Kingsbury, J. (2016). Racial identity contestation: Mapping and measuring racial boundaries. Sociology Compass, 10(8), 718–729.

    Article  Google Scholar 

  12. Saperstein, A., Kizer, J. M., & Penner, A. M. (2016). Making the most of multiple measures: Disentangling the effects of different dimensions of race in survey research. American Behavioral Scientist, 60(4), 519–537.

    Article  Google Scholar 

  13. Hu, L., & Kohler-Hausmann, I. (2020). What’s sex got to do with machine learning. arXiv preprint arXiv: 2006.01770. Retrieved from https://arxiv.org/pdf/2006.01770.pdf

  14. Saperstein, A., Penner, A. M., & Light, R. (2013). Racial formation in perspective: Connecting individuals, institutions, and power relations. Annual Review of Sociology, 39, 359–378.

    Article  Google Scholar 

  15. Pearl, J. (2014). Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann.

    Google Scholar 

  16. Pellet, J.-P., & Elisseeff, A. (2008). Using Markov blankets for causal structure learning. Journal of Machine Learning Research, 9(7), 1295–1342.

    Google Scholar 

  17. Spirtes, P., Glymour, C. N., Scheines, R., & Heckerman, D. (2000). Causation, prediction, and search. MIT Press.

    Google Scholar 

  18. Quintana, R., & Correnti, R. (2020). The concept of academic mobility: Normative and methodological considerations. American Educational Research Journal, 57(4), 1625–1664.

    Article  Google Scholar 

  19. Duncan, G. J., & Murnane, R. J. (2011). Whither opportunity? Rising inequality, schools, and children’s life chances. Russell Sage Foundation.

    Google Scholar 

  20. Chen, J. M., de Paula Couto, M. C. P., Sacco, A. M., & Dunham, Y. (2018). To be or not to be (black or multiracial or white) cultural variation in racial boundaries. Social Psychological and Personality Science, 9(7), 763–772.

    Article  Google Scholar 

  21. Ritz, S. A., Antle, D. M., Côté, J., Deroy, K., Fraleigh, N., Messing, K., & Mergler, D. (2014). First steps for integrating sex and gender considerations into basic experimental biomedical research. The FASEB Journal, 28(1), 4–13.

    Article  Google Scholar 

  22. Dar-Nimrod, I., & Heine, S. J. (2011). Genetic essentialism: On the deceptive determinism of DNA. Psychological Bulletin, 137(5), 800–818.

    Article  Google Scholar 

  23. Prentice, D. A., & Miller, D. T. (2007). Psychological essentialism of human categories. Current Directions in Psychological Science, 16(4), 202–206.

    Article  Google Scholar 

  24. Ahn, W., Taylor, E. G., Kato, D., Marsh, J. K., & Bloom, P. (2013). Causal essentialism in kinds. Quarterly Journal of Experimental Psychology, 66(6), 1113–1130.

    Article  Google Scholar 

  25. Byrd, W. C., & Ray, V. E. (2015). Ultimate attribution in the genetic era: White support for genetic explanations of racial difference and policies. The Annals of the American Academy of Political and Social Science, 661(1), 212–235.

    Article  Google Scholar 

  26. Joel, D. (2021). Beyond the binary: Rethinking sex and the brain. Neuroscience & Biobehavioral Reviews, 122, 165–175.

    Article  Google Scholar 

  27. Reskin, B. (2012). The race discrimination system. Annual Review of Sociology, 38, 17–35.

    Article  Google Scholar 

  28. VanderWeele, T. J., & Robinson, W. R. (2014). On causal interpretation of race in regressions adjusting for confounding and mediating variables. Epidemiology (Cambridge, MA), 25(4), 473–484.

    Article  Google Scholar 

  29. Bollen, K. A., & Diamantopoulos, A. (2017). In defense of causal-formative indicators: A minority report. Psychological Methods, 22(3), 581–596.

    Article  Google Scholar 

  30. Stewart, A. J., & McDermott, C. (2004). Gender in psychology. Annual Review of Psychology, 55, 519–544.

    Article  Google Scholar 

  31. Ladyman, J., Lambert, J., & Wiesner, K. (2013). What is a complex system? European Journal for Philosophy of Science, 3(1), 33–67.

    Article  Google Scholar 

  32. Koller, D., & Friedman, N. (2009). Probabilistic graphical models: principles and techniques. MIT Press.

    Google Scholar 

  33. Pearl, J. (2009). Causality. Cambridge University Press.

    Book  Google Scholar 

  34. Eberhardt, F. (2017). Introduction to the foundations of causal discovery. International Journal of Data Science and Analytics, 3(2), 81–91.

    Article  Google Scholar 

  35. Peters, J., Janzing, D., & Schölkopf, B. (2017). Elements of causal inference. MIT Press.

    Google Scholar 

  36. Pearl, J. (2008). Probabilistic reasoning in intelligent systems: networks of plausible inference (Rev. 2. print., 12. [Dr.]). Kaufmann.

    Google Scholar 

  37. Aliferis, C. F., Statnikov, A., Tsamardinos, I., Mani, S., & Koutsoukos, X. D. (2010). Local causal and markov blanket induction for causal discovery and feature selection for classification part i: Algorithms and empirical evaluation. Journal of Machine Learning Research, 11(1), 171–234.

    Google Scholar 

  38. Chen, B., & Pearl, J. (2014). Graphical Tools for Linear Structural Equation Modeling. University of California.

    Book  Google Scholar 

  39. Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53(1), 605–634.

    Article  Google Scholar 

  40. Tourangeau, K., Nord, C., Lê, T., Wallner-Allen, K., Vaden-Kiernan, N., Blaker, L., & Najarian, M. (2018). Early childhood longitudinal study, kindergarten class of 2010–11 (ECLS-K: 2011): user’s manual for the ECLS-K: 2011 Kindergarten-Fourth Grade Data File and Electronic Codebook, Public Version. NCES 2018–032. National Center for Education Statistics.

  41. Hughes, D., Rodriguez, J., Smith, E. P., Johnson, D. J., Stevenson, H. C., & Spicer, P. (2006). Parents’ ethnic-racial socialization practices: A review of research and directions for future study. Developmental Psychology, 42(5), 747–770.

    Article  Google Scholar 

  42. Martin, C. L., & Ruble, D. (2004). Children’s search for gender cues: Cognitive perspectives on gender development. Current Directions in Psychological Science, 13(2), 67–70.

    Article  Google Scholar 

  43. Nguyen, C. D., Carlin, J. B., & Lee, K. J. (2017). Model checking in multiple imputation: An overview and case study. Emerging Themes in Epidemiology, 14(1), 8.

    Article  Google Scholar 

  44. Scutari, M., & Denis, J.-B. (2014). Bayesian networks: With examples in R. CRC Press.

    Book  Google Scholar 

  45. Drton, M., & Maathuis, M. H. (2017). Structure learning in graphical modeling. Annual Review of Statistics and its Application, 4(1), 365–393.

    Article  Google Scholar 

  46. Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in Genetics, 10, 1–15.

    Article  Google Scholar 

  47. Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3(Nov), 507–554.

    Google Scholar 

  48. Andrews, B., Ramsey, J., & Cooper, G. F. (2018). Scoring Bayesian networks of mixed variables. International Journal of Data Science and Analytics, 6(1), 3–18.

    Article  Google Scholar 

  49. Ramsey, J., Glymour, M., Sanchez-Romero, R., & Glymour, C. (2017). A million variables and more: The Fast Greedy Equivalence Search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images. International Journal of Data Science and Analytics, 3(2), 121–129.

    Article  Google Scholar 

  50. Ramsey, J. D. (2015). Scaling up greedy causal search for continuous variables. arXiv preprint arXiv: 1507.07749. Retrieved from: https://arxiv.org/abs/1507.07749.

  51. Constantinou, A. C., Liu, Y., Chobtham, K., Guo, Z., & Kitson, N. K. (2021). Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data. International Journal of Approximate Reasoning, 131, 151–188. https://doi.org/10.1016/j.ijar.2021.01.001

    Article  Google Scholar 

  52. Nandy, P., Hauser, A., & Maathuis, M. H. (2018). High-dimensional consistency in score-based and hybrid structure learning. Annals of Statistics, 46(6A), 3151–3183.

    Article  Google Scholar 

  53. Shen, X., Ma, S., Vemuri, P., & Simon, G. (2020). challenges and opportunities with causal Discovery Algorithms: Application to Alzheimer’s pathophysiology. Scientific Reports, 10(1), 1–12.

    Article  Google Scholar 

  54. Fryer, R. G., Jr., & Levitt, S. D. (2010). An empirical analysis of the gender gap in mathematics. American Economic Journal: Applied Economics, 2(2), 210–240.

    Google Scholar 

  55. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media.

    Book  Google Scholar 

  56. Stoet, G., & Geary, D. C. (2012). Can stereotype threat explain the gender gap in mathematics performance and achievement? Review of General Psychology, 16(1), 93–102.

    Article  Google Scholar 

  57. Reilly, D., Neumann, D. L., & Andrews, G. (2019). Gender differences in reading and writing achievement: Evidence from the National Assessment of Educational Progress (NAEP). American Psychologist, 74(4), 445–458.

    Article  Google Scholar 

  58. Eagly, A. H., & Wood, W. (2013). The nature–nurture debates: 25 years of challenges in understanding the psychology of gender. Perspectives on Psychological Science, 8(3), 340–357.

    Article  Google Scholar 

  59. MacKinnon, D. P. (2008). Introduction to statistical mediation analysis. Taylor & Francis.

    Google Scholar 

  60. Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40(3), 879–891.

    Article  Google Scholar 

  61. Muthén, L. K., & Muthén, B. O. (2009). Mplus. Statistical analysis with latent variables. User’s guide, 7.

  62. Phelan, J. C., & Link, B. G. (2015). Is racism a fundamental cause of inequalities in health? Annual Review of Sociology, 41, 311–330.

    Article  Google Scholar 

  63. Owens, J. (2016). Early childhood behavior problems and the gender gap in educational attainment in the United States. Sociology of Education, 89(3), 236–258.

    Article  Google Scholar 

  64. Spilt, J. L., Hughes, J. N., Wu, J.-Y., & Kwok, O.-M. (2012). Dynamics of teacher–student relationships: Stability and change across elementary school and the influence on children’s academic success. Child Development, 83(4), 1180–1195.

    Article  Google Scholar 

  65. Rea-Sandin, G., Korous, K. M., & Causadias, J. M. (2021). A systematic review and meta-analysis of racial/ethnic differences and similarities in executive function performance in the United States. Neuropsychology, 35(2), 141–156.

    Article  Google Scholar 

  66. Hackman, D. A., Gallop, R., Evans, G. W., & Farah, M. J. (2015). Socioeconomic status and executive function: Developmental trajectories and mediation. Developmental Science, 18(5), 686–702.

    Article  Google Scholar 

  67. Pechtel, P., & Pizzagalli, D. A. (2011). Effects of early life stress on cognitive and affective function: An integrated review of human literature. Psychopharmacology (Berlin), 214(1), 55–70.

    Article  Google Scholar 

  68. Fay-Stammbach, T., Hawes, D. J., & Meredith, P. (2014). Parenting influences on executive function in early childhood: A review. Child Development Perspectives, 8(4), 258–264.

    Article  Google Scholar 

  69. Lucas, K., & Sherry, J. L. (2004). Sex differences in video game play: A communication-based explanation. Communication Research, 31(5), 499–523.

    Article  Google Scholar 

  70. Timea Leaper, T., & Farkas. (2014). The socialization of gender during childhood and adolescence. In D. Paul, E. Hastings-Joan, & A. Grusec (Eds.), Handbook of Socialization, Second Edition: Theory and Research (pp. 541–565). Guilford publications.

    Google Scholar 

  71. Li-Grining, C. P., Votruba-Drzal, E., Maldonado-Carreño, C., & Haas, K. (2010). Children’s early approaches to learning and academic trajectories through fifth grade. Developmental Psychology, 46(5), 1062–1077.

    Article  Google Scholar 

  72. VanderWeele, T. (2015). Explanation in causal inference: Methods for mediation and interaction. Oxford University Press.

    Google Scholar 

  73. Heinze-Deml, C., Maathuis, M. H., & Meinshausen, N. (2018). Causal structure learning. Annual Review of Statistics and Its Application, 5(1), 371–391.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafael Quintana.

Ethics declarations

Conflict of interest

The corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 29 KB)

Supplementary file2 (DOCX 35 KB)

Appendix

Appendix

Table 5 Description of the 77 variables included in the analysis

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Quintana, R. What race and gender stand for: using Markov blankets to identify constitutive and mediating relationships. J Comput Soc Sc 5, 751–779 (2022). https://doi.org/10.1007/s42001-021-00152-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-021-00152-6

Keywords

  • Race
  • Gender
  • Markov blanket
  • Graphical models