Abstract
The purpose of this study is to examine the reliability of the skin tone measures in the widely used American National Election Studies data collection (ANES 2016 Time Series). Low reliability in skin tone measurement can lead to false conclusions regarding theoretically important relationships. Consistent with previous reliability analyses based on data from the General Social Survey, we find that different interviewers agree on Black and Latinx respondent skin tone less than 20% of the time and inter-rater reliability coefficients are very low (< .3). We also exploit unique features of the ANES data that allow us to (1) assess intra-rater reliability using Krippendorff’s alpha and (2) compare observer skin tone judgments to respondent self-appraisals. We find that even for cases where the same interviewer judges the same Black or Latinx respondent 2 months later, interviewers agree with their earlier assessment less than 35% of the time—only modestly exceeding expectations based on chance alone. Furthermore, we find weak correlations between how interviewers remember Black and Latinx respondent skin tone and how respondents self-describe. Importantly, our analyses indicate that these data patterns persist regardless of whether or not interviewer race/ethnicity matches that of the respondent. Thus, our results provide little support for the claim that measurement reliability can be significantly improved through a policy of matching respondents to interviewers of the same race and ethnicity. We discuss the implications for future research on skin tone’s relationship with social attitudes and outcomes.
Similar content being viewed by others
Notes
While the General Social Survey data do not include an interviewer identification number for tracking interviewers across survey panel waves, Hannon and DeFina (2016) filtered out cases where the interviewer had identical demographics for the time-1 and time-2 surveys (and thus could be the same person).
To consider whether fading suntans from the summer may drive longitudinal differences in skin tone ratings, we examined the correlation between the difference in interview dates and the difference in recorded skin tone. The estimated correlation coefficient (− .056, p > .05) suggested that this was not a major source of confounding variation. More generally, ratings from interviews that were conducted just a couple of weeks apart were not more aligned than those involving multiple months of separation, suggesting that seasonally fluctuating pigmentation levels do not play a prominent role in the results.
One of the anonymous reviewers was curious about these outlying observations. We note that of the 9 cases involving White respondents claiming dark skin tone (7 through 10), there was interviewer classification data for only 2 of them. In both of these cases, the interviewer classified the respondent as having light skin tone (recording a value of 2 in both cases).
Following Feliciano’s (2016) hypothesizing, we also examined whether female interviewers were more likely to classify respondent skin tone in line with the respondent’s self-rating (potentially reflecting culturally promoted greater awareness of others). Consistent with Feliciano’s (2016) results for racial classifications, we found similar skin tone rating correlations regardless of whether the interviewer was male or female. We also found no evidence of a stronger correlation when both the interviewer and the respondent identified as female.
We note that our finding of similar reliability for same-race and cross-race skin tone assessments is in line with research on the consistency of Black racial classifications by same-race and other-race observers (e.g., Herman 2010, but see Feliciano 2016 for some divergent results for classification as White).
We thank Verna Keith for pointing us to this literature.
References
Bae, G. Y., Olkkonen, M., Allred, S. R., & Flombaum, J. I. (2015). Why some colors appear more memorable than others: A model combining categories and particulars in color working memory. Journal of Experimental Psychology: General,144, 744–763.
Bonilla-Silva, E. (2002). We are all Americans!: The Latin Americanization of racial stratification in the USA. Race and Society,5, 3–16.
Bonilla-Silva, E. (2006). Racism without racists: Color-blind racism and the persistence of racial inequality in the United States. Boulder, CO: Rowman and Littlefield.
Dixon, A. R., & Telles, E. (2017). Skin color and colorism: Global research, concepts, and measurement. Annual Review of Sociology,43, 405–424.
Feliciano, C. (2016). Shades of race: How phenotype and observer characteristics shape racial classification. American Behavioral Scientist,60, 390–419.
Gans, H. J. (2012). ‘Whitening’ and the changing American racial hierarchy. Du Bois Review: Social Science Research on Race,9, 267–279.
Garcia, D., & Abascal, M. (2016). Colored perceptions: Racially distinctive names and assessments of skin color. American Behavioral Scientist,60, 420–441.
Gullickson, A. (2005). The significance of color declines: A re-analysis of skin tone differentials in post-civil rights America. Social Forces,84, 157–180.
Hagiwara, N., Kashy, D., & Cesario, J. (2012). The independent effects of skin tone and facial features on Whites’ affective reactions to Blacks. Journal of Experimental Psychology,48, 892–898.
Hannon, L., DalCortivo, A., & Mohammed, K. (2017). The case for taking White racism and White colorism more seriously. In Zulema Valdez (Ed.), Beyond Black and White: A reader on contemporary race relations. Thousand Oaks, CA: Sage Publications.
Hannon, L., & DeFina, R. (2014). Just skin deep: The impact of interviewer race on the assessment of African American respondent skin tone. Race and Social Problems,6, 356–364.
Hannon, L., & DeFina, R. (2016). Reliability concerns in measuring respondent skin tone by interviewer observation. Public Opinion Quarterly,80, 534–541.
Hayes, A. F., & Krippendorff, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures,1, 77–89.
Herman, M. R. (2010). Do you see what I am?: How observers’ backgrounds affect their perceptions of multiracial faces. Social Psychology Quarterly,73, 58–78.
Hill, M. E. (2002). Race of interviewer and perception of skin color: Evidence from the multi-city study of urban inequality. American Sociological Review,67, 99–108.
Hochschild, J. L. (2012). Lumpers or splitters: Analytic and political choices in studying colour lines and colour scales. Ethnic and Racial Studies,35, 1132–1136.
Keith, V. M., Lincoln, K. D., Taylor, R. J., & Jackson, J. S. (2010). Discriminatory experiences and depressive symptoms among African American women: Do skin tone and mastery matter?”. Sex Roles,62, 48–59.
Krippendorff, K. (2004). Content analysis: An introduction to its methodology. Thousand Oaks, CA: Sage.
Levin, D. T. (2000). Race as a visual feature: Using visual search and perceptual discrimination tasks to understand face categories and the cross-race recognition deficit. Journal of Experimental Psychology: General,129, 559–574.
Maddox, K. B. (2004). Perspectives on racial phenotypicality bias. Personality and Social Psychology Review,8, 383–401.
Massey, D. S., & Martin, J. A. (2003). NIS skin color scale. Princeton, NJ: Princeton University.
Monk, E. P. (2014). Skin tone stratification among Black Americans, 2001–2003. Social Forces,92, 1313–1337.
Monk, E. P. (2015). The cost of color: Skin color, discrimination, and health among African-Americans. American Journal of Sociology,121, 396–444.
Penner, A. M., & Saperstein, A. (2013). Engendering racial perceptions: An intersectional analysis of how social status shapes race. Gender & Society,27, 319–344.
Piston, S. (2014). Lighter-skinned minorities are more likely to support republicans. The Washington Post, Monkey Cage Blog, September 17, 2014.
Ramakrishnan, K. (2014). Light-skinned minorities won’t grow the Republican Party. The Washington Post, Monkey Cage Blog, September 24, 2014.
Reece, R. (2019). Coloring racial fluidity: How skin tone shapes multiracial adolescents’ racial identity changes. Race and Social Problems,11, 290–298.
Roth, W. D. (2010). Racial mismatch: The divergence between form and function in data for monitoring racial discrimination of Hispanics. Social Science Quarterly,91, 1288–1311.
Roth, W. D. (2016). The multiple dimensions of race. Ethnic and Racial Studies,39, 1310–1338.
Telles, E., & Paschel, T. (2012). Beyond fixed or fluid: Degrees of fluidity in racial identification in Latin America. Working paper. Retrieved from: https://perla.princeton.edu/files/2012/05/BeyondFixedorFluid.pdf.
Telles, E., & Paschel, T. (2014). Who is Black, White, or mixed race? How skin color, status, and nation shape racial classification in Latin America”. American Journal of Sociology,120, 864–907.
Vargas, N. (2015). Latina/o whitening? Which Latina/os self-classify as White and report being perceived as White by other Americans? Du Bois Review: Social Science Research on Race,12, 119–136.
Walker, D. A. (2016). Confidence intervals for Kendall’s Tau with small samples. Journal of Modern Applied Statistical Methods,15, 868–883.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Hannon, L., DeFina, R. The Reliability of Same-Race and Cross-Race Skin Tone Judgments. Race Soc Probl 12, 186–194 (2020). https://doi.org/10.1007/s12552-020-09282-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12552-020-09282-4