Introduction

A substantial body of evidence has demonstrated that prenatal exposure to sex hormones is associated with neurobiological sexual differences. For example, fetal testosterone exposure is related to differences in grey matter volume in the hippocampus and the amygdala, whereas exposure to fetal estrogen is associated with differences in frontal cortex grey matter volume (Lombardo et al. 2012; Pletzer, 2019). These neurodevelopmental variations are in turn linked to sex differences in postnatal behavior and expression (e.g Blakemore et al. 2010; De Bellis et al. 2001; Giedd et al. 1997; Giedd et al. 1999; Gur and Gur 2016; Gur et al. 1999; Lenroot and Giedd 2006; Lenroot et al. 2007). Interestingly, the ratio of the second (index) to fourth (ring) digit (i.e., 2D:4D) may serve as a marker of this prenatal hormone exposure, such that a larger ratio is indicative of higher relative exposure to estrogen in utero, whereas a smaller ratio is indicative of greater relative prenatal exposure to androgens (e.g., Auger et al. 2013; Manning et al. 1998; Zheng and Cohn 2011; although see Richards et al. 2020a, b) for recent studies that call into question the validity of this measure).

2D:4D ratio

There are widely reported sex differences in 2D:4D ratio such that females generally have larger ratios than do males (e.g., Hönekopp and Watson 2010; Manning 2002). This difference is especially pronounced for the right hand (e.g., Manning et al. 1998; Hönekopp and Watson 2010), and can be detected in very young children (e.g., Gillam et al. 2008), neonates (Ventura et al. 2013), and even prenatally (Galis et al. 2010; Malas et al. 2006). Digit ratio not only differs for males and females but is also associated with sexual orientation and gender diversity. For example, there is evidence that gay males may have larger 2D:4D ratios than straight males (e.g., Lippa 2003; Manning et al. 2007; Putz et al. 2004), whereas lesbian females have smaller 2D:4D ratios than straight females (e.g., Grimbos et al. 2010; Kraemer et al. 2009; Putz et al. 2004). However, these findings are often inconsistent and/or contradictory (e.g., Putz et al. 2004; McFadden et al. 2005). For example, a large meta-analysis by Grimbos et al. (2010) found no difference in 2D:4D ratio between gay and straight males but did find a difference between gay and straight females. In addition, 2D:4D ratio may be associated with gender identity, although the literature is decidedly equivocal (e.g., Kraemer et al. 2009; Leinung and Wu 2017; Manning et al. 2020; Sadr et al. 2020; Voracek et al. 2018).

Digit ratio has also been associated with a host of sexually dimorphic behaviors and abilities. For example, males with smaller 2D:4D ratios show greater levels of aggression (Bailey and Hurd 2005; Benderlioglu and Nelson 2004; Hönekopp and Watson 2010) and competitiveness (e.g., Borráz-León et al. 2018) as compared to males with larger ratios, and are more likely to engage in risk-taking behaviors (e.g., Stenstrom et al. 2011). Additionally, 2D:4D ratio is negatively related to physical fitness and athletic prowess in both males and females (Eklund et al. 2020; Hönekopp and Schuster 2010; Hönekopp et al. 2006; Hsu et al. 2015), such that smaller digit ratios are associated with superior physical performance. Smaller 2D:4D ratios have also been associated with better performance on cognitive tasks that typically favor males, such as mental rotation (e.g., Peters et al. 2007), spatial navigation (e.g., Csathó et al. 2003), and tasks of visuospatial ability (e.g., Collaer et al. 2007).

2D:4D and pro-environmental attitudes and behaviors

Given the relationships between 2D:4D and sexually dimorphic behaviors and traits, it follows that digit ratio may also be associated with other variables that are strongly sexually dimorphic or that show sex differences, specifically, pro-environmental attitudes and behaviors. For example, males have consistently been shown to be more skeptical than females regarding the dangers of climate change (e.g., Hamilton 2011; Pickering 2015; Tranter and Booth 2015; Whitmarsh 2011; Zhou 2015), and females show greater environmental concern than males (Arcury and Christianson 1990; Blaikie 1992; Maineri et al. 1997; McCright 2010). Females also have a greater tendency to engage in environmentally friendly behaviors (e.g., Diamantopoulos et al. 2003; Fielding and Head 2012; Hunter et al. 2004; Tindall et al.. 2003), such as reducing energy consumption (Räty and Carlsson-Kanyama 2010), using public transportation (Vicente-Molina et al. 2018), and consuming less red meat (Stea and Pickering 2018).

There is emerging evidence of a positive relationship between pro-environmental attitudes and behaviors, and empathy (Baird et al. 2020; Berenguer 2007; Brown et al. 2019). Empathy offers a potential “route to human action” by supporting a more interdependent understanding of the relationship between humans and the environment (Brown et al. 2019:11). Furthermore, empathy has been linked in some studies to 2D:4D ratio (e.g., van Honk et al. 2011; Von Horn et al. 2010; Wakabayashi and Nakazawa 2010; Weisman et al. 2015), with females consistently scoring higher than males on measures of empathy (Eisenberg and Lennon 1983). However, the evidence for a link between empathy and digit ratio is mixed, with some studies showing that 2D:4D ratio is only associated with empathy in men (Von Horn et al. 2010), and a considerable number of studies (e.g., Manning et al. 2010), including a recent meta-analysis (Hönekopp 2012), finding no evidence for a link between empathy and digit ratio. Thus, it is not entirely clear whether variations in empathy are explained by prenatal testosterone exposure, and subsequently whether pro-environmental behavior will also be related.

To our knowledge, only one study has examined the relationship between 2D:4D and environmental attitudes (Hand 2020). Hand’s (2020) study explored whether 2D:4D ratio was associated with pro-environmental behaviors, such as reducing the use of single-use plastics and using public transportation. Interestingly, they found that a small digit ratio was, counterintuitively, associated with higher engagement in pro-environmental behaviors in males, whereas there was no relationship between 2D:4D and environmental actions in females. Overall, this indicates that more research is needed on the possible links between 2D:4D and environmental attitudes and behaviors.

2D:4D measurement issues

As the research on 2D:4D continues to grow, a number of methodological inconsistencies have come to light. For example, some researchers directly measure the 2nd and 4th digit in-person (i.e., direct measurements, e.g., Manning et al. 1998), whereas others use photocopies of hands or hand scans for their measurements (i.e., indirect measurements; Ribeiro et al. 2016). In addition, some researchers utilize specialized software to precisely calculate 2D:4D, whereas others use calipers or simple plastic rulers (Kemper and Schwerdtfeger 2009). Finally, in some studies, the participants themselves are asked to measure the length of their own digits (e.g., Hoskin and Ellis 2021; Osu et al. 2021; Reimers 2007). These methodological differences affect the precision of the measurements, which may contribute to inconsistencies within the literature. For example, direct measurements tend to yield different 2D:4D ratios than indirect measurements (Ribeiro et al. 2016; Manning et al. 2005), which in turn can affect the relationship between digit ratio and other variables of interest. Indeed, a 2010 metanalysis of 116 studies on 2D:4D and sex found that sex differences were inflated when digit ratio was measured indirectly as compared to directly (Hönekopp and Watson 2010). In addition, they found that the magnitude of sex differences in 2D:4D is highly variable, with typical effect sizes ranging from quite small (d = 0.09) to moderate (d = 0.61).

2D:4D and online measurements

These issues are exacerbated when collecting data via an online sample. The use of online data collection services, such as Amazon Mechanical Turk, is an increasingly attractive option given the reduced costs compared to in-person recruitment, ability to access a much more diverse pool of participants, and the rapidity with which data can be collected (e.g., Horrell et al. 2015; Krantz and Reips 2017; Sheehan 2017). The use of such online pools is especially necessary for research fields that require large N’s, diverse samples, or samples from vulnerable or difficult-to-access populations (e.g., populations that are disproportionately affected by climate change, such as females in developing countries). Now, with the unprecedented limitations and restrictions imposed by the current COVID-19 pandemic, many researchers have been unexpectedly forced to transition to collecting data remotely (Arechar and Rand 2021; Lourenco and Tasimi 2020; Wood 2021).

However, online data collection comes with its own set of limitations, including variations in data quality (Chandler et al. 2019). With respect to 2D:4D, there is a much higher probability of measurement error from self-reported digit length (Hönekopp and Watson 2010), which is the technique primarily used when collecting data from online sources. For example, Hönekopp and Watson (2010) demonstrated that the reliability of self-reported measurements of 2D:4D was just 46% of the reliability of directly-measured ratios. One way to mitigate these issues is to recruit a large sample of participants (e.g., > 250,000 as in the BBC study on sex differences; Reimers 2007), thereby reducing the impact of random measurement error (Caswell and Manning 2007). However, for many research groups, particularly those with limited resources, it may be unfeasible to conduct research on this massive scale. As such, it is necessary to explore additional options for obtaining 2D:4D data from online samples.

One possible remedy is to have participants submit images of their hands using a mobile device which can then be measured using a variety of techniques, rather than having participants take their own measurements (Sandnes 2014). The use of participant-submitted images would eliminate some of the issues associated with indirect measurement of photocopies of hands and should reduce the amount of measurement error introduced by having participants complete their own digit measurements. As of 2020, an estimated 5.2 billion individuals worldwide reported access to a mobile phone that is capable of taking high resolution images (Kemp 2021). As such, it is reasonable to assume that most individuals who participate in online data pools will have access to a camera, either via their mobile phone or webcam. As the world transitions to online data collection, it is increasingly important to find ways to measure 2D:4D in a reliable manner that is applicable to both large and small samples.

Current study

The primary purpose of the current study is to examine whether 2D:4D ratio can be reliably determined from user-submitted images. Although previous online studies have calculated 2D:4D ratio from self-reported measurements (e.g., Hoskin and Ellis 2021; Osu et al. 2021; Manning and Fink 2008), these data are prone to user measurement error and thus potentially unsuitable for smaller-scale studies. Given the utility of collecting data from online participant pools, which provide rapid and easy access to representative samples from across the world, it is necessary to develop an appropriate methodology for collecting quality physiological data from virtual participants.

In addition to examining the viability of using participant-submitted images for assessing digit ratio, the secondary purpose of this study was to examine whether the resulting 2D:4D ratios were associated with differences in empathy and pro-environmental behaviors and attitudes. Previous research has demonstrated that males and females differ with respect to pro-environmental behaviors and attitudes, including those toward climate change (e.g., Tranter and Booth 2015; Wang and Kim 2018; Zhou 2015); thus, it is possible that these sex differences in pro-environmental attitudes may be related to differences in 2D:4D ratio. To address this question, we examined whether variables that are strongly associated with pro-environmental attitudes and behaviors, and that differ as a function of sex, were associated with 2D:4D ratio. Specifically, we hypothesized that larger 2D:4D ratios would be associated with greater pro-environmental behaviors and attitudes.

Method

Participants

A total of 1065 individuals from six countries participated in this online study. A breakdown of the demographics of this sample can be found in Supplementary Table S1. The study was administered through a third-party survey company, Dynata, and participants who completed the study were compensated with Dynata credits which could be exchanged for gift cards. In order to be eligible to participate, participants had to be fluent in English, at least 18 years of age, and reside in one of six target countries (Australia, Canada, India, South Africa, UK, and USA). All subjects provided written consent prior to participating. This study was approved by the Human Research Ethics Board at Brock University and conducted in accordance with Tri-Council ethical guidelines.

Stimuli and design

Individuals were instructed to complete a questionnaire, presented via the Qualtrics platform and accessible via their Dynata profile. The questionnaire consisted of four sections: (1) a demographics questionnaire, (2) a measure of dispositional empathy, (3) three measures designed to assess environmental attitudes, and (4) a section in which participants were instructed to upload an image of their right hand for the 2D:4D ratio measurement.Footnote 1 All participants completed the questions in the same order, and after completion of the survey were debriefed and compensated.

Demographics

The demographics portion of the questionnaire asked participants to provide the following information: age in years, sex, country, type of area (urban, suburban, rural), and highest level of education. To ensure that questions were culturally appropriate, the survey was localized by an expert at Dynata with assistance from the research team. Specifically, education levels were tailored to each country, income ranges were adjusted to reflect each country’s currency, and, for the USA survey, the spelling was changed to American English.

Empathy questionnaire

To measure empathy, we employed the Adapted Empathy Scale (Loewen et al. 2009). This 8-item questionnaire assesses dispositional (i.e., naturally occurring) empathy and is a modified version of the shortened version of the Empathy Quotient (EQ) scale (Baron-Cohen and Wheelwright 2004). Participants were asked to rate their level of agreement with each of eight statements (e.g., “I find it easy to put myself in somebody else’s shoes”) on a 5-point Likert scale (1 = “strongly disagree”; 5 = “strongly agree”). An empathy score was generated by summing the ratings across the eight questions for a maximum possible score of 40 (higher scores = higher levels of empathy).

Environment questions

New Environmental Paradigm (NEP). The NEP (Whitmarsh 2011) is a widely used measure of environmental concern. Participants were presented with six statements (e.g., “Humans are severely abusing the planet”), and were asked to rate their level of agreement with each statement using a 7-point Likert scale (1 = strongly disagree; 7 = strongly agree). Scores were averaged (correcting for reverse-keyed items) for a total mean NEP score out of seven, with higher scores indicating greater environmental awareness/concern.

Environmental actions. Whereas the NEP measures environmental attitudes and values, we were also interested in examining how often participants had actually engaged in pro-environmental behaviors. Participants were asked to rate (1 = never; 5 = very often) how often they had engaged in each of the following pro-environmental actions (scale adapted from Larson et al. 2015): “Talked to others in my community about environmental issues,” “Worked with others to address an environmental problem or issue,” “Participated as an active member in a local environmental group,” “Voted to support a policy/regulation that affects the local environment,” “Signed a petition about an environmental issue,” “Donated money to support local environmental protection,” and “Wrote a letter in response to an environmental issue.” Scores were summed across the seven questions for a maximum possible score of 35 (high scores = greater participation in pro-environmental behaviors).

Environmental locus of control. We included an environmental locus of control scale (Fielding and Head 2012) to assess the degree to which participants felt that their actions could have an effect on the environment. Using a 7-point Likert scale (1 = “strongly disagree”; 7 = “strongly agree”), participants are asked to rate their level of agreement with each of three statements: “My individual actions can make a difference to the environment,” “I can influence decisions now that will help protect the environment in the future,” and “I am only one person; I can’t make a difference to the environment.” Scores were averaged for a total score out of seven (correcting for reverse-keyed items), with higher scores indicated a higher external environmental locus of control (i.e., the belief that individual actions cannot influence the state of the environment).

Digit ratio

To assess 2D:4D digit ratio, participants were instructed to take a photo of their right hand with a mobile device or computer and then upload it to the questionnaire via a designated link. The instructions to participants were field-tested to ensure clarity. Participants were asked to place their right hand on a flat surface (preferably a table, desk, or wall) with their palm facing toward the camera and their fingers spread apart. They were provided with an example photo (refer to Supplementary Fig. 1) and were also instructed to take the photo in a well-lit room or have the flash turned on, ensure that their hand was in focus (image not blurry), and ensure that all fingers were visible in the photo.

For measuring digit lengths and thus deriving 2D:4D ratio from the submitted images, we first evaluated two software packages — ImageJ (https://imagej.nih.gov/ij/) and the GNU Image Manipulation Program (GIMP, version 2.8.4; www.gimp.org GIMP) — and decided on GIMP as it offered greater functionality with the data. Photos were assessed in random order by one expert assessor. Prior to finger measurement, the assessor graded the quality of the image using the following criteria: Good: Hand resting flat, fingers straight and spaced apart, palm up, and the tip and base crease of both fingers visible; Moderate: Image is still considered usable, but is (i) partially blurry, (ii) lighting is suboptimal, and/or (iii) finger/hand orientation slightly angled or bent; additional time required to make the measurements as adjustments made where possible (i.e., improving contrast/brightness); Poor: Unable to determine digit length as (i) image is too blurry, (ii) 2D and/or 4D cut off, and/or (iii) finger/hand orientation too angled or bent.

Several techniques were trialed for determining finger length from the good and moderate quality images, and we selected a modification of the approach of Kornhuber et al. (2013). The length of the second (index) and fourth (ring) fingers of the right hand was measured from the middle of the basal crease to the tip of the finger, taking care to keep the measuring tool on the central axis of each finger, and was expressed in units of pixels using the GIMP “measure” tool. To improve accuracy, small horizontal marks were drawn digitally on the basal creases of each of the participants’ index and ring fingers before determining length. Digit lengths, derived 2D:4D ratio, image quality coding, and image metadata were compiled in MS Excel.

Results

A total of 329 of the 1065 participants were removed from the final analysis for not submitting images as instructed. Of those individuals, 190 submitted non-hand images (e.g., a picture of a dog), 43 submitted images that were in an unsupported format (e.g., video files, YouTube links), and 96 submitted left-handed photos.Footnote 2 This left a total of 736 images for further analysis.

Hand image quality and resolution

Information on the resolution and device type used for taking each image was extracted from the photo metadata where possible. The mean resolution of the submitted hand images was approximately 8.82 megapixels (MP; SD = 4.95) and ranged from less than 1 MP to 24.84 MP. Of the 736 submitted images, 166 were classified as “poor” quality, 274 were “moderate” quality, and 296 were “good” quality. In order to examine whether image resolution affected the classification of an image as poor, moderate, or good, we conducted a one-way ANOVA with a Scheffe post hoc. There was a main effect of image quality, F(2, 733) = 13.77, p < 0.001, η2 = 0.04, such that the images rated as “good” were of a significantly higher resolution than the images classified as “moderate” (Mdiff = 1.36 MP; p = 0.004) or “poor” (Mdiff = 2.39 MP; p < 0.001; see Fig. 1). However, moderate and poor images did not significantly differ with respect to their resolution (Mdiff = 1.03 MP; p = 0.10), although the poor images had numerically lower resolution.

Fig. 1
figure 1

Kernel density plot comparing image resolution in megapixels (MP) for images categorized as poor, moderate, and good

In addition to examining the relationship between resolution and image quality, we also assessed whether the type of device that participants used to take a photo had an effect on image quality (see Table 1) using the command-line tool, Exiftool (https://exiftool.org), to extract image device make and model. The vast majority of participants used a mobile phone (or the device type was unavailable from the metadata) so we were unable to perform analyses on these data, although we note that the use of a tablet device appears to yield poorer quality images.

Table 1 Device type and corresponding image quality and resolution

Reliability of image quality rating and 2D:4D measurements

Next, we examined the reliability of measuring 2D:4D ratios from user-submitted images. Twenty-nine images were randomly selected from within the pool of 736 images. A member of the research team blindly coded the image quality (poor, moderate, good) using the established criteria (refer to “Digit ratio” section in “Method”). They then repeated the exercise twice more on the same set of images with image identity re-blinded and order of presentation re-randomized. They then calculated the 2D:4D ratio for 27 of the images (16 “moderate,” 11 “good”; two “poor” quality images discarded) at three different time points (i.e., each of the 27 images was examined three times by the same assessor, and image identifiers were changed each time to ensure that they were blinded). Intraclass correlation estimates using a consistency two-way mixed-effects model were then calculated to compare both image quality ratings and 2D:4D ratio estimates across the three timepoints. The average intraclass correlation for image quality rating was 0.884 (95% CI = 0.784 to 0.941), and the average intraclass correlation for 2D:4D ratio estimate was 0.949 (95% CI = 0.966 to 0.991), demonstrating excellent intra-rater reliability.

2D:4D ratio

Data from the 166 individuals who provided images that were rated as “poor” quality were excluded from further analysis as finger length could not reliably be measured (e.g., image too blurry to identify crease, digits cut off in photo, fingers curled, jewelry covering finger crease). An additional 41 participants were removed for not answering the remaining questionnaire items as instructed (e.g., skipping all questions, submitting nonsense answers). Finally, the data from two individuals were removed for having a 2D:4D ratio that was greater than 3 SDs from the mean. As such, the data from 527 participants (280 male, 247 female, 1 undisclosed) were retained for further analysis. The mean overall 2D:4D ratio was 0.989 (SD = 0.08) and ranged from 0.781 to 1.23. The ratio data were normally distributed, D(527) = 0.38, p = 0.07, with skewness of 0.231 (SE = 0.106) and kurtosis of 0.396 (SE = 0.212).

For female participants, the mean 2D:4D ratio was 0.992 (SD = 0.082), whereas for males, the 2D:4D ratio was 0.986 (SD = 0.079; see Fig. 2). An independent-samples t-test showed that there was no sex difference in 2D:4D ratio in this sample, t(524) =  − 0.831, p = 0.406, d = 0.07. This was somewhat unexpected given that sex differences in 2D:4D are fairly well established (see Hönekopp and Watson 2010 for a meta-analysis). Given that recent studies have suggested that sex differences in 2D:4D ratio may differ as a function of ethnicity, we also ran a two (male/female) by four (white/black/Asian/IndianFootnote 3) factorial ANOVA to determine whether there was an overall effect of ethnicity on digit ratios, and whether there was a sex by ethnicity interaction. Although the self-identified black participants had numerically lower 2D:4D ratios than the other three groups — a finding that is supported by the literature (e.g., Manning et al. 2004) — this difference was not statistically significant, nor was the interaction (all F’s < 1.1; all p’s > 0.36). Additionally, there was no relationship between participant country and digit ratio, nor an interaction between country and sex (all F’s < 1.7; all p’s > 0.14). The potential implications of these findings are considered in greater detail in the “Discussion” section.

Fig. 2
figure 2

Distribution of 2D:4D ratio as a function of sex (Nmale = 279; Nfemale = 247)

Individual differences and 2D:4D ratio

To address the secondary goal of this study, we ran a series of exploratory ANCOVAs, with sex as a fixed factor and 2D:4D ratio as a covariateFootnote 4 to examine the relationships between digit ratio and our psychological and environmental measures. The frequencies, means, and standard deviations for these measures are presented in Supplementary Tables S1 and S2.

There was a marginally significant interaction between sex and 2D:4D ratio for the empathy variable, F(1, 524) = 3.78, p = 0.053, η2p = 0.01, such that males with higher digit ratios had lower empathy scores than males with lower digit ratios, F(1, 279) = 7.51, p = 0.007, η2p = 0.03, whereas females did not differ in empathy as a function of digit ratio, F(1, 244) = 0.01, p = 0.93, η2p < 0.001 (see Fig. 3).

Fig. 3
figure 3

Empathy as a function of 2D:4D ratio and sex. Shaded area depicts the standard error of each condition mean

For pro-environmental behavior, there was a main effect of sex,Footnote 5F(1, 526) = 4.68, p = 0.03, η2p = 0.10, such that males performed fewer pro-environmental behaviors than females. There was also a significant sex by 2D:4D ratio interaction, F(1, 526) = 4.46, p = 0.04, η2p = 0.09, such that females showed a positive relationship between 2D:4D ratio and number of pro-environmental behaviors (i.e., as ratio increased, females reported more behaviors), whereas males showed a negative relationship between 2D:4D ratio and pro-environmental behaviors (i.e., as ratio increased, males reported fewer behaviors; see Fig. 4). There was no relationship between 2D:4D ratio and environmental attitudes (i.e., either NEP scores or environmental locus of control (all F’s < 1.18, all p’s > 0.255, all η2ps < 0.003)).

Fig. 4
figure 4

Pro-environmental behavior as a function of 2D:4D ratio and sex. Shaded area depicts the standard error of each condition mean

Discussion

Characteristics of online 2D:4D measurements

The first key finding was that expert assessments of both image quality and 2D:4D ratio were highly reliable, with intraclass correlations exceeding 0.88. Caswell and Manning (2007) have noted that self-reported 2D:4D measurements tend to include a considerable degree of measurement error, which in turn increases the difficulty of identifying relationships between digit ratio and other variables of interest. In particular, they note that the sexual dimorphism of 2D:4D is much smaller in internet-based studies, likely due to this added error variance. By asking participants to provide an image of their hand, rather than complete a self-measurement of their digits, some of this measurement error can be eliminated (although additional error sources are introduced). Given the recent proliferation of online studies, either due to the benefits of conducting such research or out of necessity given the current pandemic, this finding provides promising evidence for the utility of this methodology going forward.

Although one source of measurement error was reduced by using our approach rather than having participants measure their own digits, there was substantial variation in the quality of the images submitted by participants. For example, a full 23% of the submitted images were classified as “poor” quality, meaning that digit ratio could not be reliably estimated. These images were rated as poor quality for a variety of reasons, but primarily because they were too blurry for the expert rater to be able to identify the appropriate features for measuring digit length. Interestingly, there was a strong relationship between image resolution and image quality, such that images that were rated as high-quality for assessing 2D:4D had a resolution of between 8 and 12 megapixels, whereas images lower than 5 megapixels were typically rated as low quality. In addition, while most participants provided images that were taken with a mobile device, those who used a tablet submitted a higher proportion of poor-quality images; we speculate that this may be attributable to the difficulty of holding a tablet with one hand to capture an image of the other hand. As such, researchers who use this method going forward may wish to consider imposing restrictions both on the resolution of submitted images, as well as the types of devices that participants use to take the images.

The second key finding was that, although estimations of digit ratio were reliable, there was a great deal of variability in 2D:4D ratio in this sample. Indeed, scores ranged widely from 0.781 to 1.23. This range is somewhat outside that recommended by Manning et al. (2007), who suggested that the majority of 2D:4D values should fall between 0.80 and 1.20 and that extreme values outside of this range might be due to measurement error (and thus, when using self-measurements, should be discarded). An additional finding was that although males in our sample had numerically lower mean 2D:4D ratios than females (Fig. 2), this difference was not statistically significant. One reason for this null result may be that while the mean female 2D:4D ratio was in line with those reported in previous studies (see Hönekopp and Watson 2010 for a meta-analysis), the males in our sample had larger 2D:4D ratios than are typically found. Although there are strong links between 2D:4D and sex, it should be noted that the magnitude of sex differences in 2D:4D is highly variable, with typical effect sizes ranging from quite small (d = 0.09) to moderate (d = 0.61; Hönekopp and Watson 2010). As such, it is possible that the lack of a sex effect here is simply attributable to the sample collected and is not indicative of a wider problem with using participant-submitted images for measuring 2D:4D. Indeed, the effect size obtained in the current study (d = 0.07) was outside of the range reported in Hönekopp and Watson (2010) and is therefore atypical for a study on sex differences in 2D:4D. More research is clearly needed into the utility of using submitted images for studying digit ratio in online studies. In particular, we encourage a validation study that directly compares the image method described here with more traditional methods for estimating 2D:4D (direct, indirect, self-report).

One important limitation that should be noted is that a high proportion (31%) of our online sample did not correctly follow instructions for submitting a legible photo of their right palm, despite very detailed instructions that had been field-tested prior to launching the survey. This noncompliance was not specific to any one country or age group (data not shown); thus, it is unclear why this issue occurred with this sample. This finding does suggest that researchers should take caution when collecting data from online samples, even when participants receive financial compensation or another incentive, as the quality of participant pools may vary considerably.

2D:4D and pro-environmental attitudes

In addition to examining the feasibility of measuring 2D:4D ratio from crowdsourced images, we were also interested in examining how digit ratio relates to environmental attitudes and behaviors. To explore this question, we examined the relationships between 2D:4D and performance on a number of measures that have previously been shown to differ between males and females: dispositional empathy and pro-environmental attitudes and behaviors (measured using the NEP, environmental locus of control, and an environmental actions scale). Unexpectedly, we found a marginally significant interaction between 2D:4D and sex for empathy, such that males with higher (more feminized) 2D:4D ratios showed less empathy than males with lower (more masculinized) ratios (but no such relationship between 2D:4D and empathy in females). This finding was contrary to what we had anticipated, since larger digit ratios are indicative of less exposure to prenatal testosterone and are usually linked to more “feminized” behaviors. As empathy is typically higher in females as compared to males, it was expected that higher digit ratios, regardless of sex, would be associated with increased empathy.

In addition, we found an interaction between digit ratio and engagement in pro-environmental behaviors. Specifically, for females a larger 2D:4D ratio was associated with greater engagement in pro-environmental behaviors, whereas for males a larger ratio was associated with fewer pro-environmental behaviors. For the female participants, this finding was expected, but the finding with the males was again contrary to our expectations. However, this result agrees with Hand (2020), who found a similar pattern whereby males with larger digit ratios were less likely to engage in pro-environmental behaviors (although hand did not find any relationship between female digit ratio and pro-environmental behavior). It is unclear why this pattern of results emerged, but at the very least it suggests that the relationship between sex and pro-environmental attitudes and behaviors may be more complex than originally thought. Further research is necessary in order to explore the relationship between prenatal androgen exposure and environmental engagement.

Although we did find a relationship between digit ratio and pro-environmental behaviors, we did not find a similar relationship between 2D:4D and environmental concern (a measure of environmental attitudes; i.e., the NEP). This may be attributable to a ceiling effect, given that this sample had a pro-ecological worldview (mean NEP score 5.2/7; SD = 1.03); speculation supported by the absence of an expected (Arcury and Christianson 1990; Blaikie 1992; Maineri, Barnett, Valdero, Unipan, and Oskamp 1997; McCright 2010) main effect for sex, F(1, 522) = 0.24, p = 0.63, η2p < 0.001. Alternatively, it may also be the case that environmental concern or attitude does not necessarily map onto pro-environmental behaviors. For example, a recent study found only a weak relationship between pro-environmental intent and objective reductions in greenhouse gas emissions from 3369 Swedish households (Enzler and Diekmann 2019), with income emerging as the strongest predictor of pro-environmental behavior.

In addition, we did not find a relationship between 2D:4D and environmental locus of control. Given that females are more likely to engage in pro-environmental behaviors, and given that dispositional internal locus of control is associated with more pro-environmental attitudes (e.g., Baird et al. 2020), we anticipated that larger (feminized) 2D:4D ratios would in turn be associated with greater internal environmental locus of control. One explanation for the null finding here may be that we measured environmental locus of control, rather than dispositional locus of control. Interestingly, dispositional locus of control has previously been shown to associate with 2D:4D (Richards et al. 2015), such that higher 2D:4D ratios were associated with greater external locus of control, although this finding was for females only.

These examinations into the relationship between digit ratio and pro-environmental attitudes and behaviors were exploratory in nature, but the results suggest that some of the sex differences in pro-environmental behavior may be due to biological factors such as prenatal exposure to androgens. The current load of environmental challenges has been described as an existential crisis for humanity (e.g., IPCC 2018), and while wider political and social factors are key to meeting these challenges, human biological factors may also play a role in mediating our response at the individual-level.

Limitations and other considerations

There were some key limitations to this study that must be considered. First, although the method of calculating 2D:4D from user-submitted images reduces the type of random measurement error that is typically found when participants self-measure their digits (Hönekopp and Watson 2010), it introduced a different source of error due to the quality of the submitted images. For example, as noted above, 23% of participants did not correctly follow our detailed instructions and submitted unusable images. Of those images that were usable, there were also variations in image quality. These variations in turn may have affected estimates of digit ratio. Indeed, there was a small, but significant, difference in digit ratio as a function of image quality, such that moderate-quality images had a slightly higher ratio than did good-quality images, F(1, 584) = 4.02, p = 0.03, η2 = 0.01. As such, image quality is an important consideration for conducting online studies of digit ratio.

In addition, camera focal length varied across participants, thereby potentially leading to distortions that would have affected the length of the digits in the image. For example, images with short focal lengths become compressed thus potentially affecting the length of the digits, particularly if the digits are not evenly pressed on a hard surface (e.g., Banks et al. 2014). Indeed, this lack of control over focal length may have, in part, contributed to the wide range of 2D:4D values that were obtained in this study, and the null relationship between sex and digit ratio. Similarly, the angle of the camera lens relative to the hand could not be tightly controlled for across participants, and this may have affected 2D:4D values.

Additionally, we expect there were variations in how much pressure participants exerted when placing their hand on a surface before photographing. These differences in pressure may in turn have led to subtle elongations of the fingers, which could affect digit ratio. A previous study that compared 2D:4D ratio estimates taken directly using calipers to estimates taken indirectly using photocopied images showed that the photocopies tend to yield lower digit ratio estimates, as well as longer digit measurements (Manning et al. 2005). The authors hypothesized that this effect may be partially due to variations in the fat pads and curvature of individual fingers and how light reflects off of them. As such, these variations may have similarly affected the images in this study. Future studies exploring the utility of assessing 2D:4D from user-submitted images should consider how variations in the way images were captured can alter digit ratio.

A second limitation is that although we were able to evaluate the intra-rater reliability of measuring digit ratio via user-submitted photographs, we were unable to assess the accuracy/validity of this method. As such, while our method may have yielded consistent estimates, they may not have been accurate estimates of actual digit ratio. Indeed, inaccuracies in assessing digit ratio may explain the lack of sex differences in this sample. Thus, future research should examine the differences between indirect estimates from user-submitted images and direct estimates using calipers to determine whether measurements from user-submitted photographs provide an accurate representation of actual digit size. We also recommend further investigation of the reliability of the 2D:4D ratio estimates by examining a greater number of images across multiple assessors.

A third limitation was that, ultimately, our sample included data from only 586 participants. Although a sample of this size provides good power to find even small effects, sex differences in digit ratio are highly variable (Hönekopp and Watson 2010). As such, even small errors in measurement can greatly impact digit ratio estimates and the ability to find an effect. A larger sample might thus be necessary when examining the relationship between 2D:4D and attitudes or behaviors. For instance, the oft-cited BBC study on sex and digit ratio included over 240,000 participants (Manning et al. 2007), thereby reducing the potential impact of measurement error on ratio estimates (Caswell and Manning 2007).

Conclusion

Given the recent proliferation of studies that use online data pools, it is desirable to develop a more robust method for collecting physiological data remotely. As such, the primary purpose of our study was to examine whether 2D:4D could reliably be measured from online samples using participant-submitted hand images. Ultimately, we were able to show that although this method yields reliable estimates of 2D:4D (i.e., intra-rater reliability), the estimates themselves were highly variable and dependent upon the quality of the submitted images. We encourage further research on the relationship between traditional measurements of 2D:4D and online methods for estimating digit ratio in order to determine whether participant-submitted images represent a valid approach to the current challenges and opportunities in this field.

In addition, this study examined the relationship between 2D:4D and empathy, pro-environmental attitudes, and pro-environmental behaviors. Pro-environmental attitudes/behaviors are strongly linked to sex, such that females tend to engage in more pro-environmental behaviors and report significantly greater concern for the environment compared to males. As such, it follows that this sex difference may at least in part be attributable to neurophysiological differences between males and females due to prenatal hormone exposure. We found an interaction between 2D:4D and sex for our measures of empathy and pro-environmental behavior. Although we found the expected relationship between 2D:4D and digit ratio and pro-environmental behavior in females, such that larger digit ratios were associated with engagement in more pro-environmental behaviors, we unexpectedly found the opposite result with male participants, whereby males with larger (i.e., more feminized) digit ratios had lower trait empathy and engaged in fewer pro-environmental behaviors. Additional research is required to further elucidate the complex relationship between empathy, pro-environmental attitudes and behaviors, and 2D:4D.

Availability of data

Data from this project will be available upon request.