Quantifying gendered participation in OpenStreetMap: responding to theories of female (under) representation in crowdsourced mapping

This paper presents the results of an exploratory quantitative analysis of gendered contributions to the online mapping project OpenStreetMap (OSM), in which previous research has identified a strong male participation bias. On these grounds, theories of representation in volunteered geographic information (VGI) have argued that this kind of crowdsourced data fails to embody the geospatial interests of the wider community. The observed effects of the bias however, remain conspicuously absent from discourses of VGI and gender, which proceed with little sense of impact. This study addresses this void by analysing OSM contributions by gender and thus identifies differences in men’s and women’s mapping practices. An online survey uniquely captured the OSM IDs as well as the declared gender of 293 OSM users. Statistics relating to users’ editing and tagging behaviours openly accessible via the ‘how did you contribute to OSM’ wiki page were subsequently analysed. The results reveal that volumes of overall activity as well editing and tagging actions in OSM remain significantly dominated by men. They also indicate subtle but impactful differences in men’s and women’s preferences for modifying and creating data, as well as the tagging categories to which they contribute. Discourses of gender and ICT, gender relations in online VGI environments and competing motivational factors are implicated in these observations. As well as updating estimates of the gender participation bias in OSM, this paper aims to inform and stimulate subsequent discourses of gender and representation towards a new rationale for widening participation in VGI.


Introduction
Our engagement with maps has changed. Nowadays, if you order a taxi, a pizza, or use an app to navigate your way through an unknown urban district, the data that these location based services use to provide you with that service will not be that collected by a qualified surveyor tasked by a government appointed agency, armed with the knowledge, skills and equipment to accurately locate and measure topographical features. Since the creation of Web 2.0 and the digital revolution in mobile computing, modern cartographic practices have changed. It is now more likely that this data originates from an online volunteered mapping platform, populated with 'crowdsourced' geospatial data. Despite the increasing influence of professional editors tasked by large corporations such as Apple and Microsoft to maintain the OSM database (Anderson et al. 2019;Brabham 2012), this data will likely have been derived from a remotely sensed satellite image, aerial photograph or GPS trace synthesized into cartographic data by an amateur mapper sitting at a desk in their home study. Empirical evidence also suggests that overwhelmingly, the creators of this data, including those corporate editors, are young men with an interest in technology and the computer skills and knowledge to match (Budhathoki and Haythornthwaite 2013;. This skewed participation model has subsequently been problematised by feminist GIS scholars on the grounds that it fails to represent the geospatial interests of the wider community, specifically women. However, these theories remain untested.

Gender representation in VGI
Recent socio-technological developments in computing have fostered a transformation in the way geospatial data is both produced and used. Transformations in digital technologies coupled with the wireless connectivity brought about by web 2.0, have effected an inherent interactivity (Flanagin and Metzger 2008) which lends itself to collaborative data models: distributed individuals connecting and sharing information online. This practice has extended to the collective production and consumption of geospatial data volunteered by distributed, non-expert individuals and through which the 'Geoweb' ), a virtual online assemblage of tools, data and practices, has been brought into existence. This process and the resulting data, the two often conflated as volunteered geographic information (VGI) (Goodchild 2007), can be understood as an expression of the growing and expanding 'wikification' of GIS (Sui 2008); as a subset of the broader co-production of knowledge through the creation of user-generated content (UGC), or 'crowdsourcing' (Howe 2006).
VGI encompasses a range of mapping communities, data formats and knowledge types (Bittner 2017). However, it is the active contribution of geospatial data to online mapping projects which constitutes the focus of this work. The most successful (although conspicuously non-visible) example of this form of VGI, with over 4 million registered users, 1 is the online mapping project OpenStreetMap (OSM). Founded in 2004 with the mission of creating ''a free, editable map of the world'' (Mooney and Minghini 2016), the open source platform follows the peer production model that created Wikipedia (Haklay and Weber 2008) whereby non-experts contribute geospatial data by and for the use of other 'produsers' (Coleman et al. 2009).
The emergence of Neogeography in the mid-2000s was initially hailed as a democratising geospatial practice, seemingly promising access to both the production and consumption of a new form of geospatial information for anyone with a Wi-Fi connection. Neogeographers were resultantly ascribed a stake in the knowledge economy (McConchie 2015). Discourses around VGI, therefore became synonymous with concepts of democratisation (Haklay 2013). However, research over the past decade has suggested the limitations of this new geospatial practice to include the experiential geospatial knowledge of the wider global community, as demographic biases in participation have been identified. As well as by age and educational background, recent research has demonstrated that user-ship in OSM is strongly skewed by gender (Budhathoki and Haythornthwaite 2013;Coleman et al. 2009;Budhathoki et al. 2010;Haklay and Budhathoki 2010;Elwood et al. 2012;Stephens 2013;, 95-98% of all contributions to OSM being produced by men Steinmann et al. 2013;Stephens 2013;Stephens and Rondinone 2012), who are often young and technologically enabled (Budhathoki and Haythornthwaite 2013;. Contributors to OSM thereby comprise a demographic cohort that is particularly skewed. The identification of this participation bias has prompted leading critical and feminist GIS scholars to challenge the notion of democracy within VGI practices (Elwood 2008;Haklay 2013). Critical GIS thinkers have argued that maps (including those created by VGI) can never claim to be entirely impartial as objectivity would require a value-free view from nowhere (Haraway 1991). Maps are ultimately embodied subjects that cannot be considered separate from the people that created them (Haraway 1988(Haraway , 1991Pavlovskaya and Martin 2007). They consequently reflect the norms, assumptions, traditions and political biases of the map-makers themselves (Harley 1989), in this case the contributors to VGI. These critiques ultimately question the notion of 'the crowd' in crowdsourced geospatial data models.
These critiques are perhaps rooted in the theoretical stance of the feminist critic Donna Haraway, who states that, as it is inherently imbued with the intent and context of the subjects that produce it, knowledge cannot be separated from its creators (Haraway 1991). Adopting this theoretical framework, some have subsequently asserted that due to the failure of crowdsourced mapping projects to represent the interests of the wider 'crowd' (Elwood 2010;Leszczynski and Elwood 2015), VGI has been unsuccessful in its bid to democratise geospatial practices (Haklay 2013).
Not only is this pertinent to the nature of the data itself but also to the technological dictums through which geospatial knowledge is shaped, for example the way online geospatial data is indexed (Zook and Graham 2007). Both the hard and soft computational infrastructures can be implicated in these processes which lend themselves to certain cognitive traits, modes and spheres. As Elwood (2008) states: ''when the epistemologies, vocabularies, and categories of data structures do not or cannot encompass the experiences, knowledge claims, and identities of some social groups or places, this produces their underrepresentation in digital data' ' (178). This view has also been articulated in the context of wider uneven geographies of access, research on which reveals that when only the specific perspectives of a particular demographic are shared certain people and places remain hidden and invisible (Graham 2010;Graham et al. 2014).
Based on these potential exclusions, there is an inherent assumption as well as an explicit claim within critical discourses of VGI that data quality is compromised on the grounds of biased representation in the data. That is, that the interests of certain groups (manifest as geospatial features) are under-represented and others over-represented as a reflection of the structure of the demographic group that participate in the creation of the data. These imbalances in participation therefore contradict the nature of VGI as a tool for democratising access to and the production of geospatial data so hotly anticipated at its outset. Its credibility as a source of geospatial data and ultimately its scope is therefore constrained.
These discourses of VGI and representation therefore expose the potential implications of the bias, which are problematic on these grounds. However, they do so with little empirical grounding with regard to the impact of demographic biases on the topographical data that is and isn't volunteered and therefore what is and isn't visible. If, as critical GIS discourses of gender and VGI propose, the crowdsourced map is a reflection of the geospatial interests of those that create it, given the particular strength of the gender imbalance in participation, then the interests of women specifically, are repeatedly excluded by the process. Scholars have argued that this has led to gendered user-generated representations of the lived experience of only a small segment of users (Stephens 2013). This being the case, the impact of the underrepresentation of [the interests of] other demographic groups, such as the elderly or specific racial or ethnic groups, must also be considered and the questions of what else is missing, posed. Several empirical studies have found that demographic participation biases in crowdsourced geospatial data in the context of both nationality (as a proxy for cultural differences) (Comber et al. 2016;Mullen et al. 2014), culture (Quattrone et al. 2015) and cultural-religious affiliation (Bittner 2017), affect the nature of what of contributed or volunteered. The role of gender in shaping geospatial contributions has also been recognised (Gardner and Mooney 2018a;Gardner and Wardlaw 2018;Stephens 2013). Subjective perceptions of environment within crowdsourced data have also been shown to impact on the contribution of geospatial data (Solymosi et al. 2018). These collective findings therefore support the notion that demographic participation biases in crowdsourced data collection models impact on the nature of the data. In short, 'who' contributes the data, matters (Brown 2017;Comber et al. 2016).
The direct impacts of the gender bias, that is the specific nuances of how, however, remain woefully under-articulated within the limited body of work on gender dimensions in VGI, an observation that is made in relation to the wider spectrum of demographically biased participation in crowdsourcing geospatial data (Haklay 2016;Quattrone et al. 2015). The fem2map 2 research project conducted by the Technical University of Vienna aimed to elevate female participation in VGI and resulted in several findings regarding female engagement in VGI (see below). Since then scholarly interest in gender and VGI has been conspicuously absent and seemingly abandoned. Instead, limited evidence, both anecdotal and empirical, suggests that current debates around gender in VGI play out predominantly in online forums and are focused on overcoming hostility and creating acceptance of diversity within male dominated virtual online mapping environments, 3 the prospect of which has been identified as a factor in alienating women from adopting technology more generally (Sørenson 2002). These efforts perhaps deflect attention away from attempting to understand the real value of widening participation, a step-change which is inherent within the rhetoric around gender and VGI. Current discourses of gender and VGI proceed therefore with little sense of impact. A deeper sense of these effects would work to empirically underpin the claims around gender and representation in VGI and focus the debate on the value to the map of gender inclusion and widening participation more generally. In short, revealing the cartographic effects of the gender participation bias would facilitate a clearer understanding of the value of male and female participation and therefore the basis of a rationale for widening participation beyond its current focus of equitable opportunity. By identifying gender differences in editing and tagging preferences, this paper aims to do contribute to this advance.

Gendering OSM contributions
The impacts of the gender participation bias in OSM and VGI more broadly cannot be identified without knowledge of the nature and characteristics of female participation. Broader quality issues of crowd-sourced data, including its inherent biases, continue to occupy a central focus within discourses of VGI (Basiri et al. 2019). However, the gender participation bias and its effect on the data has evidently failed to receive the same attention. This dearth of knowledge can be largely attributed to the lack of availability of the demographic data of contributors, which is either unrecorded (e.g. in OSM) or inaccessible due to data protection issues (e.g. in Google MapMaker). Previous analyses of OSM contributor behaviours have therefore been performed either in isolation from their realworld recorded actions or where socio-demographic indicators have been inferred to evaluate their impact on data quality (Bittner 2017;Mullen et al. 2014;Quattrone et al. 2015). Making sound inferences requires a sample that is representative of the underlying population (Jensen and Shumway 2010). Despite the value of these studies to surmise effect, without knowing the unique demographic profile of users, they can only ever offer a limited understanding of the direct impacts of participation biases on VGI processes.
Capturing the demographics of those that have created the data is therefore essential to making any kind of analysis of user behaviours based on these aspects. This relies on the active contribution of participants through direct surveying. However, due to reasons explored in ''Sampling and analysing OSM users and their behaviours'' section, with the exception of Budhathoki and Haythornthwaite (2013) which linked user demographics to editing preferences, this methodological approach has been limited. Recent data protection regulation for the EU in the form of the General Data Protection Regulation (Georgiadou et al. 2019) presents an additional and particularly pertinent challenge to accessing OSM user demographic data, as research has demonstrated a significant European bias in active contributors (Budhathoki and Haythornthwaite 2013;Gardner and Mooney 2018b). 4 Although several studies have employed online surveys to analyse behaviour and motivational factors Stephens 2013;Stephens and Rondinone 2012), none have performed these analyses in relation to users' demographic profiles, which is essential for evaluating the relationship between real-world behaviours with demographics. These studies, thus offer more limited accounts of gendered behaviours and in more generalised VGI domains.
Through a novel methodology, this study elevates our understanding of gendered mapping behaviours, by directly linking users' map edits (as recorded in the OSM full planet history file and subsequently collated on the 'how did you contribute to OSM' wiki pages) to their declared gender, captured by an online survey. This study therefore genders OSM contributor behaviours, which not only distinguishes it from other analyses of contributor activity, but also significantly furthers scholarly understanding of the gender differences in VGI participation and therefore of some of the impacts of the demographic participation bias, hitherto unreported. In doing so, this paper provides an empirical response to a need to identify the cartographic impacts of the gender bias and consequently an experiential foundation upon which future discourses of representation in VGI might proceed. Through this, as well as updating recent estimates of gendered participation in OSM, it contributes to discourses in demographic biases and gender dimensions in VGI as well as citizen science more broadly.
The paper continues by considering recent discourses in gender dimensions of VGI, to which broader gender and ICT literatures are relevant. This is followed by an account of the methodological approach to the study and the results sections which report gender differences in volumes of activity, 'modes' of contribution and tagging preferences. A discussion of these results follows, before a summary of the findings, concluding remarks and recommendations for further analyses.

Understanding female participation in VGI
The identification of gendered participation in VGI led to a small body of research which has sought to understand aspects of female engagement in these practices. The majority of this work was a result of the aforementioned fem2map project and manifested in two main strands of inquiry. The first examined women's involvement in active VGI projects through the broader lens of women's participation in UGC platforms, comparing social networking sites such as Facebook and Twitter with those with a more spatially explicit focus, i.e. OSM. Using this framework Steinmann et al. (2013) observed an inverse relationship between female participation and sites with increasing geospatial content. These findings support earlier work by both Budhathoki et al. (2010) and Stephens and Rondinone (2012), the latter of which found that despite equal gender representation on sharing sites such as Picassa and Flickr, there was a substantial drop in female contributions of geospatial content (e.g. a geotagged a photo on a social networking site) and an even further reduction of participation in sites of cartographic production (i.e. OSM and Google Maps).
This trend also correlates with observed models of co-production in the creation of specific specialist knowledge online. Both Lam et al. (2011) and Cohen (2011) observed a strong male bias in both contributions and content to the online encyclopedia Wikipedia, suggesting that topics of greater interest to women as well as their perspective on events and issues, are less prevalent amongst Wikipedia's topics and subsequently that its content is gendered (Stephens 2013). Seemingly, the more technical or expertise-driven the 'volunteering' aspect, the more gender skewed the relationship becomes in favour of men .
The sharing element of social networking sites, which studies have found correlate with higher levels of female participation (Hampton 2011;Stephens and Rondinone 2012), is a key component of female engagement in UGC. In their study of women's motivation to contribute to VGI, Steinmann et al. (2013) found that where there was simply an option to create content with no feedback, even where there was no spatial dimension, female participation levels were significantly lower. The lack of interaction with other users was also cited by a quarter of respondents as a reason for abandoning OSM contributing . These collective studies evidently reveal that women are less likely to contribute to UGC platforms where there is an explicit geospatial dimension and also where there is an opportunity to contribute what might be perceived as 'expert' knowledge. Instead, women are also seemingly more motivated than men by 'sharing' opportunities in UGC online environments.
Insights into these nuanced digital divides can be provided by considering both barriers to and motivations for female engagement with VGI projects. Research by  and , which explored motivations for and barriers to contributing to VGI projects respectively, support the notion that perceptions around the requirement of specialist and complex knowledge or skills sets negatively influences women's participation in spatially explicit VGI projects such as OSM. Considering these aspects in the context of gendered engagement with ICT more broadly offers further understanding of women's engagement with VGI. Previous research has attributed lower female engagement with ICT to socio-economic inequalities. Liff et al. (2004) and Gilbert et al. (2008) both found that women with lower incomes are more often tasked with childcare and household responsibilities which leaves less time and perhaps inclination for engagement with technology. The publication of this research prior to the digital revolution in mobile technologies, which has undoubtedly facilitated both men's and women's access to and opportunity to engage with UGC online, is acknowledged. However, more recent research has again implicated women's lifestyles in this relationship. Both  and  found that women perceived OSM editing as a time-consuming activity. Therefore time factors featured significantly as both a barrier to female participation ) and a motivation for women when asked to consider a return to OSM contributing .
The perception of caring and nurturing as particularly female qualities has also been implicated in their participation in VGI. Specifically, as women display higher levels of motivation to participate in these practices where they have an intrinsic, positive outcome (M. Schmidt and Klettner 2013), for example through the contribution of data which facilitates the support of marginalised groups. This notion is confirmed by recent work on participation in crowdsourced humanitarian mapping, which demonstrates significantly higher female participation rates (Gardner and Mooney 2018a; Humanitarian OpenStreetMap Team 2017). Humanitarian mapping for organisations such as Humanitarian OpenStreetMap (HOTOSM), Crowd2Map, Missing Maps, GAL and YouthMappers prescribe specific mapping tasks to support sustainability, development and disaster response. Significantly, these projects also demonstrate inherent gendered objectives. For example the Geochicas and GAL encourage the mapping of facilities and amenities at a local, regional and national scale, which improve the living conditions, specifically of women, through various technology projects, public data imports and participatory mapping. 5 Empirical efforts to understand the causes of female participation in VGI-or its lack thereof, have been prioritised over attempts to decipher the direct effects of the bias. This may in part be attributed to the issues around surveying contributors and access to their demographic data, highlighted earlier. Stephens (2013) work is an exception. Focusing on the feature tag approval process in OSM as well as the process of verifying local edits in Google MapMaker, Stephens (2013) found that the gender participation bias impacted on the range of feature tags available, in that there were fewer options for accurately tagging female interest (compared to male interest) amenities. Given the way spatial-semantic interaction can determine the way feature types are proposed (Mülligann et al. 2011) as men feature overwhelmingly as the main participants in these practices, they subsequently serve as the gatekeepers to local knowledge, which, Stephens (2013) asserts, results in gendered representations of user-generated geospatial content. Given the widespread use of location based service provision on crowdsourced geospatial data, Stephens (2013) argues that these subjective versions of the world are endlessly reproduced. She concludes that this reflects both a wider undervaluing of feminized spaces, also demonstrated in research by Lawson (2007) and Pratt (2003), as well as an underrepresentation of women's experiential knowledge in online geospatial domains, starkly illustrated in The Abortioneers Blog (2011).
The implication of the participation bias being identified is a skewed cartographic representation of the world, i.e. that of the dominant producers, who are overwhelmingly male: what might be perceived as female interest amenities become less visible than those of the dominant contributor group, i.e. men.
There is also an implied assumption within these discourses that this biased representation is problematic. Despite Stephens' (2013) efforts to identify the material ways in which the gender disparity in participation impacts on the map, to say they remain under articulated, is to trivialise the value of identifying these effects.

Sampling and analysing OSM users and their behaviours
This analysis of gendered OSM editing behaviours entailed a three-step methodology: (1) An online demographic survey of OSM users; (2) recording of respondents' OSM editing statistics; (3) a statistical analysis. This process is detailed below.
In August 2017 an online survey was launched via the Bristol Online Surveys tool. 6 The questionnaire comprised six closed questions which captured respondents' OSM username and 5 demographic indicators (gender, age, educational background, location and country of origin). Mindful of the potential sampling limitations experienced by webbased surveys related to both coverage and nonresponse errors (Dillman and Smyth 2007) as well as contributors increasing disinclination to participate (a phenomenon referred to as 'survey fatigue') and informed by response rates (just over 1%) to an earlier user survey conducted by Budhathoki (2010), disseminating the survey as widely as possible was considered imperative to maximise participation. Given the singular interest in OSM contributors, the link to the survey was distributed through both the OSM user diary entries (visible to all OSM members) and five English language talk mailing lists. 7 It was accepted that latent users will not have received or read the message due to inactivity or inactive email addresses which could thereby impact on non-response bias (see ''Response rates, non-response bias and representativeness'' section). In an effort to minimise non-participation bias, an initial OSM diary entry introducing the research study was published by the author several weeks prior to its launch, followed by an additional entry 4 weeks post-launch encouraging further participation. Advance notification of survey launches as well as interim reminders have been proven to increase participation rates (Kaplowitz et al. 2004). Participation was also incentivised with a £15 shopping voucher for sixty randomly selected participants. 8 The survey remained open for 8 weeks.
Capturing respondent's OSM usernames enabled access to a set of pre-collated individual user statistics openly accessible via the 'how did you contribute to OSM' webpage. 9 This data, which records statistics about each users' activity, editing and tagging actions is collated by OSM community member and commentator Pascal Neis. 10 Despite its limitation in largely excluding the contextual nature of OSM contributions, its quantification of particular types of edits facilitates the analysis of user preferences, essential for locating differences in both the volume and nature of what men and women do in OSM, and therefore identifying and evaluating their contribution. Nineteen variables, each of which serve as a quantitative measure of OSM contributor behaviours, were recorded for each user. These were organised into three categories (activity, editing and tagging) and are summarised in Table 1. Analysing these different types of edits (nodes, ways and relations) and modes of contribution (whether they were created, modified or deleted) supports the estimation of gendered roles in affecting degrees of coverage and completeness (by considering volumes of created data), as well as men and women's relative contribution to levels of geometric and thematic accuracy (by considering the volumes of deletions and modifications); identifying statistical differences in the tagging categories to which men and women contribute may inform potential areas of thematic over-or under-representation and consequently, strategies for widening participation.
Given the non-parametric distribution of the data (see ''Response rates, non-response bias and representativeness'' section), a Mann-Whitney-U test, 6 http://www.onlinesurveys.ac.uk a free to use tool for registered staff at UK higher institutions. 7 As the survey was written in English (which is also the accepted default language of OSM), the survey was disseminated to the mailing lists where it is existed for English speaking countries/regions. These were UK, USA, Africa, Australia and the global generic 'talk' mailing list.
which compares mean ranks (as a means of normalising the distributions) was performed on each of the variables. By comparing mean ranks, the Mann-Whitney test also allows for users' contributions to be explored in a way which suffer less from the influence of outliers (Field 2018).

Response rates, non-response bias and representativeness
The survey attracted 326 responses. Although at \ 0.008% this represents a fraction of the 4.3 million registered users it can be contextualised by considering: (1) the actual number of contributors (rather than the number of registered users) and (2) participation models in UGC platforms, which also characterise contributions patterns in OSM (Haklay 2016). 11 (1) According to the OSM statistical records the number of users that have made an edit to the map is 1 million. 12 This represents a quarter of registered users and recalibrates the response rate here to just over 0.03%. (2) UGC participation models propose that a small proportion of the respondents (1%) create the vast majority of the content (90%) and vice versa. This is often described as the '90-9-1' rule or 'long-taileffect' (Haklay 2016;Hristova et al. 2013;Kittur et al. 2007;Neis and Zielstra 2012;Panciera et al. 2009;Priedhorsky et al. 2007). If this rule is applied to current counts of contributors and around 1% (around 100,000, i.e. the most active) were exposed to the survey, this gives a more realistic response rate of around 0.3%.
Of the 326 responses, 49 identified as women, 272 as men and 5 declined to say. Due to the focus of the study specifically on gender, these last 5 responses were excluded from the analysis. 28 responses were also excluded due duplication, a user's absence of OSM activity or inadvertent error or deliberate sabotage in the username field, which rendered the respective username undiscoverable in OSM. After these exclusions a sample of 293 responses remained: 38 from women and 255 from men. This ratio updates previous estimates of the male participation bias to 87%, a reduction of 9% from the 96% reported by Budhathoki and Haythornthwaite (2013). The increased female participation rate suggests that the survey attracted a greater response from the female OSM user cohort (possibly due to the focus of the survey on contributor demographics). This therefore represents either an over-representation in the survey sample of women, or their increased participation in OSM since earlier estimates. These figures may also indicate that women are more likely to subscribe to mailing lists and message boards and were therefore more exposed to the survey. However, as there is currently no requirement for users to disclose personal information and that mailing lists require administrator's access it is impossible to qualify these propositions.
To gauge non-response bias and therefore consider the representativeness of the sample, levels of respondents' activity were compared with those of the community as a whole, by comparing male and female figures for the number of days a respondent has been active since registration and the number of changesets they have completed (see Figs. 1 and 2). Despite the difference in sample size between men and women, the 'long tail' distribution was observed for each of the variables for both groups, therefore mimicking the overall contribution pattern of VGI projects discussed above and supporting the premise that both groups are representative of the overall population (although given the unavailability of users' demographic data there is currently no official means of quantifying this). Median ranking data displayed in Table 2 suggests that the male sample of respondents contains some of the highest ranking contributors in terms of overall edits made to the OSM project, whereas the female sample of respondents are ranked considerably lower. Given a community of over 4 million users, for both gender groups these rankings can still be deemed relatively high with the range of rankings comfortably in the top 1% (43,000). However, it remains impossible to know whether the sample contains the most active female users as gender is unknown for the missing ranks (as is true for the male sample).

Gender differences in volumes of activity
Despite demonstrating similar distribution curves for each of the activity variables ( Figs. 1 and 2), the median values for the volumes of activity (as measured by the 'activity' categories listed in Table 1) was statistically higher for men at a 0.05 level of significance: the mean ranks show that men have statistically more 'days active' (158.94 compared to 66.86) and demonstrate statistically higher numbers of changesets than their female counterparts (i.e. those that demonstrate the greatest activity amongst the most active female cohort of the sample) (156.26 compared to 84.83).
Higher levels of activity were also observed for the editing and tagging variables (see Table 1). For all nine of the editing variables, the hypothesis that there was no difference between men and women was rejected at a 0.05 significance level. Statistically, men made more edits in each of the editing categories, demonstrating that men are more active as OSM editors of all combined edits (nodes, ways and relations). Results across the tagging categories also reveal that men participate in statistically more tagging than women at a 0.05 significance level. For all 19 variables therefore, men made significantly more contributions, revealing that not only is there a male bias in the number of contributors, but that male contributors are statistically more prolific than women in their contributing.
Gender differences in object types and modes of editing However, analysis of the nine variables used to measure user editing preferences revealed no statistically significant differences at the 0.05 significance level between men and women in either the object types (nodes, ways and relations) through which users chose to represent cartographic data or in the modes through which they chose to edit the map (creation, modification or deletion of data). The results are also reflected when the charting the frequency distributions. For both groups, 'nodes' dominated the objects of interest with a mean value of approximately 87% of activity dedicated to mapping point data (combined value for creations, modifications and deletion of nodes; see Fig. 3). The majority of both groups' remaining activity was focused on the editing of ways, with only trace values for editing relations. This set of results corresponds directly with figures for overall contributions to OSM which record that around 90% of all contributions are made as nodes. 13 More specifically, of the 9 editing variables, it was the creation of nodes which dominated activity for both groups (see Fig. 4). Anecdotal evidence reports a widely held notion amongst OSM contributors of nodes being easier, simpler and quicker to contribute than ways (either as lines or closed polygons) or relations. These results reflect therefore either this assumption, or accepted modes of co-production for particular types of data within OSM in terms of the proportion of objects mapped as each respective type, although the statistics to support this are unavailable.
These correlations between male and female object editing are also observed in users' 'modes' of editing, i.e. the proportion of newly created ('creates'), 'modifications' to or 'deletions' of existing data made by each user. Although the results reflect little difference between men and women in their preferences for modes of contributing, men demonstrated a slightly higher propensity to modify objects than women, whereas the reverse was true for the creation of new data. Both groups' demonstrated similar values for deletions (see Fig. 5). These preferences correspond with findings elsewhere, which, at the national scale and in a humanitarian mapping context, observed a male focus on geometric accuracy (Gardner and Mooney 2018a). However, these results also progress these findings by demonstrating that these preferences are also observed at the global scale and in a more general mapping context. This observation corresponds with research conducted by Fisher and Margolis (2002) on gendered engagement with ICT, which revealed that men are more motivated by the experience of technology and technological processes, whereas women were more motivated by the intrinsic outcomes offered by participating in ICT.   Gender differences in use of tagging keys As with the editing variables, no statistically significant differences were observed in the tagging categories to which men and women contribute. However, presenting the data in box plot form (Fig. 6), reveals differences in the proportions of tags made to the 'buildings' category between men and women, in that the spread for the interquartile range extends to a higher percentage (60%) than that for men (53%). This thereby suggests that more of women's tagging work in OSM is in this category than is men's, although it is acknowledged that both men and women demonstrated similarities in their preference for mapping buildings. The same is true for 'highways' and these two categories dominate both groups' tagging activities over the remaining 6 categories. This is also observed in the bar graph presented in Fig. 7 which provides a comparative view of the tagging categories to which men and women contribute. This preference for tagging 'buildings' and 'highways' combined, reflects the general pattern of OSM contributing in that the two categories are amongst the top three most commonly used tags. 14 'Address' and 'name' are also amongst the top level categories of tagging, which again, suggests the sample is representative of the overall OSM community of contributors. Men demonstrate instead higher values for the 'highways' category (27.45% of the overall contribution by male users, compared to 25.04% by female users). Figure 6 illustrates how the distribution of contributions towards the 'highways' categories among male users is more skewed towards higher percentages than among female users, although the distributions are not statistically dissimilar. This may be related to levels of knowledge and skill required to map and subsequently label linear features such as roads, which are mapped as ways rather than nodes (i.e. single points). Given the inclusion in the dataset of all users' edits to OSM since registration, it may also reflect the juncture at which users' joined the project, women possibly arriving later to the OSM party. The road network was the initial focus of the project and one of the first thematic components to reach high levels of coverage (Barrington-Leigh and Millard-Ball 2017). In the event the proportion of female contributors has increased over time, women may have found much of the road network already completed (although this would only be true for the creation of new data). Changing the temporal parameters of the inclusion of users' edits may return a different set of results.

Discussion
This analysis has demonstrated several variations in men's and women's OSM contributing habits, measured by their levels of overall activity and their editing and tagging preferences. The principal finding is the stark difference in activity levels between men and women, denoted by comparatively higher volumes of contributions as well as time spent devoted to OSM activity, both of which are statistically higher for men than women. This is undoubtedly a function of gender participation rates which this analysis finds remain strongly in favour of men at 87%.
The (recent) historical context of female engagement with computing generally, may offer some insights to this observed imbalance. Prior to the development of modern-day ubiquitous mobile computing, research proposed that ICT was dominated by men (Bimber 2000;Tømte 2008;Wasserman and Richmond-Abbott 2005). Tømte (2008) revealed that men used computers and accessed the Internet more than women, had wider computer experience, spent more time online, reported greater interests in and positive attitudes towards computer-related activities and even appeared to be more motivated to learn digital skills. Although in western communities there appeared to be no significant difference in access to computers and the Internet (Bimber 2000;Wasserman and Richmond-Abbott 2005), there remained a divide in the scope of male and female online activities, with women lagging behind in both opportunity to engage with ICT and frequency and scope of use (Wasserman and Richmond-Abbott 2005). These disparities have been linked to socio-economic inequalities as women have traditionally fulfilled the roles of primary caregivers and been responsible for household duties which have limited their online activities (Gilbert et al. 2008;Liff et al. 2004;Sørenson 2002).
Stephens (2013) found that these gender disparities in ICT were reflected in participation in UGC, including in VGI. If access is seemingly equitable, but it is the scope of activities that differ, then an understanding of the more specific reasons for the bias in participation in OSM must be sought. Whereas women equal or even exceed male participation in UGC, studies of gender in VGI suggest that the need for increased geographical or technical knowledge and skills, reduces women's participation in VGI. Despite acknowledging its opaque nature, Stephens (2013) provides convincing evidence for the role of perceptions around cultures of computing in perpetuating the gender divide: (mis)conceptions about men's natural predisposition to technology and the hegemonic discourse of men as computer 'wizards' and women as disinterested users (Corneliussen 2010); inherently gendered technology which embodies masculine values, content that favours men, sexual differences in cognition and communication (Bimber 2000); and the 'nerd culture' stigmatisation of computing, few female role models and negative stereotypes about the antisocial nature of these activities and those that chose to participate in them (Clark Hayes 2010), are each implicated in the creation of a technological culture that discourages women.
Research that accounts for the vastly changed ICT landscape, which has developed exponentially since the publication of most of these findings, is required to substantiate these correlations. It follows that given the computing revolution, patterns of engagement and use, including by gender, may also have evolved. Not least as a vast majority of ICT activities can be conducted through wireless and mobile technologies which are much less restrictive than desk-top computing and far more accommodating of differences in lifestyle, enabling users to engage almost without restriction. If the aforementioned factors remain valid, it is clear that computing has an image problem that dissuades and discourages women from engaging with certain Internet based activities. In this eventuality, work to mitigate these barriers to increasing participation equality is required.
However, anecdotal evidence (discussed at the outset of this paper) which reports hostile responses to women in online VGI environments, suggests that this sense of male dominance in computing, which in the past has been implicated in women's aversion to the adoption of technology more broadly (Sørenson 2002), remains pervasive in online VGI domains, and shouldn't be underestimated as a deterrent to female participation. 15 Cultural practices within UGC systems which are sometimes aggressive and mysogynistic and result in creating further alienation (Cohen 2011) evidently remain at play. Steinmann et al.'s (2013) findings, that women are discouraged from participating in UGC platforms where there is no opoortunity for feedback should also be qualified, as this anecdotal evidence suggests that negative feedback (a component of 'sharing') may also work to deter female participation.
These results map also reflect differential levels of technological knowledge and skills amongst men and women, a certain level of which it is accepted are required for OSM editing. This is confirmed by Capineri (2016) who states that the neogeographic revolution requires not only the possession of suitable technology (e.g. a smartphone) but also the capability and skills to capture location, view the resulting information and importantly share it on a map. Earlier research has demonstrated that women's perceptions of these skills as necessarily advanced, may work to dissuade or discourage women from participating in these activities . In a study of latent OSM users, the complexity of editing was implicated for 40% of respondents in their abandonment of the practice and the fear of doing something wrong featured for 30% . These figures may therefore indicate not only a lack of knowledge and skills but also a lack of confidence in these abilities. Both gender participation ratios and women's activity levels might therefore be interpreted as a reflection of the level of self-efficacy women feel they require in order to participate with impunity in what is perceived as a male-dominated activity. The potential for negative feedback in online VGI environments discussed above, may work to destabilise this.
Differences in educational attainment between male and female respondents could support this notion of women's heightened sense of capability. Additional analyses of the survey data reveal that 95% of female respondents were educated to at least degree level, compared to 73% of men (Gardner and Mooney 2018a). This is most starkly observed in comparative levels of post-graduate education: 54% of female participants hold a post-graduate qualification, while the same figure for men is 21%. When it comes to postdoctoral level these figures diverge to 15% and 7% respectively. These figures show an inverse relationship between numbers of male users and increasing educational attainment. Or, put simply, a higher proportion of women are more highly educated than their male counterparts; educationally an atypical trend. Evidence seemingly suggests that women require both a sense of confidence in their abilities (as well as a level of resilience), to overcome potential hostility in online VGI domains. However, these aspects mustn't be overstated or implicated without further empirical support.
Women's confidence in their knowledge and skills may also be conveyed through the editing results, which report subtle gender differences in modifying and creating data. Modifying data, which is arguably more complex than its initial creation, involves making alterations to existing data, for example with a change to a tag (key or value) or to the geometry of the object, whereas the creation of new data is usually performed through an interface which automatically mediates a users' contribution. The creation of data is often prioritised in humanitarian mapping; a form of crowdsourced mapping that aims to increase the visibility of human settlement, through the identification of buildings and roads, for the planning of services in un-or poorly mapped developing countries. Survey results of contributors to these particular activities report significantly higher proportions of female participation than those in the overall OSM project. 16 The increased female participation in humanitarian mapping, which tends to be dominated by the creation of data, may support the notion of a negative correlation between female participation rates and the perceived complexity of certain mapping tasks, i.e. modifying existing map data. Clearly stated, women may feature more heavily as participants in humanitarian mapping due to its simpler methods of contribution which emphasise the addition of new over the modification of existing data. Humanitarian mapping, which tends to prioritise the creation of buildings and highways over the labelling (i.e. tagging) of features, may also contribute to observed variations in both volumes of and tagging category preferences reported in ''Gender differences in use of tagging keys'' section. However, it is also important to recall the positive relationship between gendered participation in VGI and intrinsic reward, discussed in ''Understanding female participation in VGI'' section, which may also account for these observations. 16 29% of respondents to the HOT survey were female (HOTOSM 2017 These findings raise a series of questions about the role and influence of the technology and interfaces by which contributions to OSM are made and debates might extend to the design of software and digital interfaces which are overwhelmingly created by men and consequently, tend towards male modal spheres of engagement and production. Research has demonstrated the centrality of the processes by which crowdsourced mapping is mediated, to women's motivation and therefore participation in VGI (Schmidt and Klettner 2013; Steinmann et al. 2013) and are thus fundamental to what gendered participation looks like. That these aspects have passed somewhat under the radar within recent discourses of gender and VGI is conspicuous. Equally important as the nature of the technological practices themselves, is users' perceptions of the processes as highly technical and complex. Therefore, demystifying (perceptions of) the modes and the technologies by which contributions are made is also key to addressing gender participation biases in VGI.
Before certain Internet technologies are written off as alienating to vast swathes of the population however, the role of cognitive differences between men and women must also be considered. VGI evidently tends toward male modes of co-production: if technological cultures of VGI are geared towards the preferences of men with respect to innate cognitive differences, which current technologies fail to accommodate, then we must question how democratic we can ultimately expect participation in VGI to be. What must also be foregrounded is the increasingly accepted notion of gender as non-binary and rather a spectrum of feminine and masculine traits. Those who engage in OSM contributing may demonstrate particular combinations of these, which may more commonly be observed in men and which collectively underpin a proclivity to this kind of activity. This sense of varying predispositions to technology is taken up by Johnson (2014) in the context of the role of hard-and software in reinforcing existing socio-cultural privileges and which he argues is particularly relevant to open source software and its networks, which he states, fails to understand the ''constructed nature of data'' (ibid., 263). Johnson (ibid.) observes three dimensions through which this is embedded and perpetuated: (1) the embedding of social privilege in datasets as the data is constructed; (2) the differential in capabilities of data users; and (3) the norms that data systems impose through their function as disciplinary systems. If these principles are applied to current findings on VGI and gender, including those expounded here, then they are supported in each respect.

Conclusions and further work
By allocating users' stated gender to their recorded mapping behaviours, this paper has demonstrated a range of effects that arise from the gender participation bias in OSM contributing. Supporting earlier studies, the results have shown that men continue to contribute the vast majority of edits to OSM. Extending this knowledge, this study has also shown that not only do more men participate in the project, but those that do are significantly more active than their female counterparts. This study also reveals subtle differences in modes of editing as men demonstrate higher values than women for updating, altering or modifying existing data. These findings relay a sense of a male focus on the accurate cartographic representation of topographical features; conversely, women's focus on the creation of new data, conveys instead an emphasis on initial visibility (if not their specific nature) i.e. demonstrating the existence of topographical features where they might be otherwise entirely absent from the map.
Informed by critical discourses of representation in VGI, Stephens (2013) argues that if the content of VGI sites is skewed towards the demographics of those that have the inclination, skills and equipment to volunteer it then the base maps will represent the lived experience of those who (co)produce its content. Although the analysis here has initiated the process of identifying specific effects of the gender bias, the results alone do not support a bias in representation in VGI, and therefore nor an empirical foundation for the declaration for the increased participation of women. Aside from contributing in significantly lower volumes, women's OSM practices adopt a similar character to those of their male counterparts, albeit with some moderate variations in the objects, modes and tagging categories through which they contribute. Evidence detailed here and elsewhere suggests that this may indicate a female inclination towards humanitarian mapping efforts. In order to support such a proclamation however, there remains significantly more analysis of gendered OSM contributions to be done. Were subsequent findings to support such a hypothesis and the role of women in VGI was found to be bound up with enhancing the online visibility of otherwise marginal and non-visible people and places, the argument for widening participation would adopt a rather different but more substantive, tangible and ultimately powerful rationale, than those made in the name of equal online representation. For this, work to identify the thematic impacts of demographic participation biases, including gender, is required. This is the focus is a meta-analysis currently underway. Further work may also gender user motivations, but this can only be achieved with the continued support and active participation of the vast OSM user community, a methodology which presents its own challenges (Gardner and Mooney 2018b). Theories of gender representation in VGI are therefore not unfounded, but rather unsupported without further empirical research.
Deepening our understanding of women's roles in VGI activities such as the OSM project may work to focus the debate on the real issues around gender inclusion, which evidence detailed here, suggests goes beyond interests, skills and knowledge and is instead immersed in issues around internet technologies, women's relationship with technology itself and a sense of inclusion in online environments in which female accounts report a sense of intimidation and therefore vulnerability. Updated research on issues of gender and ICT, inclusive of transformations resulting from the digital and mobile computing revolution would also facilitate a more profound understanding of these dimensions. These debates are being played out in a socio-cultural global climate in which the status and profile of women is being increasingly mediated through a lens that is elevating respect for and value of women's contribution in almost every domain. Appreciating the role women play in VGI and therefore its specific value may feature as a corollary to this.