Introduction

Biodiversity conservation and environmental management are issues of broad public concern (Van Liere and Dunlap 1981; Hays 2000), and public opinion is a key factor driving the implementation and determining the success of policy and legislation (Hobolt and Klemmemsen 2005; Whiteley 1981; Phillis et al. 2013). The willingness of the public to accept environmental policies and to contribute time or money to conservation efforts depends largely on their interest in conservation and environmental issues (Grob 1995; Kollmuss and Agyeman 2002; Barr 2003). It is therefore useful for conservation scientists and policy makers to be able to reliably gauge public interest in environmental issues, and survey and interview methods have been applied in the past to achieve this (Mccann et al. 1997; Nisbet and Myers 2007; Wray-Lake et al. 2010). More recently, data mining methods have been used to monitor public opinion, as large sources of data such as Twitter, the number of catalogued websites mentioning specific topics, and the volume of internet searches made through Google for specific keywords, have become publicly available to researchers (Baram-Tsabari and Segev 2011; Pak and Paroubek 2010; Evans and Foster 2011). Conservation scientists have used some of these data sources to quantify public interest in bird and butterfly species (Żmihorski et al. 2012), and to measure temporal trends in public interest in environmental issues (Mccallum and Bury 2013). Online sources of data provide useful indicators of the background level of public interest or concern and are updated frequently, allowing policy makers to respond rapidly to changes in interest (Ginsberg et al. 2009; Scheitle 2011). However, they do not provide a long-term historical record of public interest in issues; search volume data from Google is only available since 2004 (Mccallum and Bury 2013). Without a longer term baseline that is comparable to Google search volume it is difficult to interpret the conclusion that interest in environmental issues has declined over time (Mccallum and Bury 2013), although data from surveys of young people show similar declines since the late 1970s (Wray-Lake et al. 2010). It is thought that public awareness of environmental issues increased during the second half of the twentieth century, thus driving the mainstreaming of the environmental movement (Van Liere and Dunlap 1981; Clapp 1994; Hays 2000). If public interest in environmental issues has declined in more recent history (Wray-Lake et al. 2010; Mccallum and Bury 2013) then it is important to know how long ago this decline began.

A longer term baseline for public interest in environmental issues can be established through analysis of another large dataset available online. Google Books’ Ngram is a database created from the largest digitised library of books in the world (Michel et al. 2011). The Ngram corpus catalogues the content of a subset of the digitised books in the Google Books library, and includes a count of the number of times that each word was mentioned every year. The proportional occurrence of environmental words in the Ngram literature could be used as an index of historical environmental interest which is comparable to the Google search volume index. The sample population that Ngram represents is limited because a broader range of the public uses Google to search the internet than publishes books. However, the supply of book content to the public is driven partly by demand (Hjorth-Andersen 2000), so Ngram data should indicate the topics that interested the public through time (Acerbi et al. 2013; Phillis et al. 2013). The database has previously been used to investigate patterns of interest in religion, philosophy, and food (Michel et al. 2011), and has received some limited attention in conservation and environmental research. The proportional occurrence of the terms “climate change” and “protected areas” has been used to indicate documentation of protected area creation (Ervin 2011), and a model describing the transfer of keywords between scientists and the public has used climate science as an example discipline (Bentley et al. 2012). Trends in the usage of natural catastrophe terms have been analysed from the perspective of cultural anthropology (Marriner and Morhange 2012). The increase in usage of catastrophic terms in the last quarter of the twentieth century suggests increased public interest in some environmental issues (Marriner and Morhange 2012), although this study focused mainly on event catastrophes rather than longer term ecological issues. Additionally, the occurrence of three specific environmental terms in a related data source, the Google News Archives database, has been used as an indicator of public interest in environmental issues (Phillis et al. 2013). In this study the relationships between environmental stressors, scientific publications, public interest and policy change were analysed to investigate how these factors interact to allow conservation success (Phillis et al. 2013). No previous studies have used historical word usage in published books as a broad indicator of public interest in environmental issues.

To investigate historical public interest in conservation and environmental management issues the proportional occurrence of nine environmental indicator keywords was analysed in the Ngram database. To investigate long-term trends in public interest in environmental issues the temporal trends in usage of these keywords were analysed over the period 1800–2009. To investigate whether there has been a peak and subsequent decline in interest in more recent history, the usage of keywords was analysed separately for the years between 1960 and 2009.

Methods

The 2012 version of the British English Ngram database for 1-grams (frequencies of single words rather than combinations of words) was downloaded (Google Ngram 2013), and the occurrence of each environmental keyword in the database was then recorded by year. Only years between 1800 and 2009 were analysed so that a reasonable sample of books was available each year (Michel et al. 2011). It is known that the composition of the Ngram corpus is not directly comparable prior to and following the year 2000 (Michel et al. 2011), but the most recent years (2000–2009) are of key interest for this study, so analyses were conducted over the full period. Additional analyses were conducted excluding the years 2001–2009 to establish that the conclusions were comparable. Six of the keywords chosen for analysis were the general terms used by Mccallum and Bury (2013): “conservation”, “biodiversity”, “environment”, “ecology”, “wildlife”, and “fisheries”. An additional three slightly more specific keywords from their selection were also used: “pollution”, “extinction”, and “sustainability”. The number of words recorded in the Ngram database every year is several orders of magnitude smaller than the number of searches that Google receives each year, so it was not possible to analyse more specific combinations using multiple words. However, the nine keywords used in this analysis are still suitable indicators of broad environmental interest.

Proportional occurrence of environmental keywords was modelled using a generalised linear model using a quasibinomial error structure to account for overdispersion (Crawley 2007). Separate models were produced for the period 1800–2009, and 1960–2009, so that any difference between long-term trends and more recent patterns could be investigated. For the 1800–2009 period proportional occurrence of each keyword was modelled as a function of sample year only. For the 1960–2009 period proportional keyword occurrence was modelled using quadratic and linear terms for sample year (2nd order polynomial), to determine whether or not there was a significant peak in usage during this period. The polynomial and simple linear models were compared using ANOVA to determine the presence of a statistically significant unimodal response. For the polynomial models the peak year of word frequency was estimated from the coefficients using the ratio –a/2b, where a was the coefficient of year and b was the coefficient of the quadratic term (Zar 2010). All Ngram database processing and statistical analyses were conducted in R version 2.15.2 (R Core Development Team 2012).

Results

The environmental keyword with the greatest total usage over the period 1800–2009 was “environment”, followed by “conservation” and “pollution” (Table 1). The keyword with the least usage was “biodiversity” (Table 1). Several keywords were not mentioned in the earliest years of the period; the first record for “wildlife” was made in 1806, “ecology” in 1816, “sustainability” in 1835, and “biodiversity” in 1965. All nine of the environmental keywords used in this study show statistically significant increases in frequency of usage between 1800 and 2009 (Fig. 1; Table 1). There is some variation in the shape of the trends, with the frequency of most keywords increasing rapidly over the last 40 or 50 years. However, the terms “fisheries” and “extinction” show more gradual increases in word occurrence throughout the entire period (Fig. 1).

Table 1 Total usage of nine environmental keywords, temporal trends and significance over the time periods 1800–2009 and 1960–2009, and peak date of keyword occurrence
Fig. 1
figure 1

Nine keywords related to biological conservation or environmental management show increasing frequencies of occurrence through time in British English language books catalogued by Google Ngram, during the period 1800–2009

Seven of the nine environmental keywords show statistically significant unimodal patterns in frequency of occurrence between 1960 and 2009 (Fig. 2; Table 1). “Biodiversity” shows a statistically significant increasing frequency of occurrence, and “extinction” shows no statistically significant pattern in frequency of usage over this period. The modelled peaks in keyword frequency for the unimodal responses all occurred between 1992 and 2006 (Table 1).

Fig. 2
figure 2

Seven keywords related to biological conservation or environmental management show unimodal frequencies of occurrence through time in British English language books catalogued by Google Ngram, during the period 1960–2009. One keyword shows an increasing frequency of occurrence over this period, and one shows no statistically significant trend

Discussion

The increasing occurrence of keywords relating to environmental issues in the British English Ngram corpus between 1800 and 2009 suggests that overall there has been increasing public interest in these issues over this period. This is consistent with the history of the environmental movement. It is thought that the modern environmental movement grew up in the post-war period (Sheal 1984) and began to gain momentum in the 1960s (Nerlich 2003). There is evidence that increasing public interest has impacted policy, for example in the rapid rate of designation of protected areas between the 1960s and 1990s (Ervin 2011; Radeloff et al. 2012). Environmental keyword usage thus provides an indicator of interest in environmental issues that extends further back into history than Google Trends or any known survey schemes, although the Ngram corpus does have some limitations. The relationship between book content and public opinion is not simple, as the majority of the public do not write books (Acerbi et al. 2013). The public audience has changed as literacy has increased, particularly through the earliest 100 years of the study period, and earlier texts may also disproportionately represent religious, academic and educational interests (Altick 1957). An additional issue is that some words have uses outside the discipline of ecology (Żmihorski et al. 2012). For example, use of the word “conservation” could be attributed to archaeological or architectural conservation, and a manual search of the Google Books database reveals that older records for “extinction” frequently refer to non-ecological extinctions (Google Books 2013). However, in this case, the observed patterns are consistent in direction between environmental-specific keywords and those with multiple uses. Despite the limitations of the Google Ngram database it is thought to reliably represent public interest in other disciplines (Michel et al. 2011; Acerbi et al. 2013), and as a noninvasive method it has advantages over traditional surveys which can be limited by their sample populations (Couper 2000), and may be impacted by nonresponse or respondent bias (Phillips and Segal 1969; Groves and Peytcheva 2008).

Environmental campaigners, policy makers and managers may be concerned that the usage of seven environmental keywords in a subset of published literature appears to have already peaked, particularly as in some cases the peak occurred around 20 years ago (Table 1). Such peaks may have been caused by information dilution in more recent years, as new book topics have been invented at a rapidly expanding pace. However, a declining trend in environmental awareness is consistent with the findings of previous studies using internet search data (Mccallum and Bury 2013) and surveys of young people (Wray-Lake et al. 2010). If interest in environmental issues is in decline, a range of cultural and social factors may be responsible. It has been suggested that economic crises may reduce interest in the environment (Kahn and Kotchen 2010), however, all keyword peaks occurred before the 2007–2008 global financial crisis (Bordo 2008). It is possible that the public have undergone a form of compassion fatigue (Tester 2001; Moeller 1999), and have become desensitised to the environmental issues summarised by the keywords used in this study. The timing of the keyword peaks suggests that this may be the case: the earliest, “pollution”, was a major driver of the early environmental movement, following noticeable declines in air and water quality (Hays 2000), and fears of pesticide misuse (Phillis et al. 2013). However, the social and cultural issues that are of most interest to the public change over time (Burns 2008; Michel et al. 2011), and the keyword with the most recent peak is “sustainability”, a newer term applied to modern, holistic approaches to environmental management (United Nations 1987). It is concerning if interest in environmental issues follows fashion cycles because older issues may become neglected by policy makers without public support (Hobolt and Klemmemsen 2005). However, neglect of environmental issues in web searches and written literature may not present a problem for the environmental movement if the issues that they represent have been resolved (Burns 2008; Phillis et al. 2013), or if the public have accepted them as issues of importance and no longer desire to research them.

Despite the selection of the linear model over the polynomial for the term “biodiversity”, it is not clear whether this keyword followed an increasing or unimodal trajectory over the period 1970–2009 because of a large amount of variation over the last 10 years (Fig. 2). However, it can be concluded that this term has either not yet peaked, or peaked last out of the selected keywords. This suggests that new generations of environmental keywords can replace older terms; in this case “biodiversity” is commonly used as a more holistic replacement for “wildlife”. If terminology changes through time but the environmental issues that are of interest remain the same then public interest in environmental problems may not actually be declining. In fact, continual rebranding through the use of new terminology may increase public environmental awareness.

Conclusions

Historical records of word occurrence frequencies in published books can provide a long-term indicator of public interest in environmental issues. As measured by these indices, awareness of environmental issues is greater now than it was for much of the nineteenth and twentieth centuries, although there is evidence that interest may now have peaked and begun to decline. Further study should establish the existence of emerging trends in the words used to describe biodiversity conservation and environmental management. It should also investigate whether emerging concepts are continuously replacing existing terminology, or whether the extent of public interest in the environment is reducing over time.