1 Introduction

When in May 1990 the World Health Organisation removed homosexuality from the International Classification of Diseases, LGBT movements around the world cherished. It was a cornerstone achievement in the struggle for civil rights, that generated an intense political debate and granted unprecedented visibility in the media (Fetner 2008; Ghaziani et al. 2016). While a significant step had been made towards the full recognition and integration of LGBT people, the era of discrimination was all but over. To the contrary, the reaction of the religious right was vigorous in several countries (Stone 2016) and conservative forces drove various forms of opposition (Van de Meerendonk and Scheepers 2004; Ayoub 2014). Thus, the institutional shift was not immediately acknowledged by the general population, whose cultural models were (and largely still are) based on historical gender beliefs that define the essential, inherent and supposedly natural qualities of men and women (Drescher 2010). Gender beliefs, deeply rooted in culture, are expressed on a daily basis – mostly unconsciously and unwittingly – through language (Drescher 2015).

Several decades of anthropological and linguistic studies have unfolded the existence of a bidirectional relation between language and culture: on the one hand, cultural categories shape the main characteristics of language structure; on the other hand, language characteristics influence the way people think, interact with each other and experience the world (Levinson et al. 2002; Cordona 2006; Athanasopoulos 2007; Anthony 2010). The constant and automatic usage of language structures naturally produces habits and standards in the perception of reality that may as well differ across languages.

Grammar for instance plays a relevant role in shaping perception and behaviour. The speakers of Indo-European languages – English included – subconsciously associate actions to tenses and numbers, since verbal forms include compulsory grammatical markers that specify whether actions take place in the present, in the past, or in the future and whether a singular agent or a multitude of agents are involved. In the Hopi language (spoken by a few thousand people in Northern Arizona) instead, verbs must specify whether the action mentioned has been personally witnessed, has been heard of, or is considered as an immanent truth. Such differences automatically lead speakers to pay more attention to some aspects of actions and affect their overall reasoning and perception. Relatively little variation however is observed in grammatical structures, especially if the comparison is limited to Indo-European languages (Pagel et al. 2007). Much more variability is reported for lexicon instead. Among the components of a language, lexicon is certainly the one that most immediately conveys specific attitudes and value judgements, orienting behaviours (Polguère 2008). A straightforward example of the influence of lexicon on perception and behaviour is represented by the lexical inventory of the Guugu Yimithirr language, spoken by Australian aboriginals in Far North Queensland: words for ‘left’ and ‘right’ do not exist and speakers use geographical directions instead, such as ‘East’ and ‘West’. This lexical trait automatically and constantly trains their orienteering skills, which makes them very accurate in terms of sense of direction (Levinson et al. 2002; Majid et al. 2004).

More generally, lexicon represents the simples form of rhetoric and allows to subtly attach specific connotations to social behaviours and social groups (Reisigl and Wodak 2003). In this sense, lexicon performs two functions, i.e. self-identification and external attribution (Orrù 2012). The former is carried out by all the formulas used by social groups to define themselves in relation to their identity, socio-economic status and sexual/affective orientation. The latter relates to all the descriptive terms and epithets attached to social groups in order to highlight their specific characteristics, intending either offense or other exaltation.

In the case of homosexuality, lexicon has often been used as a weapon, to stigmatise actions, attitudes and identities. Frequent usage of homophobic epithets has subtly trained individuals to marginalise LGBT people, just like frequent usage of cardinal points trains orientation. Several empirical studies found a significant correlation between exposure to homophobic insults and homophobia (Burn 2000; Carnaghi and Maass 2007; Bianchi et al. 2017), corroborating the theoretical proposition that language shapes thought. Not only this phenomenon hurts a sizeable minority, but it also legitimises an unconscious form of violence and marginalisation, reinforcing a socio-cultural barrier based on conformation to heteronormative standards. The intergenerational transmission of cultural categories, lexical inventories and gender beliefs moreover makes discrimination persistent.

Abusive language targeting LGBT people indeed is not a recent phenomenon. Both Greek and Latin boasted vast repertoires of epithets characterising homosexuality in a negative fashion (Di Scepsi 2017). In classical antiquity, passive (but not active) homosexual practices and effeminate behaviours on part of men were strongly stigmatised and viewed as a surrender of masculinity. Homophobic epithets mostly related to the feminisation of homosexuals. Given the lower social status of women, assimilation to a woman was received as a grave insult (Isaac 2006; Sidebottom 2018). Modern romance languages smoothly perpetrated several cultural models of the past – including gender beliefs – most often in a subtle and unconscious way (Harris and Vincent 2003).

In romance languages, a long history of lexical characterisation of homosexuals reiterated prejudice, enforced discrimination and ultimately contributed to marginalising LGBT people (Appleby 2001; Thurlow 2001). While this problem concerns all romance languages, the present work focuses on two emblematic cases, i.e. Italian and French. Among the natural successors of Latin that currently hold the status of national languages, Italian is generally described by linguists as the most conservative, whereas French is generally considered as the most innovative in terms of lexicon, phonology, grammar and syntax (Pei 1976). Lying at the opposite ends of the philological spectrum, these two languages jointly offer relevant insights for the whole romance branch of the Indo-European family. Moreover, the pervasiveness of discrimination against LGBT people and the different historical experiences that characterised Italy and France in the Twentieth Century make the comparison especially interesting. While the Italian experience is historically centred around negation (Cordona 2006), the French case features a significant extent of early grassroot mobilisation in favour of civil rights. The activities of the French LGBT community in particular triggered a process of linguistic resemantisation, consisting in the appropriation of homophobic epithets, which were deprived of their negative connotation and were proudly worn like an armour (Mauri 2013).

These different trajectories have been drawn in qualitative studies, but to our knowledge empirical support to these narratives is still scant. Lexical discrimination is in fact a nuanced and evasive phenomenon, all but straightforward to measure. To capture it, we use the frequencies of homosexuality-related epithets in published texts in Italian and French. The persistent nature of discrimination makes time series analysis particularly suitable. Following a common practice in econometrics, we decompose the series of insulting epithets into a cyclical component and a trend. The cycle may be viewed as the demand side of the market (Baffigi et al. 2013; Grant and Chan 2017), i.e. the extent to which the general population harbours a certain curiosity towards the topics related to homosexuality. The trend on the other hand may be viewed as the supply side, i.e. the willingness on part of publishers and writers to provide texts tackling the subject, challenging the wall of silence that historically surrounded the cultural taboo of homosexuality. In this view, volatility in demand indicates a moment of transition and an opportunity for change: strong frequent oscillations mirror a bottom-up push originating from the general population and affecting writers and editors. The dissolution of cultural stereotypes however is hardly a smooth and linear process, as it often entails reactionary responses and discontinuities (Courouve 1986).

The rest of this work is organised as follows: Sect. 2 sums up the historical experiences of Italy and France with respect to the perception of LGBT people. Section 3 offers a lexicographic perspective on the epithets of interest. Section 4 outlines the empirical methods adopted in this work. Section 5 provides an overview on the dataset employed. Section 6 displays and discusses the results. Section 7 concludes.

2 Case study: historical experiences of LGBT discrimination in Italian and French

Discrimination against LGBT people takes on different aspects, ranging from institutional forms, such as legal restrictions and sanctions, to subtler and perhaps more pervasive forms, such as intolerant and abusive language (Woodford et al. 2014). While many speakers may fail to perceive their own words as insulting, feeling like they are merely reproducing well-established cultural models, the consequences of abusive language on the well-being of LGBT people entail adverse effects on many aspects of their lives, including schooling decisions (Schmidt et al. 2011), mental health (Burgess et al. 2007; Clark 2014) and labour market outcomes (Ahmed et al. 2013; Göçmen and Yılmaz 2017). Abusive language is strictly related to the historical trajectories that define the perception of LGBT people in society. This paragraph offers a stylised depiction of the historical experiences of Italy and France.

2.1 Italy

In Italy, the role of religion has traditionally been primary in most domains of civil life, including culture, education, social norms and politics. The presence of the Vatican enclave in is in part responsible for this phenomenon, but the historical roots of the so-called temporal power of the Catholic Church may be traced back to sixth century C.E. (Sotinel 2010). The Church officially condemned sodomy as an unspeakable sin but did little in practice to discourage ‘outrageous’ private behaviours. The main strategy adopted by the clergy consisted in building a wall of silence around the problem.

Following the catholic tradition, mainstream Italian culture is historically oriented towards negation rather than repression of homosexuality. In line with the Savoy regulations, which largely differed from the Bourbon experience of Southern Italy, the first penal code of unified Italy in 1899 eliminated any reference and sanction of homosexual relations among consenting adults. Such omission was in fact limited to acts performed in private, while manifest public behaviours would fall under the ‘public scandal’ situation. According to several authors, this approach translated into a form of de facto tolerance towards a substantially widespread set actions and behaviours, although formal reprobation was vastly celebrated (Dall’Orto 1990).

The fascist epoch kept up with this strategy, so that homosexuality was not listed among the offences of the penal code, in an attempt to lessen its social relevance. At the same time, some forms of repression – ranging from beating to confinement – grew rather popular among the national police. The official silence endured through the first decades of newly established Republic. This peculiar characteristic of the Italian experience has been seen as a factor explaining the relatively late development of homosexual movements in the Peninsula, which outdated those arising in other European countries by at least one decade. While in the UK and in the US for example homosexual movements constituted a militant response to the repression and criminalisation of homosexuality, in Italy the consolidated strategy of silence did not push towards mobilisation (Marzullo 2015).

Italian LGBT movements developed with much delay with respect to France (De Pittà and De Santis 2005; Cristallo 2017). The gay movement in particular was established only in the 1970s and gained momentum in the second half of the decade, sharing many of the characteristics of other grassroot movements of those years, such as the proliferation of a network of little militant groups and a sympathy for the radical left (Prearo 2014). Over the 1970s, the Italian gay movement structured its articulation through widespread territorial clubs, while attempting to approach political institutions. From 1978, the first gay demonstrations in Italy took place. In the 1980s however, a generalised climate of demobilisation hit the composite groups of the homosexual movement, leading into the establishment of the Arcigay.Footnote 1 In the same period, the outbreak of HIV started to represent an important mobilisation factor, which clashed against the silence of the government and of the Church, and with a substantial lack of prevention policies.

Although some groups within the gay movement counted lesbian women in their ranks, the social mobilisation of lesbians mostly originated from feminist movements and occurred ten years later, possibly as a result of the lower extent of capital accumulation featured by women (Addis and Joxhe 2017). The Italian lesbian movement kept strong ties with feminist issues and battles, thus featuring an element of specificity with respect to the gay movement. It is not surprising then that, although the Arcigay of the 1980s counted some women (who were also granted an equal representation in the directive organs in 1990), in 1996 the Arcigay and the Arcilesbian split up. During the 1990s, the visibility of the gay and lesbian movements increased, paralleling a national trend that shifted attention from homosexual and bisexual acts and behaviours to the defence and development of non-heterosexual identities. This shift also implied the demand for a public and social role, which led to a clash with the standard attitude of the time, consisting in a general in the duality between private tolerance and public repression.

For this reason, in those years the Vatican’s stances against homosexuality grew increasingly more frequent. During the 1990s, the juridical recognition granted to de facto homosexual couples became a symbolic objective of the gay and lesbian movements. A campaign initiated by Arcigay led to the creation in some municipalities of municipal registers of cohabiting couples, as a form of pressure towards a national level legislative intervention. Regions and municipalities provided varying degrees of recognition to homosexual couples until 2016, when same-sex unions were finally recognised by the law, obtaining most of the legal rights enjoyed by married couples. The path to recognition was all but linear and smooth (Baiocco et al. 2014).

2.2 France

The French penal code of 1808 abolished the previous laws against sodomy, allowing homosexuals to live in the shadow rather than being burnt alive (Foucault 1994). This institutional shift was not well received by the general population, which instead expressed its firm opposition against the depenalisation of homosexuality. The language of 1800 was in fact way more intolerant towards homosexuality than that of the previous centure (Courouve 1986), while formal repression in the name of public morality involved substantial waves of police intervention (Pastorello 2009). In a general climate of hostility, the city of Paris represented a notable exception. The French capital hosted a gay subculture that grew very active in the 1920s, as witnessed by an intense literary production covering the theme of homosexuality (Tamagne 2006).Footnote 2 In particular, the ‘bohemian’ areas of Montmartre and Pigalle constituted a gay-friendly hub.

The outbreak of World War II, the Nazi occupation and the establishment of the Vichy regime contributed to generating a novel wave of persecution of homosexuals, that persisted well beyond the end of the war (Gauthier and Schlagdenhauffen 2019). A law of July 1949 tightened the grip of censorship, banning several ‘scandalous’ publications that tackled themes related to sexuality in general and to homosexuality in particular. The general consensus of the period, largely based on medical and psychological treaties of the previous century, maintained that human beings featured a ‘natural’ and normal sexual pulsion, attracting them to those of the opposite gender (Revenin 2007). This pulsion was described as active for men and passive for women, as well as monogamous, adult, marital and reproduction-oriented (Crozier 2003).

Until the 1960s thus, homosexuality was perceived as a social scourge and homosexuals were often forced to undergo a variety of treatments, including electroshocks and lobotomy in the most extreme cases (Borghs 2016; Copley 2019). In 1968, the WHO’s classification that listed homosexuality among mental illnesses was officially adopted in France. Between May and June of the same year however, Paris was the epicentre of a wave of riots against traditional customs, that challenged several social norms, including the condemnation of homosexuality. A feminist action committee was organised at Sorbonne, meeting with great resonance and stimulating public reflection on themes such as sexual freedom, physical pleasure, gender roles, abortion and homosexuality. This cultural dualism opposing centre and periphery persisted in the French public debate well beyond the gay-friendly experience of Montmatre and Pigalle in the 1920s.

After the Stonewall riots of 1969 in New York, the Front homosexuel d’action révolutionnaire was established in 1971 (not surprisingly in Paris), as a result of the joint activities of lesbian feminists and gay activists (Prearo 2014). The group was meant to challenge prejudice and discrimination, granting substantial visibility to the homosexual cause in the 1970s in the wake of the previous wave of social unrest. The following decades featured a steady increase in public awareness, paving the way for the recognition of homosexual rights. In 1990, the WHO removed homosexuality form the list of mental illnesses, officially opening the season of civil rights in France, which culminated in 1999 with the pacte civil de solidarité (i.e. ‘civil pact of solidarity’), a form of legal recognition of homosexual couples. From 2013, civil marriage has also become an option for same-sex couples.

3 Lexicographic analysis

In spite of the success of LGBT associationism, paralleled by a growing degree of civil rights recognition, homophobia remains one of the most widespread forms of prejudice in Western countries (Herek 2007; Carnaghi et al. 2011). Homophobic insults typically enter a speaker’s lexicon during childhood – before their precise meaning is even clear – and are used persistently until adulthood (Plummer 2001; Collier et al. 2013). An official Italian survey carried out in 2012 revealed that 80% of the population hears acquaintances using homophobic epithets, either very often (47.4%) or occasionally (32.6%; see ISTAT, 2012). This example portrays the Italian context as strongly homophobic (Zotti et al. 2019), yet the use of derogatory epithets related to homosexuality is widespread in most EU countries, (Eurobarometer 2012), including France (Provencher 2011; Pugnière 2013). The French website NoHomophobes.fr, launched in April 2016 to monitor the amount of homophobic words and expressions found in the French version of Twitter, offers some interesting information on the insulting lexemes referring to male homosexuality. In particular, the site keeps count of the categories of heteronormative insults, i.e. insults that refer to a socio-cultural model which perceives heterosexuality as standard, expected and obvious status, to the detriment of other orientations, that are in turn viewed as anomalous and thus discouraged. Table 1, drawn from the website, displays the categories of heteronormative insults with some examples in French.

Table 1 Examples of categories of heteronormative insults.

The epithets related to male homosexuality observed on the French version of Twitter are rather frequent. The lemmas stigmatising gay sexual practices are the most commonly used as well as those with the longest tradition, whose origin may be ultimately traced back to Latin. The lemmas stigmatising gay identity on the other hand are either neologisms or terms that acquired novel meanings over times. Of all the lemmas listed in the table, this work focuses on pédé, which is by far the most common, with 2,743,477 occurrences between April 2016 and May 2020 (to get a quantitative idea of the phenomenon, consider that the second most common term is tapette, with 327,032 occurrences, i.e. less than 0.12 times as much). An important comparison term is the lemma homosexuel (i.e. homosexual), which represents a non-offensive alternative to insulting epithets. The terms we focus on and their equivalents in Italian and English are thus shown in Table 2.

Table 2 Epithets in french and italian with their english equivalents

The lexicographic and semantic-pragmatic analysis of the lemmas is based on several dictionaries (see "Appendix"). Drawing on these sources, we define the speech sphere of insult, within which we focus on homophobic epithets.

The lemma injure (i.e. the French equivalent for ‘insult’) is defined as an attack to other people’s dignity, performed through abusive words or actions. This diaphasic element falls within the linguistic register of langage familier and langage populaire (Gadet 2007), constituted by epithets with a strong expressive and abusive characterisation. In the context of gender related speech, insults constitute a linguistic act that hits the victim for a practical purpose, i.e. to hurt, lessen or even humiliate him/her. Insults against male homosexuals thus turn their targets into immoral and passive beings, whose passivity is further exacerbated by the feminisation (referring to the ‘weaker sex’), which underpins several homophobic epithets (Van Raemdonck et al. 2011). The notion of passivity in French materialises in two aspects, i.e. the conjugation of verbal forms and their definition (Larchet 2017). The former concerns the usage of the past participle forms, such as enculé. The latter intersects with feminisation, introducing an intrinsically sexist element, in an attempt to the restructure a ‘fragile masculinity, that constantly needs to affirm its primacy through mock of non-manly others’ (Borillo 2000). The French terms enculé and sodomite are straightforward examples of passivation epithets, their Italian counterparts being checca and sodomita. These epithets relate to the passive role in a sexual intercourse between men. In a more extensive sense, they hint to a lack of courage and virility, setting the target aside from heteronormative standards of masculinity and closer to womanly characteristics (Yaguello 2002). Passivising terms indeed reduce the sexual orientation of a person to non-conformity to male attributes, rather than capturing the real sexual and affective orientation of a person. The verbal forms pertaining to this class reiterate the stereotype of an effeminate, weak, hysterical and sometimes grotesque homosexual man (Chauvin and Lerch 2013).

On the opposite end of the lexical spectrum lies homosexuel, a lemma with a neutral connotation, originally employed in medical records and later in the demographic jargon. First attested during the early nineteenth century in the medical literature, its definition focuses on same-sex attraction. Its scientific connotation, free of any pejorative acceptation, gradually introduced it into educated speech. Its Italian counterpart is omosessuale, which shares the same characteristics. Lexicographic evidence suggests that both are strictly related to the sexual domain, through words equivalent to the English appetite, desire and sexual satisfaction (see "Appendix"). The verbal form affetto da (i.e. suffering from), which typically accompanies omosessuale in Italian definitions however, refers to the semantic domain of illness and tends to overshadow affection (Maturi 2013). Omosessuale on the other hand has historically represented the main rival of gay, a term that characterises more broadly social identity rather than sexual orientation. It was overcome in popularity in the mid-twentieth century due to its scientific connotation – evoking a medical/psychological dimension that subtly reminds of the dark eras of ‘treatment’ of homosexuality. However, omosessuale and homosexuel have largely lost this sexual connotation after 1970, thanks to their extensive usage on part of LGBT movements (Rey 2011).

French lexicography points to both an active and a passive role of homosexual men, whereas Italian definitions revolve around passivity only. In general, French definitions are more focused on the historical and cultural aspects of the lemmas rather than on their sexual connotation.

3.1 Resemantisation as an identitarian claim: the case of pédé

Resemantisation or neosemy is a linguistic phenomenon that occurs when an already existing word holding a certain meaning is assigned a novel meaning (Rastier and Valette 2009). The French word pédé, previously featuring a negative connotation was attributed a new and legitimised meaning as a by-product of an identarian mobilisation of the homosexual movement. The lemma was first attested in dictionaries in the first half of the nineteenth century, as a short version of pédéraste, originally referring to the sphere of paedophilia. First used in the familial register, its semantic scope widened as an insult related to male homosexuality in the early twentieth century. After the 1960s, its negative connotation reverted, thanks to a semantic upturning path strongly sponsored and implemented by the French homosexual movement, that meant to turn the insulting epithet pédé into a respectable term.

While derogatory words may be hurtful for the identity of certain social groups, they also identify, certify and confirm their own existence. Sometimes, holding on hurtful terms is the only way to establish some form of social and lexical existence (Butler 2010). This phenomenon of neosemy is observed in a similar fashion in German-speaking countries, where the insulting epithet schwul underwent resemantisation, ultimately coming to represent a general insult, devoid of any relation to male homosexuality (Maturi 2013). A clear example of the novel connotation of the lexeme pédé is the fact that it is currently employed in informal context even when referring to women, which proves the lack of any connotation related to male homosexuality (Rey 2011).

4 Method

Following the preliminary lexicographic and semantic-pragmatic analysis, we proceed with an empirical investigation of linguistic data. The first step in the analysis consists in the construction of an indicator of book supply, capturing permanent variations in output. The second step instead concerns a focus on book demand, which induces transitory deviations from potential output. Such deviations, which may be viewed as output gaps, constitute a measure of the degree of pressure exerted on book production, potentially affecting culture, institutions and politics. To obtain a plausible and reasonably robust estimate of the long-run component of the production of books containing the lexemes of interest, we resort to the Hodrick-Prescott filter (HP from now on, see Hodrick and Prescott 1997), which allows to decompose the data and highlight some characteristics of interest. Variations in potential output may be decomposed into (1) supply-side fluctuations, that alter book production permanently and (2) demand-side fluctuations, that cause transitory shocks to book production. This idea may be summed up by the following equation:

$$y_{t} = \tau_{t} + \delta_{t} con t = 1, 2, \ldots ,T$$
(1)

where \(y_{t}\) represents the amount of books (expressed in logs) containing one of the lexemes analysed,\(\tau_{t}\) represents the permanent component of output, i.e. the trend and \(\delta_{t}\) is the transitory component, capturing deviations from the trend. In turn, deviations may be decomposed as:

$$\delta_{t} = c_{t} + \varepsilon_{t}$$
(2)

where \(c_{t}\) is the cyclical component, characterised by stationarity (which implies it is transitory) and \(\varepsilon_{t}\) is the disturbance component, often referred to as a white noise. Although potential output is not directly observable, its trend \(\tau_{t}\) may be extracted from the time series of actual output \(y_{t}\).

The HP filter is the most popular approach used in the economic literature in order to extract the trend component \(\tau_{t}\) (i.e. potential output) from actual output. The filter balances two opposite needs: on the one hand potential output must be to some extent close to actual output. On the other hand, potential output must be smoother than actual output, i.e. it must feature lower acceleration and deceleration rates. To balance these two desirable characteristics, bearing in mind that the distance between potential and actual output may be measured by \(\delta_{t}\) in Eq. (1), the HP filter estimates the series of potential output \(\tau_{t}\) as the solution of a minimisation problem:

$$\mathop {\min }\limits_{{\tau_{t} }} \mathop \sum \limits_{t = 1}^{T} \left[ {\left( {y_{t} - \tau_{t} } \right)^{2} + \lambda \left( {\Delta \tau_{t + 1} - \Delta \tau_{t} } \right)^{2} } \right]$$
(3)

In light of Eq. (1), the problem may be rewritten as:

$$\min \mathop \sum \limits_{t = 1}^{T} \left[ {\delta^{2} + \lambda \left( {\Delta^{2} \tau_{t + 1} } \right)^{2} } \right]$$
(4)

where λ is the key parameter of the problem, since it determines the relative weight of the two opposite needs mentioned above. As such, λ sets the relative weight of demand shocks (i.e. cyclical effects in \(\delta_{t}\)) and supply shocks (relating to the behaviour of \(\Delta \tau_{t}\)), in explaining the evolution of output: the lower the value of λ, the higher the fluctuations allowed for potential output to maximise the fitness with actual output and the lower the portion of volatility of actual output explained by the output gap. In their work, Hodrick and Prescott (1997) suggest for practical applications to choose a value of λ related to the periodicity \(p\) featured by the data:

$$\lambda = \left( \frac{p}{4} \right)^{2} *1600$$
(5)

The value of λ obtained from Eq. (5) allows the HP filter to remove any cyclical fluctuation (both deterministic and stochastic) that lasts for less than about three years, while any cyclical fluctuation lasting for more than 20 years ends up in potential output. Fluctuations lasting between 3 and 20 years are attributed to variations in potential output according to phase width (the wider the phase, the more there are attributed to changes in potential output). Using Eq. (5) with yearly data (\(p=1\)), the resulting parameter value is \(\lambda = 100\). In this case, the average period of cyclical fluctuations varies between five and eight years. Alternatively, Ravn and Uhlig (2002) proposed to improve the performance of the HP filter, suggesting a modified version of Eq. (5), where the exponent of the first term is four rather than two. As a result, the value obtained is \(\lambda_{RU} = 6.25\), a significantly lower value.

Once the technical aspects of the analysis have been outlined, it is important to explain how the results may be interpreted in light of time series econometrics. Fig. 1 shows an example of output decomposition, the trend component and the cyclical component of a sample series.

Fig. 1
figure 1

Source: original elaboration

Decomposition of series and underlying distributions.

Suppose that at the beginning of the series, reader orientation towards homosexuality follows a normal distribution, as shown in the upper left corner of the figure. In particular, most readers will follow mainstream culture lying around the mean value of the distribution, while only a few belong to the tails of the distribution: conservatives demand a much lower than average amount of books treating homosexuality, while reformists are more curious about the topic and want to challenge prejudice, demanding a higher than average amount of books. Until the late 1950s, the series displays a very flat trend with values close to zero, implying that mainstream culture discourages open dissertation on homosexuality. In other words, the mean value of the distribution is low (\(\mu_{1}\) in the lower left corner of the figure) and the standard error of the distribution is small (\(\sigma_{1}\) in the lower right corner of the figure), meaning that most readers are close to mainstream culture and very few belong to the tails. From the late 1950s to the early 1990s, the series displays a clear upward trend, while the cyclical component is more volatile. The international context and national events are in other words producing an effect on mainstream culture, which is now becoming more open \(\left( {\mu_{2} > \mu_{1} } \right)\), while national readers are drifting apart from mainstream culture, creating more significant subcultures represented by thicker tails \((\sigma_{1} > \sigma_{2} )\). Reader orientation induces changes in the shape of the distribution, which may become right-asymmetric or left-asymmetric depending on contingent temporary shocks to the public opinion. Periods where the distribution is asymmetric to the right represent spells of strong interest on part of the public, while periods of left-asymmetry represent outbursts of conservativeness that may naturally arise in response to rapid social change (Courouve 1986; Baiocco et al. 2014). Finally, from the 1990s onwards, the trend component is steeper, indicating a growing openness of mainstream culture \((\mu_{3} > \mu_{2} )\) and the volatility of the cyclical component grows larger, pointing to the further distancing of readers from standard orthodox orientation \(\left( {\sigma_{3} > \sigma_{2} } \right)\). In particular, negative cycle values indicate an increase in the size of the right tail, meaning that conservatives grow more numerous, while positive cycle values are a symptom of the enlargement of the right tail, implying that reformists prevail over conservatives. This framework is especially useful when it comes to interpreting our results.

5 Data

In order to construct our dataset, we use the Google Books Ngram Viewer archive, which contains an outstandingly vast number of written works published from the sixteenth century to 2009. The corpus includes novels, scientific publications, magazines, comics, essays and newspaper articles (Michel et al. 2011; Lin et al. 2012). Overall, the repository covers between 6 and 8% of all the written works ever produced in the history of humanity. We restrict the analysis to texts published in Italian and French from 1900 to 2009 (the last year currently available), obtaining a total of 792.118 written works.

Since Italian is spoken almost exclusively in Italy,Footnote 3 it is safe to establish a direct correspondence between the national language and the national culture. French instead is an official language in 29 countries scattered around five continents (Ethnologue 2017), which makes it more complicated to assume a univocal correspondence between French publications and France. The publishing houses located in France however work in a regime of quasi-monopoly (Gnocchi 2004). Any literary work written in French that seeks legitimacy and acknowledgement from its readership needs to be published in France (or even better in Paris), as a result of the historical authoritativeness and primacy of the French representatives of culture within the broader context of la francophonie. The relation between the centre and the francophone periphery features in other words a strong degree of subordination, as a result of the lack of prestigious publishing houses and high-profile francophone Universities in much of the periphery (e.g. Sub-Saharan Africa). The guidance role played by French publishers acts as a cultural filter, influencing every aspect and phase of the publication process. Consequently, the vast majority of the books in our corpus are either published in France or heavily influenced by the technical and cultural standards set by French publishers.

Ngram Viewer allows to isolate lemmas and syntactic lexemes, returning the frequencies of usage for each word on a yearly basis. We count the texts that include at least once a lemma related to male homosexuality, either with a neutral or with a negative connotation. For French we use the neutral term homosexual and the insulting epithet pédé. The Italian equivalents are respectively omosessuale, and pederasta. All terms are used only when they are identified as nouns.

Of the 792.118 published works considered, 21.909 contain at least one of the words of interest, i.e. slightly less than 3%, almost evenly distributed between the two languages. Table 3 sums up the main features of the dataset, reporting the relative frequencies of the lemmas of interest. The values were rescaled, multiplying each frequency by 1,000,000 in order to obtain more readable numbers.

Table 3 Descriptive statistics

6 Results

The first part of this section shows the time series of insulting epithets, paired by correspondence. Fig. 2 in particular depicts the relative frequencies of Italian and French lexemes. Concerning the homosexuel/omosessulale series, a clear parallel behaviour emerges. In particular, both series consist in a steady and almost flat line until 1962 and start growing steadily afterwards, pointing the alleviation of censorship, the diffusion of awareness and the growing extent of legitimisation of homosexuality. While displaying the same pattern over the whole timespan, the two series do not overlap perfectly: the French terms is dominant until 1974 and becomes dominated afterwards.

Fig. 2
figure 2

Time series of Omosessuale and Homosexuel, 1900–2008

The pédé/pederasta series shown in Fig. 3 on the other hand displays very different trajectories in the two languages: while the Italian term looks rather stationary around low frequencies, its French equivalent encounters a turning point in the mid-1960s. Before 1964, its occurrences are relatively rare, and the term is dominated by its Italian counterpart. After 1964 instead, pédé grows increasingly more popular, dominating by and large the Italian series. This result may be explained in light of the resematisation process described in previous sections.

Fig. 3
figure 3

Time series of Pederasta and Pédé, 1900–2008

Summing up, this first descriptive analysis unfolds the presence of two different cultural environments. One the one hand, French speaking countries feel more intensely the effect of social change, as shown by the identitarian appropriation between the 1950s and the 1960s of the pédé lexeme on part of the French homosexual community. On the other hand, Italian speaking areas are more heavily influenced by institutional change. The prevalence of omosessuale is overwhelming and the term refers to sexuality in an exquisitely medical context. It is interesting to notice how the mere observation of the series proposed allows to characterise the intellectual and cultural fabric of French speaking and Italian speaking communities, highlighting differences that perhaps represent the ‘film’ of their social, cultural and institutional realities.

Subsequently, we apply the HP filter on the series obtained from Ngram Viewer. For each pair analysed, we reproduce four graphs. Graphs (a) and (b) show contemporaneously the trend (potential supply of books) and the cycle (demand for books) for the Italian term and for its French equivalent. Graphs (c) and (d), show respectively the trend and the cycle of the lexemes for both languages, in a comparative perspective, which makes them the most informative.

Figure 4 shows the decomposition of the omosessuale/homosexuel series pair. Both lemmas are very rare before the 1950s, when they start growing more popular. Although the trends are qualitatively similar, a clear dominance of the Italian series may be highlighted from 1980 to the early 2000s, when the rival term of foreign origin gay forcefully made its way into the Italian lexicon. The demand-side of the market shows a high volatility for both series starting from the 1960s, quantitatively more marked in the Italian case. It follows that the distribution of omosessuale term turns out to be flatter, thus generating a greater dispersion than mainstream culture. These results allow to conclude that, while the orientation of publishers of both languages overlaps, reader responses differ, with Italian consumers featuring a comparatively higher extent of openness towards the topic of homosexuality, when the phenomenon is worded in a scientific or medical fashion. The interest of Italian readers however pertains to the domain science, rather than to the societal aspects of the matter, as clarified by the cycle of the series. In particular, Fig. 4d shows that the cycle of omosessuale predominantly features negative values, implying a certain prevalence of the conservative culture among readers, which disfavours institutional change and equal rights for homosexual people. On the contrary, the cycle of the term homosexuel mostly displays positive values, tending to flesh out the reformist heterodox culture that recognises full social integration for homosexuals.

Fig. 4
figure 4

Decomposition of Omosessuale and Homosexuel, 1900–2008. a Omosessuale: Trend and cycle. b Homosexuel: Trend and cycle. c Both trends. d Both cycles

Concerning the pederasta/pédé pair depicted in Fig. 5, a strong difference emerges at a first sight. The supply-side of the market for the French terms undergoes a strong increase after 1960, displaying an exponential growth pattern. On the other hand, the Italian epithet keeps a rather constant trend over the whole timespan. The cyclical component highlights a difference in the level of pressure exerted by demand on supply: while the bottom-up push is steady for the Italian case, it grows increasingly stronger after 1960 in the French case, mirroring a period of intense grassroot mobilisation that culminated with the establishment of a solid LGBT community. Also in this case the prevalence of positive values is higher in the French language than in the Italian one; showing this a prevalence of a more reformist culture in French-speaking countries than in Italy.

Fig. 5
figure 5

Decomposition of Pederasta and Pédé, 1900–2008. a Pederasta: Trend and cycle. b Pédé: Trend and cycle. c Both trends. d Both cycles

The spectacular increase observed for the trend component of the pédé series takes place in response to the ongoing process of resemantisation. Once firmly established and organised, the French LGBT community undertook a steady battle against discrimination, turning the most commonly used epithet directed against homosexuals into its own standard. Pédé thus gradually lost its negative sexual connotation and became a general term. This process may be analysed in the light of Bourdieu’s dichotomy between objective cultural capital and incorporated cultural capital (Edgerton and Roberts 2014). The former refers to the extent of knowledge and cultivation that an individual amasses over her life, including educational qualifications. The latter instead covers all mental attitudes, including cognitional schemes and acquisition methods. Being raised in a cultivate environment for example offers relevant social advantages, including a certain familiarity with legitimate culture and a higher extent of flexibility in learning (Bourdieu 2001), which we would define today as open-mindedness. Lexical discrimination is a by-product of a substantive lack of incorporated cultural capital, which may lead to a self-enforcing ‘bad’ equilibrium, where a certain share of the population is marginalised. Resemantisation functions as a social driver that spreads the incorporated component of cultural capital intended à la Bordieu across society, depriving those featuring low objective cultural capital of one the most popular weapons of their arsenal. Rather than forbidding the usage of a discriminatory lexeme, resemantisation encourages it, while stripping it of its negative connotation and anti-social effects.

Finally, one factor that contributes to shaping the two series depicted above is the dynamics of major historical events: while the self-enforcing mechanism underlying discrimination may be very hard to break, shifts in mainstream culture may in the long run produce tolerance and inclusion (Bianchi et al. 2008). Drastic changes, like the WHO’s decision to remove homosexuality from its list of mental illnesses in 1990, inevitably produce a top-down effect, influencing the media first and the general population afterwards.

7 Concluding remarks

Discrimination is a significant problem that LGBT communities have been facing for several decades. This work focuses on the historical experiences of Italy and France from the beginning of the twentieth century, proposing a quantitative approach to verify the validity of the narratives produced by several years of qualitative studies on discrimination. The previous literature and recent survey data depict Italy as a somewhat hostile cultural environment for male homosexuals, while France emerges as a more tolerant context.

Using time series analysis, we focus on the frequencies of usage of two very different terms, i.e. the equivalents of homosexual and the equivalents of pederast. While the former is a technical and non-discriminatory term, the latter features a derogatory connotation. In spite of the previous claims, we find that Italy and France display a similar historical behaviour if considering the series of terms that stand for ‘homosexual’. In other words, little difference emerges when the topic of homosexuality is worded technically and tackled without any connotation. On the other hand, the two languages differ significantly when considering the series of the terms that stand for ‘pederast’, negatively connotated and discriminatory: the term is seldom used in Italy, confirming a silent orientation towards negation rather than repression, whereas its French equivalent has grown more and more popular after the 1960s, as a result of the bottom-up process of resemantisation that was largely stimulated by the French LGBT community. The extensive grassroot mobilisation of French associationism exerted significant influence on mainstream culture, which also produced a shift in lexicon. This process is absent in the Italian language, where the strategy of silence highlighted by the previous literature is substantially confirmed by the data.

To our knowledge, this work represents a first attempt to study the lexicon of homosexuality in a diachronic fashion, using sophisticated analytical tools and the rich dataset of Google Books Ngram Viewer. Some limitations however must be highlighted: first of all, the timespan we cover ends in 2009, leaving the most recent developments of language and culture unexplored. Second, we consider only two paradigmatic lexemes, while several epithets are used in both Italian and French to characterise male homosexuality. Finally, the vast majority of the written works in French are published in Paris, which is also by and large the most gay-friendly city in the country: this coincidence might overestimate the openness of the French society.

Future studies may widen the scope of the analysis proposed here, considering more lexemes as well as more languages. Since discrimination takes many forms moreover, similar investigations may be set up to analyse problems such as racism, xenophobia and discrimination against minorities.