1 Introduction

Fan cultures and communities have been productive actors in the literary sphere for centuries:Footnote 1 After killing off his most famous creation, Sherlock Holmes, Arthur Conan Doyle was bombarded with hundreds of angry letters and had to cope with the competition of quite successful fan writers, who distributed stories featuring the famous detective, brought back to life and quick-witted as ever (Cranfield 2014; Kuhns 2014). Wilkie Collins, whose The Woman in White (1860), a classic of mystery fiction, was published in serial publication, has been suspected to incorporate plot developments suggested by his audience in fan mail to ensure that his satisfied readers kept on reading, and thus, kept on making him money (Pykett 2005, pp. 79–80). The consolidation of this productive power in the 20th century with the creation of fanzines and the institutionalization of fan cultures in clubs and conventions (Jenkins 1992) is accompanied by the rise of even more independent fan-made art, and more specifically, fanfiction. While fanfictions’ de-centralized publication structure in early mailing lists (see Zurek/Petrik 2015, p. 25) and fanzines enhanced the inter-personal connections in the fan communities and helped to solidify certain fanfiction tropes (Tosenberger 2014, p. 8), only the creation of self-organized online archives in the late 1990s and early 2000s provided fans without direct personal connections to any fan club or community access to fanfiction. With freely accessible platforms as, for example, fanfiction.net (1998), Wattpad (2006), Archive of Our Own (AO3) (2009), and Fanfiktion.de (2004), the mass of literary fan-art is now read, written, and reviewed online, creating an extensive archive of what fanfiction is today.

Due to their accessibility, fanfiction archives do not only grant access to a world-wide community of reader-writers, but also enable quantitative analyses of mechanics, characteristics, and tropes of fanfiction. It is the aim of this contribution to address one of these characteristics, the issue of shifts in the conceptualizations of characters taken from the original universe, in the context of the Harry Potter fandom. By doing so, we will not only test and improve methods of character analysis, but also combine results from quantitative methods with existing theories of fanfiction. With our results, we will show that even though generally, fanfiction communities do not explicitly formulate rules and regulations defining acceptable deviations from the original canon, there are constraints in place that guide the major trends in the conceptualizations of characters.

In order to be able to link qualitative research on fanfictions and Harry Potter with our results, the first section summarizes existing research and provides an explicit definition of fanfiction in terms of its recursiveness. We will then go on to describe the corpus creation and metadata curation. To target different layers of the texts and the inherent textual complexity (see Weitin 2017), our comparative approach is divided into three parts: First, we will show with basic frequency distributions that there is in fact a shift of quantitative focus on specific characters between J. K. Rowling’s novels and the fanfictions. In a second step, we will identify central character pairings in both originals and fanfictions and will indicate how the portrayal of their relationship differs and which pairings gain or lose the fanfiction writers’ attention. Third, we will use word embeddings and a word vector based approach to sentiment analysis to quantify possible variations in the emotional charging of characters. In the discussion section, we will bring together the results obtained with our methodological triad and theories of fanfictions to highlight trends and tendencies in character shifts from original novels to fanfictions.

2 Theoretical Background

Fanfiction as a concept and practice has often been defined based on its derivative and appropriative nature, which, as Abigail Derecho (2006, p. 64) argues, automatically »throws into question the originality, creativity, and legality of th[e] genre«. In our consideration of fanfiction, we regard fanfiction as being primarily defined by its referentiality: There is always a referenced text that is to some extent »repeated with a difference« (Derecho 2006, pp. 73–74; cf. Genette 1993). In contrasts to other forms of intertextual literature, the creation of this difference, its extent and nature, is »deeply embedded within a fannish artistic community with an existing set of interpretations, tropes, and narrative traditions« (Tosenberger 2014, p. 13). This double recursiveness, or, as Katherine Tosenberger calls it, »aesthetics of constraint« (2014, p. 22), is the mechanism by which trends and tropes in fanfictions are regulated: A text is a re-telling and re-imagining of an existing story, but is simultaneously contingent on the already existing re-tellings and re-imaginings in the community.

The choice of the specific »difference« that is created in a fanfiction can be linked to the receptive mode with which the reference text is consumed. Being part of a, as Tosenberger (2014, p. 13) calls it, »fannish artistic community« requires a degree of immersion in the narrative world and emotional involvement with the source material that surpasses other modes of reading and is »extreme, intense, [and] more powerful than simple appreciation« (2014, p. 6). Building upon analyses of historical reading responses, this reading mode typical for lay audiences can be described as being primarily concerned with a text’s affective-communicative value: Readers focus on aspects that seem especially realistic and/or close to their personal life and their own memories, disregarding the composition of the text itself (Heydebrand/Winko 1996, p. 212). As a consequence, the choice of »difference« is not primarily based on the reference text, but on the interplay of the individual reading experience, desires for more diverse reading material, and the recursiveness within the existing body of fanfiction, causing a multi-layered system of references whose disentanglement is the source of many fans’ most profound enjoyment (Tosenberger 2014, p. 17).

Due to being closely connected to the reader-writer’s personal reading experience, the specific type of »difference« that is presented in a fanfiction can be linked back to Wolfgang Iser’s concept of narrative gaps (»Leerstellen«) (1994) and, even more generally, to perceived and real temporal gaps in the reading experience. In the case of the Harry Potter series, temporal gaps in the reading experience were caused by, as Vera Cuntz-Leng (2017) points out, the extensive time periods between the publication of the individual novels in the series. In the two years between the publication of the penultimate and the final installment of the series, Half-Blood Prince (2005) and Deathly Hallows (2007), numerous versions of endings to the saga were floating around online (2017, p. 96). Similar to these gaps in the publication process, the omissions in the narrative itself leave room for productive reception. These omissions—Cuntz-Leng compares them to the cinematographic elements such as cuts and pan-shots (2017, p. 96)—provide fanfiction writers the opportunity to complete the narrative according to their own design by describing situations between the narrated plot points and by providing minor characters with background stories.

One prominent omission in the Harry Potter universe has been highlighted retrospectively by the author herself: By outing Dumbledore as homosexual, long after the last novel had been published and with only »ghostly traces of homosexuality« in the texts themselves (Pugh 2014, p. 93), Rowling invoked a so-called queer reading (Sedgwick/Koestenbaum 2016) of her oeuvre, which is in general already common in the fanfiction community (Cuntz-Leng 2015, pp. 77–104). The omissions of any mentions of Dumbledore’s sexual identity as well as the general avoidance of portraying sexuality in any form give fanfiction writers the opportunity to close these gaps with narratives that reflect their own (queer) reading of the text. Similarly, indeterminacies (Iser 1972) and contradictions in the reference text offer points of entry for fanfiction writers. Their »desire for correction« (Cuntz-Leng 2017, p. 100) of contradictions and plot holes is part of the fanfiction discourse, but also entrenched in the broader fandom.

As the »primary ›threshold fandom‹ of the internet era« (Tosenberger 2014, p. 9), the Harry Potter fandom is characterized by »a proliferation of specialized microfandoms« (2014, p. 9) and »an increasingly customizable fannish experience« (Coppa 2006, p. 54) in innumerable forums and fan clubs, both on- and offline. Crowd-sourced wikis (e.g. the German-language Harry Potter Wiki [2005]), Podcasts and YouTube channels dedicated to the fan theories (e.g. Harry Potter Theory [2019], Harry Potter Folklore [2015]), discussion panels at conventions such as LeakyCon, first hosted in 2009, and fan-made musical and theater productions (A Very Potter Musical [2009], Puffs [2015]) built upon omissions and contradictions in the original novels, are indicators of the fandom’s communal desire for completion, disambiguation, and correction (Cuntz-Leng 2017).

The centrality of the Harry Potter novels in the canon of contemporary juvenile fiction has lead to increased research interest in the book series. Among many other research subjects, the series’ characters, their functions, and inspirations have been addressed in contributions on the novels’ genre conventions (Hiebert Alton 2009; Le Lievre 2003), their identity politics (Birch 2020; Byler 2016; Saraco 2020), and intertextual references to mythology and religion (Ciaccio 2009; Hartmann 2017). According to Maria Nikolajeva (2009, p. 226), the success of the series can be linked to the re-introduction of a romantic hero in contemporary children’s literature: The protagonist Harry, who functions with only a few exceptions as the series’ exclusive personal narrator and stand-in for the intended juvenile reader, is born around mystical circumstances, receives unlimited power (although he has to train in them first), is marked as the chosen one and is, due to his inherent goodness, predestined to triumph over evil. In Rowling’s interpretation of the romantic hero, this leaves, as Nikolajeva (2009, p. 225) states, »a fortunate blend of the straightforward and the reasonably intricate, the heroic and the everyday«. The idea that the combination of archetypal qualities and the ordinary is a defining characteristics of the Harry Potter series is also supported by Paul Bürvenich’s description of Harry as a Cinderella-like figure who starts out being constrained to forced labor, subjugated by his relatives, living in dire conditions (the infamous cupboard under the stairs) and finally rising to the top of the social hierarchy, as the wizarding world’s »Chosen One« and Christ-like savior. This set-up is then combined with Harry’s depiction as ordinary school boy, encountering every-day obstacles at school, such as bullying and academic struggles (Bürvenich 2001, 36 ff.). Other archetypal characters in the series include Harry’s nemesis, the uber-evil Lord Voldemort, the not as menacing, but still annoying school bully Malfoy, the omniscient mentor Dumbledore, the funny best friend Ron, who is more experienced in the magical world, and the bookish female best friend with a defined moral compass Hermione.

In order to be able to make more generalizable claims about the innumerable number of fanfictions written about these characters, and fanfiction in general, quantitative approaches have to be considered (e.g. Milli/Bamman 2016; Jacobs 2019; Kleindienst/T. Schmidt 2020; Pianzola/Rebora/Lauer 2020; Pianzola 2021; Rowe/Henderson/Wang 2021). In their study on characters in fanfictions published on fanfiction.net, Smitha Milli and David Bamman (2016) show that fanfiction writers tend to focus on secondary characters: Juxtaposing the automatically extracted characters in fanfictions and their source material, they find that, for example, Mr. Darcy and Dr. Watson are more central in fanfictions than the corresponding main characters, Elizabeth Bennett and Sherlock Holmes. According to Milli and Bamman, fanfictions also show a statistically significant increase in interest in female characters. A similar tendency is observed by Frederico Pianzola in his comparative analysis of screen time in the Harry Potter movies and character tags for Harry Potter fanfiction on AO3: While the 46 female characters in the movie franchise have 3.2 times less screen time than their 76 male counterparts, this gap decreases in fanfictions, with female characters being featured only 2.5 times less than male characters. On the character level, some minor characters appear drastically more often in fanfiction than in the movies: Blaise Zabini, one of the few black students at Hogwarts, and Bill Weasley, the oldest Weasley brother, both seem to be fan favorites even though they scarcely appear in the movies. When comparing general fanfiction audiences’ preferences on character constellations and pairings surveyed in an online census (Lulu 2013) to metadata tags assigned to Harry Potter fanfictions on AO3, Pianzola also finds that both readers and authors seem to prefer M/M fanfictions, with 90% of surveyed readers expressing their predilection for homosexual slashes and 61% of analyzed texts featuring such pairings.

3 Data and Corpus

The corpus that we are using has been created in the context of a greater research project on German-language fanfiction (see, for example, Weitin et al. 2023) and can be seen as part of an ongoing effort to enable large-scale literary corpus analysis for languages other than English (see, for example, Odebrecht/Burnard/Schöch 2021). The entire corpus comprises over 24,000 fanfictions created or updated in the year 2020 on Fanfiktion.de, the largest German-language archive for fanfiction with in sum over 412,000 individual texts. Due to our comparative approach, we are on the one hand limited to fanfictions which reference other textual sources (not, for example, comics, movies, or real-life people), and need, on the other, quite extensive source material for the creation of word embeddings (excluding small fandoms based on a single reference novel or shorter texts). Therefore, Harry Potter fanfictions are especially suitable for our analysis, as there is enough textual material for both originals and fanfictions for the creation of reliable text models. The corpus of the original Harry Potter novels consists of the plain text extracted from e‑book editions of the German translations; paratexts and preliminaries were removed manually.

The Harry Potter fanfiction sub-corpus has been created based on the fandom attribution that is part of the publication process on the Fanfiktion.de platform. Each text has to be attributed to a specific fandom; the available categories are not created bottom-up but top-down by suggesting additional categories to the platform’s administrator. The website Fanfiktion.de also offers certain subdivisions for larger fandoms according to the specific reference material. We chose to include only the main Harry Potter fandom (Harry Potter/Harry Potter - FFs) and excluded smaller fandom categories, as, for example, fanfiction about the Fantastic Beasts franchise, because our methodological approach only allows for the comparison of characters present in both reference text and fanfiction.Footnote 2

In addition to the content-related heterogeneity expected from such a broad fandom category, the texts themselves can be differentiated based on their word count, text type, and genre. With a mean of 5,739 words per text, the Harry Potter fanfiction sub-corpus includes individual fanfictions from 22 to 4,119,120 words, which is in fact more extensive than the entire corpus of original Harry Potter novels with its 1,124,139 words. Most of the shorter texts belong to structurally limited story types such as one-shots and drabbles, which consist of a single chapter or exactly 100 words, respectively. The genre categories, which are again pre-defined by the platform, represent a mixture of more traditional notions of genre (mystery, thriller, romance, adventure) and thematic foci (pain & solace, family, friendship). The customary indication of slashes—which Cuntz-Leng defines as »a ›fannish‹ concept in which fictional characters are removed from their preferred heteronormative exegesis and transferred into self-made homoerotic utopias« (Cuntz-Leng 2017, p. 93)—is addressed by the category of pairing types.

4 Frequency distributions

In general, all larger fanfiction forums share a distinct organizational structure. To find a fanfiction of interest, readers navigate through various fandoms to find a list of recently updated texts. If they want to search for a certain fanfiction of their liking, common subcategories or tags are genre (e.g. adventure, drama, romance), the explicitness of the text (mostly defined by a age-restriction or recommendation) or the major characters of the fanfiction. With this information, readers get a first impression of the plot, be it a classic adventurous re-telling of an episode from the original books starring Harry, Ron and Hermione or a complete turn-over of the original character constellations with Harry and his teacher Snape entering a (sexual) relationship. Quantifying which characters appear to which extent in originals and fanfictions can therefore already indicate whether these characters are over- or underrepresented in fanfictions compared to the original universe. Frequency distributions are a suitable indicator for general shifts of interest between originals and fanfictions and constitute points of reference to which shifts are connected to canonical omissions identified and filled by fanfiction writers.

Working with character names requires a preceding preprocessing step: In most cases, characters are not referred to by their full name, which causes inconsistencies and the need for disambiguation. To tackle these issues, we created a dictionary with all name variations and an identifying entity name, consisting of the standard first and second name in the German translations in all capital letters, divided by an underscore (e.g. HARRY_POTTER, RON_WEASLEY, and HERMINE_GRANGER). Replacing all occurrences of the identified name variations with the entity name, beginning with the most general variation and then moving on to pet names and more obscure synonyms, we created modified text versions for both originals and fanfictions. The issue of identically named characters, which is even used as a narrative device in the fourth installment of the series, is pragmatically resolved: By reordering the dictionary by the expected frequency of characters sharing name variations (for example, Ron Weasley and his father Arthur Weasley being referred to by »Mr. Weasley«), name variations for the character with a higher expected frequency are retrieved first. To be able to compare the frequency distributions between the two corpora (originals and fanfictions), the frequency of each entity is calculated relative to the total sum of character mentions in each corpus.

As expected, the frequency distribution of characters in the original texts (see Figure 1) is dominated by the protagonist Harry Potter with a relative frequency of 0.264, which means that he accounts for over 25% of all occurrences of names in the originals. With a great gap, he is then followed by his friends Ron (f = 0.088) and Hermione (f = 0.076) and the headmaster Albus Dumbledore (f = 0.048). After some more characters with relative frequencies over 0.016 (Rubeus Hagrid, Severus Snape, Voldemort, Draco Malfoy and Sirius Black), the distribution of the remaining 315 characters flattens out rather quickly.

Fig. 1
figure 1

Relative frequencies of the top 50 character mentions in the Harry Potter series

Differences between the frequency distribution of character names in the originals and the fanfictions are apparent at first glance (see Figure 2). First, the predominance of the eponymous hero falters; Harry accounts for less than 9% of all character names in fanfictions. Second, the gap between the top characters is less pronounced: In contrast to the originals, Harry is more closely followed by Draco Malfoy (f = 0.053), Hermione Granger (f = 0.044), and Severus Snape (f = 0.042). After the fourth position in the frequency ranking, we find a larger gap to a group of characters lead by Ron Weasley (f = 0.017), Albus Dumbledore (f = 0.016), and Sirius Black (f = 0.015). Similarly to the frequency distribution of character names in the originals, the distribution of in total 335 characters—the fanfictions include cross-over characters or newly created characters—then flattens out.

Fig. 2
figure 2

Relative frequencies of the top 50 character mentions in fanfictions

With this first and most basal approach to analyzing character shifts, we can already circumscribe some general tendencies for Harry Potter fanfictions. Most strikingly, the attention the series’ main character receives in fanfictions diminishes significantly. As the main protagonist for most of the plot, as well as the narrative’s internal focalization and most fleshed-out character, the character of Harry Potter seems to provide fewer omissions and indeterminancies to be filled by fannish interpretations. Other central characters, such as Draco Malfoy, Hermione Granger and Severus Snape, seem to offer more such opportunities, particularly as in the canon, readers encounter them only in interactions with and through the mediation of the main character. Especially for the two canonically negatively portrayed characters of Malfoy and Snape, this can also be seen as a first indication of the fanfiction trope of the centralized villain. Several minor characters, as, for example, the clique around Sirius Black and Harry’s father James Potter, who novel readers only encounter in flashbacks, are represented more prominently in fanfictions, again suggesting that narrative gaps relating to events before the narrated time spark fanfiction authors’ interest.

5 Co-occurrences

The first findings indicate that some characters are more likely to be featured in fanfictions than others. The logical next step is to examine which of the characters co-occur in fanfictions most often. Co-occurrences are not only markers of adapted plot points and filled omissions, as characters are paired together who might not interact directly in the originals, but also function as a direct symptom of slashes. These predominantly non-canonical romantic and/or sexual pairings are key elements in the fanfiction repertoire which can also be linked to more general character shifts.

5.1 Pairing Shifts

In contrast to plays, which have a clearly operationalizable form of co-appearance on stage, alternatives for the operationalization of character co-appearance and thus implied interaction in narrative texts have to rely on co-occurrences of entities in text segments. We decided to implement these co-occurrences on the comparatively narrowly defined sentence level, as this limited textual frame suggests a close personal connection between characters. For this, we identified sentences in which a pair of entity names co-occurred and then calculated in how many sentences of the corpora co-occurrences were detected. With these calculated co-occurrence values between characters, the pairing differences and potential shifts across the two corpora can be identified.

As a first approach to understanding differences in co-occurrences, we use hypothesis testing to fit a statistical model to our observed data. This means that we measure the difference between the observed data, in our case the distribution of character co-occurrences in fanfictions, and the canonical norm of character co-occurrences presented in the originals. These one-directional shifts indicate whether a character pairing is more common in fanfictions than in the originals, and vice versa.

Treating the values in the original and fanfiction texts as expected and observed, a suitable hypothesis test is the chi-square goodness of fit test (Pearson 1900) defined as

$$X^{2}=\sum \frac{\left(\textit{observed}-\textit{expected}\right)^{2}}{\textit{expected}}.$$

While the numerator measures the difference between the observed and the expected and therefore represents a shifting behaviour, the formula as a whole intends to take the shape of the sum-of-squared differences (Eckle-Kohler and Kohler 2017). This test can be applied to the whole set of co-occurrence shifts, but as it is especially interesting to evaluate individual character pairs, single fractions can be used to represent a shift of the respective pairing:

$$\frac{\left(\textit{observed}-\textit{expected}\right)^{2}}{\textit{expected}}$$

However, due to the square, all fractions have a positive value and therefore only indicate if there is a difference between the observed and expected data points. To distinguish whether the observed value is larger or smaller than the expected value, or, in other words, whether a pairing occurs more or less frequently in the fanfictions than in the originals, the shift direction has to be disambiguated by using Pearson residuals (Agresti 2003) which are the square root of these fractions:

$$r_{j}=\frac{\textit{observed}-\textit{expected}}{\sqrt{\textit{expected}}}$$

Due to the same difference in the numerator, Pearson residuals have the same purpose as the chi-square goodness of fit test, but allow for a more fine-grained description of individual co-occurrences of characters: A negative value corresponds to a shift where the frequency of the co-occurrence of two characters is higher in the original than in the fanfictions; a positive value indicates that the co-occurrence of two characters is more prominent in fanfictions than in the originals. For a set of characters, these values can then plotted in so-called chi-gramsFootnote 3 (see Figure 3).

Fig. 3
figure 3

Chi-gram of central characters sorted by their frequency in the originals

In the heat map, a selection of the most central characters in the Harry Potter universe can be seen (sorted by the relative frequency of their entity names in the original texts). Every cell contains the Pearson residual of the pairing of two of those characters in the corresponding row and column. The underlying shades of red and blue denote a shift of decreasing or increasing number of co-occurrences towards fanfiction texts, respectively. The intensity of the color describes the strength of the shift. The most notables shifts in darker shades of red can be seen in pairings with some of the series’ main characters, especially Harry, Hermione, and Ron. This can be seen as further proof of the hypothesis that minor characters are foregrounded in fanfictions: Main characters co-occur less frequently with other main characters, as they are more frequently presented in situations with canonically minor characters. As the characters are arranged according to the decreasing frequency of their entity name (see Figure 1), the color evolution from the top left to the bottom right also supports this claim: A clear trend of more and more positive Pearson residuals (i.e. the dominance of blue cells) demonstrates how in fanfiction, less frequent characters are moved to the foreground.

A character’s sum of Pearson residuals over all pairings displayed in Figure 3 offers a more general understanding of under- or over-representation in fanfictions compared to pairings in the originals (see Table 1). For example, in addition to a lower relative frequency of his entity name, the protagonist Harry also appears less frequently in pairings with other major characters. Of the central novel characters, Draco Malfoy, Remus Lupin, Ginny Weasley, and James and Lily Potter co-occur more often with other central characters in fanfiction than in the originals. This can be seen as an indication that these characters are more frequently set in relationship to others, and that they are likely to fulfill more central roles in fanfictions (both on the level of narrative and plot).

Table 1 Sum of rows for chi-gram in Figure 3

In Figure 4, only characters with at least one pairing shift of over 0.5 are displayed (sorted by the relative frequency of their entity names in the fanfictions). In addition to the apparent under-representation of co-occurrences of the main characters Harry, Hermione, and Ron, Harry’s relationships with his mentor figures, Dumbledore, Hagrid, and, to some extent, Sirius Black, are also not as strongly featured in fanfiction. Interestingly, the novel’s main conflict between good and evil, represented by Harry and Voldemort, is also less central: Voldemort co-occurs less frequently with both Harry and Dumbledore, who are marked out to be his most prominent opponents in the originals. In contrast to this, characters who are mostly absent from the original narrative, as, for example, Lily and James Potter, are frequently co-occurring with other characters in fanfictions. Similarly, a canonical flat character such as Blaise Zabini is more often paired with others, which is also true for the series’ antagonists, Snape, Draco and Lucius Malfoy.

Fig. 4
figure 4

Chi-gram of characters with strongest shifts sorted by their frequency in the fanfictions

5.2 Event Shifts

To contextualize these shifts further, we introduce another level of complexity by analyzing words in collocation with character pairs to clarify the situations in which these pairs appear together. First, we calculated the relative frequencies for all words appearing in the contextual sentences for each pairing. By comparing the relative frequencies of words that appear in contextual sentences for a pairing in both corpora, i.e. the intersection of collocations for the two corpora, we are able to determine shifts in words that are associated with character pairings. To be able to determine the direction of these shifts, we again used Pearson residuals.

Figure 5 depicts the top 20 shifts in both directions for the character pair of Harry and Snape. The blue bars at the top stand for collocations that occur proportionally more often in the fanfiction texts, red bars at the bottom for collocations more frequent in originals. As expected, negatively connoted words such as »Fluch« (curse), »Hass« (hate), and »zornig« (angry) occur especially often in sentences describing interactions between Harry and Snape in the original texts. In contrast to this, their relationship in the fanfictions can be claimed to be more positively portrayed, with words such as »lächelte« (smiled), »grinste« (grined) and »nickte« (nodded) being over-represented in sentences featuring the pairing. Moreover, with words such as »bett« (bed), »zog« (pulled/attracted) and »flüsterte« (whispered) frequently appearing in fanfiction sentences implying an at least private, if not intimate relationship between the two characters, the issue of slash fanfiction is, again, evoked.

Fig. 5
figure 5

Intersection of collocations for Harry and Snape (fanfictions at the top, original texts at the bottom)

When examining collocations that are unique to either one of the corpora, i.e. collocations that represent the set difference between contextual sentences of a pairing in originals and fanfictions, it is not necessary to calculate Pearson residuals, as the words only exist in on set of contextual sentences. Consequently, relative frequencies are enough to indicate the contrast between the collocations.

Figure 6 shows the top 20 most frequent collocations in the set difference of all collocations between Harry and Snape; blue bars at the top represent shifts towards fanfictions and red bars at the bottom those towards the original texts. While the red bars denote rare and extremely specific compounds such as »Schuljahresbeginn« (the beginning of the school year), the blue bars again suggest a potential intimate relationship in fanfictions with words such as »kuss« (kiss) and »küsste« (kissed).

Fig. 6
figure 6

Set difference of collocations for Harry and Snape (fanfictions at the top, original texts at the bottom)

6 Word Embeddings

6.1 Model

Word embeddings in their implementation as word2vec (Mikolov et al. 2013) have been used fairly regularly in wider digital literary studies (e.g. B. Schmidt 2015a; B. Schmidt 2015b; Heuser 2016; Grayson et al. 2017; Schöch 2022) and are often employed to tackle semantic questions, such as the extraction of especially similar or dissimilar words. Word2vec models are created via a neural net, with hyperparameters determining, among others, the vector size, the size of the context window (how many tokens before or after a certain token are considered relevant), the number of iterations, the minimum frequency of the words for which vectors are created, and the chosen architecture. For both architectures, Skip-gram (SG) and Continuous Bag of Words (CBOW), we used the gensim implementation of word2vec.

In addition to package’s hyperparameter standards of a vector size of 100, a context window of 5, 5 iterations, a minimum count of also 5 and CBOW as architecture (Řehůřek 2021), we surveyed related works for variations to these settings. While some researchers, for example Ryan Heuser (2016), used the standard vector size of 100 with a Skip-gram architecture, others opted for a combination of Skip-gram with a larger vector size of 500 and a wider context window of 12 words (B. Schmidt 2015a, B. Schmidt 2015b). Grayson et al. (2017) and Christof Schöch (2022) both used versions of Skip-gram models with 300 dimensions, context windows of five to six words and an absolute frequency threshold of 50 to 100.

With these contributions as points of reference, we compared various models and parameter settings to find the most suitable model for our corpora. Using three semantic test sets belonging to the Deep Semantic Analogies lexicon assembled by the Universität Stuttgart (Köper/C. Scheible/Schulte im Walde 2015), different model parameters were evaluated for the potential models ensuring that the chosen parameters yield a valid model. As parameters of interest we tested the model architecture (CBOW vs Skip-gram), the vector size (100, 200, or 300), and the number of epochs (5, 10, 15, 20, or 25); all tested parameter combinations are summarized in Table 2.

Table 2 Models for testing parameters

Figure 7 and Figure 8 show each model’s accuracy for the two analogy test sets, de_trans_Google_analogies (Mikolov et al. 2013) and de_sem-para_SemRel (S. Scheible/Schulte im Walde 2014). These data sets encode semantic meaning as the analogical relationship between words, i.e. in the form of A is to B as C is to D. Using the corresponding function in the Python library gensim, the result of an analogy task in the model is matched with the solutions in the data set and accordingly, the accuracy of the correct matching is measured. The accuracy values on the left side correspond to the German version of the Google semantic/syntactic analogy data sets de_trans_Google_analogies, which was manually translated and contains 18,552 analogy tasks. On the right, accuracy values yielded with the paradigmatic semantic relation data set de_sem-para_SemRel are presented, which contains 2,462 analogies for German. Different parameter combinations yield the highest accuracies for the original and the fanfiction corpus: While model D has the highest accuracy (accuracy = 0.33) for the original corpus, model E with its additional five epochs works best for the fanfiction corpus (accuracy = 0.31). For the paradigmatic data set, several models perform similarly well for the original corpus (A, C, G, H); for the fanfiction corpus, it is model K that performs slightly better than all other models.

Fig. 7
figure 7

Accuracy tests for original models

Fig. 8
figure 8

Accuracy tests for fanfiction models

As an additional evaluative step, we used the Schm280 (S. Schmidt 2001) data set, which contains 280 word pairs with a value describing their semantic relatedness (see Figure 9a & Figure 9b). Using the corresponding function in gensim, these values are compared with the respective similarity in the models and a correlation score is calculated to measure how well they match in total. Here, model D is identified as the most suitable parameter combination for the fanfiction corpus while for the original texts, models L and P achieve comparatively high results. Nevertheless, a summary of all evaluation tests identifies model D as the overall most suitable parameter combination for both corpora, as its accuracy and correlation values are comparatively stable across models and corpora, thus ensuring a higher degree of reliability.

Fig. 9
figure 9

Correlation tests for original and fanfiction models. a Correlation test for original models, b Correlation test for fanfiction models

6.2 Cluster

In his paper »Sentiment Analysis for Words and Fiction Characters From the Perspective of Computational (Neuro‑)Poetics«, Arthur M. Jacobs proposes an innovative approach to sentiment analysis using word embedding models (Jacobs 2019). Jacobs computes emotional and figure personality profiles based on a fastText vector space model for the German Harry Potter book series (Skip-gram, 300 dimensions, no minimum count). With set label words representing high and low arousal values, Jacobs computes the cosine similarity between these labels and all other words, including character names. By subtracting the similarities to the negative poles from the similarities to the positive poles, he computes valence and arousal values for all words in the model. Furthermore, Jacobs defines an emotional potential as the absolute value of the valence multiplied by the arousal. In comparing relative values of seven selective main characters Jacobs identifies the characters of Harry, Hermione and Hagrid with the highest relative valence value as protagonists and Voldemort with a low valence though a high emotional potential as the expected antagonist (Jacobs 2019).

While adopting Jacobs’ method of calculating the valence, arousal and emotional potential for literary characters, we propose an alternative way of selecting meaningful label words, not from a theoretical point of view but an empirical inquiry in emotionally charged vocabulary. For this we use the Berlin Affective Word List - Reloaded (BAWL-R), a catalog of nearly 3000 German words and, among other values, their empirically retrieved valence and arousal values (Võ et al. 2009). In contrast to Jacobs’ approach, the BAWL list contains more potential words occurring at least 50 times in the corpus and thus, due to this larger pool of possible sentiment representations, the four polarities of high and low valence and high and low arousal can be operationalized reliably. The first step is to subset words for each category whose values are closer to the end of the given spectrum compared to all other BAWL words. More concretely, this means that, for example, the subset representing high valence consists of words that occur at least 50 times in the corpus and at the same time, have high valence values. In order to ensure the reliability of the sentiment scores, we additionally introduced an upper threshold for the standard deviation of sentiment scores of words. Even after extracting the subset of top words for high valence, the problem might arise that their embedded vectors are spread across the vector space. Starting from the assumption that closeness of vectors corresponds at least partially to semantic similarity, such a scattering does not form a clear and compact cluster characterizing the semantic space that stands for high valence. As a consequence, summing the similarities between a character vector and this subset of vectors might lead to less interpretable results due to the measurement across multiple unconnected semantic subspaces. To find a suitable subspace for this purpose, the subset of vectors can be clustered in itself so that groups are formed containing members with less distance to each other. In other words, the intra-cluster similarity should be as high as possible. This approach allows to define the centroid of this cluster as a representative of high valence. Therefore, the high valence aspect of a character is given as the similarity between its vector and this centroid.

In this context, the number of potential clusters for a polarity is variable as the corresponding word vectors can spread arbitrarily across the whole vector space. For automatic clustering, this greatly reduces the options in terms of suitable algorithms. More established methods such as e.g. k-means (Hartigan and Wong 1979) are not applicable without additional effort (although it is possible to get an adequate number of clusters when trying to optimize the resulting cluster shape). In contrast to this, the chosen clustering method affinity propagation (Dueck 2009) allows to automatically identify a suitable number by evaluating the best centroids for hubs of high similarity.

After applying the clustering algorithm, multiple clusters of high intra-cluster similarity are obtained. The four clusters for the definition of either end of the valence and arousal scale can be chosen with different strategies: The straightforward method would be to hermeneutically identify the most concise cluster for each position by evaluating its content in the context of the corresponding corpus. However, an automated ranking system judging the suitability for all clusters allows a more transparent and objective choice. For this purpose, favorable features of a cluster must be expressed numerically so that a ranking score can be used to compare the adequacy of different clusters. This score is first defined in the following in terms of a suggestion value SVp(c) for each cluster c and polarity p (high/low valence/arousal). The intention of this score is to find the best cluster c0 by maximizing SVp(c). Even though the clustering process increases the intra-cluster similarity of the resulting cluster, it remains a feature that should be as high as possible to avoid using too broad semantic areas that are not clearly representing any end of the valence/arousal spectrum. Thus, the average similarity A(c) between items in a cluster c is added to the value. Next, the corresponding polarity p must be well represented in the cluster. In the first step, the corresponding average sentiment score in BAWL for all items in the cluster has to be calculated. On this basis, the representation score R(c, p) for a cluster c that is added to SV is defined in Table 3. Note that the absolute value for low valence does not have to be manipulated further as the average value is expected to be negative. This does not apply to the value for low arousal which has to be subtracted from maximum value of the arousal scale.

Table 3 Calculation of representation score

As these emotional scores express higher certainty when paired with low standard deviation as given in the BAWL list, the average standard deviation S(c, p) for cluster c and polarity p has to be subtracted from SV. Finally, the number of items N(c) in a cluster c should be taken into account since bigger cluster encloses a bigger semantic space in the word embedding. Hence, after combining the previous three values the result increases by 5% for every additional item.

$$SV_{p}\left(c\right):= \left(A\left(c\right)+R\left(c{,}p\right)-S\left(c{,}p\right)\right)\times 1.05^{N\left(c\right)}$$

The ranking is then given by ordinal numbers, where the highest suggestion value marks the most suitable cluster. Note that this process of choosing suitable clusters is done for each corpus separately to ensure a better sentiment representation in the corresponding model. Table 4 displays the chosen clusters with their respective words.

Table 4 Chosen clusters with their respective words

In summary, the clustering process for both models, originals and fanfictions, can be described in the following steps:

  1. 1.

    Subsetting the BAWL for words occurring at least 50 times in the corpus with a standard deviation below 1.

  2. 2.

    Subsetting the resulting words into four polarities (high/low, arousal/valence).

  3. 3.

    Clustering each polarity with the affinity propagation algorithm.

  4. 4.

    Calculating suggestion value SV for each cluster and picking the cluster with the highest value for each polarity.

  5. 5.

    Calculating the centroid of each cluster as a representative.

6.3 Emotional Profiles

After having calculated the centroids for high and low valence and arousal, this section is concerned with creating emotional profiles for characters in Harry Potter. As previously mentioned, this characterization is performed using the similarity (sim) between the vector representing an entity name (c) and the previously discussed sentiment cluster centroids (e.g. high valence). As a direct comparison to Jacobs’ approach (Jacobs 2019), we first reproduced his relative depiction of percentiles of raw valence, arousal, and emotional potential values. For this, valence is given by the difference between the distance to the high valence centroid and the distance to the low valence centroid. Analogously, the arousal value is given by distances to the high and low arousal centroids. The emotional potential is calculated as the product of absolute value of valence and arousal. For a vector c representing an entity name, the three values are calculated with the following functions:

$$\textit{valence}\left(c\right)=sim\left(high\,\textit{valence}{,}c\right)-sim\left(low\,\textit{valence}{,}c\right)$$
$$\textit{arousal}\left(c\right)=sim\left(high\,\textit{arousal}{,}c\right)-sim\left(low\,\textit{arousal}{,}c\right)$$
$$\textit{emotional}\,\textit{potential}\left(c\right)=\left| \textit{valence}\left(c\right)\right| \times \textit{arousal}\left(c\right)$$

The functions are applied to all character names present as vectors in the word2vec model, i.e. generalized entity names that occur at least 50 times in the respective corpus. Using the resulting values for the 100 most frequent characters in each model, a character’s valence, arousal, and emotional potential values are then expressed relative to the 99 other most frequent characters by employing a so-called unity-based normalization (Dodge, Cox, and Commenges 2006). Re-scaling the characters’ values to an interval from 0 to 1, the normalized values still rank at the same relative positions while simplifying comparisons between characters. For the normalization, the minimal value across all characters xmin for arousal, valence, or emotional potential is subtracted from a specific character’s value x and then divided by the difference between the xmax and xmin value:

$$x'=\frac{x-x_{\min }}{x_{\max }-x_{\min }}$$

In Figure 10, the same list of characters analyzed by Jacobs is depicted in terms of our methodology and corpusFootnote 4. Here, the valence values for Harry, Hermione, and Hagrid show a high similarity to Jacobs’ results (Jacobs 2019). While Voldemort also has a low valence value, he has a much higher arousal value in this model. In contrast to Jacobs’ model, our results indicate that Dobby and Dumbledore have higher values for all three categories, which, especially for Dobby, corresponds to our reading experience.

Fig. 10
figure 10

Valence, arousal, and emotional potential values for selective characters in the originals

A more detailed depiction of the individual values for both valence and arousal scales can be found in Figure 11. Clearly, Voldemort leans strongly towards the lower end of the valence spectrum, i.e. is more negatively connoted than all other characters displayed here. Nevertheless, his similarity to the centroid of high valence is comparable to other characters. This means that Harry, Hermione, and Hagrid achieve relatively high valence values in comparison to other characters (see Figure 10) not because they are closer to the centroid of high valence, but because they are not associated with low valence.

Fig. 11
figure 11

Character comparison in the model of original texts

Even though normalization just simplifies values inside a single model to compare multiple characters, it is a very useful tool when comparing values of one character across corpora. Depending on the model and the texts it was trained on, the absolute values calculated for characters can be spread over completely different number ranges and have completely different intervals. Although unity-based normalization is performed for each model separately, the relative scaling between characters uses the same interval for both originals and fanfictions. This allows a comparison of characters within a model, but also, and in our case more importantly, of changes observed between the position of a character in the two models.

In Figure 12, the same characters as in Figure 10 are displayed with their corresponding scores in fanfictions. While Dumbledore and Voldemort both achieve higher arousal scores than in the originals, Hagrid’s value for arousal is the lowest across all 100 most frequent characters in the fanfiction model. Most notably, all characters depicted here have, with the exception of Voldemort, lower emotional potential in the fanfiction model than in the originals. As the emotional potential is dependent on both valence and arousal, this change indicates a general shift for the 100 characters that are used for the relative computation of valence and arousal, most of whom are not shown here: With minor characters being fleshed out more in fanfictions, they are attributed higher valence and/or arousal values and can consequently rank higher than canonically more central characters.

Fig. 12
figure 12

Valence, arousal, and emotional potential values for selective characters in the fanfiction

As with the previous analysis on the original texts, a more detailed depiction of these characters in fanfiction can be found in Figure 13. The previously mentioned notable shifts in Figure 12 can be seen here. In comparison to Figure 11, Voldemort has comparatively extreme values for both low valence and high arousal, while Hagrid is attributed the maximal value in low arousal.

Fig. 13
figure 13

Character comparison in the model of fanfiction texts

To describe an actual (geometric) shift from characters’ emotional profiles in the originals to the fanfictions, a direct comparison across corpora can be achieved by relying on the intersection of the sets of most frequent characters used independently above: By using the 67 characters that belong to both the 100 most frequent characters in originals and fanfictions as a constant reference system, changes of valence and arousal can be visualized, because the normalized values of a specific character are not only mapped to the same interval but also calculated in contrast to the same set of characters. As a consequence, a character’s relative emotional position in different corpora can be strictly compared and therefore, all characters can be displayed onto the same plane.

In Figure 14, such a planar representation using the two normalized scales of valence and arousal is depicted. The red and blue dots stand for characters in the intersection of originals and fanfictions, respectively. Characters featured in Figures 1013 are labeled and highlighted by a cross. With this visualization, we can not only compare characters based on their valence and arousal values, but can also quantify the intensity of shifts between the two models, represented by the geometric distance between a character’s position in the originals and the fanfictions. Hagrid moves, for example, from the quadrant of high valence and high arousal in the originals to that of high valence and low arousal. Other characters, such as Dumbledore and Draco, almost seem to switch positions: The canonical version of both characters are slightly negatively connoted, with Draco additionally scoring a high arousal value, while Dumbledore occupies a position on the mid-to-lower end of the arousal spectrum. In the fanfictions, Dumbledore takes Draco’s original position (high arousal, mid-to-low valence), and Draco moves to the quadrant of low arousal and high valence, which puts him into the vicinity of the canonical characters of Harry and Hermione. For other characters, the shifts are less pronounced: Snape places only slightly more central on the valence scale in the fanfictions than in the originals and remains in a comparably central position on the arousal scale for both originals and fanfictions.

Fig. 14
figure 14

Character shifts

In Figure 15, the top 10 largest shifts of characters, determined by the distance between their portrayal in the originals and fanfictions, are depicted in the same plane. All remaining characters can be, due to the low relative frequency of their entity names (see Figure 1), defined as minor characters. This, again, confirms our assumption that the canonically minor characters are not only depicted more frequently in fanfiction (see Figure 3), but that they are also subject of re-interpretation.

Fig. 15
figure 15

Top 10 biggest shifts

7 Discussion

On all levels of our analyses, the results indicate that there is in fact a general shift from original texts to fanfictions. With this, we can not only support claims presented in previous qualitative research on fanfictions, but also elaborate on the concept of aesthetics of constraint: Canonical characters and their established traits and features are not copied and pasted into new fannish narratives, but are appropriated and adapted in a specific manner to fit the narrative constraints of a fannish universe defined by already existing re-tellings and re-imaginings. As alluded to above, the choice of characters at the center of this re-imagination is not arbitrary; neither is the way they are modified and re-designed.

Our results indicate that the choice of central fanfiction characters is motivated by omissions and lack of narrative representation in the originals. The originals’ focus on Harry and his perspective as the narrative focal point inevitably influences other character’s portrayal: Through Harry’s eyes, his friends are depicted in more detail, as well as more favorably than his antagonists. With the exception of some scenes, flashbacks, and visions, the entire narrative is told through his eyes, which limits the audience’s exposure to other characters to interactions with Harry as the main character. This seems to be an incentive for fanfiction authors: Characters who are not as directly or prominently featured in the original universe gain importance and are moved to the foreground. This applies to three categories of characters: minor characters, who are fleshed out, characters absent from the main narrative, who are included in fanfiction pre- and sequels, and antagonists, whose points of view are—often more positively—presented.

Minor characters, as, for example, Blaise Zabini and Pancy Parkinson, two Slytherin students who are part of Malfoy’s clique, are much more prominently featured in fanfictions than in the originals. It can be assumed that this discrepancy is caused by fanfiction authors’ desire for completion: With a strong focus on Gryffindor students, other Hogwarts houses, their members, and designated areas in the castle remain mostly unexplored in the original novels. A shift towards this narrative completion in fanfictions is noticeable even on very basic levels: While frequencies for the house name »Gryffindor« remain constant over both corpora, »Slytherin« appears thrice as frequently in fanfictions compared to the originals.

For Blaise, the mere quantitative rise in the occurrences of his character name is also connected to an increased intensity of connections to other characters. As a consequence, he is a prime example for a fanfiction shift: Examining shifts for pairings featuring Blaise, we can observe positive Pearson residuals indicating an overrepresentation in fanfictions for the three original main characters (Harry: rj=0.15, Ron: rj=0.30, Hermione: rj=0.67), as well as for other fanfiction favorites such as Draco (rj=1.80) and Snape (rj=0.20). In contrast to this, other already more established minor characters, such as Ron’s brothers Fred and George, are less frequently paired with other characters in fanfictions. The status of a minor character alone thus does not guarantee an inclusion in the fannish canon; other factors seem to be at play. In addition to being one of the few people of color (POCs) in the canonical universe (see Pianzola 2021), Blaise offers two main incentives for fanfiction authors: As a flat character, his background and identity is not overly explored or spelled out. In fact, he is so undetermined that his gender-neutral name caused some confusion with translators: Earlier Dutch translations introduce Blaise as a female character and were only updated after he was revealed as a male character in the sixth installation of the series (Blaise Zabini 2022). Although he can thus act as a tabula rasa for fannish projections, the few canonical qualities ascribed to him—mainly his exceptional good looks and snobbishness—are a useful shorthand for a character type in fanfictions.

The same combination of a blank slate with some defining features applies to characters relevant for several prequels of the main narrative in the universe. Besides Dumbledore’s backstory and his relationship with Gellert Grindelwald, it is especially the adolescence of the so-called Marauders (James Potter, Sirius Black, Remus Lupin, Peter Pettigrew, and through extension, Lily Potter and Severus Snape) that seems to be a focus for fanfictions. In both the relative frequencies and co-occurrences (see Figures 1 & 2; Figures 34), these characters are in sum more prominently represented in fanfictions.

In general, co-occurrences can not only be interpreted as indicators of shared plot lines, but also as markers of slashes. High co-occurrences with characters who are particularly popular slash partners (Harry, Draco, Snape, and Hermione, among others) can therefore be seen as clues for additional sexualized pairings. Interestingly, the canonical pairings outlined in the originals, as, for example, Harry and Ginny, and Ron and Hermione, do not seem to resonate with fanfiction authors. They seem to ignore these pre-defined relationships and swap them for explicitly non-canonical slashes, crossing age differences (e.g. Blaise/Snape: rj=0.20, Hermione/Snape rj=0.20) and inverting character dynamics (e.g. Harry/Draco: rj=0.25), often along the lines of common fanfiction tropes, as, for example, »Enemies to Lovers«. In general, a character’s sexualizability seems to play an important part in their applicability for fanfictions: Characters who are, because of their looks, their personality, or their intense relationship to others, available and adaptable for slashes, tend to be more frequently featured in fanfictions. Characters who lack this potential, as, for example, Hagrid, Dobby, and Voldemort, seem to be less central for fanfictions.

Although in general, slashes cannot be equated with romantic relationships, as these pairings are often sexualized rather than romanticized, they can be linked to a wider tendency of redeeming villains: Through perspective shifts and their involvement in sexualized and/or romanticized relationships, or more generally, when they are shown in interactions with characters other than Harry, antagonists tend to be presented less negatively. An extreme example for contrastive emotional profiles for canonical antagonists are Lucius and Narcissa Malfoy (see Figure 15), whose shifts are, however, probably mainly caused by the diversification of their plot lines. While Snape’s emotional profile does not change significantly—possibly due to his overall ambivalent portrayal, even when placed in a relationship—, Draco is portrayed more positively in fanfictions (see Table 5), indicating that his character redemption is an integral part of his fanfiction persona.

Table 5 Valence and arousal values for Severus Snape and Draco Malfoy

8 Conclusion

As a literary phenomenon, fanfiction is more than the re-phrasing of limited aspect of its reference material. Fanfictions often go much further than superficial modifications or additions and subvert their source text. For these re-workings, character conceptions and constellations are central, as they are equally influential in the broader fanfiction community. Especially for Harry Potter as the first »›threshold fandom of the internet era« (Tosenberger 2014, p. 9) and one of the most dominant fandoms in juvenile and young adult fiction, characters and their identification potential for readers are important structuring devices.

With the three-part structure of our methodological approach, we were able to outline possible shift in characters on three levels: the relative frequency of character names, the co-occurrence of the character names and collocations of these character pairings, and the characters’ emotional profiles, derived from their position of character names in word embeddings. On all levels, we found that characters who fulfill the following prerequisites are particularly likely to experience a shift—may it be quantitative or qualitative—in fanfictions: Generally, minor characters gain importance in fanfictions, which corresponds to the idea of narrative gaps as motivators for fanfiction writers. Similarly, characters who canonically only appear in pre- or sequels are more frequently featured. If characters from either of these groups, but also more frequently appearing characters, such as antagonists, lend themselves well to fanfiction tropes, as, for example, »Enemies to Lovers«, they tend to be even more over-represented when compared to the original novels. Characters who undergo such a transformation, as, for example, Draco Malfoy, are consequently also candidates for qualitative shifts. The general diversification of plot lines and contexts a character appears in is, however, also responsible for a character’s changed valence and arousal values.

The combination of three methods and the resulting three degrees of granularity helped us relate our results to theories presented in qualitative research. Especially the word embedding-based analysis of emotional profiles and characters’ valence and arousal values has proven to be a useful tool for the analysis of characters’ emotional charging. As the approach is easily adaptable for other fandoms, comparisons of other groups of texts, and for other research questions related to sentiment analysis, we will be working on the refinement of these methods in other contexts.