Orthogonal to the scheduling dilemma, an application of the data-oriented standard methodology from Computer Science/Computational Linguistics in hermeneutically oriented research contexts may run up against what one may call the subjectivity problem. As laid out in Sect. 2, within the computational disciplines the “proper use” of computational modules in an analysis chain has to adhere to the established annotation-based methodology for specifying the modules’ input/output relations: annotation guidelines have to operationalize the categories of annotation, such that an intersubjectively stable observation about language use in context is captured. By measuring inter-annotator agreement in multiple annotation experiments, the effectiveness of guidelines can even be tested empirically. Target categories leading to low levels of agreement in human annotation are generally considered problematic for data-driven modeling.
Now, when aspects of literary or historical text interpretation are targeted in a text study, the postulate of intersubjectively stable “results” becomes highly controversial. In the hermeneutic context, the process of text interpretation/textual criticism (targeting the relational notion of significance) is not aimed at a single, “correct” target for a given text—even if the full text production context is taken into account in all facets. Rather, throughout the reception history of important texts, new interpretations have been and will be obtained, taking different points of view such as a psychological dimension, societal considerations, production aesthetics, emphasis on intertextual links with other works, etc. In most cases, a new interpretation does not invalidate earlier interpretations. In Literary Studies, a broadly shared hypothesis is that literary texts are inherently ambiguous or “polyvalent”.Footnote 32 As a consequence, for text properties connected up with interpretive differences, intersubjectively stable interpretation results cannot be assumed.
What does this imply for the applicability of the standard annotation-based methodology in the study of literary or historical texts? A plausible reaction would seem to be to completely exclude the sphere of interpretation (in the literary or historical sense) from the scope of formal/computational modeling—leaving it to traditional hermeneutics—and rather concentrate the operationalized annotation guidelines and computational modeling efforts on descriptive categories for surface-related text properties, for which intersubjective agreement can generally be reached.Footnote 33 The annotation approach in the heureCLÉA project (Gius and Jacke 2016), focusing on narrative literary texts, implements such an approach, including reconciliation steps for resolving disagreements.
At the same time, the exclusion of those text properties from formalized annotation that are contingent on interpretive decisions seems awkward too: one of the purposes of the traditional practice of (individual, subjective) text annotation has been for the reader/annotator to record one’s subjective reading impression: These may provide the basis for observing systematic patterns among text properties in a second pass. The objective of systematicity in annotation and the concession that certain annotations are influenced by subjective judgements do not necessarily exclude each other. It would seem that a computationally enhanced hermeneutic approach could benefit from computational models based on subjective annotations—even though these do not follow the rules of “proper” data-driven modeling.
Base for illustration: point of view in narrative text
The desideratum to address the subjectivity problem in more lenient ways becomes particularly clear when considering the interplay across levels of “depth” in text analysis. As I argue in Kuhn (in preparation), most categories of text analysis that would under most circumstances be considered plain descriptive—i.e., candidates for inclusion in the strict annotation methodology—can appear in ambiguous patterns, which effectively open up a disambiguation choice that depends on preference among alternative interpretations.
Consider for instance the classification of narrative point of view in narrative texts by the Austrian author Arthur Schnitzler (1862–1931). Many of his shorter narrative texts (e.g., Berta Garlan/Frau Bertha Garlan, 1900) are written in third-person narrative voice, limited to the subjective viewpoint of the focal character.Footnote 34 In his novel The Road to the Open (Der Weg ins Freie, 1908), the viewpoint of the third-person narration alternates to a certain degree between several characters’ subjective viewpoints and an objective viewpoint (predominant is narration from the narrow subjective scope of the Catholic composer George von Wergenthin-Recco, but occasional passages also take the Jewish writer Heinrich Bermann’s and other characters’ subjective viewpoint).
At first glance, the following two passages from chapter 2 and from chapter 3 appear to be typical depictions of George’s and Heinrich’s viewpoint respectively.
It was striking nine from the tower of the Church of St. Michael when George stood in front of the café. He saw Rapp the critic sitting by a window not completely covered by the curtain, with a pile of papers in front of him on the table. He had just taken his glasses off his nose and was polishing them, and the dull eyes brought a look of absolute deadness into a face that was usually so alive with clever malice. Opposite him with gestures that swept over vacancy sat Gleissner the poet in all the brilliancy of his false elegance, with a colossal black cravat in which a red stone scintillated. When George, without hearing their voices, saw the lips of these two men move, while their glances wandered to and fro, he could scarcely understand how they could stand sitting opposite each other for a quarter of an hour in that cloud of hate.
[Arthur Schnitzler: The Road to the Open, translated by Horace SamuelFootnote 35 (Chapter 2)]
[George has just asked Heinrich a question]
Heinrich nodded. […] He sank into meditation for a while, thrust his cycle forward with slight impatient spurts and was soon a few paces in front again. He then began to talk again about his September tour. He thought of it again with what was almost emotion. Solitude, change of scene, movement: had he not enjoyed a threefold happiness? “I can scarcely describe to you,” he said, “the feeling of inner freedom which thrilled through me […].”
George always felt a certain embarrassment whenever Heinrich became tragic. “Perhaps we might go on a bit,” he said, and they jumped on to their machines.
[Arthur Schnitzler: The Road to the Open, translated by Horace Samuel (Chapter 3)]
Passage (6) directly and indirectly conveys sensory perceptions by George (e.g., him seeing Rapp polishing his glasses). Seemingly similar, passage (7) depicts the mental state of Heinrich, in part through direct attribution (“He thought of it again with what was almost emotion.”), in part through free indirect discourse (“Solitude, change of scene, movement: had he not enjoyed a threefold happiness?”). Annotating the subjective viewpoint accordingly would hence seem to be relatively uncontroversial (George for (6), Heinrich for (7)).
However, when looking at the novel as a whole (and at Schnitzler’s shorter, limited-viewpoint narrations), it turns out that there are many passages with an extended build-up establishing one character’s inner view, which can then be kept up for quite some time, including the perception of other character’s actions. Since formally, we find different variants of third-person narration, the transposition of whose viewpoint we are being presented is quite subtle.
Assuming that Schnitzler likes to play with this uncertainty (which is an interpretive postulate!), passage (7) can be convincingly analyzed as depicting George’s perception of Heinrich’s actions: Heinrich’s pushing of the bike is deictically related to George’s position (“a few paces in front”), and we do not learn about the content of Heinrich’s meditations until he begins to speak (so we can hear him through George’s ears). The most misleading sentence is “He thought of it again with what was almost emotion.” What appears like a switch of the narrator’s voice towards Heinrich’s inner view can of course also be free indirect discourse—conveying George’s perception of Heinrich saying “I think of it again with what is almost emotion”.Footnote 36 There is nothing in the passage about Heinrich’s mental state that is not conveyed through an indirect or direct quote of what Heinrich uttered in the situation. The closing sentence “George always felt a certain embarrassment whenever Heinrich became tragic” resolves the tension, revealing whose viewpoint we were confronted with earlier on in passage (7). (Note that this is an interpretation of the aesthetics of the passage, which presumably cannot be defended on intersubjectively uncontroversial grounds, although it could—hopefully—be made plausible by appealing to fine-grained distinctions in the linguistic form and comparisons with other passages in the novel and other texts by the author, i.e., elements of a hermeneutic process.)
So, what we can observe when analyzing text passages using largely descriptive narratological categories is the following: the inherent ambiguity of many linguistic characterizations can easily lead to situations where “deeper” interpretive decisions percolate down to more superficial ones. (In our sample scenario, an interpretive hypothesis percolates down the recursive embedding of narrative levels: are we seeing one character’s inner state or is it another character’s perception on the first one talking about his inner state?)
If one takes the subjectivity problem to exclude a formal annotation approach (because no sufficient inter-annotator agreement can be reached), then, the possibility of such percolation happening implies that there might be no level of descriptive text analysis that is perfectly “safe” from interpretive biases. Vice versa, one might take it as a plausibility argument for an approach taking certain systematic modeling efforts (or formal annotations) to conditionally depend on the acceptance of some subjective pre-understanding.
Modeling subjective categorizations: Another place for rapid probing?
For the subjectivity problem, the rapid probing idea of methodological integration presented in Sect. 4.2 can also be realized based on a standard NLP analysis chain, augmented with a task-specific machine learning classifier that is trained with the rapid prototyping idea, similar as in the previous section. The corpus data and research questions are narrative literary texts, on which narratological categorizations are performed that may be correlated with interpretive decisions.
In Kuhn (in preparation), pilot experiments on a corpus of Schnitzler texts are discussed, targeting annotation of character-specific viewpoint in the narration. The idea is to explore the implications of (different subjective) interpretive pre-assumptions by integrating them in a machine learning classifier.
The experiments adopt a straightforward mention-based operationalization of point of view that is compatible with the formally precise descriptive framework worked out by Wiebe (1994) for predicting psychological point of view in narrative texts. Her model takes the form of an algorithm that predicts at each mention of a character in the linear text sequence, whether or not the previously established point of view stays the same, or whether it is shifted to the character now mentioned. The algorithm is formulated deterministically, taking into account a differentiated set of linguistic features; so whenever there are competing interpretation options, Wiebe’s algorithm would enforce a decision. However, the decision relies on the auxiliary notion of subjective elements, which would be the natural place for including non-determinism in the algorithm.
With modern machine learning techniques, a simple mention classification framework is a sufficient basis for rapidly probing experiments testing the effects of a model that follows a particular approach to reading point of view in Schnitzler’s text. Linguistic indicators (explicit attribution of speech or thought, deictic elements, adverbial modifications, reference to sensory perception, etc.) and contextual build-up, including certain patterns of character references, are included in the feature set, and so are style indicators (as Brooke et al. 2017 showed in a detailed analysis of free indirect discourse in the writings of Virginia Woolf and James Joyce).
Due to the subtle interactions, we cannot expect a machine learning approach to reliably and robustly predict the “correct” subjective viewpoint. However, following the rapid probing idea, the behavior of alternative predictive models trained on manually annotated viewpoint annotations can be systematically compared, potentially allowing for conclusions about the role different factors play; similarly, models trained on distinct texts can provide indications for a contrastive analysis.
The relevant datapoints in the machine learning classification are defined to be all mentions of characters, in their respective context. For (6), an excerpt of the above text passage, there would for instance be seven datapoints. The annotation decision is a binary decision: whether or not character referred to by the mention is the focus of perception at the current point of narration—where what counts is the informed reading impression, i.e., readers who have the individual impression that frequent switches of viewpoint occur will make different annotations than readers who perceive long build-ups of embedded narration levels (as discussed above). For (8), the annotation would be uncontroversial: the first two mentions refer to the focus of perception, the remaining ones do not.
[George] stood in front of the café. [He] saw [Rapp the critic] sitting by a window not completely covered by the curtain, with a pile of papers in front of [him] on the table. [He] had just taken [his] glasses off [his] nose
Subjective annotations of this kind can be performed quite fast; for a pilot study, about a thousand data points could be annotated within a few hours. Note that no distinction between explicit thought attribution, free indirect discourse, etc., are made, since—by design—emphasis in this pilot study is placed on the pattern of switches in viewpoint.
On the dataset, supervised classifiers can be trained using a standard machine learning library (e.g., the Python library scikit-learn http://scikit-learn.org/). As features, the output from multiple NLP analysis tools can be used,Footnote 37 including syntactic structure (which is important for detecting attributions of speech and thought), co-reference, but also verb class membership (which may lead to better generalizations).
Besides training the classifier on manually annotated data, one can also experiment with systematic automatized annotations. As mentioned above, some shorter Schnitzler narrations are entirely told from a single character’s viewpoint, e.g., Berta Garlan. Using automatic co-reference resolution, with a few minutes of manual post-correction, a dataset marking all mentions referring to the main character as “focal”Footnote 38 can be generated, and one can experiment with an “intertextual” model transfer: the automatized Berta Garlan data are used to train a supervised classifier, and this is applied to data from Road to the Open, in which psychological point of view varies more among the protagonists.
Table 1 displays evaluation results of a number of different training experiments on held-out test data—the idea being to give some indication of what kind of considerations can be taken. The rows in the table ((A) through (C)) vary the training and test data in the experiments, and in the columns, evaluation results for classifiers trained with Logistic Regression are shown.Footnote 39
In scenario (A), a classifier is trained on Road to the Open and tested on manually annotated test data from Berta Garlan. The fact that relatively decent accuracy scores (0.78) can be reached in transfer across texts seems to indicate that the model picks up a certain level of abstraction (across texts, it cannot be highly text-specific clues that help in testing).
In scenario (B), with training and test data from the same text (but with a smaller training set than in (A)), the prediction accuracy is slightly lower than in (A) (0.75).Footnote 40 Scenario (C) includes “mixed” training data, testing on the same data as in (B). The classifier benefits from the increased amount of training data—which could be an indication for relatively homogeneous patterns of narrative viewpoint across texts.
As a last thing, it can be interesting for a text study aiming at interpretive aspects to check the machine-learned classifier on other texts or parts of the development text that were not taken into account in the annotation. About Road to the Open it has for example been observed that George’s mistress, Anna Rosner, is very rarely focalized. One can now look for text passages in which the classifier (trained on viewpoint contexts for other characters) nevertheless predicts a subjective viewpoint for clusters of references to Anna. Passage (9) (from the end of chapter 2) is an example of such a passage. This can be compared with passages where references to her are assigned low scores by the subjective viewpoint classifier, e.g., (10) (from the beginning of chapter 3).
She had for the first time in her life the infallible feeling that there was a man in the world who could do anything he liked with her.
Anna had given herself to him without indicating by a word, a look or gesture that so far as she was concerned, what was practically a new chapter in her life was now beginning.
(9) is indeed one of the few passages in which the narrator merges with Anna’s subjective perception, whereas in (10), the subjective viewpoint is intuitively George’s. So, we do find indications that a machine-learned classifiers, which a scholar can adjust to his or her individual pre-understanding within a few hours, could indeed be of use for advanced text-analytical explorations.
Of course, if a pilot study converges on certain correlations, structural patterns, etc., tentative insights from rapid probing have to be followed up by efforts for building relevant components more systematically and subjecting the models to a strict empirical evaluation on independently obtained target annotations.