Introduction

The writing process of professional and creative writers is never linear (Fenoglio, 2015; Flower & Hayes, 1981). Text production, planning, (re)searching and revision take turns in an iterative manner. This impacts the way the writer moves through and works on different sections of the developing text. We define a writer's breaks in linear text production for 'visits' or 'jumps' to other text parts as non-linearity (cf. Perrin & Wildi, 2008). This non-linearity has become an established part of our conception of writing. Measuring non-linearity can help to find out how different writers engage with their developing manuscript over time. Interestingly, non-linearity is understudied for multi-session unprompted professional writing (Alamargot & Lebrave, 2010).

Alamargot and Lebrave (2010) flag two characteristics of literary processes of professional writers: the writer works with an open problem space and the processes stretch out over a longer time period than most other writing that is studied. These characteristics give rise to a very different metacognitive process management, compared, for instance, to technical writers, or short-form journalists, in which the text produced so far plays an important role as an "external form of memory" (p18). Non-linearity is a way to study these patterns of interaction with the text-produced-so-far over time.

Non-linearity has been investigated through a variety of methods, including post-hoc interviews, think-aloud protocols, studying drafts and versions, and since the 1990's video and keystroke registrations of digital writing processes (Kellogg & Whiteford, 2012). Interestingly, although a large part of our day-to-day writing spans several days, divided over multiple sessions, most studies including those studying non-linearity, focus on short, single session writing (cf. Fürer, 2017; Leijten et al., 2014; Perrin & Wildi, 2008). Accordingly, the methods employed in these studies to analyse non-linearity cannot always be generalized to longer-term multi-session writing processes, such as is the case in literary writing.

Within non-linearity analyses we need to distinguish two locations: the leading edge and the point of utterance. The leading edge is the lower/outer boundary of the text; whereas the point of utterance describes the current cursor location, providing text is being produced or revised there (Lindgren et al., 2019). Current non-linearity analyses either use the lower boundary of the document (the leading edge) as a border between linear text production and non-linear actions in pre-existing text, or require extensive manual data annotation. For example, Inputlog's built-in Revision Matrix uses the leading edge to separate text production events and insertions (Leijten & Van Waes, 2013). Similarly, the S-notation encodes each jump away from the leading edge, as well as the distances between text changes as an indication of non-linearity. Here, jumps back to the leading edge are left out, as these do not entail distant revision, but rather new text production (Kollberg & Severinson Eklundh, 1996).

We argue that current non-linearity approaches are not suitable for multi-session writing. First, using the leading edge as a border is less informative for complex and long working documents, where new text is often produced in various locations, not necessarily including the leading edge. Secondly, current operationalisations of non-linearity take a binary approach (either something is non-linear or it is linear), which does not allow further differentiation into various types of non-linear actions.

Accordingly, this paper aims to propose a new, automatized, non-linearity analysis. Although it is specifically developed to analyse multi-session writing processes, it can equally well be applied to single-session writing. Moreover, the analysis does not strive to provide a binary label of linear production versus non-linear production. Of course, as a starting point we apply a binary distinction between linearity and non-linearity at the event level (with each event represented as a single action in the logfile). However, we introduce a non-linearity continuum at the session level. More specifically, the study situates each session along a non-linearity continuum by considering process characteristics such as the size in characters and duration of non-linear movements. We contend that this approach assists researchers in discovering more fine-grained characteristics of session management and in describing various ways in which writers interact with the text-produced-so-far (without resorting to manual annotation).

Background

Experienced writers employ non-linearity for various reasons, which shall be briefly discussed below. Next, an overview will be given of related studies measuring writers' movements through their working documents. As non-linearity is not uniformly formalised, this will be followed by an example of our own delineation of the concept.

Causes of non-linearity

Experienced writers and writers working on complicated tasks employ more non-linear writing strategies compared to novice writers and writers working on simple tasks or routine tasks (Severinson Eklundh, 1994). Bereiter and Scardamalia (1987) have developed a model to account for the differences between experienced and novice writers. They distinguish between a knowledge-transforming strategy, for experienced writers, and a knowledge-telling strategy for novice writers. In their approach, all breaks in forward linearity are considered evidence of a knowledge-transforming strategy. During and through the act of writing, the writers' knowledge about the concepts they are writing about is transformed, and this prompts writers to rearrange and rework the text. With knowledge-telling, on the contrary, there is a linear relationship between the order in which the concepts are retrieved from memory and the order in which they are put into words. Kellogg adds to this an expert-writer strategy, 'knowledge-crafting', which also takes the readers' perspective into account (2008). This stage of writing expertise is also part of Galbraith and Baaijen's (2018) dual process model in which they show that knowledge can be constituted during writing, as well as retrieved. Knowledge constitution implies the potential to discover new ideas through the act of writing ('discovery writing'), rather than (only) the transformation of existing ideas such as Bereiter and Scardamalia's (1987) model suggests. Both Kellogg's (2008) and Galbraith and Baaijen's (2018) proposed strategies imply a more non-linear workflow, as it implies continuous interaction with the text-produced-so-far and intensive (distant) revision. In discovery writing, the new ideas found can lead to changes to the text-produced-so-far.

Operationalisations of non-linearity

Two main approaches can be seen in studies using non-linearity measures. Most studies employ the concept of non-linearity to measure distant revision: revisions taking place away from the leading edge. (Matsuhashi, 1987) Another group of studies uses a navigation-based approach, using cursor location, with the aim of distinguishing episodes of similar activities in a process without linking non-linearity one-to-one with the amount of distant revisions.

Distant revisions

Both Groenendijk et al. (2008) and Baaijen and Galbraith (2018) use a product-based linearity measure based on distant revisions, though they differ in their operationalisations. Groenendijk et al.’s (2008) linearity measure is based on comparing the line number of the final text with the timing of its production and revision. For example, line number 1 was the first line of the final product, but may have been produced after lines 2 and 3 of the final product. Contrary, Baaijen and Galbraith (2018) checked which percentage of sentences in the final text had been moved from an earlier position, taking transpositions as their unit of analysis.

These product-process approaches are very rich and informative, but have the disadvantage of requiring manual annotation. Automatic collation of text versions still struggles tremendously with the tracing of transpositions (Bleeker, 2017). Furthermore, it can be philosophically challenging to determine which sentences are 'the same' when sentences are rewritten up to 40 times (as is not unusual in multi-session long-term literary writing projects, see Bekius, 2021). In the field of genetic criticism it is common to take into account those fragments that did not 'make it' to the final version. The genetic scholar distributes their attention equally over those parts of the text that did not 'make it' and those that did (Ferrer, 2011). This is another reason not to restrict the research perspective to a product-based approach when working with literary texts.

A process-based approach to distant revisions is the S-notation, developed by Kollberg and Severinson Eklundh (1994), based on work by Matsuhashi (1987). The S-notation is a text annotation system which orders all revisions and production events. Breaks in linear text production at the leading edge are numbered chronologically and labelled as deletions and insertions. The S-notation offers insights on episodes of revisions taking place consecutively, and accordingly gives insight into local non-linearity. However, the approach makes it harder to get a general overview of the non-linearity of the full text, in the case of longer and complex documents.

In these cases, linear text production does not take place exclusively at the leading edge. For example, a document can consist of a list of (preliminary) chapter headings, used as an outline underneath which new text can be produced. Second, there might be either a detached text fragment or even white spaces (returns) at the end of the document which are pushed along. The status of such a text fragment could be unclear (e.g., it could be notes, bibliography, key words, loose ends, or a so-called 'text graveyard' with earlier attempts of text production that are not yet deleted). The existence of these fragments theoretically implies no text production will be taking place at the leading edge anymore even though new text is being produced elsewhere. This implies that using the leading edge as a boundary between linear and non-linear activities is not the best indicator of a process' non-linearity.

There is an alternative usage of the term leading edge, namely in its conceptual meaning as: the location(s) where new content is added to the document – not necessarily situated at the bottom of the text. Lindgren et al. (2019) define the leading edge as 'anywhere' in the text where "new meaning is being created" and "at the end of insertions within previously written text where a writer inserts new ideas (not only revises form)" (p. 346). This approach requires a manual localization, for example through a selection of process fragments.

Working from the premise that writing is never 'perfectly' linear, Severinson Eklundh built in a threshold for non-linearity by focussing on the amount of high-level (words or higher) text changes. She defines non-linear writing as "high-level texts—(sic!) editing operations —insertions and substitutions of large text passages—are regularly made at a distance from the current point of inscription." (Severinson Eklundh, 1994, p. 204). In addition to counting high-level (phrase-length or larger) insertions per writing session, she calculates their distance to the (previous) point of utterance. With the focus on insertions, not deletions (though those are included in another measure in the same paper), the way in which the text grows is taken as the basis for determining non-linearity. Distinctions between revision and new text production are not explicitly made.

Baaijen and Galbraith (2018) choose a different approach, and focus on the breaches in linear text production. They opt for a range of variables, rather than one or two measures as in the previously discussed studies. Crucial to their approach is that they select those parts of the keystroke logs where new text is being produced. They observed the boundaries between P-bursts (Pause bursts or text produced in between two larger pauses), words and sentences and divided these into linear transitions ('uninterrupted transitions to the next unit of text', p. 9) and event transitions, which include 'scrolling, movements and other operations' (p. 9). Building on these transitions, a range of variables was measured to characterise each work process, together forming a 'global linearity scale'.

Most of the approaches discussed above require a manual interpretation and annotation to make a distinction between locations where new text is being produced—and revision (an adaptation of existing text). They also all select certain parts of the process logs for inclusion. For example; only the additions and not the deletions (Severinson Eklundh, 1994), or only activities taking place away from the leading edge. A benefit of these approaches is that there is a clear link between writing goals (in these cases: revising and new text production) and characteristics of the observed writing process. The focus on distant revisions implies that other types of non-linearity (triggered by other writing goals—planning and searching) are not included.

Navigation-based

Non-linearity has also been approached in a more abstract manner, by taking the cursor position as the starting point, and not linking keystroke data to writing goals. Instead of taking the leading edge as a boundary, or manually subdividing process logs into sequences of revision and production, changes in cursor position over time are the units of analysis. This approach has mainly been taken in visualisations, such as the LS-graph (Lindgren & Sullivan, 2002), Inputlog's process graph (Leijten & van Waes, 2013), the progression diagram (Perrin, 2019), the session graph included in the GenoGraphiX-Logger (Usoof et al., 2020), and the progressive visualisation (Bécotte-Boutin et al., 2015, 2019; Caporossi & Leblay, 2011). These graphs can provide a clear overview of a single writing session, and are also useful for comparing sessions, providing these are roughly equal both in terms of duration and document length. If the sessions vary too much, the scales of the graphs will not align. For very large documents and for longer processes, it becomes more difficult to build comprehensive graphs like these. Moreover, manually comparing a large number of sessions through visual inspection of graphs is challenging. Here, quantitative analyses can offer a solution. Using the changes in cursor position as an indication of production location over time, Perrin and Wildi (2008) have developed a non-linearity measure and an accompanying visualisation. They operationalize non-linearity as distances in actual cursor position from an ideal linear progression, providing information on phases. The corpus they work with is based on workplace keystroke registrations done in Swiss newsrooms, which makes it an example of multi-session writing. To handle the 'noise' in this real-life data, they transformed both time and textual changes (not split out into revisions and production) into an ordinal scale (preserving only the chronological ordering of the cursor jumps). Their visualisation (Fig. 1) provides an overview of the increases and decreases in linearity during the process. It compares actual cursor movements with an 'ideal' perfectly linear forward progression of the cursor. This quantitative approach is scalable, but the high-level abstraction from the process data makes this approach less suitable to characterize different (consecutive) writing processes or to quantify different levels of non-linearity. It is difficult to link this graph to the textual development as both actual cursor location and actual time stamp have been removed. Secondly, similar to the other visualisations discussed above, when comparing hundreds of sessions, it is difficult to do so using a single graph for each session.

Fig. 1
figure 1

Elementary de-trending of progression graph (reference trend: perfectly linear writer) [from Perrin and Wildi (2008, p8)] (reprinted with permission)

Fürer (2017) builds on the work of Perrin and Wildi to develop a classification of phases in the writing processes of newsroom journalists. The data consists of a chronological ordering of text events (‘revisions’ in his definition, but consisting of all text production and deletion events) and the location of those events in the developing document. A progression graph was used to visualize the text event chronology and location, which was used to identify non-linearity patterns. This manual annotation could be substituted by a quantitative selection procedure which has as a benefit that the rules for assigning a pattern are transparent (allowing others to judge them) and replicable.

Using automatic versioning rather than keystroke logging, Sardo et al. (preprint, 2023) study switches between text production and 'exploration' (planning and revising existing text accordingly) in the multi-session work processes of academic writers. They do this by tracking at a sentence-level which parts of the text were adjusted (added, deleted or changed) in each (one-minute) version. Then, they measure the complexity of the editing process by looking at where in the document these changes took place. By mapping out and visualising the chronologic versions and their respective edit locations, any deviation from producing the sentences in the order they take in the final text is seen as non-linearity and a sign of switches between text production and 'exploration'. This seems like a promising approach for working with intermediate text versions.

In conclusion, both distant revision-based methods and navigation-based methods have their drawbacks when considering their application to a large multi-session writing corpus. For multi-session writing data, using the leading edge as a boundary between text production and distant revision is not accurate enough as the foundation of a non-linearity analysis. Separating linear text production from distant revision manually, as several revision-based approached discussed above are doing, is highly demanding on a large corpus. Navigation-based approaches seem a more fruitful route for long-term process data. Navigation-based visualisations provide the insights into different types of session management, but are difficult to scale up to a multi-session corpus. Perrin & Wildi's quantitative analysis meets many of the criteria, but is quite an abstract measure that we expect to be more difficult to connect to different writing activities such as text production and distant revision.

An automatized and data-driven approach to non-linearity

The proposed analysis strives at characterizing and classifying multi-session writing sessions from a non-linearity perspective. First, a generic working definition of non-linearity will be presented, which uses the point-of-utterance rather than the leading edge. This is followed by a more specific approach to segmenting the process logs into 'jumps' and other events. To conclude this section, a technical implementation will be presented, in which the jumps and text bursts from the segmented process logs are the basis for several variables that together comprise the non-linearity analysis.

Delineating non-linearity

In this approach, we use the point of utterance rather than the leading edge to separate linear from non-linear operations. We consider a non-linear action as any movement (through mouse, cursor or key combination shortcuts) away from the current point of utterance, regardless of the purpose of that move (whether the move is indicating a correction, reading, consulting sources or switching to new text production elsewhere, that is insertions in the text-produced-so-far).

For example, in Fig. 2, a small sequence of three moments is represented; (A) a sentence is written in one go, (B) then the cursor is moved back into the sentence to delete the word roses, after which the cursor is at position 2, (C) then the word flowers was inserted, which places the cursor at position |3. (A) is a fully linear sequence of events. Moving the cursor from position 1 to position 2, in order to delete the word roses, is an example of a non-linear action. Typing the word flowers is a linear event, but when the cursor is moved from position 3 to 1 in order to start the next sentence, this is a non-linear movement. The locations of the point of utterance and the leading edge are shown for each of these three moments in the composition process.

Fig. 2
figure 2

Example of leading edge and point of utterance using our working definitions, using the opening line from Mrs Dalloway (1925), Virginia Woolf

Approach

Our approach is a synthesis and quantification of various existing methods, which is suitable for use on multi-session and other large corpora of keystroke data. Our approach is most closely related to Perrin and Wildi’s (2008) exploration of shifts in non-linearity, but more fine-grained. Specifically, we segment our data such that cursor movements for (semi-)linear text production are separated from movements away from the point of utterance. In addition, we include several indicators to characterize jumps. For example, we include duration where they use an ordinal time scale (the order of events), to be able to measure the relative amount of time spent on non-linear events.

In our approach, all changes in cursor position (except from 1 character forward indicating normal typing and 1 character backwards indicating deletion) are identified as a 'jump' away from the point of utterance. The jumps represent the switches a writer makes towards working on another part of the document. The textual bursts in between those jumps are labelled as a 'linear burst'. These are sequences of characters that are produced and deleted without leaving the point of utterance. A linear burst is not bordered by pauses (such as a P-burst, see e.g., Leijten, Van Horenbeeck & Van Waes, 2019) but by jumps. When a move away from the point of utterance is followed by subsequent cursor movements, whether it is through arrow keys, mouse clicks or scrolls, these are all combined into a single jump. In contrast to Baaijen and Galbraith (2018), we use the jumps as our unit of analysis, rather than the textual bursts in between. So, much like earlier approaches, we distinguish linear events from non-linear ones. However, our boundaries between the two are different, as are our subsequent analyses.

For these jumps, we register a number of different descriptive features in two phases.

In a first phase, we explored the data in a so-called global analysis. We selected two variables related to the non-linear behaviour to assess to what extent these two variables are sufficient to explain the variance of non-linearity in a corpus of writing sessions. The variables include the mean jump size, and jump count, both relative to the document size (total characters), to control for different text lengths. Taken together, these are an approximation of Perrin and Wildi's (2008) dynamics of the cursor position, where the jump size and jump count is comparable to their distance from the ideal linear progression.

After the global analysis, we opted for a more detailed, multi-perspective exploration of non-linearity. Creating this detailed analysis was motivated by, firstly, the exploratory nature of our work on multi-session writing. It will provide the opportunity to discover which measures show the largest variance over time, and which variables will be most useful to characterise individual writing sessions. Secondly, a range of features will offer a more fecund basis to connect different types of non-linear movements to different writing goals and strategies in follow-up studies. The detailed analysis includes a range of features, including the two variables used in the global analysis, that can be used to automatically explore non-linearity in text composition, allowing for a scalable approach.

Implementation

The implementation requires a keystroke log file including timing of keystrokes, mouse movements and mouse clicks, as well as an indication of the writing session. A writing session can be defined by the author or by the researcher, for example using a time-based rule, such as a new writing session starts after one hour of inactivity.

The keystroke events are first segmented based on the following rules to indicate an interruption of linear text production:

  1. 1.

    A typist moves from a 'character' key press (excluding, e.g., delete/backspace) to a mouse click or combination key (to delete, select, navigate, or paste text), or vice versa. The shift-key (when used for capitalisation) as well as the caps lock key and the dead keys are considered part of text production.

  2. 2.

    A typist moves from a character key press (except from delete/backspace) to a delete/backspace key press, or vice versa.

  3. 3.

    A typist moves from one mode of deletion to another (e.g., from delete key to backspace key press).

  4. 4.

    A typist moves from typing a character to pressing arrow keys, or vice versa.

  5. 5.

    A typist moves from the main text to a text external source (e.g., online dictionary or webpage), or vice versa.

Sequences bounded by these interruptions of linear text production are labelled as jumps; text production bursts, which are text production events taking place consecutively at a single location; deletions, which are removals of either whitespace or visible characters; or focus events, which are events taking place outside the main document (e.g., when searching for information in Google). Cutting text is seen as a deletion, pasting text is labelled as an insertion.

Text selection events have been categorised based on the action that follows them. If they are followed by a deletion, they are labelled as a deletion. If they are followed by a typing burst, they are a part of that typing burst. In other cases, they are seen as (part of) jumps, as they are mouse movements away from the point-of-utterance.

Table 1 provides an overview of the labelled events. Based on the labelled events, several features are calculated per jump. The relative jump frequency and mean jump size are used in the global analysis. For the detailed analysis, a variety of features is calculated. A source of inspiration for these variables were visualisations of non-linearity such as Inputlog's process graph (see Fig. 4) and the progression diagram (Perrin, 2019). These combine timing and cursor position in relation to the text's development. A full list of the variables can be found in Table 2 (first and second column). The variables are calculated for each writing session. They consist of both absolute as well as relative metrics.

Table 1 A breakdown of how events from the general analysis files are segmented using the example from Fig. 2
Table 2 Overview of variables for the Detailed non-linearity analysis

In Table 2, we list all the variables used in the global analysis. Below, these variables are briefly discussed, in the same order as they are presented in the table. The subtitles correspond to the Aspect-column. All variables are calculated per writing session.

Jump frequency

The relative jump count indicates the ratio between number of jumps (non-linearity) and characters produced (linearity). It is calculated by dividing the total amount of characters produced by the amount of jumps.

Jump size

The absolute jump size consists of the amount of characters in between the two points of utterance that the jump connected. As multi-session processes contain a wide range of document lengths, a relative measure was also included. It divides the size in characters of each individual jump by the document length at the time of that jump. Then mean and standard deviation are calculated per session. The detour is a second relative measure. This is a ratio obtained by dividing the summed jump size by the total number of characters typed per session.

Jump direction

Furthermore, we specified the jump direction, as shifts in direction may indicate a switch in underlying writing activities. Backwards jumps may be occurring more for close to point of utterance revisions and coherence-revisions, whereas forwards jumps could indicate episodes of distant revisions.

Jump location

To further our understanding of the linguistic units more or less likely to be disrupted by jumps, the functional jump location was included. This is closely related to and inspired by Baaijen and Galbraith's (2012, 2018) measure of non-linearity. Even though we reject using the leading edge as the boundary for non-linearity, information relating to the leading edge was incorporated to define the relative jump location. Obtaining relative locations for these activities can help our understanding of non-linearity, as we can measure the bandwidth of the locations in which the author is working. For each jump, both the start- and end location (expressed in characters from the start of the document) are divided by the document length.

Jump duration

To assess jump duration, two variables were used: the percentage of session time taken up by jumps, and the duration of each individual jump (for which session mean and standard deviation were derived). Pauses (and interkey intervals) are not treated separately, rather, they are included in the duration of jumps.

Jump eventfulness

Concerning the jump eventfulness, a 'slope' measure is computed to see the relation between the duration and position change as can be seen in visualisations such as the process graph in Inputlog. The number of clicks, keypresses and scrolls during the jumps are measured to get an indication of how 'eventful' the jumps itself are.

Text burst size

The text burst size measure indicates how fragmented text production is. The bursts are demarcated by shifts to either a jump, a focus event, or a deletion.

Text burst location

Similar to the jump location, this measure was designed to grasp the bandwidth of the author's actions in the document, or how close to the leading edge the text bursts are taking place.

Text burst duration

Similar to the jump duration variables, the text burst duration is measured using two variables. First, the percentage of session time that was spent on producing text bursts, and secondly, the duration of each text burst was calculated. For these durations, the time between the first key press of the burst and the last key release of the burst was calculated. Pauses within the burst were included in this duration metric. Pauses at the beginning and at the end of a linear burst are removed.

The full non-linearity analysis, including the segmentation of the general Inputlog files into jumps, text bursts, deletions, focus events and replacements, as well as the extraction of the jump characteristics is available as an annotated R Markdown notebook at: { https://github.com/FloorBuschenhenke/NonLinearityMethod}. The analysis can be applied to left-to-right languages, with no direct upper limit for the size of the corpus. The corpus needs to consist of at least two jumps and one text burst to be able to calculate all metrics.

Proof of concept

In this section, we provide an exploratory non-linearity analysis based on the approach described above. We use this analysis to identify different patterns of interaction with the text-produced-so-far over a long-term writing process. Tracking and comparing non-linearity between writing sessions will help in identifying different types of writing sessions and will enable us to describe session management strategies in multi-session writing. We also take into account the temporal distribution of these differences in non-linearity as we expect, for instance, that non-linearity increases in the final editing stages. We therefore aimed to determine the effect of timing on non-linearity.

As non-linearity in multi-session writing is not yet widely studied in an empirical manner, we opted for an exploratory cluster analysis rather than a statistical testing of hypotheses. The multi-session process data collected from the novel writer Gie Bogaert composing his novel Roosevelt were used for this analysis. First, we briefly describe the corpus, and then we present a more global and a more detailed analysis to illustrate the proposed non-linearity approach.

Method

Materials

The Flemish novelist Gie Bogaert was so kind to participate in this study and to track the writing process of his tenth novel, Roosevelt, written in Dutch. From the summer of 2013 until the end of 2015, he diligently used Inputlog (Leijten & Van Waes, 2013) to record his writing sessions until reaching one of the last versions of his book: creating a corpus of 386 writing sessions, spanning a total of 276 h of keystroke data. The sessions vary in length from a couple of minutes to over three hours. The final text published consisted of 34.105 words or 224 pages. The corresponding keystroke data is not openly available due to privacy concerns. However, a digital edition of a segment from the writing process, which includes a detailed picture of textual changes as well as the author's movements through the document, will become available via (Bekius, 2023).).

From interviews with Bogaert and studying the drafts and notes comprising his work process, a general picture emerged of two distinct phases in his writing. Before starting the keystroke logged composition process, Bogaert spent a long period working on character development, research, plot structure and style (the various narrators each speak in a distinct voice). This pre-composition phase is captured in a paper notebook. Once the composition properly started, the entire chapter structure of the book had already been devised. Presuming planning and revising are recursive − and sometimes overlapping − writing activities this presumably results in extensive revision (leading to non-linearity). In addition, this planning stage allows for a non-linear method of composition. As the various chapters have already been conceptualised, one could start writing chapter 5, move on to chapter 2, etc. Alternatively, in this case, as the novel is built on intertwining stories told by different characters, each character could be elaborated separately. Bogaert's stage-based process provides a perfect case to study fluctuations in non-linearity of the writing sessions with the current approach.

The writer worked in a single document throughout the process. The intermediate written products contain a work-in-progress mix of structuring elements (chapter and paragraph headings; metamarks, i.e. small notes about the writing, mostly planning ahead); and the 'actual' text, fragments of which are first drafted in a telegram-style with some gaps and notes, incomplete sentences, and then later fully fleshed out by the writer.

Data preparation and data analysis

After applying the non-linearity analysis, we conducted some additional data cleaning steps. All jumps with a duration of more than three times the standard deviation for that session were removed for the calculation of jump duration. This has been done because there were many large breaks taken from the writing, ranging from 5 to sometimes 30 min. These breaks were most likely not directly related to the writing activity (cf. ‘down time' concept in Leijten et al., 2014) and therefore, distorted our view of the duration of jumps. In total, 927 ‘break-like' jumps (2.7%) were removed. After this, only writing sessions with at least two jumps and one typing burst were included for analysis, resulting in 373 out of the 386 writing sessions.

In this proof of concept we explored our data in two steps. First, we present the global analysis, and then we present a more detailed approach, including cluster analysis.

In the more detailed analysis, we extracted the variables presented in Table 1 for each of the 373 writing sessions. To control for multi-collinearity, Pearson bivariate correlation was applied as variable selection method. As a result, 10 of the 39 variables were removed because they correlated more than 0.90 with other variables, including the relative jump count from the global analysis.

To explore the similarities and differences between the writing sessions, a hierarchical cluster analysis was run on the sessions. The ‘hclus’ function in R was applied. Ward’s method was used, as it is quite robust for outliers, focusing on the sum of squares of the distances between pairs of data points. For each of the clusters, we identified which variables were responsible for the largest differences between clusters.

Results

Global analysis

In order to compare various writing sessions on their non-linearity and the possible chronological variance of that non-linearity, we have deployed a global analysis as a first exploration of the large data set. For both the relative jump size and the relative jump count, density plots were generated to examine the distribution of the mean scores (per session). The sessions had a median relative jump size of 0.0037 (SD = 0.03), where jump size was divided by document size in characters (on a hypothetical document of 1000 characters long would indicate that 3.7 characters was the median jump size). The relative jump count had a median of 7.4 jumps per character produced (SD of 7.6). This indicates the general level of non-linearity of this writer for this process. Without other processes to compare these scores with, it is not possible to say if this is 'high' or not. The indicators show highly skewed distributions, as visualised in Fig. 3, with a vast majority of relative jump sizes and relative jump counts being relatively small, and a large right tail of very high values. These variables help us in finding the small set of very distinguishable highly non-linear sessions. However, the bulk of sessions shows to be quite similar. In other words, the single perspective approach to non-linearity does not seem to allow for a much diversified characterization of Bogaert's writing sessions. As that is what we wish to accomplish, the detailed analysis (4.2.2) contains a larger range of variables.

Fig. 3
figure 3

Left and right showing the skewed distributions of the relative jump size and the relative jump count, as used in the Global approach. (As the x-axes of Fig. 3 have quite different ranges, the values of the y-axis are also dissimilar. The area underneath the curve equals 1, so for the Mean relative jump size at the left, when values of the x-axis are below 1, the values on the y-axis raise above 1)

Detailed analysis through clustering

Thereafter, a cluster analysis was performed on 373 sessions using 29 variables from the detailed analysis (see Table 3). The clustering was optimized by setting a threshold for the maximum number of sessions that were allowed in any one cluster, where clusters could not contain more than half of all sessions. Thus, the maximum cluster size was set at 186 sessions. Using an iterative approach, the number of clusters was set at nine, as this allowed for an optimal differentiation between the sessions while maintaining interpretative strength.

Table 3 Means for all variables per cluster

The clusters vary widely in size. The first four clusters were very small (all < 11 sessions) and showed extremely high non-linearity. Although these sessions could be interesting because of their divergence from the general picture, and outliers like these are sometimes very informative, for the purpose of clarity and brevity we will not further discuss them. Table 3 shows the means of all variables for the remaining five clusters, as well as the cluster size. Below, these five clusters are described in more detail.

Medium-sized clusters

There are two medium-sized clusters of 20 and 33 sessions respectively.

  • High-detour cluster: The sessions in this cluster focus on jumps rather than typing bursts, as shown by the highest detour value. In addition, the jumps are performed very fast, as the slope is the highest of all clusters. This cluster has a very large absolute jump size. As it did not have very high backwards jumps, this means that the forwards jumps are rather large.

  • Typing cluster: In contrast, this cluster is focused on typing, where the typing bursts take more than twice as long compared to all other clusters. As the size in characters is highly correlated to duration, it can be assumed that larger text fragments are being produced in one burst. The jumps possess both the lowest mean count of events and scroll movements. Also, the shortest amount of session time is spent on jumps. The absolute jump size (not included in the clustering) was also shortest for this cluster. Lastly, the size of the backwards jumps is the second-to-lowest. This is therefore the most linear of the clusters.

Baseline clusters

This leaves us with three large clusters. The clusters do not contain any extreme values and show less inter-cluster variability, compared to the other clusters.

  • More-jump-time cluster: A notable feature of this cluster is that half of the session time was spent on jumps, compared to 29% and 35% of the other two large clusters. The backwards jumps are larger, and jumps are more eventful, that is they include more scrolls and key presses/mouse clicks. This cluster contains more jumps per character typed, as the detour is the highest of the three. Furthermore, the mean jump slope was highest.

  • Minimalist cluster: This cluster is characterized by simple jumps (low on events and scrolls), short backwards jumps, and the least time spent on jumps. The detour is high, so although not that much time is spent on jumps, there is even less time spent on text production. Possibly these sessions were used for text evaluation, deletions and planning.

  • Middle ground-cluster: The low detour and high typing duration in this cluster point to a more even split between typing and non-linear jumps than the other clusters. There is a slightly higher proportion of backwards jumps than for the other two large clusters.

To illustrate the characteristics of each cluster more concretely, we added process graphs of one example session per cluster (Fig. 4). A process graph shows the cursor position, the number of characters produced, and the document length at any moment in time during the writing process. The x-axis represents the timeline. These example sessions have been selected because they occurred closely together on the global process timeline. Moreover, they are similar in total document length, making it easier to compare and contrast the sessions.

Fig. 4
figure 4

Process graphs. Session 33 is part of More-jump-time cluster, session 27 is part of the Minimalist cluster; and session 29 is part of the Middle-ground cluster

The more-jump-time cluster is represented by sample session 33. In this session, the text is not significantly expanded upon. The process graph for this session shows one large backwards jump, typical for this cluster, which is followed by a staircase-shaped trajectory back ‘up' through the text, with small pauses and modest text expansion. In contrast, the sample sessions from the Minimalist and Middle-ground clusters (Fig. 4) contain more small jumps in quick succession, pointing to local text alterations. The detour is represented in the process graph as the difference in line length between the characters produced and the cursor position—the characters produced stays rather flat whereas the cursor position shows large movements. The steep slope typical of this cluster can be found in the process graph through the verticality of the cursor position line during jumps. For this session, larger parts of the document are passed quicker than for the other sessions.

In the process graph for sample session 27, the document shrinks while new text is produced, pointing to a focus on short revisions at different locations in the text (see grey line) rather than the production of new material. It matches the overall Minimalist cluster picture of a high detour caused by relatively little text production and frequent but short jumps. Compared to the other two sample sessions, a large part of the process is spent in revision-rich upwards movements.

The sample session 29 presents quite a similar graph to session 27. The detour for this cluster was smaller than for the other two, and the typing bursts had the longest duration of the three large clusters. The document hardly grows, but quite some characters were produced, pointing to a highly iterative process in this session.

Looking at the sample sessions in their chronological order (other sessions were present in between, though), it is clear that in session 33, the segment of the document that was edited in the previous sessions is revisited—but now without many additional revisions or text growth. This sheds light on the phasing of non-linearity and its connection to writers' revision strategies.

The cluster analysis brought to the fore five clusters of differing non-linearity characteristics. The most linear cluster contained sessions in which the writer focussed on text production. The other four large clusters were all highly non-linear, but in various ways. For example, in one cluster, the High-detour sessions, the writer moved through vast areas of the working document, and made relatively few text changes. Perhaps the writer wanted to refresh his memory, check for coherence and plan ahead, through reading the text-produced-so-far. In the More-jump-time cluster, half of the working time was devoted to jumps, also indicating a central role for the text-produced-so-far, but here the detour was lower, suggesting more textual changes were made and less of the document was traversed compared to the High-detour sessions. Comparing this range of variables shows that the writer had several distinct writing sessions, which hint at the varying underlying motivations of the writer to 'jump' through the document; episodes where reading and reflection might have been key and episodes where textual changes are more important.

To determine the effects of the timing of the writing session on the non-linearity, we identified the temporal distribution of the clusters (see Fig. 5A, B). There does not seem to be a clear temporal pattern across the clusters, as all clusters include both early and late sessions. However, most of the sessions from the High-detour cluster are taking place from session number 250 onward, in the last third of the process, and this cluster does not appear among the first 100 sessions at all. The Minimalist cluster has a larger presence in the first half of the process, before session 200. Typing cluster is showing large absences in the period around sessions 300 to 400. This is when High-detour cluster ‘takes over', as well as the More-jump-time cluster. This suggests that the focus of the writer has switched from text expansion towards (distant) revisions, leading to a different non-linearity profile. This further indicates that it might be of interest to study the sessions before and after such a switch in more depth. The textual development and the writer's activities (distant revisions, reading, text production, researching) around such a switch could provide more insights in why such a switch was made.

Fig. 5
figure 5

A Line graph of cumulative session count represented in each cluster B Boxplots showing distribution of sessions over the clusters, including scatter plot of the sessions

To conclude, this exploratory case study showed that this writer switched between five distinctive non-linearity profiles throughout his process spanning more than 400 sessions. One of these profiles was clearly focussed on text production, and the others showed different non-linearity patterns, which in some part and tentatively could be linked to reading (for memory refreshment) and to distant revision activities. The hunch that non-linearity would increase in a final editing stage is confirmed by the analysis of the distribution of the clusters throughout the timespan. The most non-linear High detour cluster was mostly present in the last stage of the process. However, the other switches between different types of clusters cannot be easily linked to session management without a further study of the textual development and the writer’s activities.

General discussion and conclusion

Non-linearity analysis can shed light on the role of the text-produced-so-far and role of the monitor in process models. In addition, non-linearity could be used as an indication of cognitive complexity — as Severinson Eklundh (1996) found a task-effect where more complex writing tasks correlated with more non-linearity of the process. This study proposed a novel non-linearity analysis, which can be used to characterize multi-session writing processes. The analysis extracts jumps away from the point of utterance, and calculates a set of variables related to these jumps. This allows for a detailed characterisation of non-linearity. In the proof of concept, this analysis was applied to Gie Bogaert's writing process spanning two-and-a-half years and nearly 400 keystroke-logged writing sessions. The descriptive statistics, as well as a cluster analysis offered new insights into session management, showing the variance in the organisation of this long-term writing process. This confirms the usefulness of this approach for studying non-linearity in writing. Although the analysis was created for long-term (single writer) processes, it could as well be applied to group studies comparing single-session processes of individual writers.

Linking non-linearity to process management

A next step to take would be to develop complementary hermeneutical and data-driven analyses. To close the gap between data-driven and qualitative research into writing, it is necessary to apply manual interpretation. The quantitative non-linearity analysis could be used as a tool for exploring the materials and selecting process sections that are worthy of a manual inspection. Hermeneutical interpretation of the textual changes can shine a light onto the links between non-linearity patterns and process management (see, for instance, Bowen & Van Waes, 2020). For example, implementing new ideas into the text may have a distinctive non-linearity signature compared to stylistic revisions. Moreover, a quantitative analysis and resulting visualisations can help in finding those segments of the process that would be most insightful to submit to a manual interpretation. Natural language processing (NLP) offers exciting possibilities such as Word-to-Vector neural nets, which could be used to calculate the semantic distance between two texts (versions) and, therefore, the extent of rewriting. (See Lang, 2019 for an application on literary texts.) Semantic distances could be calculated for start- and endpoints of cursor jumps to see whether the text fragments in question are semantically related and to which extent. For our case study, it would be insightful to see if non-linear patterns link up segments from the perspective of the same character.

A follow-up step is to demarcate text-in-progress from segments of the document that have other functions, such as notes, comments and plans. By manually localising these segments, jump behaviour to and from these segments can be studied and non-linearity can be more firmly connected to the writers' strategies. Additionally, taking the internal text structure into account by, for example, annotating chapter boundaries and paragraphs or scenes, can further elucidate non-linear movements in the keystroke data. Flagging telegram-style writing would be possible using NLP-tools instead of manual annotation, but the other text types (discarded fragments, notes and plans) require manual localisation.

Limitations and fine-tuning of variable selection

For applying this analysis to other, new samples, we would like to address some decisions that could be made to fit the approach to the research question at hand. First, the non-linearity analysis generates a large number of both absolute and relative variables. We optimised our variable selection for the proof-of-concept's cluster analysis, but other studies could use other selections of the variables to better match the study's objectives. For example, L2-learners focus more on low-level issues and less on high-level issues (e.g., Barkoui, 2019). Using the functional jump location variable can show whether text production is interrupted more often by non-linear jumps at lower linguistic levels (such as within-word) than L1-writing. Working with the jump size variables could demonstrate this focal point of word finding and phrase building when the median jump size is consistently lower in L2-writing than for L1-writing.

The global approach, with only two variables, offers a quick insight, but more studies on a diverse corpus are needed to select the most suitable variables for this global approach, finding the optimal trade-off between discriminatory power and simplicity of application. Within the detailed approach, a number of variables correlated too highly for inclusion in the clustering set. Therefore, more research is needed to know if the same holds for other (semi) long-term processes, and to refine the variable set by excluding any consistently high-correlating variables.

Second, at the moment, events of text selection that are not followed by deletion of content or typing of new content have been categorised as jumps and are, therefore, part of non-linearity. However, text selection can also be followed by layout changes, but those often fall outside of the current scope of Inputlog's logging. Expanding the logging environment to these kind of formatting changes could be informative as these actions often indicate the addition of user oriented visual structuring or highlighting (Schriver, 2012).

Third, although deletion events are a part of our non-linearity analysis, we did not include these as a variable for our study's clustering. For researchers seeking a broader picture of writing characteristics, it could be insightful to add a relative deletion measure (a process/product-ratio) to include a global indication of the recursiveness of the writing process. Our analysis can complement an in-depth revision analysis, which would look at the characteristics and content of the textual changes in a systematic way (e.g., Conijn et al., 2020; Lindgren & Sullivan, 2006).

There are two aspects of a non-linearity analysis we would like to propose as directions for future studies. First, an expansion of the operationalisation of the concept of 'time' beyond taking the session as a unit. Second, a narrowing down of the operationalisation of 'non-linearity' through implementing a threshold for jump size.

Phases and time scales

Although strictly segregated ‘phases' were not found, it did become clear that writing sessions with different non-linearity characteristics, as expressed through the clustering, alternated in a non-random way. The clusters were not all evenly spread out over the entire process, but rather, some occurred much more at certain stages than at others. First, a study into the probability of sessions from the different clusters following each other could elucidate dynamic patterns and interactions between these clusters, and further our understanding of long-term writing and the strategies writers employ to manage them.

Although a writing session as defined by the writer is a strong conceptual unit, a holistic, whole process approach offers a different perspective. For example, in our current study, some sessions were very small, with short times between the sessions, hence one could argue to combine these sessions. In a similar way, some sessions contained large breaks, which could be divided in two (or more) separate sessions. To find out if writers have specific ways to manage their sessions, a within-session interval is useful. Fürer (2017) points out that the timing of a non-linearity pattern within the session is related to its function. Comparing and contrasting these three time perspectives; the session, intervals within sessions; and intervals in the whole process, can help in determining the scale at which writers alternate their planning, production, revision and online research, complementing the microscale of moment-by-moment actions.

Threshold of non-linearity

In writing process research there is a dominance of studies addressing relative short and single draft writing sessions, mostly in an educational context (Fürer, 2017). Thanks to these studies, we are able to build a better basis of how to connect the dots between keystroke actions and the writer's cognitive sub-processes, such as (goal setting for) planning, composing and revising. However, we contend that it could be productive to strive for scaling up and aggregating keystroke logs into larger segments of similar activities while observing multi-session professional writing processes. We expect this approach to lead to complementary insights, for instance, in relation to higher-level session management. The current exploration of non-linearity and the related automatized analysis clearly demonstrate this.

One of the central issues, however, in operationalizing non-linearity, is defining and selecting the perspectives to optimally address non-linearity. For instance, the boundary of a non-linear movement is now set at one key backwards and at least one key forwards. As we know that new text production usually includes small movements back and forth around the point of utterance (e.g., caused by typos, see Conijn et al., 2019), further research could determine where to draw the line between small, local jumps that might be considered still part of text production, and bigger jumps that signify the writer's switch in attention to a different text segment and to acts of distant revision. Setting a 'less strict' boundary for non-linearity could be done through various approaches. Both a fixed number (for example, 100 characters away from the point of utterance) as well as a writer-dependent threshold (for example, the shortest 5% of jumps) could be implemented in our analysis, using the jump size variable. Another approach would be taking the sentence as a boundary; with every jump within an unfinished sentence considered part of linear production. These approaches could all be implemented without performing a manual annotation of the data (cf. Baaijen & Galbraith, 2012).

In summary, in this paper we introduced and demonstrated a new non-linearity measure to analyse keystroke data collected from (long-term, multi-session) writing processes. In the near future, we hope to apply this approach to a range of other keystroke log data in order to further fine-tune and optimize the current non-linearity analysis.