Eye tracking analysis of computer program comprehension in programmers with dyslexia

McChesney, Ian; Bond, Raymond

doi:10.1007/s10664-018-9649-y

Eye tracking analysis of computer program comprehension in programmers with dyslexia

Open access
Published: 10 September 2018

Volume 24, pages 1109–1154, (2019)
Cite this article

Download PDF

You have full access to this open access article

Empirical Software Engineering Aims and scope Submit manuscript

Eye tracking analysis of computer program comprehension in programmers with dyslexia

Download PDF

5445 Accesses
9 Citations
3 Altmetric
Explore all metrics

Abstract

This paper investigates the impact of dyslexia on the reading and comprehension of computer program code. Drawing upon work from the fields of program comprehension, eye tracking, dyslexia, models of reading and dyslexia gaze behaviour, a set of hypotheses is developed with which to investigate potential differences in the gaze behaviour of programmers with dyslexia compared to typical programmers. The hypotheses posit that, in general terms, programmers with dyslexia will show gaze behaviour of longer duration and a greater number of fixations on program features than typical programmers. An experiment is described in which 28 programmers (14 with dyslexia, 14 without dyslexia) were asked to read and explain three simple computer programs. Eye tracking technology is used to capture the gaze behaviour of the programmers. Data analysis suggests that the code reading behaviour of programmers with dyslexia is not what would be expected based on the dyslexia literature relating to natural text. In conjunction with further exploratory analysis, observations are made in relation to spatial differences in how programmers with dyslexia read and scan code. The results show that the gaze behaviour of programmers with dyslexia requires further study to understand effects such as code layout, identifier naming and line length. A possible impact on dyslexia gaze behaviour is from the visual crowding of features in program code which might cause certain program features to receive less attention during a program comprehension task.

Cognitive load theory and educational technology

Article 01 August 2019

Word problems in mathematics education: a survey

Article 13 January 2020

The simple view of reading and its broad types of reading difficulties

Article Open access 12 August 2023

1 Introduction

Dyslexia is defined as “a specific learning difficulty which affects the ability to recognize words fluently and/or accurately; causes problems with spelling, auditory short-term memory, phonic skills, multi-tasking, remembering instructions, and organizational skills” (OUP 2015). Approximately 10% of people live with dyslexia (Sexton et al. 2012). Individuals with dyslexia experience the condition in different ways and there is much debate surrounding its identification and support (Armstrong and Squires 2014). Computer programming is primarily a text-based activity and as such, it may present additional challenges to the programmer with dyslexia over and above the normal cognitive challenges of software development. The impact of dyslexia on programming tasks, either learning to program or professional programming practice has been investigated directly and indirectly by a number of researchers. Powell et al. (2004) consider its impact on programming in terms of both its negative aspects (such as poor handwriting, spelling and short term memory) which can lead to reading deficiencies, and its positive manifestations (such as strong visualization, spatial awareness, and creativity) which characterize positive alternative learning styles. Powel et al. propose a mapping between these characteristics and stages in the program development process, suggesting that for tasks such as problem definition and system design, traits such as visualization and creativity bring benefits, whereas, for tasks related to coding and testing, traits such as poor spelling and short term memory are disadvantageous. Their mapping is supported by qualitative and anecdotal evidence from conversations with programmers with dyslexia. The link between the strong visual-spatial processing of a programmer with dyslexia and their ability to effectively problem solve in a programming context is also noted by Coppin (2008), who extends this observation to suggest how a workspace can be designed to capitalize on these traits (Coppin and Hockema 2009).

In a wider context, there is a long established research interest in the link between computer programming and personality. This has ranged from its relevance to the individual programming task (Bishop-Clark 1995), through to its impact on pair programming (Salleh et al. 2014) and into the wider sphere of team-based software engineering (Cruz et al. 2015). A recent line of enquiry has been in relation to learning disabilities, across the spectrum, and their impact on the individual’s approach to computer programming. Morris et al. (2015) present results from a survey of professional programmers who have a range of conditions such as autism spectrum disorder, attention deficit hyperactivity disorder and dyslexia. Results from interviews with 10 neurodiverse technology workers and from a survey of a further 59 neurodiverse technologists are presented. The work reported refers to challenges they face during software development, such as rigid interpretation of rules, difficulty committing to certain types of tasks perceived as mundane or expression of, at times, inappropriate emotions. Though the number of programmers with dyslexia in the survey was small (16 identifying with dyslexia or other learning difficulties, other than Asperger Syndrome, Attention Deficit Disorder or Attention Deficit Hyperactivity Disorder), it represents a significant empirical attempt at identifying how neurodiverse programmers approach programming in ways which are different from the neurotypical programmer. For example, when asked to self-rate their skill at certain programming tasks, neurodiverse programmers’ self-rated skill was significantly higher in tasks such as detecting patterns in code and adopting good programming style, whereas they were self-rated as less skilled in, for example, reviewing other’s code and writing test cases. If it is possible to identify ways in which programmers with dyslexia engage with programming which are not typical, then the workplace in general, and software engineering tools in particular, can be adapted to support these ways of working.

This paper contends there is a need for empirical work in understanding how programmers with dyslexia actually develop, test and comprehend program code. The primary research question here is, when reading program code for the purpose of comprehension, do the eye movements of programmers with dyslexia differ from those of programmers with typical reading profiles? In pursuing this question, other subsidiary questions become apparent which cannot be answered directly from the study described here but are noted as areas for further investigation. For example, do models of reading such as the Dual Route Model (Coltheart et al. 1993) apply when reading program code? How does the visual aspect of program code (indentation, camel case and code editor features) assist programmers with dyslexia? Do orthographic and phonological deficiencies, as exhibited by readers with dyslexia when reading prose, persist as deficiencies when reading program code? If so, are such deficiencies amplified or attenuated by the external representation of the program and/or the mental models at work in program comprehension? To seek to answer the primary question, this exploratory study uses eye tracking technology to gather data on the gaze behaviour of programmers with dyslexia during code reading and program comprehension tasks.

The paper is organized as follows. In Section 2 related work from a number of areas is drawn together to help formulate the hypotheses for the study. This work is reviewed in relation to the role of eye tracking in program comprehension studies, reading models and eye movements, and eye movement studies of readers with dyslexia. Informed by this work, the study design is presented in Section 3, including the hypotheses which have been formulated to guide the enquiry. This is followed in Section 4 by a detailed presentation of the results arising from the experiment eye gaze data. The discussion in Section 5 explores possible interpretations of the results in relation to the code reading behaviour of programmers with dyslexia for the three programs in the study. Section 6 identifies threats to the validity of the study after which overall conclusions and areas for further investigation are presented in Section 7.

2 Related Work

2.1 Program Comprehension Models

Program comprehension is an established area of research within the discipline of computer science (Brooks 1978; Shneiderman and Mayer 1979; Shaft and Vessey 1995). Its study seeks to explicate the factors at work when a programmer reads program source code to understand its overall purpose and to identify the particular syntactic and semantic components from which the program is constructed. Program comprehension is a function of properties of the programmer, such as their cognitive processes and programming language experience, and properties of the program artifact, such as code layout, identifier naming style or the code editor in use. Various models of program comprehension have been developed to reflect the range of cognitive strategies adopted by programmers. For example, bottom-up models propose that programmers seek to understand individual statements and program features and then assimilate these into higher level semantic blocks of code (Shneiderman and Mayer 1979; Pennington 1987). Top-down models propose that an initial view of the program’s purpose is formed, for example by using recognizable constructs in the code and then reading individual statements to support, reject or refine this initial view (Brooks 1983). In practice, an integrated approach may be used, with programmers switching between levels of abstraction as they move towards an understanding of a program’s purpose (Von Mayrhauser et al. 1997). Maalej et al. (2014) found that in real-world settings, professional programmers adopt sophisticated program comprehension strategies which involve not only bottom-up and top-down strategies but also viewing a program’s behaviour from the user’s perspective thereby constructing a mental model of the program by visualizing its input and output.

Schulte et al. (2010) suggest that the range of program comprehension models which have been proposed have a number of elements in common. These are (i) the external representation of the program. This is typically the program source code but can also include representations such as class diagrams and dynamic code inspectors; (ii) an assimilation process by which a programmer views the external representation and assembles the building blocks for (iii) an internal, cognitive representation of the program, complemented by existing mental models and cognitive structures which are part of the programmer’s experience and problem solving capacity (Fig. 1). With reference to this simple framework, the focus of this study is the assimilation process of the programmer with dyslexia as she reads the program artifact and seeks to build an understanding, using her cognitive model, of the purpose of the program. Specifically, when reading program code for the purpose of comprehension, do the eye movements of programmers with dyslexia differ from those of programmers with typical reading profiles?

2.2 Program Comprehension and Eye Tracking

In recent years eye tracking has been used as a mechanism for direct measurement of the reading processes of programmers and, from the data generated, for inferring strategies of program comprehension. It is accepted that eye gaze is a strong indicator of attention (Rayner 2009; Reichle and Sheridan 2015). As such, when used to study the reading of program code, eye movement gives an insight into the reading behaviour of the programmer and the mental model she is constructing. Bednarik and Tukiainen (2006) used eye tracking to identify differences in program comprehension strategies between expert and novice programmers when reading a program in conjunction with an execution visualization tool. They found that an experienced programmer’s approach to understanding was to read the code first, then confirm their mental model by running the visualization. Novice programmers had a greater reliance on the visualizer to aid understanding. Busjahn et al. (2011) conducted a comparison of reading natural text and reading program code using eye tracking. They observed some differences when reading normal text compared with reading program code, exhibited by differences in key gaze metrics such as mean fixation times and the number of regressions. Whereas reading natural text generally proceeds in a linear fashion, leading to serial-attention reading models such as the E-Z Reader Model (Reichle and Sheridan 2015), reading program code appears to be a mixture of linear and non-linear reading behaviour. The study described in Busjahn et al. (2015) further showed a combination of linear and non-linear behaviors, with notable differences between novice and expert programmers. Novice programmers showed a “fairly strong linear character” with 70% of their eye movements on source code being linear, compared with 60% for expert programmers. It is suggested this reflects the experts’ ability to follow the execution order of a program and/or to seek out beacons in the code as an aid to understanding. Sharma et al. (2012) studied gaze transitions between the three program elements of identifiers, structural elements (e.g., loops) and expressions. Findings suggested that the gaze of those who understood a program was focused on transitions between identifiers and expressions, reflecting a control flow or execution-based reading of the code. Those who did not exhibit a good understanding of the program tended towards a systematic, structural reading of the code. Other work has also shown differences between reading natural text and program code; for example, in natural text reading, there is a correlation between first fixation duration and word frequency. The less frequent the word in the lexicon, the greater the first fixation duration. However, with respect to keywords in Java, keyword frequency is not a predictor of first fixation duration (Busjahn et al. 2014a). Jbara and Feitelson (2017) used eye tracking to compare the reading of regular code and non-regular code. They found that reading is done non-linearly using scan patterns such as scanning and jumping ahead. Binkley et al. (2013) report a series of experiments investigating the impact of identifier style on code comprehension. As part of this, they also considered the differences in reading natural text and program code. They concluded that reading natural text and reading code are fundamentally different processes – on the basis that the representational structure of code (such as indentation and white space) and code beacons enable programmers to assimilate and understand parts of a program quite quickly – a phenomenon less common in natural language texts.

The First International Workshop on Eye Movements in Programming Education (Bednarik et al. 2014) devised a coding scheme to describe gaze behaviour when reading program code. This scheme is useful for illustrating the ways in which reading code is different from reading natural text. The scheme includes the notion of gaze patterns to describe sequences of fixations. Patterns can be linear, for example LinearHorizontal (where a programmer reads elements in a whole line of code in an equally distributed time pattern), or non-linear, for example Flicking (where gaze moves back and forth between two related items), JumpControl (movement to the next line according to execution order), and LinearVertical (following the code line by line). The categories of the coding scheme provide a valuable vocabulary for describing the non-linear components of reading code at the program level (Busjahn et al. 2014b).

Other work has used eye tracking technology to study aspects of the software development process other than programming. Recognizing that most real-world software development involves complex programs spanning multiple screens and files, Sharif et al. (2016) describe iTrace, a tool for enabling the use of eye track technology when the software artifact is not a static representation on screen but rather a dynamic artifact such as a scrolling code listing or the folder structure in a code editor. Using iTrace, Kevic et al. (2017) investigated software change tasks. As well as eye tracking data, code editor interaction data was collected. They found that in a software change task, developers only looked at very few lines of code within a program subroutine. Also, developers “chase” variable flow (execution flow) within code. This is consistent with the patterns of expert gaze already mentioned. In their work, Rodeghero et al. (2014) seek ways to augment automated code summarization tools by using data from the programmer’s gaze when performing summarization tasks (gaze time, number of fixations and regressions). Results include the observation that professional programmers exhibited a preference regarding the type of code regions they read. Rather than focusing on control flow (as suggested by Sharma et al. 2012), professional programmers tended to focus on method signatures and the code locations from where the methods were called. Ali et al. (2012) investigated the construction of requirements traceability links between requirements and source code. By identifying the sections of source code which developers focused on when verifying requirements, using metrics such as total fixation duration, they sought to find better ways of constructing accurate links between source code entities and their originating requirements. The use of eye tracking in software development research is not limited to studying gaze on source code. De Smet et al. (2014) describe three experiments investigating the impact of widely used program design patterns on the time and effort to perform maintenance and program comprehension tasks. Eye tracking technology was used to record participant’s gaze behaviour (fixation duration) when looking at various types of program structure diagrams. In keeping with findings from the program comprehension work described earlier, novice programmers tended to browse structure diagrams systematically whereas experts used their experience to scan and gather the salient information more quickly.

2.3 Reading Models and Dyslexia

The reading of program code has similarities and differences to the reading of natural text. While it does have some linear characteristics, it is also characterized by scanning, jumping and regression. Nevertheless, the assimilation process does require a reading capability. Dyslexic readers exhibit deficiencies when they read and comprehend natural text. To paraphrase the research question from the introduction, do programmers with dyslexia read and comprehend program code differently from programmers who do not have dyslexia? Do programmers with dyslexia see things differently?

The Dual Route Model (DRM) of reading is a widely accepted abstraction of the reading process (Coltheart et al. 1993; Coltheart et al. 2001; Law and Cupples 2017). The first stage of reading is orthographic visual analysis and letter identification. The model describes the next stage of reading as taking place through two separate processes, or routes, from print to speech. The so-called direct or “lexical” route involves the reader, having visually acquired the word to be read, look up this word in her orthographic lexicon – the set of words she has previously recognized through reading. The indirect or “non-lexical” route involves the reader, having visually acquired the word to be read, applies explicit conversion rules for parsing the word into graphemes and their corresponding phonemes. These phonemes are combined to form the word. Both routes are active when reading is taking place. However, exception words (words that do not conform to standard phonetic rules, such as “tough” or “know”), are only processed through the lexical route as they do not conform to the reader’s grapheme-phoneme mapping. Words which have not been encountered previously by the reader, i.e., are not part of her orthographic lexicon, are processed using the non-lexical route, leading to a successful or unsuccessful attempt at reading the new word.

Considering the Dual Route Model when reading program code, typical reading events would include reading familiar words, such as program language keywords, which according to the model, would be processed using the lexical route. This would include exception words such as new or byte. Words not previously encountered can be common in program source code, especially when reading code written by someone else. For example, the identifier name cakePriceArray would, according to the DRM, be processed through the non-lexical route, though because of its compliance with English grapheme-phoneme mapping rules, would typically be processed without difficulty at the word level.

As suggested by the Dual Route Model, dyslexia itself is a multi-faceted condition that has many subtypes which can be present to varying degrees in the reader. Friedmann and Coltheart (2016) provide a comprehensive summary of the types of dyslexia using the Dual Route Model as a reference framework. Deficits in the orthographic visual analysis stage of reading are examples of peripheral dyslexia (also known as visual dyslexia). These include letter position dyslexia, attentional (letter migration) dyslexia, letter identity dyslexia (the reader cannot abstract a letter), and neglect dyslexia (neglecting one side of a word). Deficits in the lexical and non-lexical routes of the model are described as central dyslexia. Examples include surface dyslexia which is a deficiency in the lexical route of the model. In such cases, the reader will have difficulty reading words such as “receipt”, “new” or “gnu”. Phonological dyslexia arises from a deficiency in the non-lexical (phonetic) route of the model whereby reading can only proceed via the lexical route, leading to a difficulty in reading new or non-words. Friedmann and Coltheart note in particular that readers with this type of dyslexia “usually encounter this severe difficulty again when they learn to read a new language”. Other examples of central dyslexia relate to deficits in the “phonological output buffer” such that the reader cannot properly read, process or articulate long words. Deep dyslexia describes semantic errors or erroneous word associations such as reading, in a programming context, “variable” as “value”, or “get” as “set”.

There is an extensive body of work related to dyslexia and possible interventions, a review of which is beyond the scope of this paper. Refer to Pennington and Peterson (2015) for an overview.

2.4 Eye Movements in Reading

Reading models such as the Dual Route Model have been extended to take account of eye movements when reading. Schroeder et al. (2015) state that with respect to reading, monitoring eye movements is “an excellent tool to help us understand how comprehension during reading takes place via interactions between visual and language processing systems”. Radach and Kennedy (2013) have noted three perspectives from which eye movement in reading research has been conducted. There is research which has focused on visual processing and sensorimotor control, for example, the relationship between vision, attention and saccade preparation. The second category of research is informed by cognitive science, focusing on reading as an information processing and word-level processing activity. The third category is research which has used direct measurement of eye gaze to develop and test hypotheses.

Certain types of gaze metrics can be associated with particular stages in the Dual Route Model. For example, first fixation duration measures can be associated with early stage orthographic processing; gaze duration can be associated with later stages of the model such as lexical access. Many eye movement measures used in the analysis of reading are temporal in nature. Early orthographic processing time can be inferred by first fixation duration on a word. Later stages of reading a word, including lexical analysis, can be related to total fixation duration on the word. Reading processes concerned with word integration and sentence semantics can be inferred from metrics such as total viewing time and regression path duration.

Other gaze metrics are spatial in nature. For example, the length of the target word, launch distance of a saccade, and position of the target word on the line of text. The extent to which such eye movement measures can infer cognitive processes when reading or can explain the essence of the reading process is the subject of much debate. For example, computational models of the Dual Route Model differ in their assumptions regarding reading as a sequential or parallel activity. In sequential attention shift models such as E-Z Reader (Reichle and Sheridan 2015), the processing window is one word wide. In a parallel processing model, perceptual spanning across a word boundary can be processed in parallel (Engbert et al. 2005). The case for sequential models includes the fact that, for example, attention is necessary to combine features of words into a unitary representation, that the sequential order of word recognition aligns with grammatical order (facilitating comprehension), and that the lexical processing of multiple words is not adequately described with any existing model. However, there is evidence that letter processing within words is conducted in parallel (Adelman et al. 2010).

2.5 Eye Movements Associated with Dyslexia

Eye movement data provides an insight into the reading process. Conceptual and computational reading models provide a theoretical framework in which this can be understood. It follows that eye movement data pertaining to the reading behaviour of dyslexic readers can provide some empirical basis for distinguishing this behaviour from that of typical readers.

Bellocchi et al. (2013) present a review of the literature pertaining to eye movement reading behaviour in developmental dyslexia. The nature of the link between dyslexia and eye movement is still under debate. Some of the observations are characteristics simply of younger readers and some characteristics disappeared when the task was not reading but rather requiring sequence or pattern recognition. Their review is presented in terms of research conducted in three broad areas. First, studies which have focused on visual motor behaviour have found that:

(A)
At word, pseudoword or sentence level, dyslexic eye movements are characterized by more and longer fixations, shorter saccades, and more regressions. (e.g., Hawelka et al. 2010).
(B)
Dyslexic eye movements show a smaller number of words that receive a single fixation or are skipped, a greater number of words with multiple fixations, a marked effect of word length on gaze duration, and prolonged gaze durations for singly fixated words (e.g., de Luca et al. 2002).

With reference to the Dual Route Model, these findings are interpreted as a failure of orthographic whole word recognition and an inefficient lexical route.

Second, several studies have found defective visio-attentional processes in dyslexia such that:

(C)
Dyslexic readers are influenced more by crowding (visual distractions around the centre of the word target) (Spinelli et al. 2002) and that inter-letter, inter-word spacing improves legibility for dyslexics (e.g., Perea et al. 2012).
(D)
However, crowding has a confounding effect. It affects some dyslexics more than others. Those with a moderate reading deficit tend to be sensitive to crowding. Those with a severe reading deficit tend not to be sensitive to crowding.
(E)
The dyslexic reader exhibits sluggish attentional shifting associated with deficits in spatial position encoding, affecting phonological representation (e.g., Hari and Renvall 2001). There can be asymmetrical allocation of attention to the right visual field in dyslexia, resulting in a so called left mini-neglect phenomenon (Facoetti and Molteni 2001).
(F)
Dyslexic readers can only process a few letters at each fixation, suggesting that a smaller visual-attention span prevents dyslexics from processing many letters simultaneously (Prado et al. 2007). However, this was not true for non-reading tasks such as visual search, leading to the conclusion that the observed differences between normal and dyslexic readers may apply only to text reading.

When considering eye movement behaviour relating to saccades, studies have shown the existence of an optimal viewing position (OVP) to maximize the efficiency of word recognition which, for normal readers, is slightly to the left of the word’s centre, with recognition efficiency decreasing on both sides of this point.

(G)
For dyslexic readers, there appears to be an absence of this left-right asymmetry in the OVP when initially fixating upon a word. Rather than the saccade landing on the OVP, it tends to land in the middle of the word, suggesting dyslexic readers are less able to focus on the OVP as the most information rich part of the word (Ducrot et al. 2003).
(H)
Positioning errors are more frequent for the dyslexic reader, leading to more refixations (Hawelka et al. 2010).

Bellocchi et al. (2013) argue that dyslexia can be best observed and described using (a) characteristics of global eye movement measures (number of fixations, fixation duration) and (b) characteristics of specific eye movement measurements relating to OVP and saccade landing sites, as indicators of attention allocation during reading or word identification, notwithstanding the heterogeneous nature of dyslexia manifestations and causes.

2.6 Summary of Related Work

Reflecting on the work described in the previous sections, Fig. 2 summarizes the contribution of these research areas to the present study. The relationship between dyslexia and the programming task has been considered in other studies (see Section 1). However, there has been no empirical work on the gaze behaviour of programmers with dyslexia. The program comprehension and eye tracking literature has shown that reading code involves both sequential and scanning reading patterns. Scanning will typically be guided by the program structure and its control flow, with sequential reading taking place at the word and pseudoword level. Reading models provide a framework for understanding the different types of dyslexia, with deficiencies arising in different circumstances depending on the need to process, for example, exception words, new words or non-words – scenarios which are common when reading computer program code. The literature on the eye movement of dyslexic readers enables the formulation of hypotheses to test for differences in the reading behaviour of programmers with dyslexia compared with typical programmers.

However, there are limitations in taking findings from the realm of reading natural text and applying these to the reading of program code. Many of the studies investigating eye movement in dyslexia have been conducted under tightly controlled experimental conditions in terms of how gaze objects such as word lists, letters and rapid automatized naming (RAN) tasks can be manipulated. Also, many of the experiments in developmental dyslexia have been based on the observation of children’s reading performance. In this study, the reading artifact (program code) is static in nature and the research is conducted using adults with dyslexia. Nevertheless, reading models and the related dyslexia research provide a reasonable starting position for exploring how programmers with dyslexia might read code. It enables the formulation of hypotheses regarding eye movement in order to explore potential differences in code reading behaviour amongst programmers with dyslexia and typical programmers. This study uses the observations A, B and E from section 2.5 above as the basis for formulating hypotheses testable using program code gaze metrics. Observations C,D,F,G and H are less straightforward in terms of their formulation into testable hypotheses given the experimental design described here. Exploration of these observations will require further study.

When measuring eye gaze activity of natural text reading, the unit of observation is typically the word, pseudoword or sentence construct. In terms of reading program code, the unit of observation adopted here is that of a program code feature, which may be an identifier, keyword or line of code, depending on context. The hypotheses of this study are formulated in terms of eye gaze behaviour in relation to such features and are presented in section 3.1 below.

3 Study Design

The following sub-sections present the hypotheses which have been formulated to guide the study and the experimental setting in which these were tested. In summary, the experiment involved a study group (14 programmers with dyslexia) and a control group (14 programmers without dyslexia). The participants were presented with three unseen Java programs, and in each case they were asked to read and describe the program’s purpose. The experimental session was recorded using an eye tracking device. Before eye gaze recording commenced, participants completed a profiling questionnaire to capture details such as age, programming language experience, and whether or not they had dyslexia. The study was reviewed and approved by the university ethics filter committee and all participants were recruited according to the agreed protocol.^{Footnote 1}

3.1 Hypotheses

3.1.1 Hypothesis 1

Based on the observation (A) in Section 2.5 above, at the word, pseudoword or sentence level, dyslexia eye movements are characterized by more fixations:

H1₀ – Programmers with dyslexia have the same number of fixations on program code features as the control group.
H1₁ - Programmers with dyslexia have a greater number of fixations on program code features than the control group.

3.1.2 Hypothesis 2

Based on the observation (A) above, at the word, pseudoword or sentence level, dyslexia eye movements are characterized by longer fixations:

H2₀ – Programmers with dyslexia have fixations on program code features of the same duration as the control group.
H2₁ - Programmers with dyslexia have fixations on program code features with greater duration than the control group.

3.1.3 Hypothesis 3

From observation (B), when reading at the word, pseudoword or sentence level, dyslexia eye movements are characterized by more regressions:

H3₀ - Programmers with dyslexia have the same number of gaze visits to program code features as the control group.
H3₁ - Programmers with dyslexia have more gaze visits to program code features than the control group.

3.1.4 Hypothesis 4

From observation (B), when reading, dyslexic readers exhibit a smaller number of words that receive a single fixation or are skipped:

H4₀ – For programmers with dyslexia, the number of program code features with a gaze visit count of [1|0] is the same as the control group.
H4₁ – For programmers with dyslexia, the number of program code features with a gaze visit count of [1|0] is less than the control group.

3.1.5 Hypothesis 5

From observation (B), dyslexic readers spend more time on longer words and, there is a stronger correlation between word length and fixation duration:

H5₀ – The correlation between identifier length and fixation duration is the same for programmers with dyslexia and the control group.
H5₁ – The correlation between identifier length and fixation duration is stronger for programmers with dyslexia than the control group.

3.1.6 Hypothesis 6

From observation (E), dyslexic readers have an asymmetric visual attention gradient (fixation count), tending to an increased level of attention on the right-hand side (RHS) of a word:

H6₀ – Programmers with dyslexia exhibit the same number of fixations on the RHS of a program code feature as the control group.
H6₁ - Programmers with dyslexia exhibit a greater number of fixations on the RHS of a program code feature than the control group.

3.1.7 Hypothesis 7

From observation (E), dyslexic readers have an asymmetric visual attention gradient (fixation duration), tending to an increased level of attention on the right-hand side (RHS) of a word:

H7₀ – Programmers with dyslexia exhibit the same fixation duration on the RHS of a program feature as the control group.
H7₁ - Programmers with dyslexia exhibit a greater fixation duration on the RHS of a program feature than the control group.

In addition to investigating these hypotheses, the dataset from the experiment has also enabled exploratory data analysis of gaze behaviour across the two groups using metrics not immediately suggested by the dyslexia literature. The exploratory study examined behaviour such as time to first fixation and fixations before, to help identify possible differences in behaviour. This is discussed in section 4.3 below.

3.2 Participants

Participants were recruited from computing undergraduate programmes at Ulster University. A total of 30 participants were recruited for the study. Data was successfully collected from 28 (one study session was void due to computer failure during the session and one due to unsuccessful calibration). 14 participants were programmers with dyslexia (the dyslexia group), 14 did not have dyslexia (acting as the control group). Students were recruited to take part in the study through two types of invitation. One was an email invitation to all students on the institution’s undergraduate computing programmes, explaining the need for participants with and without dyslexia. The second type was an email invitation to students registered as dyslexic with the university’s student support department. Student support assembled the distribution list for this email from their own records and issued the invitation, requesting that replies be sent directly back to the student support department. They then returned the list of participants to the authors. As an incentive, participants were offered an online gift voucher for taking part. For the purposes of the study, students self-designated as having dyslexia or not on the profiling questionnaire administered at the beginning of each recording session. While the student support dyslexia register was useful in gauging if a sufficient number of students with dyslexia had responded, it was not known with certainty which participants were dyslexic until the study was underway.

Of the 14 participants with dyslexia, there were three female and 11 male. The mean age of the dyslexia group was 23.4 years (SD = 6.50). Of the 14 participants without dyslexia (the control group), there were similarly three female and 11 male, with a mean age 21.5 years (SD = 3.32).

Participants were asked how long they had been programming. In the dyslexia group, the mean duration was 3.32 years (SD = 2.44). For the control group, mean programming experience duration was 2.89 years (SD = 1.47).

Participants were also asked to rate as low, medium or high (i) their overall programming expertise and (ii) their programming expertise in Java. The responses are summarized in Table 1.

Table 1 Self-assessment of programming expertise

Eye tracking analysis of computer program comprehension in programmers with dyslexia

Abstract

Similar content being viewed by others

Cognitive load theory and educational technology

Word problems in mathematics education: a survey

The simple view of reading and its broad types of reading difficulties

1 Introduction

2 Related Work

2.1 Program Comprehension Models

2.2 Program Comprehension and Eye Tracking

2.3 Reading Models and Dyslexia

2.4 Eye Movements in Reading

2.5 Eye Movements Associated with Dyslexia

2.6 Summary of Related Work

3 Study Design

3.1 Hypotheses

3.1.1 Hypothesis 1

3.1.2 Hypothesis 2

3.1.3 Hypothesis 3

3.1.4 Hypothesis 4

3.1.5 Hypothesis 5

3.1.6 Hypothesis 6

3.1.7 Hypothesis 7

3.2 Participants

3.3 Study Tasks

3.4 Instrumentation

3.5 Areas of Interest for Gaze Analysis

3.5.1 Line of code

3.5.2 Identifier

3.5.3 Left-right split

4 Analysis and Results

4.1 Reading Time and Performance Overview

4.2 Hypothesis Testing

4.3 Exploratory Data Analysis

4.3.1 Program 1

4.3.2 Program 2

4.3.3 Program 3

5 Discussion

5.1 Program 1

5.2 Program 2

5.3 Program 3

6 Threats to Validity and Study Limitations

6.1 Participant Factors

6.2 Experimental Factors

6.3 Study Limitations

7 Conclusions and Further Work

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation