GraphCom: A multidimensional measure of graphic complexity applied to 131 written languages

Chang, Li-Yun; Chen, Yen-Chi; Perfetti, Charles A.

doi:10.3758/s13428-017-0881-y

GraphCom: A multidimensional measure of graphic complexity applied to 131 written languages

Published: 19 April 2017

Volume 50, pages 427–449, (2018)
Cite this article

Download PDF

Behavior Research Methods Aims and scope Submit manuscript

GraphCom: A multidimensional measure of graphic complexity applied to 131 written languages

Download PDF

Li-Yun Chang¹,
Yen-Chi Chen² &
Charles A. Perfetti³

4426 Accesses
60 Citations
3 Altmetric
Explore all metrics

Abstract

We report a new multidimensional measure of visual complexity (GraphCom) that captures variability in the complexity of graphs within and across writing systems. We applied the measure to 131 written languages, allowing comparisons of complexity and providing a basis for empirical testing of GraphCom. The measure includes four dimensions whose value in capturing the different visual properties of graphs had been demonstrated in prior reading research—(1) perimetric complexity, sensitive to the ratio of a written form to its surrounding white space (Pelli, Burns, Farell, & Moore-Page, 2006); (2) number of disconnected components, sensitive to discontinuity (Gibson, 1969); (3) number of connected points, sensitive to continuity (Lanthier, Risko, Stolz, & Besner, 2009); and (4) number of simple features, sensitive to the strokes that compose graphs (Wu, Zhou, & Shu, 1999). In our analysis of the complexity of 21,550 graphs, we (a) determined the complexity variation across writing systems along each dimension, (b) examined the relationships among complexity patterns within and across writing systems, and (c) compared the dimensions in their abilities to differentiate the graphs from different writing systems, in order to predict human perceptual judgments (n = 180) of graphs with varying complexity. The results from the computational and experimental comparisons showed that GraphCom provides a measure of graphic complexity that exceeds previous measures in its empirical validation. The measure can be universally applied across writing systems, providing a research tool for studies of reading and writing.

“Nonsense Rides Piggyback on Sensible Things”: The Past, Present, and Future of Graphology

Graphology

ReaderBench: A Multi-lingual Framework for Analyzing Text Complexity

The world’s writing systems contain graphs that span a wide variety of visual forms. Much of this variety is associated with variable mappings that graphic units can have to linguistic units (abjad, alphabetic, syllabary, alphasyllabary, and morphosyllabary). This mapping variety has been the focus of comparative reading research (e.g., universal grammar of reading, Perfetti, 2003; phonological grain size, Ziegler & Goswami, 2005; orthographic depth, Katz & Frost, 1992; semantic transparency, Wydell, 2012; for reviews, see Frost, 2012; Perfetti & Harris, 2013; Seidenberg, 2011). The actual forms of the graphic units have received less attention. However, the visual forms of graphs—reflecting their visual complexity and discriminability—have the potential to affect the identification of both individual graphs and graph combinations (e.g., single letters and letter combinations in alphabets and abjads, akshara in alphasyllabaries, syllables in syllabaries, and characters in morphosyllabaries; Pelli, Burns, Farell, & Moore-Page, 2006), and thus to affect learning to read.

To study the effects of graphic challenges to learning to read in a universal way, free of biases based on a particular writing system, it is important to have a measure of graphic complexity that is sensitive to the variety of devices used in writing. We report here such a measure, a multidimensional measurement system for quantifying graphic complexity, GraphCom, and its application to 131 written languages. We demonstrate the value of the system in predicting similarity ratings made by speakers of different languages.

In what follows, we first discuss the broader context for descriptions of graphic units, delineating the units that are the object of our study and reviewing previously developed measures of graphic complexity (i.e., perimetric complexity: Pelli et al., 2006; Watson, 2012) and considering perceptual principles in human cognition. We then present the rationale and descriptions of our new multi-dimensional measure, the results of applying the measure to 131 languages, and the performance of the measure in predicting visual similarity judgments.

Graphic units: Graphs and graphemes

It is common in alphabetic reading research to refer to a grapheme as the basic unit of writing—in particular, one or more letters that map onto a single phoneme. Such a definition lacks both universality (e.g., Chinese characters do not map to phonemes) and also departs from the logic of linguistic descriptions. A definition of grapheme that conforms to linguistic analysis by being parallel to descriptions of phoneme and morpheme is this: a grapheme is a functional unit of writing that abstracts over variations in graphs—allographs; for instance, all the fonts for the letter b that exist at in a given language. The unit is functional in that the grapheme is the minimal graphic unit distinguishing two written morphemes, thus analogous to the phoneme, which distinguishes two spoken morphemes. For example, in English all letters are graphemes as well as graphs, because all letters distinguish among written English morphemes. According to this definition, the functional role of graphemes does not depend on mapping to phonemes, as attested by the contrast between homophonic morphemes such as buy/bye and reel/real. This technical definition of grapheme also includes nonletter graphemes such as the apostrophe, which distinguishes teacher’s from teachers. What counts as a grapheme is language-dependent even within a writing system. Thus, a capital letter and a lowercase letter seem to be allographs of a single grapheme in English, but probably not in German, where capitalization distinguishes between grammatically derived morphemes (Wissen as noun vs. wissen as verb). This sense of a grapheme as a distinguisher of written morphemes is more systematic and universal than the commonly used definition in the English language research literature. Thus, it applies to Chinese as well, where a character is a grapheme as well as a morpheme and distinguishes between multimorpheme words.

For purposes of measuring graphic complexity, our view is that the common psychological use of “grapheme,” which originated in alphabetic research, is too narrow. However, the more universal linguistic definition requires a detailed morphological analysis of each written language, a goal that is beyond the scope of our research. These issues of the definition of grapheme have led us to focus instead on the minimal unit of the graph, a written form that can be combined with other graphs to form graphemes (in any sense of grapheme). These graphs are readily recognized by literate users of a language as basic writing units—several thousand characters in Chinese, 26 letters in English, 33 letters in Russian, and 46 kana in the Japanese syllabaries—that are combined to produce written language, whatever their mapping. Most important is that a metric based on writing graphs (rather than graphemes) can be applied to any written language according to the goals of a researcher. For an English example, the complexity measure applied to the letters S and H separately can also be applied to the combination SH if one wants to measure “grapheme” complexity.

Writing graphs, as a culture product, are different from other visual categories

Every writing graph (henceforth, simply “graphs”) is a basic, two-dimensional visual form that participates alone or in combination in coding a linguistic unit (e.g., phoneme, syllable, or morpheme). In the information they convey, these graphic forms are different from other visual categories such as natural scenes, objects, and faces, but similar to line drawings (Changizi, Zhang, & Shimojo, 2006). Scenes carry more complex information about color, texture, shading, illumination, and occlusion (Sayim & Cavanagh, 2011). Objects, similar to scenes, provide more information about three-dimensional space, depth, and texture than line drawings. Faces, although they are composed with fewer elements than scenes and objects, are still more complex than line drawings because faces are usually seen from many viewpoints. In contrast, line drawings are simple. Because their complexity varies along fewer visual dimensions, indices that are useful for natural visual categories—for example, entropy on information, Fourier analysis on spatial-frequency, JPEG compression for size of an image (see Chikhman, Bondarko, Danilova, Goluzina, & Shelepin, 2012)—are not applicable.

Although graphs and line drawings share the general properties of two-dimensional simplicity, graphs become differentiated from line drawings with the earliest emergence of literacy contexts. Letters, the graphs used in alphabetic systems for examples, are differentiated from line drawings by children by the age of three (Levin & Bus, 2003; Robins & Treiman, 2009). For English-speaking young children, preschoolers 3–5 years of age who are preliterate have some understanding that a written word represents a specific spoken word, differing in this way from a drawing (Treiman, Hompluem, Gordon, Decker, & Markson, 2016). Moreover, these children are sensitive to the visual spatial layout of their own writing system, as compared to foreign writing systems (Treiman, Mulqueeny, & Kessler, 2014). When they are asked what writing is, these young children are more likely to choose sets of graphs from their own language (i.e., English) as instances of writing than graphs in other languages (e.g., Chinese characters; Lavine, 1977). These observations point to a categorical importance of graphs, as they becomes a functionally distinct perceptual objects in learning to read.

Previous measures of graphic complexity

One well-defined and well-attested dimension for quantifying graphs is perimetric complexity (Pelli, Bums, Farell, & Moore-Page, 2006): the ratio of the square of the sum of the inside and outside perimeters to the product of 4π and the area of the foreground (Pelli et al., 2006; Watson, 2012; see Tables 1 and 2 for examples, and the Method section for the algebraic expression). More informally, perimetric complexity captures the density of the written marks (“black ink”) relative to the background space in which they are located. Perimetric complexity has some valuable characteristics. First, it is objective, quantitative, and size-invariant. Thus, its values are not affected by font size. Second, it is empirically tested and correlates well with other subjective measures, such as pattern goodness and information load (for a discussion, see Jiang, Shim, & Makovski, 2008). Third, it is computerized, and the algorithm can be used for binary-code (black-and-white) images (Watson, 2012), making it a tool that is general across visual categories.

Table 1 Comparison of two graphs in terms of their visual complexity

Full size table

Table 2 Five graphs with complexity values using GraphCom, the measurement system with four dimensions

Full size table

Pelli et al. (2006) applied perimetric complexity to a range of graphs and demonstrated that perimetric complexity is inversely proportional to graph identification efficiency. Specifically, they sampled graphs across a wide range of written languages (i.e., Arabic, Armenian, Chinese, Devanagari, English, and Hebrew) and different fonts (e.g., Bookman, Couier, Helvetica, Kustler, and Sloan). They asked participants (ranging from 3 to 68 years of age) to look at a briefly displayed graph and then to identify it from a list of graphs in the given language. Graphic complexity was negatively correlated with human identification efficiency. Given the reliability and validity of perimetric complexity, it became a useful measure for controlling the complexity of stimuli in studies on learning to read (e.g., Liu, Chen, & Wang, 2016; Wang, McBride-Chang, & Chan, 2014; Yin & McBride, 2015).

Research on the relation between visual complexity and learning to read across writing systems suggests that learning to read more visually complex first language (L1) may require higher visual skills and may, in turn, strengthen such skills. In particular, a series of studies by Nag and colleagues provided evidence that visual skills required for reading Indian languages tend to be relatively high as compared with alphabetic languages (e.g., Nag, 2008; Nag & Snowling, 2011; Nag, Snowling, Quinlan, & Hulme, 2014; Nag, Treiman, & Snowling, 2010). These high demands come from the large number of graphs in these “extensive” written languages (Nag, 2007, 2014) and impose a strong influence on the pace of learning to read (Nag, Caravolas, & Snowling, 2011). It is possible that meeting the higher learning demands imposed by visually complex writing systems leads to improved visual skills: in a cross-writing-system study (McBride-Chang et al., 2011), children learning to read traditional Chinese outperformed age-matched kindergarteners who were learning to read less complex languages (Hebrew and Spanish) in a visual–spatial processing task. Similarly, in a comparison of 8- to 14-year-old readers of Chinese and Greek, controlling for reading experience, Chinese readers of all ages outperformed their age-matched Greek counterparts on visual-spatial processing (Demetriou et al., 2005). Collectively, these findings underscore the importance of visual complexity of graphs regarding their roles in impacting learning to read across writing systems.

Complexity characteristics in different writing systems

Not all characteristics of graphs found to be important in reading research are captured by perimetric complexity. There are many examples of two graphs that share the same value in perimetric complexity, while differing substantially in other ways. For instance, in Table 1, perimetric complexity quantifies both the graph <w> (an English letter) and the graph <> (a Thai letter) as 13; however, the two graphs have salient visual differences in their numbers of disconnected components (i.e., <w> has one component, and <> has two components), and thus also in their numbers of connected points; <w> has three points, each composed by two lines, and <> has two connected points, each composed by one circle and one line.

Variation in disconnected components is typical for alphasyllabaries, and variation in numbers of connected points is typical for alphabets. The numbers of connected points in letters of the Roman alphabet used in English (e.g., line terminations in <R>) are the features most critical in letter identification (Fiset et al., 2008). In alphasyllabaries, letters featuring disjointed components (e.g., the Thai letter <>) are highly associated with visual confusion in early literacy (Winskel, 2010). However, it is unclear whether the number of connected points, an important factor in the recognition of alphabetic writing, also affects letter identification in an alphasyllabary; similarly, we do not know whether the number of disconnected components, a salient measure in an alphasyllabary, plays a role in early alphabetic literacy.

In the morphosyllabic system for Chinese languages, the number of strokes (usually defined as a one-time movement of pen) has been long used as a complexity index with demonstrated psychological reality. For instance, Su and Samuels (2010) report that, in a Chinese character recognition task, response latencies to characters increased with the number of strokes for Chinese-speaking second-graders. In a study of Japanese kanji, Tamaoka and Kiyama (2013) found that both lexical decision times and naming depended on the number of strokes as well as on kanji frequency. Chinese character reading studies have also examined (and experimentally controlled) the number of strokes in both simplified (Wu, Zhou, & Shu, 1999) and traditional (Y. P. Chen, Allport, & Marshall, 1996) Chinese. Although all writing varies in the number of strokes, this measure remains unapplied to any writing system other than Chinese.

To examine complexity characteristics of graphs in different written languages and to examine and compare reading and writing across writing systems, a general, multidimensional measure that can apply to all writing systems is needed.

Gestalt principles for perceptual organization of graphs

Some of the features highlighted in research (i.e., numbers of disconnected points, connected components, and strokes, respectively) seem to echo principles of the perceptual organization of relations among visual components (proximity, symmetry, convexity, closure, connectedness, and continuation) that were emphasized in Gestalt theory (Koffka, 1935/1963). These principles were proposed as a partial answer to the question of how individual elements group into parts that then group into the larger perceptual object that is separated from other perceptual objects (Ehrenstein, 2008; Spillmann & Ehrenstein, 2004). For example, continuation affords clues to the relationship between simple features (Biederman, 1987), and connectedness is sensitive to information regarding continuity (Lanthier, Risko, Stolz, & Besner, 2009). In contrast, discontinuity highlights relations between more complex features.

An emphasis on continuity and discontinuity echoes the criteria for making a well-designed written language suggested by Watt (1983, 1994; see also Treiman & Kessler, 2011). Watt argued that shapes in such written language should be (1) similar, or have a degree of homogeneity; (2) contrasting, or distinguishable from one to another; (3) economical, or easy to perceive and produce; (4) redundant; (5) attractive; and (6) expressive. The systematicity of graph shapes was also emphasized by Treiman and Kessler (2014), who observed that, across writing systems, there is a tendency for graphs to look similar. This similarity may reflect basic principles of learning, one of which is that learners abstract patterns that hold across a set of graphs and use these patterns to supplement their memory for individual graphs.

Consideration of the different ways in which graphs vary across writing systems led us to develop a new measure that uses these different complexity-related variations, while also building on perimetric complexity. This measure, GraphCom, consists of four dimensions: perimetric complexity, number of disconnected components, number of connected points, and number of simple features (strokes). We applied this visual complexity measure to a large number of written languages, representing all five of the major writing systems. To the best of our knowledge, this is the first attempt to apply multidimensional complexity measures to quantify a larger number of graphs, in order to provide a valid tool to study the visual forms of those graphs.

The graph complexity measure, GraphCom

GraphCom includes four dimensions of graph measurement. The three dimensions added to perimetric complexity are quantified by the following basic units: A simple feature, following Pelli et al.’s (2006) definition, is a discrete element of an image that can be discriminated independently from other features. For example, <T> has two simple features, a vertical segment and a horizontal segment. A connected point (or a junction) is an adjoining of at least two features. For example, <T> has one (the junction of the horizontal line and the vertical line) and <F> has two connected points (the junctions of one vertical line with two horizontal lines). A disconnected component is a simple feature that is not linked to other features in a set. For example, disconnected components are shown, respectively, in <i> (the dot and the vertical line) and <> (the horizontal line on the top and the integral component at the bottom). Given these basic definitions, we can describe our four dimensions:

Perimetric complexity (PC)

PC is the ratio of the squared perimeter of a graph (number of pixels) to the number of background pixels in the graph. Specifically, PC is \( \frac{P^2}{A4\uppi\ } \), the square of the sum of the inside and outside perimeters of the foreground (P), divided by the foreground area (A), divided by 4π (Pelli et al., 2006; Watson, 2012). For example, if upper-case <W> has a 4,656-pixel perimeter and 136,602-square-pixel area, its perimetric complexity is 12.6287 (= 4,656 × 4,656 / 136,602 / 4π). This dimension is sensitive to the changes in luminance across space (i.e., spatial frequency) of a graph and its value is invariant to the size of the graph (Grainger, Rey, & Dufau, 2008).

Number of disconnected components (DC)

DC is defined as a simple feature or a feature that is not linked to other features in a set. If a given graph is composed of multiple disconnected components, there are spaces among these components; for instance, <> has four disconnected components created by spaces among the circle and the three dots. This dimension is sensitive to discontinuity information (Gibson, 1969).

Number of connected points (CP)

CP is a point of contact between features. This dimension is sensitive to information regarding continuity (Lanthier et al., 2009) and provides clues to relations between simple features (Biederman, 1987), counter to the DC dimension. Note that CP is not simply the inverse of DC; for instance, Vai syllables <> and <> have the same number of disconnected components (three), but the number of connected points of <> is four (for the diamond) and the number of connected points of <> is zero.

Number of simple features (SF)

SF is a discrete element that can be discriminated from others (Pelli et al., 2006); a typical example is a stroke within a Chinese character (Wu, Zhou, & Shu, 1999). Other examples for one simple feature include a line, a dot, a circle, or a curved line. To make the measure size-invariant, length, width, and thickness are not considered properties of features. This dimension is sensitive to the extent to which the graph combines simple features.

Collectively, these four dimensions provide objective, quantitative, and size-invariant estimations of graphic complexity. Table 2 shows how these four dimensions of GraphCom capture different characteristics of five example graphs.

Method

The written languages

For the application of the GraphCom to actual writing, we selected 131 written languages to represent five writing systems^{Footnote 1} (alphabet, 60; abjad, 16; alphasyllabary, 41; syllabary, 11; morphosyllabary, 3), using languages examined in previous cross-writing-system (Changizi & Shimojo, 2005), cross-alphabet (Seymour, Aro, & Erskine, 2003), and cross-script (traditional vs, simplified Chinese; H. C. Chen, Chang, Chiou, Sung, & Chang, 2011) studies. To identify the inventory of graphs and writing system categories for these languages, we followed Changizi and Shimojo (2005), who used Ager’s Omniglot: A Guide to Writing Systems (Ager, 1998). For the three languages on which Omniglot offers no information, we consulted other sources: H. C. Chen et al. (2011) for the two major scripts of Chinese (i.e., traditional and simplified Chinese), and an official list of 1,006 Japanese kanji by school year (Ministry of Education in Japan, 2015). Finally, for purposes of the complexity measure, we used only the forms of isolated graphs. For most written languages (ignoring handwriting), this is of no consequence. However, in some, especially the akshara of alphasyllabaries, graphs can change shape when they are combined in actual writing: Vowel graphs are reduced to diacritics when conjoined with consonants. These variations, which are important in actual writing, are not captured in our analyses, which defines the graphs of every language in their canonical forms.

Graphic complexity quantification

We generated images of each of 21,550 graphs using the Processing software (www.processing.org; Reas & Fry, 2010). Graphs were presented in black Arial font against a 500 × 500 pixel white background. In all, 25% of the selected languages are not supported by the Arial font; for these, an alternative font similar to Arial was adopted. Appendix Table 10 summarizes the detailed information about these 131 written languages. Measures of the four dimensions of the GraphCom were then applied to each of these 21,550 images.

Results

Complexity variation along individual dimensions

We describe the complexity of a graph as a set of values along the four dimensions of GraphCom. Figure 1 shows the complexity variations across writing systems as boxplots for each of the four dimensions: perimetric complexity (PC), number of disconnected components (DC), number of connected points (CP), and number of simple features (SF).

To assess the relationships among these dimensions, we correlated the complexity values on each of the four dimensions across the five writing systems, as well as separately for each writing system. Table 3 summarizes the results for the overall correlations, collapsed across writing systems: All correlations are greater than .82 (all ps < .001), except for the r = .65 correlation of DC (the number of disconnected components) with CP (the number of connected points). Perimetric complexity shows high correlations with the other dimensions, although a lower correlation with DC, reflecting PC’s ability to capture indirectly much of what the other dimensions target specifically. However, the measure with the greatest shared variance is the number of simple features, the building blocks of the graphs. Finally, the correlations show that the number of discontinuous components (DC) is the most distinctive measure, sharing no more than 67% of variance with other measures, and only 42% with the number of connected points. Significantly, not all writing systems showed the same pattern of correlations among the dimensions. These specific writing system differences are discussed in Chang (2015).

Table 3 Correlations of graphic complexity across writing systems

Full size table

Dimensions differentiate writing system pairs

Next, we determined which dimension is best at differentiating among parent writing systems. If different dimensions play a role in such differentiation, this would support the value of the multidimensional approach. For this analysis, we used the nonparametric Kolmogorov–Smirnov (KS) distance^{Footnote 2} (Stephens, 1974), one of the most commonly used distance measures for comparing two samples. In our case, the two samples correspond to two writing systems. The KS distance, which does not assume a normal distribution, is sensitive to the difference in the cumulative distribution functions of two samples, and thus is suitable for the highly nonnormal distributions of our writing systems on the various dimensions. Our five writing systems yielded ten writing system pairs. For each pair, we calculated the KS distances on each dimension (see Table 4); the dimension responsible for the greatest KS distance was taken as the dimension that is most sensitive to differences between those two writing systems.

Table 4 KS distances between two given writing systems for each dimension

Full size table

Table 5 shows the complexity dimension that maximally differentiates each pair of writing systems. Thus, the alphabet and abjad writing systems are most differentiated by their numbers of disconnected components; alphasyllabaries and morphosyllabaries are most differentiated by their numbers of simple features; and so forth. The number of connected points was not a maximal differentiator for any pair of systems. Interestingly, perimetric complexity, which has been the only dimension used to compare graphic complexity across writing systems in prior research (Pelli et al., 2006), was the most reliable differentiator only for the alphasyllabary–alphabet pair. These results suggest that the most effective dimension for differentiating writing system pairs is the number of disconnected components (DC); in Table 5, DC provides maximal differentiation for six of the ten writing system pairs. More generally, the results highlight the value of the multidimensional approach. No single dimension is universally the most effective at distinguishing any two arbitrarily selected writing systems.

Table 5 Dimensions that maximally differentiate writing system pairs

Full size table

Behavioral validation: Similarity ratings of graph pairs

To provide a behavioral test of GraphCom and its individual dimensions, we had participants with different first language (L1) backgrounds make similarity ratings on pairs of graphs from a single written language. We chose similarity ratings because they represent a paradigm commonly used in visual science and psychology over the past 130 years (for a review, see Mueller & Weidemann, 2012). The assumption is that two graphs that are more similar in complexity will be judged more similar than two graphs that are less similar in complexity.

Method

Stimuli

To select a language that is representative of its writing system, we identified a centroid for each writing system within a multidimensional complexity space^{Footnote 3} defined by the four dimensions of GraphCom. A centroid is the geometric center of a multidimensional space; in our case, the centroid of a writing system is the location of the unweighted mean of all the written languages within this writing system—that is, the average of their coordinates along the four dimensions. Thus, for each writing system, one language was designated as its centroid written language: Hebrew (abjad), Russian (alphabet), Telugu (alphasyllabary), Cree (syllabary), and Chinese (morphosyllabary). The stimuli included graphs from these five centroid written languages.

We created two categories for Chinese, because it contains thousands of characters and more than one type of graphic unit. Basic components (including radicals), which are the functional “building blocks” in the Chinese language (Shen & Ke, 2007), can stand alone as characters and are composed of a small number of strokes (average: 4.52). Compound characters, which are composed from these building blocks, have a large number of strokes (average: 13.21). Thus, we classified all characters as either basic or compound; note that these characters have the same forms in the traditional and simplified Chinese systems.

The division of Chinese into basic and compound types resulted in six groups of graphs based on the centroid written languages. These are ordered, in terms of increasing complexity, Hebrew, Russian, Cree, Telugu, basic Chinese characters, and compound Chinese characters. For the similarity ratings, graphs were paired within each written language, with graphs in each pair matched on either upper or lower case and, where applicable, vowel or consonant; all graphs in each written language (except for Chinese) were exhaustively used.

We created four stimulus lists, each consisting of six groups of graphs. Each list contained 180 pairs, more than the 120 pairs that are adequate to induce meaningful similarity judgments (Simpson, Mousikou, Montoya, & Defior, 2013). Appendix Table 11 shows the graph pairs for each list; Table 6 provides further information regarding these pairs of graphs.

Table 6 Characteristics and numbers of graph pairs for each written language in similarity rating (per list; four lists in total)

Full size table

Observers

A total of 180 observers participated in this experiment. All reported normal or corrected-to-normal vision. Table 7 presents demographic information about these observers. We chose observers whose first language was among those five languages with the most speakers worldwide (Arabic, English, and Hindi) and for whom the graphs to be judged were not from their first language.

Table 7 Demographic information for the participating observers (n = 180 in total)

Full size table

Procedure

The experiment was carried out via a large crowdsourcing platform, Amazon Mechanical Turk (MTurk). MTurk data have been demonstrated to be indistinguishable from laboratory data in different research fields (e.g., economics: Horton, Rand, & Zeckhauser, 2011; politics: Berinsky, Huber, & Len, 2012; social science: Buhrmester, Kwang, & Gosling, 2011; psycholinguistics: Sprouse, 2011; and psychology: Simcox & Fiez, 2014); to ensure data quality, we also followed the principles for using MTurk (Chandler, Mueller, & Paolacci, 2014) to design our online experiment. Four human intelligence tasks (HITs) for recruiting observers from four writing systems were posted on MTurk’s online recruitment interface. Each HIT had a two hour completion limit. Consent was obtained prior to the experiment; after MTurk volunteers agreed to participate, they were directed via a Web link to any of the four stimuli lists for similarity ratings.

The sequence of tasks was the same for each observer: a similarity rating, a language history questionnaire, a demographic background task, and a translation task (except for the English HIT) for verifying the observer’s L1 backgrounds. After completing the last task, a unique 13-digit code associated with the observer’s responses appeared on the screen automatically, along with debriefing information. The observer was instructed to report the code to MTurk to obtain monetary compensation. Successful generation of the 13-digit code also indicated that all of the observer’s responses were successfully sent from their local machines to our server. Below, we give a brief introduction for each task.

Similarity rating task

This task was designed to tap variability in observers’ judgments of visual similarity. Each trial began with a black fixation cross appearing for 300 ms, followed by a pair of graphs appearing for up to 5,000 ms, followed by a blank for 1,000 ms. The observer saw a pair of graphs appear at the center of the screen and the heading “1 = very different 2 = mainly different 3 = mainly similar 4 = very similar” at the bottom of the screen. The observers were instructed to rate how visually similar the two graphs were by pressing one of four keys on their alphanumeric keypad (not the numeric keypad) to indicate the rated similarity. Once the observers had responded, the screen moved on to the next trial.

After instructions, the observers were given 12 demonstration trials with explicit statements on the degree of similarity, 12 practice trials without feedback, and 180 experimental trials; the ordering of grapheme pairs was randomized. Table 8 shows trials in the demonstration, practice, and experimental phases. Responses and response time were recorded. This task took approximately 15 minutes to complete.

Table 8 Examples for graph pairs at different phases in the similarity rating task

Full size table

Language questionnaire

The language questionnaire (Tokowicz, Michael, & Kroll, 2004) was used to assess participants’ language-learning experiences both quantitatively (e.g., rating general language learning skill) and qualitatively (e.g., comments about language-learning experience). Observers were encouraged to give their best answers to the questions, without any time limit.

Demographic background questionnaire

The demographic background questionnaire was developed to learn more about observers’ educational, cultural, and health statuses (e.g., visual and hearing problems) and their surroundings during participation in this study. The responses on visual and hearing questions were used to filter data quality. We imposed no time limit to complete this survey.

Translation task

The translation task was developed to filter the data for quality. This task consisted of 20 English words chosen from the instructions for this experiment. Observers saw one word at a time and were asked to type the first translation to their L1 that came to mind within 12 seconds; timing was determined in a pilot study. Observers who failed to provide translations in a written language consistent with their reported L1 were excluded from the analysis.

Results

Each dimension played a prominent role in predicting human similarity ratings.

To test the effects of complexity on the perceptual judgments, we used a mixed-effects modeling approach, which is well-suited to assess the effects of both items (graphs) and subjects (observers) (Baayen, Davidson, & Bates, 2008). We assumed that two graphs that were more similar in complexity, as measured in GaphCom, would be judged more visually similar than two graphs that were less similar in complexity. Accordingly, we expected that a model that used all four dimensions of graphic complexity would provide the best fit to the human similarity ratings. We thus tested alternative models by means of a backward elimination procedure, which ensured that any joint predictive capability^{Footnote 4} of the dimensions could be observed (Burnham & Anderson, 2003). We first tested the full model containing all the predictors (the four complexity dimensions); then we constructed a second model that removed one of the predictors, to test whether removing one predictor would reduce the predictive performance. If so, that was evidence that the predictor should remain in the model.

Our predictors were the absolute differences in similarity ratings between two graphs in each of the complexity dimensions (i.e., PC, DC, CP, and SP). We performed a series of model comparisons with Laplace estimation using the lmer() function of the lme4 package (Bates, Maechler, & Dai, 2010) to fit the models, and the likelihood ratio test (Lehmann, 1986) to determine model performance. The lme4 model formulae used to fit each model are displayed in Appendix A.

The full mixed-effects model (FULL) included fixed effects of the four predictors and crossed random effects for subjects and items. The additional four models had fixed effects for three predictors (one predictor removed for each model) and crossed random effects for subjects and items. Thus, four model comparisons were carried out. Table 9 summarizes the model tests in terms of the Akaike information criterion (AIC) and Bayesian information criterion (BIC), two common criteria for model selection, as well as the chi-square values (and associated degrees of freedom) for the likelihood ratio test. A lower AIC/BIC indicates a better-fitting model (Wasserman, 2006). As is shown in Table 9, both AIC and BIC suggested that the FULL model scored best on these criteria. Similarly, for all likelihood ratio tests, the FULL model showed significant advantages over any reduced model (p values below .001): [FULL without PC vs. FULL, χ ²(8) = 372.88; FULL without DC vs. FULL, χ ²(8) = 138.44; FULL without CP vs. FULL, χ ²(8) = 390.08; FULL without SF vs. FULL, χ ²(8) = 558.83]. These tests indicate that removing any one of the predictor dimensions made the model significantly worse in accounting for variance in the data. This suggests that each dimension played a role in accounting for observers’ judgments of similarity.

Table 9 A summary for four model comparisons, using dimensions to predict human similarity ratings (n = 180)

Full size table

Discussion

The multidimensional measure of graphic complexity, GraphCom, is a useful tool for assessing visual complexity in any writing system. Its dimensions are grounded in basic perceptual factors—the number of simple visual features (lines, curves, and dots), the number of connected points, and discontinuities in the configural form. These dimensions are added to perimetric complexity, a proven measure that captures overall configurational complexity (Pelli et al., 2006). We applied GraphCom to 131 written languages across the world’s five major writing systems, demonstrating that this measurement system surpassed previous measures in predicting human perceptual judgments. Importantly for research, GraphCom can be applied to any of the many other written languages beyond our sample of 131.

The value of GraphCom is supported by several results. First, it resulted in an ordering of complexity among the 131 languages that aligns with informal observations of these languages. Thus, Chinese written in its traditional script is measured as the most complex written language, more than the simplified Chinese script. At the other end of the scale, abjads and alphabets show similar low levels of complexity, and are distinguished primarily by their number of discontinuous elements. Of course, these alignments are to be expected to some extent, because we developed GraphCom measures to reflect properties of real writing. Thus, the ordering of the written languages is not a validation, but a demonstration that the measure produces sensible outcomes.

More interesting are the results concerning the individual dimensions. Perimetric complexity, the most commonly used measure for capturing configurational complexity of graphs in prior research, may not be suitable in some situations. When we applied each dimension to pairs of writing systems, using the nonparametric KS distance measure, we found that perimetric complexity was not the best differentiator among writing systems. It was the most successful differentiator only for separating alphabetic from alphasyllabary languages. The number of disconnected components was generally the most important distinguisher of writing systems.

Also relevant are the results of a modeling study that simulated graph learning across hundreds of languages (Chang, Plaut, & Perfetti, 2016). In the learning model, each dimension of the GraphCom was found to uniquely account for the training times the model needed to reach mastery. Indeed, perimetric complexity was the weakest predictor in the graph learning simulation; the number of simple features was the strongest one.

The most direct validation of the measure comes from its prediction of human perceptual similarity judgments. In fitting the perceptual judgment data to regression models, we found that all dimensions contributed to explaining the data. Removing any one dimension score from the model significantly reduced the model’s ability to predict visual similarity judgments.

We emphasize that these dimensions are not independent, and indeed they are highly inter-correlated when the data are collapsed across writing systems to allow a correlation based on 21,550 graphs. Writing systems differ in how they use the visual, graphic characteristics that are measured by GraphCom dimensions (Chang, 2015). In alphabets, connected points (or line junctions such as <L>, <T>, and <Y>) are especially important in letter identification (Lanthier et al., 2009; Szwed, Cohen, Qiao, & Dehaene, 2009). This importance reflects the relatively small number of graphs needed in most alphabetic languages. This allows the re-use of a small set of simple features that can be combined at junctions to form unique graphs. In contrast, morphosyllabic writing (Chinese and Japanese kanji) requires a very large number of graphs to code syllable morphemes. As the number of graphs increases, recombining features through connected points becomes impossible; instead additional graphs must add more simple features that also create more connected points and discontinuous components. Overall, graphic complexity is largely driven by the number of graphs that is needed in a written language. Collapsed over all 131 orthographies in our study, the number of graphs is highly correlated with the GraphCom measure of written language complexity, r = .78 (p < .001). This correlation is governed by how the written language manages the mapping of graphs to linguistic units in spoken language, because the writing system largely determines the number of graphs required (for a discussion, see Perfetti & Harris, 2013; Perfetti & Verhoeven, in press).

Although the neural basis of visual perception is beyond the scope of our study, it seems relevant to consider the relation between the properties of the graphs developed for written language and the properties of human vision. Hubel and Wiesel (1962, 1965) established that the receptive field of the cats’ visual system included line, curvature, and edge detectors and computations that estimate their numbers. Primate visual systems have layered receptive fields that selectively respond to specific dimensions—for instance, V1 neurons to orientations; V2 neurons to corners; or V4 neurons to linear gratings, colors, angles, and curves—and computational abilities that operate across these layers (Van Essen, Anderson, & Felleman, 1992; for more recent work, see Coen-Cagli & Schwartz, 2013; Grill-Spector & Malach, 2004; Troncoso, Macknik, & Martinez-Conde, 2011).

Details aside, it is reasonable to suggest that the development of writing graphs has become aligned with human vision capabilities within other constraints, especially time and effort in graph production (Changizi & Shimojo, 2005). The three added dimensions (beyond perimetric complexity) of GraphCom seem to align with basic detection functions (simple features) and computational capabilities (connected points, discontinuous components) of human vision. Perimetric complexity seems to indirectly capture most of these detection and computational capabilities. Contributing substantially to perimetric complexity’s measures of inside and outside perimeters are graphemes’ simple features and their junctions. Indeed, the number of simple features and the number of connected points together account for over 88% of the variance in perimetric complexity.

Finally, we note the practical value of GraphCom as a research tool. Researchers can access the dimension-specific complexity values of the 21,550 graphs from 131 written languages in the graphic complexity database, available at https://dl.dropboxusercontent.com/u/28768192/GraphemeAll/GraphDataset_131_languages.zip. The database can be used in various applications, depending on the research goals. For example, within a single language, graphic complexity measures can be applied to the graphs a child encounters in reading instruction; across languages, graphic complexity in one language can be compared with those of another. For some research aims, specific complexity dimensions can be applied to within-language and between-languages comparisons; for other aims, researchers can create composite scores at the level of individual graphs, the language using them, or the writing system to which the language belongs. More generally, data at the graph, grapheme, written language, or writing system level can be useful for a wide range of applications from comparative writing studies to learning to read to models of graph processing; in short, for studies that take account of visual factors in written language.

Summary and conclusion

We introduced GraphCom, a multidimensional measurement system for quantifying the visual complexity of graphs across the world’s writing systems. Starting with perimetric complexity, a well-validated single measure of complexity, GraphCom adds three dimensions that reflect the ways that graph forms differ in their composition over simple features, their connection points, and their discontinuities. These four dimensions were validated by their abilities to predict human perceptual judgments on graphs that varied in complexity as measured by GraphCom. As a tool for research, the GraphCom measures are available online for 131 written languages and 21,550 graphs. In addition, its measures are defined precisely, in order to allow application to any of the world’s writing systems. This provides a practical research tool for constructing studies of perception and orthographic learning by children and adults, and also cross-language studies of reading and writing.

Notes

Writing systems are broad families of written languages, delineated by the linguistic units represented by their graphemes (e.g., abjads: consonants; alphabet: phonemes; alphasyllabaries: consonant–vowel units; syllabaries: syllables; morphosyllabaries: syllable morphemes; Cook & Bassetti, 2005). Among writing scholars, there is some variation in the names for these systems. For example, what we refer to here as alphasyllabaries, following Bright (1992), have been named abugida by Daniels (1990) to reflect their distinctiveness and to counter the impression that they are a mix of alphabetic and syllabic writing. This five-way classification system is a more accurate reflection of differences between written languages than is the traditional three-way system (e.g., Gelb, 1952). In particular, the three-way system included both alphasyllabaries and consonantal abjads as alphabets. The differences between these systems and alphabets are sufficient for us to adopt the five-way classification. For a penetrating discussion of these issues, see Share and Daniels (2016); for a contrary view of alphasyllabaries as alphabets, see Rimzhim, Katz, and Fowler (2014).
The KS distance between two samples is the maximal difference between their (empirical) cumulative distribution functions. Given one sample, the cumulative distribution function is the function F(x) whose value is “the ratio of data points below x.” For instance, when x is the median, then F(x) = .5; when x is the first quartile, F(x) = .25. Given two samples with cumulative distribution functions F1(x) and F2(x), the absolute difference, |F1(x) – F2(x)|, is a function of x. The KS distance is defined as the maximal value of |F1(x) – F2(x)| for all possible values of x.
In constructing this multidimensional space, we calculated, for each written language, the mean score on each dimension. Because the scale varies among dimensions (min–max range and interquartile range [Q3 – Q1] for each dimension: perimetric complexity: 1–76, Q3 – Q1 = 22; number of disconnected components: 1–18, Q3 – Q1 = 4; number of connected points: 0–37, Q3 – Q1 = 8; number of simple features: 1–32, Q3 – Q1 = 8), we transformed the resulting means to within-dimension ss scores. This standardization procedure addressed the issue of scale difference, allowing for comparisons of complexity on the same scale.
Backward elimination (or a back-out procedure) has an advantage over forward and stepwise selections. Forward selection and stepwise methods can fail to identify predictive models based on joint contributions of variables, because if the variables don't predict well individually, they will never enter the model. Because the backward method starts with everything in the model, joint predictive capability will be seen.

References

Ager, S. (1998). Omniglot: A guide to writing systems. In Encyclopedia omniglot. Retrieved from www.omniglot.com
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390–412. doi:10.1016/jmla.2007.12.005
Article Google Scholar
Bates, D., Maechler, M., & Dai, B. (2010). lme4: Linear mixed-effects models using S4 classes (R package version 0.999375-37) [Computer software manual]. Retrieved from http://lme4.rforge.r-project.org
Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis, 20, 351–368. doi:10.1093/pan/mpr057
Article Google Scholar
Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115–147. doi:10.1037/0033-295X.94.2.115
Article PubMed Google Scholar
Bright, W. (Ed.). (1992). International encyclopedia of linguistics. New York: Oxford University Press.
Google Scholar
Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6, 3–5. doi:10.1177/1745691610393980
Article PubMed Google Scholar
Burnham, K. P., & Anderson, D. R. (2003). Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.
Google Scholar
Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods, 46, 112–130. doi:10.3758/s13428-013-0365-7
Article PubMed Google Scholar
Chang, L. Y. (2015). Visual orthographic variation and learning to read across writing system (Unpublished doctoral dissertation). University of Pittsburgh, Pennsylvania. http://d-scholarship.pitt.edu/23959/
Chang, L. Y., Plaut, D. C., & Perfetti, C. A. (2016). Visual complexity in orthographic learning: Modeling learning across writing system variations. Scientific Studies of Reading, 20, 64–85.
Article Google Scholar
Changizi, M. A., & Shimojo, S. (2005). Character complexity and redundancy in writing systems over human history. Proceedings of the Royal Society B, 272, 267–275.
Article PubMed PubMed Central Google Scholar
Changizi, M. A., Zhang, Q., Ye, H., & Shimojo, S. (2006). The structures of letters and symbols throughout human history are selected to match those found in objects in natural scenes. American Naturalist, 167, E117–E139.
Article PubMed Google Scholar
Chen, Y. P., Allport, D. A., & Marshall, J. C. (1996). What are the functional orthographic units in Chinese word recognition: The stroke or the Stroke pattern? Quarterly Journal of Experimental Psychology, 49, 1024–1043.
Article Google Scholar
Chen, H. C., Chang, L. Y., Chiou, Y. S., Sung, Y. T., & Chang, K. E. (2011). Construction of Chinese orthographic database for Chinese character instruction. Bulletin of Educational Psychology, 43, 269–290.
Google Scholar
Chikhman, V., Bondarko, V., Danilova, M., Goluzina, A., & Shelepin, Y. (2012). Complexity of images: Experimental and computational estimates compared. Perception, 41, 631–647. doi:10.1068/p6987
Article PubMed Google Scholar
Coen-Cagli, R., & Schwartz, O. (2013). The impact on midlevel vision of statistically optimal divisive normalization in V1. Journal of Vision, 13(8), 1–20. doi:10.1167/13.8.13
Article Google Scholar
Cook, V., & Bassetti, B. (2005). An introduction to researching second language writing systems. In V. Cook & B. Bassetti (Eds.), Second language writing systems (pp. 1–67). Clevedon: Multilingual Matters.
Google Scholar
Daniels, P. T. (1990). Fundamentals of grammatology. Journal of the American Oriental Society, 110, 727–731.
Article Google Scholar
Demetriou, A., Kui, Z. X., Spandoudis, G., Christou, C., Kyriakides, L., & Platsidou, M. (2005). The architecture, dynamics, and development of mental processing: Greek, Chinese, or universal? Intelligence, 33, 109–141.
Article Google Scholar
Ehrenstein, W. H. (2008). Gestalt psychology. In Encyclopedia of neuroscience (pp. 1721–1724). New York: Springer.
Google Scholar
Fiset, D., Blais, C., Ethier-Majcher, C., Arguin, M., Bub, D., & Gosselin, F. (2008). Features for identification of uppercase and lowercase letters. Psychological Science, 19, 1161–1168.
Article PubMed Google Scholar
Frost, R. (2012). Towards a universal model of reading. Behavioral and Brain Sciences, 35, 263–279. doi:10.1017/S0140525X11001841
Article PubMed PubMed Central Google Scholar
Gelb, I. J. (1952). A study of writing. Chicago: University of Chicago Press.
Google Scholar
Gibson, E. J. (1969). Principles of perceptual learning and development. New York: Meredith.
Google Scholar
Grainger, J., Rey, A., & Dufau, S. (2008). Letter perception: From pixels to pandemonium. Trends in Cognitive Sciences, 12, 381–387. doi:10.1016/j.tics.2008.06.006
Article PubMed Google Scholar
Grill-Spector, K., & Malach, R. (2004). The human visual cortex. Annual Review of Neuroscience, 27, 649–677. doi:10.1146/annurev.neuro.27.070203.144220
Article PubMed Google Scholar
Horton, J. J., Rand, D. G., & Zeckhauser, R. J. (2011). The online laboratory: Conducting experiments in a real labor market. Experimental Economics, 14, 3990425.
Article Google Scholar
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. Journal of Physiology, 160, 251–260.
Article Google Scholar
Hubel, D. H., & Wiesel, T. N. (1965). Receptive fields and functional architecture in two non-striate visual areas (18 and 19) of the cat. Journal of Neurophysiology, 28, 229–289.
Article PubMed Google Scholar
Jiang, Y. V., Shim, W. M., & Makovski, T. (2008). Visual working memory for line orientations and face identities. Perception & Psychophysics, 70, 1581–1591. doi:10.3758/PP.70.8.1581
Article Google Scholar
Katz, L., & Frost, R. (1992). Reading in different orthographies: The orthographic depth hypothesis. In R. Frost & L. Katz (Eds.), Orthography, phonology, morphology, and meaning (pp. 67–84). Amsterdam: North-Holland.
Chapter Google Scholar
Koffka, K. (1963). Principles of Gestalt psychology. New York: Harcourt, Brace & World (Original work published 1935).
Google Scholar
Lanthier, S. N., Risko, E. F., Stolz, J. A., & Besner, D. (2009). Not all visual features are created equal: Early processing in letter and word recognition. Psychonomic Bulletin & Review, 16, 67–73. doi:10.3758/PBR.16.1.67
Article Google Scholar
Lavine, L. O. (1977). Differentiation of letterlike forms in prereading children. Developmental Psychology, 13, 89–94. doi:10.1037/0012-1649.13.2.89
Article Google Scholar
Lehmann, E. L. (1986). Testing statistical hypotheses (2nd ed.). New York: Springer.
Book Google Scholar
Levin, I., & Bus, A. G. (2003). How is emergent writing based on drawing? Analyses of children’s products and their sorting by children and mothers. Developmental Psychology, 39, 891–905. doi:10.1037/0012-1649.39.5.891
Article PubMed Google Scholar
Liu, D., Chen, X., & Wang, Y. (2016). The impact of visual–spatial attention on reading and spelling in Chinese children. Reading and Writing, 29, 1435–1447. doi:10.1007/s11145-016-9644-x
Article Google Scholar
McBride-Chang, C., Zhou, Y., Cho, J.-R., Aram, D., Levin, I., & Tolchinsky, L. (2011). Visual spatial skill: A consequence of learning to read? Journal of Experimental Child Psychology, 109, 256–262. doi:10.1016/j.jecp.2010.12.003
Article PubMed Google Scholar
Ministry of Education in Japan. (2015). Official list of kyōiku kanji by grade. Retrieved from the Ministry of Education, Culture, Sports, Science and Technology–Japan website, www.mext.go.jp/a_menu/shotou/new-cs/youryou/syo/koku/001.htm
Mueller, S. T., & Weidemann, C. T. (2012). Alphabetic letter identification: Effects of perceivability, similarity, and bias. Acta Psychologica, 139, 19–37. doi:10.1016/j.actpsy.2011.09.014
Article PubMed Google Scholar
Nag, S. (2007). Early reading in Kannada: The pace of acquisition of orthographic knowledge and phonemic awareness. Journal of Research in Reading, 30, 7–22.
Article Google Scholar
Nag, S. (2008). Kannada vocabulary test. Bangalore: Promise Foundation.
Google Scholar
Nag, S. (2014). Alphabetism and the science of reading: From the perspective of the akshara languages. Frontiers in Psychology, 5, 866. doi:10.3389/fpsyg.2014.00866
Article PubMed PubMed Central Google Scholar
Nag, S., Caravolas, M., & Snowling, M. J. (2011). Beyond alphabetic processes: Literacy and its acquisition in the alphasyllabic languages. Reading and Writing, 24, 615–622.
Article Google Scholar
Nag, S., & Snowling, M. J. (2011). Cognitive profiles of poor readers of Kannada. Reading and Writing, 24, 657–676.
Article Google Scholar
Nag, S., Snowling, M., Quinlan, P., & Hulme, C. (2014). Child and symbol factors in learning to read a visually complex writing systems. Scientific Studies of Reading, 18, 309–324. doi:10.1080/10888438.2014.892489
Article Google Scholar
Nag, S., Treiman, R., & Snowling, M. J. (2010). Learning to spell in an alphasyllabary: The case of Kannada. Writing Systems Research, 2, 41–52.
Article Google Scholar
Pelli, D. G., Burns, C. W., Farell, B., & Moore-Page, D. C. (2006). Feature detection and letter identification. Vision Research, 46, 4646–4674. doi:10.1016/j.visres.2006.04.023
Article PubMed Google Scholar
Perfetti, C. A. (2003). The universal grammar of reading. Scientific Studies of Reading, 7, 3–24.
Article Google Scholar
Perfetti, C. A., & Harris, L. N. (2013). Reading universals are modulated by language and writing system. Language Learning and Development, 9, 296–316. doi:10.1080/15475441.2013.813828
Article Google Scholar
Perfetti, C. A., & Verhoeven, L. (in press). Learning to reading across languages and writing systems. Cambridge, UK: Cambridge University Press.
Reas, C., & Fry, B. (2010). Getting started with processing. Sebastopol: O’Reilly.
Google Scholar
Rimzhim, A., Katz, L., & Fowler, C. A. (2014). Brahmi-derived orthographies are typologically aksharik but functionally predominantly alphabetic. Writing Systems Research, 6, 41–53. doi:10.1080/17586801.2013.855618
Article Google Scholar
Robins, S., & Treiman, R. (2009). Learning about writing begins informally. In D. Abram & O. Korat (Eds.), Literacy development and enhancement across orthographies and cultures (pp. 17–29). New York: Springer.
Google Scholar
Sayim, B., & Cavanagh, P. (2011). What line drawings reveal about the visual brain. Frontiers in Human Neuroscience, 5(118), 1–4. doi:10.3389/fnhum.2011.00118
Google Scholar
Seidenberg, M. S. (2011). Reading in different writing systems: One architecture, multiple solutions. In P. D. McCardle, B. Miller, J. R. Lee, & O. J. L. Tzeng (Eds.), Dyslexia across languages: Orthography and the brain–gene–behavior link (pp. 146–168). New York: Paul Brookes.
Google Scholar
Seymour, P. H. K., Aro, M., & Erskine, J. M. (2003). Foundation literacy acquisition in European orthographies. British Journal of Psychology, 94, 143–174.
Article PubMed Google Scholar
Share, D. L., & Daniels, P. T. (2016). Aksharas, alphasyllabaries, abugidas, alphabets and orthographic depth: Reflections on Rimzhim, Katz and Folwer (2014). Writing System Research, 8, 17–31. doi:10.1080/17586801.2015.1016395
Article Google Scholar
Shen, H. H., & Ke, C. (2007). Radical awareness and word acquisition among nonnative learners of Chinese. Modern Language Journal, 91, 97–111.
Article Google Scholar
Simcox, T., & Fiez, J. (2014). Collecting response times using Amazon Mechanical Turk and Adobe Flash. Behavior Research Methods, 46, 95–111. doi:10.3758/s13428-013-0345-y
Article PubMed PubMed Central Google Scholar
Simpson, I. C., Mousikou, P., Montoya, J. M., & Defior, S. (2013). A letter visual-similarity matrix for Latin-based alphabets. Behavior Research Methods, 45, 431–439.
Article PubMed Google Scholar
Spillmann, L., & Ehrenstein, W. H. (2004). Gestalt factors in the visual neurosciences. In L. Chalupa & J. S. Werner (Eds.), The visual neurosciences (pp. 1573–1589). Cambridge: MIT Press.
Google Scholar
Sprouse, J. (2011). A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior Research Methods, 43, 155–167. doi:10.3758/s13428-010-0039-7
Article PubMed Google Scholar
Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69, 730–737.
Article Google Scholar
Su, Y.-F., & Samuels, S. J. (2010). Developmental changes in character-complexity and word-length effects when reading Chinese script. Reading and Writing, 23, 1085–1108. doi:10.1007/s11145-009-9197-3
Article Google Scholar
Szwed, M., Cohen, L., Qiao, E., & Dehaene, S. (2009). The role of invariant line junctions in object and visual word recognition. Vision Research, 49, 718–725. doi:10.1016/j.visres.2009.01.003
Article PubMed Google Scholar
Tamaoka, K., & Kiyama, S. (2013). The effects of visual complexity for Japanese kanji processing with high and low frequencies. Reading and Writing, 26, 205–223. doi:10.1007/s11145-012-9363-x
Article Google Scholar
Tokowicz, N., Michael, E., & Kroll, J. F. (2004). The roles of study abroad experience and working memory capacity in the types of errors made during translation. Bilingualism: Language and Cognition, 7, 255–272.
Article Google Scholar
Treiman, R., Hompluem, L., Gordon, J., Decker, K., & Markson, L. (2016). Young children’s knowledge of the symbolic nature of writing. Child Development, 87, 583–592. doi:10.1111/cdev.12478
Article PubMed PubMed Central Google Scholar
Treiman, R., & Kessler, B. (2011). Similarities among the shapes of writing and their effects on learning. Written Language and Literacy, 14, 39–57.
Article PubMed PubMed Central Google Scholar
Treiman, R., & Kessler, B. (2014). How children learn to write words. New York: Oxford University Press.
Book Google Scholar
Treiman, R., Mulqueeny, K., & Kessler, B. (2014). Young children’s knowledge about the spatial layout of writing. Writing System Research, 7, 235–244. doi:10.1080/17586801.2014.924386
Article Google Scholar
Troncoso, X. G., Macknik, S. L., & Martinez-Conde, S. (2011). Visual prosthetics (G. Dagnelie, Ed.). Boston, MA: Springer. doi:10.1007/978-1-4419-0754-7
Van Essen, D. C., Anderson, C. H., & Felleman, D. J. (1992). Information processing in the primate visual system: An integrated systems perspective. Science, 255, 419–423. doi:10.1126/science.1734518
Article PubMed Google Scholar
Wang, Y., McBride-Chang, C., & Chan, S. (2014). Correlates of Chinese kindergarteners’ word reading and writing: The unique role of copying skills? Reading and Writing, 27, 1281–1302. doi:10.1111/1467-9817.12016
Article Google Scholar
Wasserman, L. (2006). All of statistics: A concise course in statistical inference (pp. 218–222). New York: Springer.
Google Scholar
Watson, A. B. (2012). Perimetric complexity of binary digital images: Notes on calculation and relation to visual complexity. Mathematica Journal, 14, 1–41. doi:10.3888/tmj.14-5
Article Google Scholar
Watt, W. C. (1983). Grade der Systemhaftigkeit: Zur Homogenitat der Alphabetschrift [Degrees of systematicity: On the homogeneity of the alphabetic script]. Zeitschrift fur Semiotik, 5, 371–399.
Google Scholar
Watt, W. C. (1994). Curves as angles. In Writing systems and cognition—Perspectives from psychology, physiology, linguistics, and semiotics (pp. 215–246). Amsterdam, The Netherlands: Springer.
Winskel, H. (2010). Spelling development in Thai children. Journal of Cognitive Science, 11, 7–35.
Article Google Scholar
Wu, N., Zhou, X., & Shu, H. (1999). Sublexical processing in reading Chinese: A development study. Language and Cognitive Processes, 14, 503–524.
Article Google Scholar
Wydell, T. N. (2012). Cross-cultural/linguistic differences in the prevalence of developmental dyslexia and the hypothesis of granularity and transparency. In T. N. Wydell & L. Fern-Pollak (Eds.), Dyslexia: A comprehensive and international approach (pp. 1–14). Rijeka: InTech.
Chapter Google Scholar
Yin, L., & McBride, C. (2015). Chinese kindergarteners learn to read characters analytically. Psychological Science, 26, 424–432.
Article PubMed Google Scholar
Ziegler, J. C., & Goswami, U. (2005). Reading acquisition, developmental dyslexia, and skilled reading across languages: A psycholinguistic grain size theory. Psychological Bulletin, 131, 3–29. doi:10.1037/0033-2909.131.1.3
Article PubMed Google Scholar

Download references

Author note

This work was supported by the National Science Foundation (Grant #SBE-0836012) through Pittsburgh Science of Learning Center (PSLC) and “Aim for the Top University Project” of the National Taiwan Normal University and the Ministry of Education, Taiwan, R.O.C. The authors thank Adrian Maries and members in the Perfetti Lab at the University of Pittsburgh for their assistance with various tasks, and all observers for their participation. Moreover, the authors acknowledge the insightful comments of David Share and other, anonymous reviewers.

Author information

Authors and Affiliations

Department of Applied Chinese Language and Culture, National Taiwan Normal University, Taipei, Taiwan
Li-Yun Chang
Department of Statistics, University of Washington, Seattle, 98195, WA, USA
Yen-Chi Chen
Learning Research and Development Center, University of Pittsburgh, 3939 O’Hara Street, Room 833, Pittsburgh, Pennsylvania, 15260, USA
Charles A. Perfetti

Authors

Li-Yun Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yen-Chi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Charles A. Perfetti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Li-Yun Chang or Charles A. Perfetti.

Appendices

Appendix

Table 10 Detailed information for 131 written languages

Full size table

Appendix

Table 11 Graph pairs per list used in the similarity rating

Full size table

Appendix A lme4 (Bates, Maechler, & Dai, 2010) model formulae for fitting models and comparing the four models

Fitting five models

Full <- lmer(Response ~ 1 + dPC * dDC * dCP * dSF + (1|Subject) + (1|Item), data = rating)

FullwoPC <- lmer(Response ~ 1 + dDC * dCP * dSF + (1|Subject) + (1|Item), data = rating)

FullwoDC <- lmer(Response ~ 1 + dPC * dCP * dSF + (1|Subject) + (1|Item), data = rating)

FullwoCP <- lmer(Response ~ 1 + dPC * dDC * dSF + (1|Subject) + (1|Item), data = rating)

FullwoSF <- lmer(Response ~ 1 + dPC * dDC * dCP + (1|Subject) + (1|Item), data = rating)

Comparing the full model with the other four models, respectively

anova(Full, FullwoPC)

anova(Full, FullwoDC)

anova(Full, FullwoCP)

anova(Full, FullwoSF)

1 denotes the intercept, * denotes an interaction plus main effects between two predictors, and (x|a) denotes random effects of x by a.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, LY., Chen, YC. & Perfetti, C.A. GraphCom: A multidimensional measure of graphic complexity applied to 131 written languages. Behav Res 50, 427–449 (2018). https://doi.org/10.3758/s13428-017-0881-y

Download citation

Published: 19 April 2017
Issue Date: February 2018
DOI: https://doi.org/10.3758/s13428-017-0881-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

GraphCom: A multidimensional measure of graphic complexity applied to 131 written languages

Abstract

Similar content being viewed by others

“Nonsense Rides Piggyback on Sensible Things”: The Past, Present, and Future of Graphology

Graphology

ReaderBench: A Multi-lingual Framework for Analyzing Text Complexity

Graphic units: Graphs and graphemes

Writing graphs, as a culture product, are different from other visual categories

Previous measures of graphic complexity

Complexity characteristics in different writing systems

Gestalt principles for perceptual organization of graphs

The graph complexity measure, GraphCom

Perimetric complexity (PC)

Number of disconnected components (DC)

Number of connected points (CP)

Number of simple features (SF)

Method

The written languages

Graphic complexity quantification

Results

Complexity variation along individual dimensions

Dimensions differentiate writing system pairs

Behavioral validation: Similarity ratings of graph pairs

Method

Stimuli

Observers

Procedure

Similarity rating task

Language questionnaire

Demographic background questionnaire

Translation task

Results

Discussion

Summary and conclusion

Notes

References

Author note

Author information

Authors and Affiliations

Corresponding authors

Appendices

Appendix

Appendix

Appendix A lme4 (Bates, Maechler, & Dai, 2010) model formulae for fitting models and comparing the four models

Fitting five models

Comparing the full model with the other four models, respectively

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation