1 Uniform Measures

The world of education has long been waiting for a sunrise. Believe it or not, a popular compilation of educational tests lists 97 different reading tests (Mitchell, The Ninth Mental Measurements Yearbook, University of Nebraska Press, 1985). This situation produces 97 different “reading ability measures.” What a mess! But now, with the dawn of uniform educational measures, the sun is rising here in Melbourne.

When I was a physicist, I came to appreciate the essential part uniform measures play in science. In the seventeenth century, there were many ways to observe the effects of heat. It was thought, therefore, that there were many kinds of heat. That was a brutal barrier to progress. Nevertheless, seventeenth century scientists thought they were observing “57 varieties.” After all, is not bathtub heat different from teacup heat, different from cauldron heat, different from fireplace heat—all of which are different from the heat of the sun? Eventually, it was discovered that it was not only desirable but also necessary to have just one kind of heat. Today, for science and commerce, we do our thinking about heat in terms of one entirely abstract unit, the “degree.” Whether it's a Kelvin, Celsius, or Fahrenheit degree does not matter. We know exactly how to get from one to another. They all measure, what we insist is, the same one kind of heat. Measures are older than talking. Birds measure. So do bees. Our own measures evolved from our bodies—our feet, our arms, our hands, our fingers. An inch is the distance from thumb to knuckle. A span is the distance between thumb tip and little finger. A cubit is the length of a forearm. A fathom is the distance between outstretched arms. A pace is two steps. A furlong is 200 paces. A mile 1,000.

Abstractly equal units of length were counted on before the oldest fragments of writing. Figure 1 is Moses’ plan for the Tabernacle.

Fig. 1
figure 1

Exodus 26

Without approximations to equal units, Babylonians, Egyptians and Hebrews could not have imagined, let alone built, their towers.

Fair measurement is embedded in Judeo-Christian morality. But the “perfect and just measure” demanded in Deuteronomy 25, Fig. 2, is an ideal that can only be approximated in practice.

Fig. 2
figure 2

Deuteronomy 25

The “weight” referred to is a shekel stone, understood to weigh 11.4 oz. However, archeologists have never found two shekel stones that weighed exactly the same. No technology, no matter how advanced, can fabricate perfect weights. Nevertheless, even when Deuteronomy was written, we already understood the essential necessity and justice of fair units.

The necessity of uniformity in the representation of quantity appears again in King John's Magna Carta, Fig. 3. Without the ideal of uniform measures, there would be no money. There would be no fitted clothes, because there would be no way to fit them. Imagine what life would be like if there were no abstract unit of length like the inch.

Fig. 3
figure 3

The Magna Carta

Suppose that taking an inch were complex—differing with every situation and material. Imagine that wood inches were different from brick inches, were different from steel inches. We would not have civilization. We would have a mess—a mess like the mess that permeates most of what we misleadingly refer to as “educational tests and measurements.”

2 The Evolution of Science

The study of any subject begins with tangles of speculations. Ideas branch in all directions. But as we work through the tangle, we connect what we experience with what we see. We coax our ideas into shape, form unities, develop lines of inquiry. We fit our ideas together and make them something. We evolve our bush of ideas into a tree of knowledge, Fig. 4.

Fig. 4
figure 4

A tree of knowledge

The bush was a tangle. The tree has direction. Our final step in wrestling a useful abstract assertion from a complex concrete confusion is to carve a ruler out of our tree. The ruler does not exist until we imagine it and carve it. The carving is not perfect. It is just an approximation. But what it approximates—a perfectly straight line—enables us to use it as though it were marked off in perfectly equal intervals.

We can pace off land in somewhat equal steps. But steps inevitably vary according to conditions. To produce reliable measurements, we need something more reproducible than pacing. The scientific measurement of length was born as we connected our experience of stride with man made marks on straight pieces of wood extracted from tree trunks. A piece of tree is more stable than anyone's paces. A ruler does not change its bench marks. When we grow a confusing bush of tangled ideas into a tree of useful knowledge and make a ruler, then we can plan and build a pyramid, a temple, a house—and also measure the height of a child (Rasch, 1980).

3 The Imaginary Inch

An inch is pure, abstract and without content. It has no meaning of its own. It is an imaginary unit of length. A height of inches, however, has meaning. As we grow, we learn the advantages to growing taller. Brick size has meaning. As we build, we learn the advantages of same-sized bricks. What makes bricks useful is that their interchangeability is maintained by approximations to the fiction we call an inch.

It is essential that our idea of an abstract inch is always the same. If we let our idea of an inch change each time we made a measure, we would not produce useful bricks or keep track of a child's growth. As our child grew, we would not know by how much they had grown. But with a uniform unit of measure, like an inch, we can measure the height of our children, we can refer to last year—or perhaps to the height of an average second grader because, as it turns out, child height is related to school grade. We can guess what grade a child is in by how tall they are–and how tall they are by what grade they are in. That is an understanding based entirely on applications of rulers. The applications would be useless without that single, unvarying inch that our rulers approximate.

No metric has content of its own. The ruler, with its equal measurement units is merely an approximate realization of a pure idea—an ideal which we invented from tangled experiences of length—invented to make uniform measures available for any application we care to undertake.

4 One Kind of Reading Ability

Let's turn to the measure of reading. We can think of reading as the tree in Fig. 5. It has roots like oral comprehension and phonological awareness. As reading ability grows, a trunk extends through grade school, high school and college branching at the top into specialized vocabularies. That single trunk is longer than many realize. It grows quite straight and singular from first grade through college.

Fig. 5
figure 5

A tree of reading

Reading has always been the most researched topic in education (Thorndike, 1965). There have been many studies of reading ability, large and small, local, and national. When the results of these studies are reviewed, one clear picture emerges. Despite the 97 ways to test reading ability (Mitchell, 1985), many decades of empirical data document definitively that no researcher has been able to measure more than one kind of reading ability. This has proven true in spite of intense interest in discovering diversity. Consider three examples: the 1940s Davis Study, the 1970s Anchor Study and six 1980s and 1990s ETS studies.

4.1 Davis—1940s

Fred Davis went to a great deal of trouble to define and operationalize nine kinds of reading ability (1944). He made up nine different reading tests to prove the separate identities of his nine kinds. He gave his nine tests to hundreds of students, analyzed their responses to prove his thesis, and reported that he had established nine kinds of reading. But when Louis Thurstone reanalyzed Davis’ data (1946), Thurstone showed conclusively that Davis had no evidence of more than one dimension of reading.

4.2 Anchor Study—1970s

In the 1970s, worry about national literacy moved the U.S. government to finance a national Anchor Study (Jaeger, 1973). 14 different reading tests were administered to a great many children in order to uncover the relationships among the 14 different test scores. Millions of dollars were spent. Thousands of responses were analyzed. The final report required 15,000 pages in 30 volumes—just the kind of document one reads overnight, takes to school the next day and applies to teaching (Loret et al., 1974). In reaction to this futility, and against a great deal of proprietary resistance, Bashaw and Rentz were able to obtain a small grant to reanalyze the Anchor Study data (Rentz & Bashaw, 1975, 1977). By applying new methods for constructing objective measurement (Rasch, 1980; Wright & Stone, 1979), Bashaw and Rentz were able to show that all 14 tests used in the Anchor Study—with all their different kinds of items, item authors, and publishers—could all be calibrated onto one linear “National Reference Scale” of reading ability.

The essence of the Bashaw and Rentz results can be summarized on one easy-to-read page (1977)—a bit more useful than 15,000 pages. Their one page summary shows how every raw score from the 14 Anchor Study reading tests can be equated to one linear National Reference Scale. Their page also shows that the scores of all 14 tests can be understood as measuring the same kind of reading on one common scale. The Bashaw and Rentz National Reference Scale is additional evidence that, so far, no more than one kind of reading ability has ever been measured. Unfortunately, their work had little effect on the course of U.S. education. The experts went right on claiming there must be more than one kind of reading—and sending teachers confusing messages as to what they were supposed to teach and how to do it.

4.3 ETS Studies—1980s and 1990s

In the 1980s and 1990s, the Educational Testing Service (ETS) did a series of studies for the U.S. government. ETS (1990) insisted on three kinds of reading: prose reading, document reading and quantitative reading. They built a separate test to measure each of these kinds of reading—greatly increasing costs. Versions of these tests were administered to samples of school children, prisoners, young adults, mature adults, and senior citizens. ETS reported three reading measures for each person and claimed to have measured three kinds of reading (Kirsch, Jungeblut, & Campbell, 1991). But reviewers noted that, no matter which kind of reading was chosen, there were no differences in the results (Kirsch & Jungeblut, 1993, 1994; Reder, 1996; Salganik & Tal, 1989; Zwick, 1987). When the relationships among reading and age and ethnicity were analyzed, whether for prose, document, or quantitative reading, all conclusions came out the same.

Later, when the various sets of ETS data were reanalyzed by independent researchers, no evidence of three kinds of reading measures could be found (Bernstein & Teng, 1989; Reder, Rock & Yamamoto, 1994; 1996; Salganik & Tai, 1989; Zwick, 1987). The correlations among ETS prose, document and quantitative reading measures ranged from 0.89 to 0.96. Thus, once again and in spite of strong proprietary and theoretical interests in proving otherwise, nobody had succeeded in measuring more than one kind of reading ability.

5 Lexiles

Figure 6 is a reading ruler. Its Lexile units work just like the inches. The Lexile ruler is built out of readability theory, school practice, and educational science. The Lexile scale is an interval scale. It comes from a theoretical specification of a readability unit that corresponds to the empirical calibrations of reading test items (Rasch, 1980; Stenner, 1997). It is a readability ruler. And it is a reading ability ruler.

Fig. 6
figure 6

Educational status by average Lexile

Readability formulas are built out of abstract characteristics of language. No attempt is made to identify what a word or a sentence means. This idea is not new. The Athenian Bar Association used readability calculations to teach lawyers to write briefs in 400 B.C. (Chall, 1988; Zakaluk & Samuels, 1988). According to the Athenians, the ability to read a passage was not the ability to interpret what the passage was about. The ability to read was just the ability to read. Talmudic teachers who wanted to regularize their students’ studies, used readability measures to divide the Torah readings into equal portions of reading difficulty in 700 A.D. (Lorge, 1939). Like the Athenians, their concern in doing this was not with what a particular Torah passage was about, but rather the extent to which passage readability burdened readers.

In the twentieth century, every imaginable structural characteristic of a passage has been tested as a potential source for a readability measure: the number of letters and syllables in a word; the number of sentences in a passage; sentence length; balances between pronouns and nouns, verbs and prepositions (Stenner, 1997). The Lexile readability measure uses word familiarity and sentence length.

5.1 Lexile Accuracies

Table 1 lists the correlations between readability measures from the ten most studied readability equations and student responses to different types of reading test items. The columns of Table 1 report on five item types: Lexile Slices; SRA Passages; Battery Test Sentences; Mastery Test Cloze Gaps; Peabody Test Pictures.

Table 1 Correlations between empirical and theoretical item difficulties

The item types span the range of reading comprehension items. The numbers in the table show the correlations between theoretical readability measures of item text and empirical item calibrations calculated from students’ test responses. Consider the top row. The Lexile readability equation predicted how difficult Lexile slices would be for persons taking a Lexile reading test at a correlation of 0.90, the SRA passage at 0.92, the Battery Sentence at 0.85, the Mastery Cloze at 0.74 and the Peabody Picture at 0.94 (Stenner, 1996, 1997). With the exception of the cloze items, these predictions are nearly perfect. Also note that the simple Lexile equation, based only on word familiarity and sentence length, predicts empirical item responses as well as any other readability equation—no matter how complex. Table 1 documents, yet again that one, and only one, kind of reading is measured by these reading tests. Were that not so, the array of nearly perfect correlations could not occur. Table 1 also shows that we can have a useful measurement of text readability and reader reading ability on a single reading ruler!

An important tool in reading education is the basal reader. The teaching sequence of basal readers records generations of practical experience with text readability and its bearing on student reading ability. Table 2 lists the correlations between Lexile readability and basal reading order for the 11 basal readers most used in the United States Each series is built to mark out successive units of increasing reading difficulty. Ginn has 53 units—from book 1 at the easiest to book 53 at the hardest. HBJ Eagle has 70 units.

Table 2 Correlations between basal reader order and Lexile readability

Teachers work their students through these series from start to finish. Table 2 shows that the correlations between Lexile measures of the texts of these basal readers and their sequential positions from easy to hard are extraordinarily high. In fact, when corrected for attenuation and range restriction, these correlations become approach perfection (Stenner et al., 1992, 1996, 1997, 1998).

Each designer of a basal reader series used their own ideas, consultants and theory to decide what was easy and what was hard. Nevertheless, when the texts of these basal units are Lexiled, these Lexiles predict exactly where each book stands on its own reading ladder—more evidence that, despite differences among publishers and authors, all units end up benchmarking the same single dimension of reading ability.

Finally there are the ubiquitous reading ability tests administered annually to assess each student's reading ability. Table 3 shows how well theoretical item text Lexiles predict actual readers’ test performances on eight of the most popular reading tests. The second column shows how many passages from each test were Lexiled. The third column lists the item type.

Table 3 Correlations between passage Lexiles and item readabilities

Once again there is a very high correlation between the difficulty of these items as calculated by the entirely abstract Lexile specification equation and the live data produced by students answering these items on reading tests. When we correct for attenuation and range restriction, the correlations are just about perfect. Only the Mastery Cloze test, well-known to be idiosyncratic, fails to conform fully (Stenner, 1997).

What does this mean? Not only is only one reading ability being measured by all of these reading comprehension tests, but we can replace all the expensive data used to calibrate these tests empirically with one formula—the abstract specification equation. We can calculate the reading difficulty of test items by Lexiling their text without administering them to a single student!

Putting the relationship between theoretical Lexiles and observed item difficulties into perspective, the uncorrected correlation of 0.93, when disattenuated for error and corrected for range restrictions, approaches 1.00. The Lexile equation produces an almost perfect correlation between theory and practice (Stenner, 1997).

Variation comes from item response options that have to compete with each other or they do not work. But there has to be one correct answer. Irregularity in the composition of multiple choice options, even when they are reduced to choosing one word to fill a blank, is unavoidable. What the item writer chooses to ask about a passage and the options they offer the test taker to choose among are not only about reading ability. They are also about personal differences among test writers.

There are also variations among test takers in alertness and motivation that disturb their performances. In view of these unavoidable contingencies, it is surprising that the correlation between Lexile theory and actual practice is so high.

How does this affect the measurement of reading ability? The root mean square measurement error for a one item test would be about 172 Lexiles (Stenner, 1997). What are the implications of that much error? The distance from First Grade school books to Second Grade school books is 200 Lexiles. So we would undoubtedly be uneasy with measurement errors as large as 172 Lexiles. However, when we combine the responses to a test of 25 Lexile items, the measurement error drops to 35 Lexiles. And when we use a test of 50 Lexile items, the measurement error drops to 25 Lexiles—one eighth of the 200 Lexile difference between First and Second Grade books. Thus, when we combine a few of Lexile items into a test, we get a measure of where a reader is on the Lexile reading ability ruler, precise enough for all practical purposes. We do not plumb their depths of understanding. But we do measure their reading ability.

5.2 Lexile Items

One might now ask, how hard is it to write a Lexile test item? Fig. 7 describes a study to find out whether Lexile items written by different authors produce usefully equivalent results (Stenner, 1998).

Fig. 7
figure 7

Stability study

Five apprentice item authors were each asked to choose their own text passages and to write their own response illustrated missing word options (Fig. 8) each author wrote 60 items spanning 900 to 1300 Lexiles. From these (5 × 60 = 300) items, five 60 item tests were constructed by drawing 12 items at random from each author. Then seven grade school students were given a different test each day for five days. This produced five measures for each student over the five days. And, by pooling days, five measures for each student over the five authors.

Fig. 8
figure 8

An 800 Lexile slice test item

The question becomes “Is the variation by author in a student’s reading ability measure any larger than the variation by day?” If not, that would imply that writing useful Lexile test items, as in Fig. 8, was not a problem, since even apprentice authors can do it well enough to obtain measures as stable as the differences in a person's reading performance from day to day.

Findings indicate that no more noise is introduced into the Lexile way of making a reading measure by a difference among item authors than by the difference a day makes.

These five Lexile item authors were not experts. They were just well educated persons, instructed in Lexile item writing for four hours. Courtney, 27, is a psychology student. John, 23, is a math student. Gail, 35, is a law student. Chris, 22, is a football player. Gayle, 45, is a teacher.

5.3 Calculating Lexiles

Lexile measures of reading are easy to understand and easy to use. Lexile readability—measured by word familiarity and sentence length—establishes how difficult a text is to read. Lexile reading ability—measured by how well a reader is able to recognize words and connect them into sentences—establishes how able a reader is to read a text (Stenner, 1982, 1983, 1987).

Readability is passage reading difficulty.

Reading Ability is ability to read passages.

Lexile reading ability is measured by finding out what Lexile passage readability a person can read with 75 percent success.

Success is defined as recognizing what words are needed to mend gaps inserted in passages.

The Lexile formula is based on two axioms.

The semantic axiom: the more familiar the words, the easier the passage is to read;

The more unfamiliar the words, the harder.

The syntactic axiom: the shorter the sentences, the easier the passage is to read; the longer the sentences, the harder.

These axioms apply to whatever is read, quite apart from content. They apply whether we like what we are reading or not, whether it is prose, document or quantitative.

The Lexile system calculates passage readability from just these two characteristics—both of which are explicit in the passage. Sentence lengths are there to see. We count and average them. Word familiarities are obtained from compilations of word usage. The Lexile Analyzer uses John Carroll’s sample of 5 million words (Carroll, Davies, & Richmond, 1971).Footnote 1

If readers do not know the words, they cannot read the passage. If they do know the words, they can begin to make the passage take shape by stringing its words into sentences. If they can make the sentences, they can read the passage and then, and only then, begin to think about what the passage has to say. Knowing the words and making the sentences sets the threshold for reading (Hitch & Baddeley, 1974; Lieberman et al., 1982; Shankweiler & Crain, 1986; Miller & Gildea, 1987).

To Lexile a passage, we look up the occurrence frequency of each word. The Lexile Analyzer uses the average log word frequency and the logarithm of average sentence length. The final Lexile measure for the passage is a weighted sum of these two logarithms. Figure 9 shows how to Lexile a book.

Fig. 9
figure 9

How to Lexile a book

The coefficients in the formula are set to provide the most efficient balance between log word familiarity and log sentence length and to define a metric that reaches 1000 Lexiles from the books used in First Grade at 200 Lexiles to the books used in Twelfth Grade at 1200 Lexiles. The full Lexile range of readability goes from zero to 1800. The equation is simple. Word familiarity and sentence length are all there is to it. Figure 10 shows how to Lexile a reader.

Fig. 10
figure 10

How to Lexile a reader

5.4 Lexile Relationships

When a reader with a Lexile ability of 1000L is given a 1000L text, we expect them to experience a 75% success rate (Stenner, 1992). If the same reader is given a 750L text, then we expect their success rate to improve to 90%. If the text is at 500L, their success rate should improve to 96%. The more readers’ Lexile abilities surpass the Lexile readability of a text, the higher their expected success rates. However, the more a text Lexile readability surpasses readers’ Lexile reading abilities, the lower their expected success rates.

Success rates are relative. They are the results of Lexile differences between readers and texts. The 250L difference between a 750L text and a 1000L reader results is the same success rate as the 250L difference between a 1000L text and a 1250L reader. Each reader-text combination produces 90% reading success. Success rates are centered at 75% because readers forced to read at 50% success report frustration, while readers reading at 75% report comfort, confidence and interest.Footnote 2

Each reader has their own range of reading comfort. As a result, there is a natural range of text readability that most motivates each reader to improve their reading ability. Some readers are challenged by a success rate as low as 60%. Others find that burdensome. Once a reader places themselves and their books in the Lexile Framework, they can discover what Lexile difference between their reading ability and text readability challenges them in the most productive way.

Book readability varies from page to page. Some books have a narrow range, their passages cluster around a common level. As we read these books, the reading challenge stays level. There are no hills or valleys. Other books have a wide range of readability. There are easy passages and hard passages. These books can enable us to use the momentum we gain from the easier passages to surmount the challenge of the harder ones. Overcoming this kind of resistance improves reading ability.

When we want to help a student read, we can Lexile them and then offer them books with a readability that matches their reading ability. It is also helpful to know the book’s passage difficulty variation. If we want our students to learn to read by reading, then we want to give them material that fascinates, motivates, absorbs and also challenges them.

We do that best by giving them books they want to read that are a little too hard for them, with passages that vary in passage difficulty. Then as they read along, they speed up and slow down. The speed-ups give them the energy and confidence needed to work through the slow-downs.

5.5 Using Lexiles

Books are brought into the Lexile Framework by Lexiling the books. Tests are brought into the Framework by Lexiling their items and using these Lexile calibrations as the basis for estimating readers and reading ability.

To write a Lexile test item, we can use any natural piece of text. If we wish to write an item at 1000 Lexiles, we select books that contain passages at that level. We select a 1000 Lexile passage and add a relevant continuation sentence at the end with a crucial word missing. This is the “response illustration.“ Then we compose four one word completions, all of which fit the sentence but only one of which makes sense. Thus, the only technical problem is to make sure all choices complete a perfectly good sentence, but that only one choice fits the passage.

The aim of a Lexile item is to find out whether the student can read the passage well enough to complete the response illustrated sentence with the word that fits the passage. Lexiled items like this are available at the Lexile website (www.lexile.com).

The Lexile Slice is a simple, easy to write item type. But in practice, we may not even need the slice to determine how well a person reads. Instead, we may proceed as we do when we take a child's temperature. Since, the Lexile Framework provides a ruler that measures readers and books on the same scale, we can estimate any person's reading ability by learning the Lexile level of the books they enjoy.

6 The One Minute Self-report

When our child says, “I feel hot!” we infer they have a fever. When a person says, “I like these books,” and we know the books’ Lexile levels, we can infer that the person reads at least that well.

7 The Three Minute Observation

To find out more about our child, we feel their forehead. The three minute way to measure a person's reading is to pick a book with a known Lexile level and ask the person to “Read me a page.” If they read without hesitation, we know they read at least that well. If they stumble, we pick an easier book. With two or three choices, we can locate the Lexile level at which the person is competent, just by having them read a few pages out loud.

With a workbook of Lexile calibrated passages, we can implement the three minute observation this simply by opening the workbook and turning the pages to give them successive passages to read.

8 The Fifteen Minute Measurement

To find out more, we use a thermometer to take our child's temperature, perhaps several times. For reading, we give the person some Lexiled passages ended with an incomplete sentence. To measure their reading ability, we find the level of Lexiled passages at which that person correctly recognizes what words are needed to replace the missing words 75% of the time.

The Lexile reading ruler connects reading, writing, speaking, listening with books, manuals, memos and instructions. This stable network of reproducible connections empowers a world of opportunities of the kind that the inch makes available to scientists, architects, carpenters and tailors (Luce & Tukey, 1964).

In school, we can measure which teaching method works best and manage our reading curriculae more efficiently and easily. In business, we can Lexile job materials and use the results to make sure that job and employee match. When a candidate applies for a position, we can know ahead of time what level of reading ability is needed for the job and evaluate the applicant's reading ability by finding out what books they are reading and asking them to read a few sentences of job text out loud. This quick evaluation of an applicant's reading ability will show us whether the applicant is up to the job. When an applicant is not ready, we can counsel them, “You read at 800 Lexiles. The job you want requires 1000 Lexiles. To succeed at the job you want, you need to improve your reading 200 Lexiles. When you get your reading ability up to 1000, come back so that we can reconsider your application.”

8.1 Lexile Perspectives

Jobs—Twenty-five thousand adults reporting their jobs to the 1992 National Adult Literacy Study (Campbell et al., 1992; Kirsch et al., 1993, 1994). Their reading ability was also measured. In 1992, the average laborer read at 1000 Lexiles. The average secretary at 1200. The average teacher at 1400. The average scientist at 1500.

When we can see so easily how much increasing our reading ability can improve our lives, we cannot help but be motivated to improve, especially when what we must do is so obvious. If we want to be a teacher at 1400 Lexiles but read at only 1000, it is clear that we have 400 Lexiles to grow to reach our goal. If we are serious about teaching, the Lexile Framework shows us exactly what to do. As soon as we can take 1400 Lexile books off the shelf and read them easily, we know we can read well enough to be a teacher. But if we find that we are still at 1000 Lexiles, then we cannot avoid the fact that we are not ready to qualify for teaching, not yet, not until we teach ourselves how to read more difficult text.

School—Reading is learned in school. The 1992 National Adult Reading Study shows that there is a strong relationship between the last school grade completed and subsequent adult reading ability. On average, we are never more literate than the day we left school. The average 7th grade student reads at 800 Lexiles. The average high school graduate reads at 1150 Lexiles. College graduates can reach 1400 Lexiles. For many of us the last grade of school we successfully complete defines our reading ability for the rest of our lives. Once we leave school—and we no longer benefit from the reading challenge that school provides—we tend to stop learning.

Income—Reading ability also limits how much we can expect to earn. The average incomes of readers in the 1992 National Adult Literacy Study indicated that, on average, an adult reading at 950 Lexiles made $10,000, at 1200 Lexiles, $30,000, at 1400 Lexiles, $60,000 and at 1500 Lexiles, $100,000. From 1000 to 1300 Lexiles, each reading ability increases of 150 Lexiles doubles our earning expectations. If we read at 1000 Lexiles and want to double our potential, then we have to improve are reading to 1150 Lexiles.

When students can see the financial consequences of reading ability on an easy to understand scale that connects reading ability and income, then they have a persuasive reason to spend more time improving their reading abilities. No need to berate students, “Do your homework!” Instead, we can show them, “You want more money? You want to be a doctor? Here is the road. Learn to read better. It's up to you. But we'll help you learn.”

8.2 Reading Education

Education can only succeed if we connect learning to each learner's selfish motives. We need to involve our students individually, to engage their desires and arouse their drives. When we do that, student education will drive itself. Then, all we need do is to add support and guidance. Otherwise, we will continue to deceive ourselves into running a penitentiary system that keeps some troublesome kids off the street, but only for a while.

Remember, when we know text readability, all we need do to learn how well a student reads is to ask them to read a page or two aloud. If they succeed, we can give them a harder page. If not, we know their reading ability is below the readability of the text we asked them to read. No need for debate. No need for guesswork. No need for confession or reproach. The student's status is plain to us and plain to them. We have not tricked them with a mysterious test score. All we have done is to help them see for themselves how high they can read.