Assessment and Learner Identity
The individual in contemporary society is not so much described by tests as constructed by them. (Hanson 1994)
Standardized assessments play an ever increasing role in modern society. They often determine the life chances of those assessed, shaping learner identities. Assessment is a social activity and is value laden, despite attempts to make it seem objective and neutral. The appeal of testing, particularly in relation to selection, is that it is fairer than other methods. However, it has sometimes made claims that are hard to support, for example, the inferences drawn from intelligence tests, which have profoundly affected learner identities. Ways in which assessment can contribute positively to learner identities are then considered.
Assessment as a Social Activity
Assessment is a social activity which has always been part of the fabric of life. Used in the broad sense of gathering information to make judgments, it has been part of decision making through the ages. Where to start a settlement or deciding who is innocent or guilty are these kinds of judgments.
While this is still the case, for example, in risk assessment or legal procedures, the focus has shifted to the deliberate gathering of information to make judgments about individuals or groups. The most familiar of these are the tests used for selection or to determine competence. These are tests which are deliberately designed to gather specific information about individuals so a judgment can be made, for example, about whether they should be selected for progression to a particular institution or be allowed to practice in an occupation.
When used for selection, these assessments have essentially become written, standardized tests, a tradition which stretches back over a thousand years to the Chinese Civil Service selection tests. Standardized testing has seen an exponential worldwide growth over the last hundred years and has become a major industry, particularly in the United States. These tests have increasingly become assessments of achievement, how well the curriculum is known or skills mastered, though generalized ability/intelligence tests still play a powerful role in some cultures (see below). Occupational assessments for the “license to practice” generally involve practical elements as well, for example, medical practicals or making a product (a tradition which goes back to the medieval guilds).
Assessment has increasingly become a powerful social tool because of the consequences for the life chances of those assessed. Failing a selection test when 11 years old has meant for many children either no or limited secondary education, just as passing examinations at 18 may lead to a university place and better job opportunities. These are the “high-stakes” assessments which have such serious consequences for individuals and which shape their identity as learners.
Tests as Fairer and More Meritocratic
How do such assessments come to have this power? The key appeal has always been that they are the fairest way of selection as they rely on individual merit rather than patronage or family connections. In this way they have provided opportunities for many who would not have had access to selection by patronage. It is estimated that in the Chinese Civil Service examinations of the Ming Dynasty (1368–1662) up to 60% of the successful candidates were not from the families of the administrative elite (a better ratio than today’s administrations?).
However, just because candidates all take the same test at the same time under the same conditions does not automatically make tests fair or meritocratic. Throughout the history of such testing, certain groups have been excluded from entry. Gender is the most obvious example, with females excluded from most selection tests well into the twentieth century. There is a similar history for lower social classes (slaves, laborers, and actors could not enter the Chinese examinations) and for racial and religious groups. This sends the identity message that these groups are not capable of doing this kind of work.
This has been compounded by the neglect of differences in preparation for the test. Those who had a privileged preparation for tests as a result of elite education would put their success down to merit; they performed better than others, rather than to advantaged preparation. This in turn fed into beliefs that the privileged had more natural ability than the disadvantaged, beliefs which fed into identities of social superiority. The worldwide phenomenon of an exam preparation industry funded by more affluent parents and operating outside regular schooling is part of this. Affirmative action programs for such as university entrance have recognized this problem by allowing for imbalances in the quality of schooling, often meeting opposition from the privileged elite.
Bias. Linked to this is the issue of the fairness of test content. Tests can never be culture-free, so whose culture is being tested? This is about bias in testing. Does one item advantage a specific group by assuming a particular cultural knowledge, for example, a particular interpretation of a country’s history or literature? There are now techniques for identifying those biases which favor one group against another, for example, Differential Item Functioning – DIF, but these may still underestimate bias within the broader curriculum. We also know that the mode of assessment will affect results, for example girls are likely to perform better than boys on open-ended writing tasks while some students will do better on practical projects rather than tests.
Assessment as a social rather than objective activity. The development of measurement techniques (“psychometrics”) to improve the quality of standardized tests has led to the perception that assessment is a neutral “scientific” activity rather than a socially constructed process. This positivistic position assumes that tests are simply measuring “what’s there” and are independent of it. So psychometrics is presented as a scientific and detached measurement activity.
The more sociocultural position adopted here is that any assessment is essentially a social activity, so that what is assessed, how it is assessed, and how the results are interpreted are all value-laden social activities. There is no such thing as culture-free assessment – even nonverbal tests involving abstract mental reasoning (for example Raven’s Matrices) are rooted in particular cultural understandings and experiences.
Because assessment is a social activity, it shapes how individuals and groups see themselves. The philosopher of science Ian Hacking writes about how “sometimes our sciences create kinds of people that in a sense did not exist before. This is making up people” (2006, p. 2). He applies this to conditions like Multiple Personalities which was “discovered” in the 1970 and saw a dramatic increase in people with the condition. More contemporary examples might be the way in which increasing number of children are identified as “dyslexic”, or having Attentional Deficit Hyperactivity Disorder (ADHD), or Asperger’s syndrome. This is not to say that such children do not have reading, attentional, or interactional difficulties, but the labels we choose and how we respond to these labels is a social process. Hacking identifies ten “engines of discovery” that drive this process. These are 1. Count, 2. Quantify, 3, Create Norms, 4. Correlate, 5. Medicalize, 6. Biologize, 7. Geneticize, 8. Normalize, 9. Bureaucratize, 10. Reclaim our identity. These offer a useful framework with which to understand how educational assessments can shape learner identities. They fit particularly well with the development of intelligence testing, a form of testing that has had a powerful impact on the identities of individuals and groups for over a century.
Intelligence Testing: A Case Study in “Making up People”
In 1923 Edwin Boring defined intelligence as “what the tests test”. This is more profound than it first appears as it signals that how we understand intelligence is largely the result of how it has been tested. What is included in an intelligence test in turn involves social judgments about what the concept of intelligence represents.
The history of intelligence testing is instructive in understanding this. Intelligence testing as we know it began with the work of Alfred Binet and Theodore Simon in France at the beginning of the twentieth century. They were looking for ways of identifying children in Paris who, with the introduction of universal primary education, would not be able to cope with regular schooling and may need specialist help. Binet took a pragmatic approach to the development of his tests which focused on what was required for school-based learning. In terms of the “engines of discovery,” he used the first four processes to construct his tests and to provide a “mental age.” His social philosophy was that intelligence was “the capacity to learn and assimilate instruction” (1909, p. 104), and the task of educators was to improve pupils’ intelligence and he himself developed “mental orthopedic” exercises to help with this.
Binet’s tests became the basis for many tests in the English speaking world, the Stanford-Binet intelligence test was widely used test throughout the twentieth century. However in the hands of Anglophone psychologists and statisticians with very different social views to Binet about the nature of intelligence it was, in Hacking's terms, medicalized, biologized and geneticized. Statisticians such as Francis Galton, Charles Spearman, and Cyril Burt in England and Louis Terman and Edward Thorndike all had strong beliefs about the inherited and fixed nature of intelligence. Intelligence was reified (it is a physical entity somewhere in the brain) to something that could be quantified and standardized, using statistical techniques that they developed. Their philosophical beliefs underpinned these developments – intelligence was inborn and fixed, the successful, including certain races, had more of it. The poor were as they were because of limited intelligence which they then transmitted genetically to their children. While presented as objective scientific findings, they reflected deeply held cultural beliefs evidenced by the proponents’ involvement in social programs related to them. Galton coined the term “euthanasia” and wanted the breeding of the poor restricted, as did Terman in the United States who also called for restrictive immigration laws. These claims have not gone away – they are echoed in best sellers such as Herrnstein and Murray’s The Bell Curve: Intelligence and Class Structure in American Life (1994).
Intelligence testing became part of the social fabric in many countries (Hacking’s “normalize” and “bureaucratize”). In the UK, it was widely used for selection to secondary school; the 11+ test used the standardized scores to select the top twenty per cent or so of students for a prestigious grammar school education. It was also widely used in job selection and for identifying pupils in need of special education.
The issue here is how a single test can shape the life chances and identities of cohorts of students. To fail the 11+ sent a signal to the student that they did not have the capacity for academic study – in Patricia Broadfoot’s words “Intelligence testing, as a mechanism of social control, was unsurpassed in teaching the doomed majority that their failure was the result of their own inbuilt inadequacy” (1979, p. 44).
While the claim that intelligence was a unitary “hard-wired” capacity (Spearman’s “g”) has dominated much Anglophone culture, there have always been those who have opposed this interpretation (“reclaim our identity”). Statisticians such as Louis Thurstone used different statistical techniques that produced seven primary abilities that were independent of each other and could not be aggregated into a single scale. Also in this tradition is Howard Gardner’s multiple intelligences which present eight separate intelligences, for example linguistic and bodily kinaesthetic, which cannot be simply assessed by pencil and paper tests or put on a single scale. Daniel Goleman’s influential Emotional Intelligence (1995) attacked the notion that intelligence test score (IQ) was the critical measure as our social intelligence was more important to success.
Other cultures do not assume that intelligence is fixed at birth and are closer to Binet’s more pragmatic concept of malleable intelligence. For example, Confucian heritage societies, with their emphasis on effort and motivation to improve, see intelligence as something that can be developed
Achievement Testing and Identity
In our current education systems, there has been a reduction in the reliance on intelligence testing for selection, though it has sometimes been replaced by “ability tests” which closely resemble them and which produce a less emotive reaction. There may be many reasons for this decline, not least the move to more comprehensive secondary education systems – so there is not the same need for selection for types of school. It also represents a lessening of confidence in IQ scores as their claims and predictive accuracy have been challenged.
It has increasingly been recognized that ability scores are indicators of general achievement rather than the cause of it, this is why they correlate well with school achievement. Robert Sternberg, a leading intelligence researcher, concludes that “what distinguishes ability tests from other kinds of assessments is how the ability tests are used (usually predictively) rather than what they measure. There is no qualitative distinction between the various kinds of assessment. All tests measure developing expertise.” (1999, p. 60)
A good example of this is the shift in claims about the SAT test in the United States. This is a test taken at the end of high school which is important in the college application process. The SAT began life as the Scholastic Aptitude Test, an ability test to predict college success. In 1990 it changed to the Scholastic Assessment Test because it was recognized that it could not be accurately described as an ability test as it also measured achievement. By 1997, even this title was thought to be unhelpful and so the letters SAT became an empty acronym – they did not stand for anything. By then the purpose of the test was more modestly defined in terms of determining how well students analyze and solve problems – skills that are learned in school that will be needed in college.
The use of achievement tests which measure the level of understanding of the given curriculum is now widespread. These typically take the form of end of school examinations, the grades or marks from which one may determine university selection. These assessments are high-stakes because of the consequences for individuals. The French sociologist Pierre Bourdieu observed that “between the last person to pass and the first person to fail, the competitive examination creates differences of all or nothing that can last a lifetime” (1991, p. 120). In some countries, a single mark in the national examination can make the difference between a university place and a year in military service.
These are dramatic examples of how test results may affect a student’s identity as a learner. However, the assessments conducted in education systems and in schools may have a pervasive impact of students’ identities as learners. Assessments used to set or stream students into different ability groupings will impact on their identities as learners, especially when there is little movement between groupings. Research in mathematics groupings in England, where ability grouping has been encouraged by the government, has shown that of the children in a bottom group at 5 years old around nine out of ten will still be in the bottom group at 16 years of age and will no doubt see themselves as “no good at maths”. Many high-performing countries, in which the gap between high- and low-achieving students is much smaller (for example Finland and South Korea), do not permit ability grouping in primary education.
Assessment and Identity: Some Positive Steps We Can Take
Limit assessment ambitions by focusing on achievement. Assessment is a social activity which informs us about what has been learned. It is not culture-free nor can it tell us about any underlying ability independent of what has been experienced and learned.
Ensure the assessments are as fair as possible. Fairness is about more than standardizing test-taking conditions. It involves reducing bias in the cultural assumptions about what is required and seeking to make sure students are clear about what will be assessed and have the resources to support this. If groups or individuals are disadvantaged in these processes, it will impact as their identity as learners in a particular culture. Where there are cultural or resource disparities, allowances can be made for this in terms of affirmative action?
Interpret results more cautiously. There has been a history, particularly in relation to intelligence testing, to infer more from assessment results than they can validly support. Using a one-off IQ score to pronounce on someone’s lifelong ability to learn impacts powerfully on learner identity. Imagine never being allowed to drive because you failed your first driving test. This caution also applies to achievement tests when grades are interpreted without reference to factors affecting performance.
Create sustainable assessment. David Boud has developed the concept of the double duty of assessment in which “any assessment act must also contribute in some way to learning beyond the immediate task…assessment that meets the needs of the present and prepares students to meet their own future needs” (Boud 2002, pp. 8–9). In this way, assessment can help shape positive learner identities, equipping learners with confidence to face unknown futures.
In summary, assessment is a powerful social tool in the shaping of learner identities. The labels and judgments which assessments generate through scores and grades affect how we view our ability, attainments, and potential. Their impact has often been negative, especially as tests can never fully capture what we know, understand, and can do. We live in testing times but we need not be at the mercy of them.
- Binet, A. (1909). Les Idées Modernes Sur Les Enfants. Paris: Flammarion.Google Scholar
- Boud, D. (2002). The unexamined life is not the life for learning: Rethinking assessment for lifelong learning. Middlesex: Professorial Lecture given at Trent Park.Google Scholar
- Bourdieu, P. (1991). Language and symbolic power. Cambridge, MA: Harvard University Press.Google Scholar
- Broadfoot, P. (1979). Assessment, schools and society. London: Methuen.Google Scholar
- Goleman, D. (1995). Emotional intelligence. New York/London: Bantam Books.Google Scholar
- Hacking, I. (2006) Kinds of people: Moving targets, The Tenth British Academy Lecture. http://www.britac.ac.uk
- Hanson, F. A. (1994). Testing testing: Social consequences of the examined life. Berkeley: University of California Press.Google Scholar
- Herrnstein, R. J., & Murray, C. A. (1994). The bell curve: Intelligence and class structure in american life. New York: Free Press.Google Scholar