Improving our understanding of the history of psychometrics was the main reason for doing an oral history project. Before we continue with the presidents’ perceptions, we will sketch a general (historical) framework that helps to contextualize the interviews.
Psychometrics originated at the end of the nineteenth century and early twentieth century, with the work of academics like Francis Galton, Karl Pearson, Charles Spearman, and Louis L. Thurstone. It has seen a number of shifts which closely resemble the four generations of test theory that Paul Holland (one of our interviewees) has conceptualized (Dorans, 2011). Holland’s delineation starts in the early twentieth century when test theory’s first generation started with developments in classical test theory, reliability, and validity. The second generation, which started in the 1940s and peaked in the 1970s, was concerned with the development of models for item-level data. The third generation, which started in the 1970s, focused on the statistical advancement of item-level models. The fourth generation attempts to bridge the gap between the psychometrician and the testing enterprise, by developing methods for differential item functioning or test equating.
When we transpose this delineation to psychometrics, we find that it lacks a clear role for factor analysis (a first-generation development and, as we will see later, considered crucial in the history of psychometrics) and structural equation modeling and multidimensional scaling as part of the second and third generation. Moreover, we consider the fourth generation to be broader than just bridging the gap between psychometrics and the testing enterprise: As we will see below, many fourth-generation psychometricians aim at finding connections with a variety of other sciences and enterprises, not just the testing industry.
Importantly, Holland argues that none of these generations have permanently ended: All generations—though some might have drastically shrunk over the years—are still active research domains, and Psychometrika still publishes research from these four domains. The most cited papers from the past two decades concern topics in structural equation modeling, reliability estimates, and advances on a variety of latent variable models (a mixture of topics from different generations). Articles on ”Item Response”. theory still make up a significant part of Psychometrika’s content, and historically speaking, articles on the analysis of proximities have also been one of Psychometrika’s pillars (Heiser et al., 2016). More recent directions are cognitive diagnosis, Bayesian methods for model estimation, and computer adaptive testing. What is interesting about this list is that topics like the replication crisis, questionable research practices, and the practice of educational measurement—exceptions granted—are usually not addressed in Psychometrika. Psychometrika mainly publishes in-depth theoretical and technical papers, not commentaries on research or testing practices. Psychometrics, as understood in this paper, is thus a highly technical, abstract, and model-based research domain.
Key Moments in the History of Psychometrics
In the interviews, we asked the presidents how they perceive the history of psychometrics, and especially what they believe were psychometrics’ key moments and main achievements.
One of the questions we asked was what the presidents believe is the most significant work or the most important psychometrician in the history of psychometrics. The most common answer (given by eight interviewees) was that this must be Lord & Novick’s Statistical Theories of Mental Test Scores (1968). Statistical Theories of Mental Test Scores came out at ETS (Bennett & Von Davier, 2017) and was one of the first works in psychometrics to give a formal treatment of classical test theory (Traub, 1997). Its publication took place in the midst of the shift from classical test theory to modern test theory, possibly the quintessential paradigm shift in psychometrics. Though classical test theory was strictly speaking never falsified, the latter became dominant in most psychometric research. Lord & Novick (1968) is one of the first comprehensive works to treat topics from both classical and modern test theories. Brian Junker praises it for having ‘everything from factor analysis to IRT and other things that are relevant to standard measurement questions in psychometrics. [...] there is a real effort to connect psychometrics to current thinking in statistics.’ Ivo Molenaar praises it for being:
on the transition of the old classical correlation-based and classical test theory-based models, to the item response models and latent trait models. [...] Fred Lord was the classical one, and Mel Novick brought in the logistic models, which was definitely a very important step for the psychometric community as a whole.
This strong consensus on the central importance of Lord & Novick’s Statistical Theories of Mental Test Scores is remarkable and invites further research on the effect the work has had on the development of the field.
Some presidents go further back in time to the early twentieth century and consider either Charles Spearman or Louis L. Thurstone, the founders of factor analysis, as the most important psychometrician in the history of psychometrics. Klaas Sijtsma regards Spearman as revolutionary:
He actually combined psychological problems he was struggling with, with the development of statistical tools that he needed to tackle those problems, and in a way, he is the founding father of classical test theory and factor analysis, which is not a small accomplishment; it is incredible.
Paul De Boeck states that, between Charles Spearman and Louis Thurstone, he prefers the latter. Thurstone (1934) ‘He [Louis Thurstone] was doing factor analysis, but not just to measure. His paper was called ‘Vectors of Mind’, so he wanted to explain the human mind. He both had an interest in measurement, and an interest in understanding how the mind functions.’ Larry Hubert commends Thurstone for training and educating so many prominent psychometricians, like Paul Horst and Ledyard Tucker. And it was also Thurstone whom David Thissen admires most:
Thurstone made everything. Thurstone made the discipline; he came from nowhere, received degrees in things like engineering, and created quantitative psychology; he created scaling, he changed factor analysis into multiple factor analysis. He started the Psychometric Society.
Willem Heiser and Robert Mislevy consider Lee Cronbach as one of the most influential psychometricians in history. According to Heiser, Cronbach’s paper on the reliability coefficient is one of his most significant contributions (Cronbach, 1951), due to its applicability to practical problems in research, not only in psychology but also in medical science or other fields where measurement plays a central role. Mislevy praises Cronbach for thinking critically about psychological measurement and the inferences or conclusions you can draw based on certain data, referring here to generalizability theory (Cronbach et al., 1972): ‘he laid down some real mileposts, about how psychometrics is not just about measurement, it is about the quality and the nature of inferences that you’re making.’
Some presidents do not mention specific people, but rather focus on a typical psychometric idea that was historically significant. For example, Peter Bentler mentions the theory of error as an essential scientific contribution by psychometrics:
Very influential was the idea of errors in measurement, which of course, had been around for a long time in astronomy – it is not like Spearman invented it - but Spearman thought about it in a way that made it relevant to psychological measurement.
Jos ten Berge agrees: ‘The very simple fact that when you measure someone’s intelligence twice, you don’t get the same results, means that at least one of the two measurements cannot be correct, and that must be error.’ Not only is the idea of the quantification of error in measurement an important scientific contribution of psychometrics, but it also marks the attitude of the psychologist or psychometrician as a researcher. Jos ten Berge argues the following:
It is a very interesting fact that psychologists have a routine of evaluating their measurements, for instance, by reliability and validity studies. It is a form of self-criticism that often isn’t sufficiently appreciated. It is a very beautiful situation: a discipline that distrusts its own results.
The conceptualization of measurement error and its incorporation in psychometric models are thus seen as unique contributions of psychometrics to the sciences. Moreover, these contributions characterize how the psychometrician practices research: with a strong awareness of the imperfection of (psychological) measurement. Ten Berge’s remark underscores that the characteristic viewpoint of the psychometrician involves the recognition and appreciation of the problems involving psychological and educational measurement.
The Dark Ages of Psychometrics
According to several presidents, psychometrics’ most important contribution to society is psychological and educational testing. Testing has pervaded several phases in people’s lives, and psychometricians turned it into a standardized and reliable enterprise. However, measurement and testing do not only resonate in the ears of some of our respondents as something that is only positive and for a good reason. Despite the fact that the controversial part of the history of psychometrics was not an official interview topic, some presidents bring it up themselves, often torn between psychometrics’ controversial history on the one hand and its important achievements on the other. When David Thissen states that it was indeed testing that put psychometrics on the map, twice, he states that this was ‘for better or for worse.’ Jacqueline Meulman says that she:
was amazed by how many bad things had happened in psychometrics, I was flabbergasted. On the other hand, I was intrigued by the mathematical background of the methods I was reading about [...]. Although I did realize that many of the great psychometricians didn’t have very good political backgrounds, I was intrigued by the methods themselves [...].
The interviewees refer here to the controversial history of mental measurement, which was strongly intertwined with nineteenth and twentieth-century politics, and especially eugenics. Eugenics—a scientific and political movement that aimed to improve the genetic quality of the human population, which thrived late 19th and early twentieth century (Chitty, 2007)—was a popular ideology among many psychometricians, among which Charles Spearman, Lewis Terman, and James McKeen Cattell. In these times, the measurement of intelligence was often misinterpreted and misused to attribute differences in intelligence test scores to genetics (Jackson & Weidman, 2004; Richards, 2012). Predominantly during the late nineteenth century and early twentieth century (though not exclusively so), differences in scores on intelligence tests served as ‘scientific’ proof for the claim that some groups (Afro-Americans, women, people of lower classes) were less intelligent and thus less worthy than upper-class white males. And though the Psychometric Society did not have an explicit eugenic ideology (or any political motivation for that matter), at least one president entertained similar ideas. Henry Garrett, president in 1943, supported the idea of hereditary racial differences in intelligence and racial segregation (Winston, 1998). The history of psychometrics is thus not a sequence of one groundbreaking scientific achievement after the other, nor were all psychometricians always distrusting of their results.
Other presidents also refer to the adverse effects of psychometric research. Bill Stout states that when done well, psychometrics can be very important, but psychometricians have also sometimes ‘oversimplified a very complicated subject.’ Here, Stout refers to the Bell Curve controversy (Herrnstein & Murray, 1994), a more recent example of how differences in intelligence scores are used to justify differences between races and social groups. Larry Hubert is highly critical of psychometrics’ past, and where other presidents see testing as a relatively positive contribution of psychometrics, Hubert is not so sure: ‘[...] I’m not sure if all in all the idea of measuring intelligence hasn’t brought more ill stuff than it has brought good stuff. The whole politics of race and psychometrics is not a very happy one.’ Though the dark ages of psychometrics were not an official interview topic, several presidents touch upon them on their own initiative, implying that these dark ages should not be overlooked in further historical research.
The Relationship Between Psychometrics, Psychology, and Statistics
As we discussed in the introduction, what is intriguing about psychometrics is its position relative to other disciplines. Though psychometrics originated in psychology, it is now closely affiliated to statistics as well. In this section, we will discuss how the presidents perceive the relationship between psychometrics and two of its closest neighbors: psychology and statistics.
Psychometrics, Psychology, and Educational Measurement
The relationship between psychometrics and psychology is hard to define, but the detachment between psychometrics and psychology (and also the detachment between psychometrics and educational measurement) rises to the surface in several interviews. What the psychometricians disagree on is whether this detachment is indeed an issue, and in case it is how psychometricians should act on it.
A particularly vivid illustration of the disconnected relationship between psychology and psychometrics is formed by the similarly detached attitude of some of the interviewees towards psychology. Some presidents express a certain ignorance of or lack of interest in what is going on in psychological research: They explicitly mention knowing little of psychology, or just not being interested in it. For example, statistician Bill Stout stresses the importance of statistics in psychological research but mentions not knowing enough what is going on in the field of psychology to see how psychometrics can contribute. Jacqueline Meulman expresses her discomfort with topics in psychology or educational measurement and states she feels more at home in biostatistics. Though appreciative of fellow psychometricians doing psychological research, their own interests lie somewhere else.
This indicates an important change with respect to the early twentieth century because it is hard to imagine a similar approach to psychology and psychometrics in the early days of psychometrics when psychometrics and psychology were still in a close relationship. The remarks of some of the presidents show that it is currently possible to be a successful psychometrician and a president of The Psychometric Society, without having either a background or an active interest in psychology. Being successful in psychometrics and being a president of the Psychometric Society, therefore, does not require a strong connection to psychology or educational measurement: Having strong ties with mathematics or biostatistics is equally relevant and appropriate. Modern psychometrics has thus evolved into a field that is no longer dedicated to psychology alone and can no longer be defined as psychology’s statistical counterpart; instead, psychometrics has developed ties with different fields, which shows in the backgrounds and interests of the presidents of the Psychometric Society.
Several presidents argue that standardized testing or educational measurement is the most important contribution of psychometrics. However, some stress that psychometrics also has trouble reaching educational measurement: Similar to psychology, educational measurement is missing out on some of the newest psychometric methods. Susan Embretson explains that this is because ‘testing is the hardest thing to change’; people in education are slow in adopting cognitive theory for item construction. According to Jacqueline Meulman, educational measurement is missing out on psychometrics because ‘major testing institutes in the US don’t use the work of psychometricians, and there are even institutes or agencies that do testing that use nothing that comes of out of the psychometric community.’ However, the detachment might be less severe than with psychology: psychometricians like Wim van der Linden and Hua-Hua Chang also see many possibilities for psychometrics in educational measurement, especially for adaptive testing. According to Van der Linden and Chang, there is high demand for adaptive methods and they see this continuing in the future.
There are a number of possible explanations for the growing distance between psychology and psychometrics. David Thissen explains that, before the 1950s, a psychologist was also trained in psychometrics, but for the sake of the grant system, psychology departments are divided into subfields. ‘It is now almost inconceivable to get to this state of the art in more than on one of these subareas, in one brain. You can never know enough.’ In other words, one becomes a social psychologist, a developmental psychologist, or a psychometrician, and there is very little mingling between the three professions. Related to this, Jan de Leeuw states that he also finds it the job of the psychologist, not of the psychometrician, to engage with building psychological theories. According to De Leeuw, the psychologist and the psychometrician simply have different job descriptions, which means that the work they are doing is fundamentally different.
A second explanation has to do with how psychometric research is communicated to external parties. Bengt Muthén, Larry Hubert, and Peter Bentler express their opinion that Psychometrika or other psychometric literature can sometimes be too narrow in terms of content, and perhaps also too technical and too theoretical for the psychologist or educational researcher to read and use. Consequently, Psychometrika has become out of reach for applied researchers without thorough psychometric or statistical training. Psychometrics might thus have become too much of a niche, and consequently, detached from psychology.
For several presidents, the growing distance between psychology and psychometrics is a reason to worry. Klaas Sijtsma states that he now encourages ‘everybody to engage in theory building. So, to become a psychologist, rather than a psychometrician.’ He pleads for a more unified psychology, where once again people are trained both as a psychometrician and psychologist. De Boeck also pleads against using psychometrics as purely a statistical toolkit. ‘I think psychometrics is a way of thinking about substantive issues, and it’s possible to come up with ideas, substantive ideas, based on a certain way of understanding psychometric models.’ According to these presidents, psychometrics is not just a toolbox of purely statistical, data-analytic models, but a set of models and techniques that can inspire substantive thinking about psychological problems and thereby aid psychology theory building.
A reason why building psychological theory is no longer one of psychometrics’ priorities is given by Susan Embretson:
There is a whole breed of psychometricians out there who seem to have less of a substantive background, and I do not think that’s a good thing. I think they might be dealing with rather narrow statistical issues that are not really going to make a difference in the discipline [...]. So, I really see a necessity to keep quantitative methods attached to a discipline so it can influence that discipline.
According to Embretson, psychometricians can sometimes be too involved with technical details, whereas they should pay more attention to what they can contribute to psychological research. As mentioned earlier, Psychometrika mostly publishes articles on narrow, statistical issues, rather than articles that are relevant and readable for the psychologist. Psychologists might, therefore, not be inclined to look for relevant literature there.
However, the reason for the detachment does not only lie in psychometrics’ court. Several presidents mention the lack of interest of the psychologist in applying proper psychometrics. When we ask James Ramsay to identify the relationship between psychology and psychometricians, he answers:
I would say it is both distant and uneasy because the psychologist needs psychometricians badly, but quite frankly, once they have what they need, they do not want to hear anything else, so statistically speaking, it is a very conservative community.
It is hard to escape a sense of disappointment or frustration here. Psychometricians are not able to get their expertise across, whereas helping psychologists with their methodological problems is often considered part of the job description of the psychometrician. The psychometrician is supposedly the consultant who offers statistical or methodological advice, but psychometricians can only do their job if the psychologist seeks the psychometrician’s help when in need. In practice, this does not happen frequently enough, and that is a shame. Wim van der Linden states that psychometricians ‘could be a major support to psychology, make their measurement rigorous, and then plan their experiments better, help them model. [...] it could feed psychology.’ Psychometrics could thus provide valuable input for the psychologist, which the psychologist is now missing out on.
The interviews show that the relationship between psychology and psychometrics is nothing short of complicated. What makes the psychology–psychometrics relationship even more challenging is that psychometrics is also strongly affiliated with statistics, the topic of the next section.
Psychometrics and Statistics
After psychology, statistics is probably psychometrics’ closest kinship, and the relationship between the two was frequently touched upon in the interviews. According to Brian Junker, the separation of psychometrics and psychology is not necessarily a reason to worry: ‘In a certain sense, psychometrics is by definition tied to psychology, but the methods are really just the methods of latent variable modeling for individual differences, and that may or may not be tied to psychology.’ According to Junker, psychometrics may have its origins in psychology, but this does not imply that psychology should be its only connection. Many presidents stress that it would be beneficial for psychometrics if it were to extend its influence to other fields. They believe psychometrics should make more effort to be taken seriously by other fields, like statistics, since it could make important contributions there as well.
Willem Heiser uses the metaphor of a river system to describe the relationship between statistics and other disciplines with a strong quantitative component:
A river system starts with small little rivers, and which is where I consider the various disciplines, like biology, psychology, economy, econometrics, chemistry. Those are the areas where people do quantitative things. Sometimes, they invent something for themselves which is useful for others, and then these techniques that are invented in a substantive area go down the stream to the big river. The big river is statistics, so to speak. That is where everything ends up.
According to Heiser, scientific disciplines with a quantitative focus each develop their own statistical methods, which at first are devoted to solving a specific substantive research question, but then get stripped from substantive interpretation. These models are subsequently free to move from the small river to the big river of statistics, which is filled with models developed in a wide variety of research areas. Not uncommonly, quantitative methods developed in one river find their way to other disciplines as well. An example of such a method in psychometrics would be factor analysis, which was originally developed to describe general intelligence, and has now found its way to other research areas both in and outside psychology (Young & Pearce, 2013).
The close connection between statistics and psychometrics becomes clear when we find that a number of presidents do not have a background in psychology, but in statistics or mathematics. Paul Holland articulates this close connection between the two: ‘I think that psychometrics has a very strong statistical side, I keep thinking of psychometrics as being part of statistics, not so much “psycho”. Even though the guys that invented the field all came from psychology.’ Like Willem Heiser, Paul Holland stresses that methods developed in psychometrics are no longer restricted to psychological research alone and can be used by other disciplines. Taking Holland’s perspective a bit further, we might say that psychometrics has lost its ‘psycho’-affiliation throughout the years and became a type of modeling that is relevant for a variety of research domains (psychology, sociology, medical science, artificial intelligence) and can be gathered under the statistics umbrella.
Even though psychometrics and statistics have a close relationship, several presidents point out that psychometrics has a problem making that connection beneficial for both sides: There is plenty of proper, technically well thought out psychometric work that is useful for the statistician but is not recognized as such by other statisticians. Jan de Leeuw gives a reason why original psychometrics did not strike a chord with the statisticians: ‘It was mostly because of the way the original factor analysts, who were psychologists, like Spearman and Cattell, presented [factor analysis] as some magical tool that could discover laws of nature by simple inductive data analysis.’ Interestingly, the same magic-jargon is mentioned by Bengt Muthén, who says that ‘statisticians think of that [factor analysis and structural equation modeling] as hocus pocus machinations.’ Psychometricians magically pulling ‘factors,’ such as intelligence, out of the hat did not sit well with the statisticians, who were possibly less interested in making strong substantive claims about the identity of latent variables than the psychometricians and psychologists at the time.
Moreover, some interviewees point out that on a number of occasions, research that was being done under the name of statistics, had actually already been done before in psychometrics. But because psychometrics is too much of a niche field, researchers from other fields simply do not know it had already been done before. And this leads to frustration among some of the presidents since psychometrics could, in fact, contribute a lot to the field of statistics. According to Muthén:
[...] it is a strong tendency in statistical journals to refer to early statistical articles referring to the psychometric literature [instead of referring directly to the original psychometric literature] [...]. It seems psychometric publishing seems to be too separatedfrom general mainstream statistical modeling [...].
Interestingly, the public relations issues of psychometrics seem to come up both with the psychology-oriented presidents and with the statistics-oriented presidents: Psychometricians are not able to reach out to either group and fail to receive acknowledgment for their work.
The Identity of the Psychometrician: A Multitude of Approaches
The sections above show there are multiple ways how the psychometricians perceive their own field, and that contemporary psychometrics consists of a variety of approaches, each with their own ideas and visions. Below, we distinguish between five approaches we have recognized in the interviews. Our intention here is not to categorize each respondent and define them as a specific type of researcher, but to show there are different ways in which psychometrics research can or should be practiced, each prioritizing different characteristics or elements of psychometric research. The types discussed below underscore the plurality of approaches in a field that, to the outside, might seem relatively uniform.
First of all, unsurprisingly perhaps, we identify the psychometricians who identify themselves as both a psychometrician and a psychologist. The psychology-oriented psychometrician uses psychometrics as a way to improve psychological understanding and always has a substantive interest. According to the psychology-oriented perspective, psychometric models do not only describe or summarize psychological data but can help in understanding or explaining the data as well. The division between the psychometrician and the psychologist then becomes rather fuzzy: Psychometricians who are driven by substantive questions take on a double identity (being both a psychometrician and a psychologist) rather than identifying themselves as solely a psychometrician. For reasons cited earlier, people like Klaas Sijtsma, Susan Embretson, and Paul De Boeck are psychometricians who have a psychology-oriented approach.
Closely related to the psychology-oriented approach, but not entirely equivalent, is the consultant approach. The consultant aims to maintain a close relationship with psychologists and encourages collaborations, in which the psychologist comes up with a substantive research question, and the psychometrician offers methodological advice. The difference between the psychologist approach and the consultant approach is that the psychometricians of the first kind have an intrinsic interest in psychological theory and uses psychometrics as a way to build psychological theories, whereas the psychometrician with a consultant approach prefers to aid psychologists in solving methodological and statistical problems and leave the actual theory building to the psychologist. Peter Bentler and Bengt Muthén, who often collaborated with psychologists or other applied researchers and helped them solve complex methodological problems, might recognize themselves as taking up such a role in their research.
The Data Analyst
Third, we find that a number of presidents have more of a data analytic approach. These psychometricians view psychometrics as a toolbox that contains a set of models that are mostly of the latent variable type, which they consider applicable to a wide variety of data and disciplines. Though some of these models were perhaps originally designed for psychological measurement, in a data-analytic approach, these models are not necessarily used as substantive models and can be translated to several types of data for different types of purposes. The goals for the data analyst are usually not explaining the data or understanding the underlying mechanisms (which would be major motives for the psychology-oriented psychometrician) but rather to make predictions or summarize the data. Brian Junker, who, as quoted earlier, considers psychometric models to be translatable to all sorts of research problems. His view aligns with the data-analytic approach.
A fourth type we encountered is the engineer. Engineers are people who are interested in ‘making’ technologically advanced artifacts, which then find a clear application in society. Examples of such artifacts in psychometrics are innovative types of tests, like computer adaptive tests or simulation assessments, but also software programs. These applications then find their way to testing agencies, educational measurement, or the scientific community. Through these artifacts, the engineer may try to explain human behavior or solve challenging technical problems, but this takes place through a real-world application, rather than doing foundational or theoretical work only. People like Hua-Hua Chang, Wim van der Linden, and Robert Mislevy are co-builders of such applications and share an engineering-approach.
Lastly, we distinguish the mathematician who gains most joy out of proving a mathematical theorem or solving a technical problem, without necessarily feeling the need to find an application or answering a substantive research question. The mathematician approach does therefore not require collaboration with psychologists or other applied researchers. For the mathematician, knowledge for the sake of knowledge (not for the sake of application) is sufficient. Moreover, the indisputable quality of mathematics—proving a theorem for once and for all—has an incredible appeal to some of the presidents. Jos ten Berge stresses that what he likes so much about psychometrics is ‘the absolute certainty with which you can decide about what is true or isn’t true. The mathematical part of it.’ This sentiment is also shared with Jan de Leeuw, who finds psychology too ‘debatable, or uncertain, or up in the air,’ and who appreciates the beauty of mathematics.
Two Dimensions of Psychometric Research
Naturally, a psychometrician does not necessarily fall under only one of the categories above: A combination of approaches is equally plausible. For example, someone who is a designer of technologically advanced tests—whom we might characterize as having an engineering approach—may also be interested in learning mechanisms in school children and thus have a substantive or psychological interest as well. For this reason, we summarize these categories in two dimensions, one ranging from ‘psychology’ to ‘statistics,’ the other ranging from ‘theoretical’ to ‘applied.’ Our respondents differ from each other in whether their research is driven by psychological questions or technical statistical issues, and at the same time, they differ in how strongly they concern themselves with applied or theoretical topics. Someone with a mathematical approach is more on the theoretical and statistical side of both dimensions, whereas the psychometrician with a strong interest in psychology can be located in the psychology/theoretical corner (or more on the applied side, if this psychometrician has a strong focus on doing applied research). These dimensions thus describe core aspects of the multifaceted identity of psychometric research.
The Future of Psychometrics
The interviews provided an excellent opportunity to invite the presidents to take a look into the future of psychometrics and ponder on possible directions psychometrics might take. Some presidents think psychometrics will continue to remain relevant. Jos ten Berge stresses that since psychologists do not have the technical training that psychometricians have, there will always be a need for psychometricians. According to David Thissen: ‘[...] testing will continue to develop and continue to be a thing that is done for placement in education, in jobs. [...] I think testing still has some decades, if not centuries in it.’ Testing thus remains an important application of psychometrics. Analyzing test data well and making the right decisions based on test scores are still crucial in today’s society and will most likely continue to remain crucial in the upcoming decades. Moreover, testing now transcends traditional paper–pencil formats, and new types of tests are continuously being developed. The expertise of the psychometrician is therefore crucial and relevant and will remain so in the future.
However, the future relevance of psychometrics does not seem guaranteed. A number of interviewees express a certain sense of uncertainty with regard to a fruitful future of psychometrics. Though the interviewees disagree on what they believe the future holds, several presidents agree that a prosperous future for psychometrics is not a given. Psychometricians will have to put in the effort to make themselves relevant.
Some presidents point out that psychometrics has a serious PR problem and has to work hard to be heard, whether it is by psychologists or by other possible collaborators, and many see challenges in selling psychometric research to relevant parties. In fact, Wim van der Linden considers the inability of psychometrics to market itself as psychometrics’ biggest pitfall. He blames this inability on the slow development in psychometrics of making good user-friendly software, which would have paved the way for selling psychometric models at an earlier stage. Robert Mislevy states that ‘it is easier to get people to recognize the value and the use of psychometric techniques if you do not call them psychometric techniques until you have worked with them for a couple of months at least!’. Even though the presidents think it is crucial that psychometric knowledge is not lost to the test of time, psychometrics will have to make up a plan to remain influential. Mislevy continues: ‘there are very rapid advances today in technology, in psychology, in learning analytics, and the biggest challenge of psychometrics is not getting left in the dust.’
When asked about what the future holds for psychometrics, some respondents refer to the big data era, and how psychometrics could contribute to such new developments. Some say that the big data era provides an opportunity for psychometrics, and that again, we should not miss the boat. Ulf Böckenholt is full of optimism: ‘We live in the age of big data, the age of self-quantification. I carry a Fitbit. It is the dream of the psychometrician!’. And, according to Paul Holland, ‘The future of psychometrics is about the open-mindedness of all the different varieties of the ways that people collect data and try to draw conclusions and to make sense of it.’ It is the age of big data, and human response data are anything but extinct. In fact, more and more different types of data, in need of thorough analysis, are coming our way. And, according to Hua-Hua Chang, psychometricians have relevant knowledge that other researchers do not:
Everyone is talking about big data, but what is big data? How is the data collected? I think our psychometricians should do a good job of making sure data is collected reliably. How was the data collection designed? Does it have high validity? [...] That will make psychometricians even more important.
Thus, big data need to be analyzed appropriately, and psychometricians have the tools to get involved, also when the nature of these data is significantly different from traditional testing data.
But even though the big data movement seems more than promising, Jacqueline Meulman warns for the hype. According to Meulman, both psychometricians and statisticians should be critical of this development. Instead, psychometricians should claim back their own field:
They should say, ‘psychometrics is our area, and testing is from our origins, and we should claim it back.’ I am amazed sometimes by things I see on the Internet, that major agencies that do testing have no clue what psychometrics is all about.
Meulman stresses that it is by no means her intention to ignore developments that are going on in data science, but that it is essential to be on guard with these modern trends, and also to remain influential where psychometrics has always been needed the most: the testing industry. Ivo Molenaar also warns for the rise of big data: ‘I think that they [the psychometricians] have more computational possibilities now and have what they call big data [...]. I am getting old-fashioned, so I think maybe you should not collect that many data because it is only going to cause you problems.’ Molenaar refers here to the danger of overfitting and the lack of critical thinking in a mostly computer-driven process.
The future of psychometrics is thus regarded with careful optimism. Several presidents believe that psychometrics will remain relevant for psychology and the testing industry. But, where some presidents stress the importance of opening up to contemporary scientific ideas, others explicitly warn for these new developments. Both sides are afraid psychometrics might remain too isolated and out of touch with the scientific playground.
Psychometrics might thus benefit from a change of course. But what change? It is challenging to extract a single recommendation from all twenty transcripts. What we can safely conclude is that contemporary psychometrics is essentially a pluralist research area, and it is this plurality that needs cherishing. This does not mean that we should just ‘let things be pluralist’ and each go our own ways, which is perhaps what is happening now. Instead, psychometrics needs to make explicit what a plurality of goals and approaches actually entails. What are the avenues that psychometrics aims to tread? What is psychometrics’ mission, and what are its priorities? Where and how does psychometrics want to contribute? We would recommend the Psychometric Society and other psychometric institutes to list their priorities and make a resulting mission statement public. Based on the interviews, these priorities could include: (1) building psychological theory, (2) improving educational measurement in terms of fairness or reliability, (3) constructing and distributing user-friendly software for the analysis of behavioral data, and (4) developing new methods for data analysis. Not only does such a list of priorities make it easier to communicate to external parties what it is that psychometrics does and values (something that worries many of our presidents), it can also offer guidance on relevant topics for sessions at meetings and the publication of articles. With this recommendation, we have no intention of preventing researchers from pursuing a path that is not listed as a priority. However, a more active policy may provide some clarity and guidance for a field that, if current trends continue, with time will only become more and more fragmented and diverse.
A second recommendation has to do with psychometrics’ relationship with its past and how its history also shapes contemporary psychometrics. Early psychometricians like Francis Galton, Lewis Terman, and James McKeen Cattell were often devoted to a specific social ideal—often associated with the highly controversial ideas of eugenics—and they expressed these ideals in their academic work. It is interesting to see that contemporary psychometricians do not often engage in public debate—even when educational measurement is again part of a heated discussion—and Psychometrika rarely publishes articles about such themes. Perhaps, psychometrics’ controversial history functions as a warning against a strong social involvement. Instead, contemporary psychometrics engages in highly technical work that, on the face of it, often seems to be detached from social reality. Psychometricians’ shyness for public expression does not help in improving their visibility, and importantly, it might lead to outcomes that are completely undesirable to the psychometrician (e.g., the possible decline of reliable measurement in schools or the rise of irresponsible data analysis). Whatever the reason for psychometrics’ current absence from public debate, we would recommend psychometricians to engage in matters that touch upon their expertise, not only as a way to increase their visibility, but more importantly, because they have expertise that matters.