Introduction

Facility with evolutionary concepts is foundational to a rich understanding of biology, and several large, collaborative efforts to improve undergraduate education have outlined this importance (American Association for the Advancement of Science 2011; Association of American Medical Colleges and the Howard Hughes Medical Institute 2009; National Research Council 2003, 2009, 2012). Thinking Evolutionarily, a report summarizing a convocation organized by the National Research Council and the National Academy of Sciences, lays out the value of and practical approaches to infusing the teaching of evolution throughout biology courses across K-12 and undergraduate curricula (National Research Council 2012). Focusing on undergraduate curricula, the American Association for the Advancement of Science report Vision and Change identifies core concepts within evolutionary biology for developing biological literacy (American Association for the Advancement of Science 2011). That succinct description of concepts has since been interpreted and elaborated for specific fields of biology (American Society of Plant Biologists and the Botanical Society of America 2016; Merkel et al. 2012; Tansey et al. 2013), and translated into a framework to help instructors align their departmental educational goals with Vision and Change (Brownell et al. 2014). However, even with clear educational goals in mind, carefully measuring student learning and adjusting teaching practices to achieve these goals is a daunting task (Handelsman et al. 2004).

One way to measure student learning, usually within the context of a single course or module, is by using a concept inventory. Concept inventories are test-based assessments of a concept or set of concepts, usually using multiple-choice questions (D’Avanzo 2008; Knight 2010). The incorrect choices for a question are called distractors, and are ideally based around common student misconceptions (Haladyna et al. 2002; Sadler 1998). For example, to create the Genetic Drift Inventory (GeDI), a concept inventory of genetic drift, the authors used student interviews and built upon previous work to identify six common student misconceptions about genetic drift, then designed many of the inventory’s questions to assess these (see Table 3 in Price et al. 2014, as well as Andrews et al. 2012). One misconception they identified was that “Natural selection is always the most powerful mechanism of evolution, and it is the primary agent of evolutionary change”, and four of the 22 questions on the inventory test some aspect of this misconception.

Despite the growing number of concept inventories assessing topics in evolution, there are many impediments to their widespread use among college instructors. First, the current concept inventories cover only a few of the major topics that may be taught in an undergraduate evolution course. In an analysis of peer-reviewed evolution education research, Ziadie and Andrews (2018) found that the majority of published papers pertaining to assessment of evolutionary concepts relate only to natural selection or phylogenetics (particularly tree-thinking). Many common topics in undergraduate evolution courses had limited or no coverage. In addition, Ziadie and Andrews note there are few literature reviews of such assessments, and that college instructors who wish to use these assessments in their teaching would benefit from a review of evolution-related assessments that summarize both the topics and misconceptions covered and the differences in approach to their development.

Alongside the challenge of uneven coverage, college instructors also face barriers to translating this work into practical use (Anderson 2007). Instructors often have limited time and training to apply new teaching methods (American Association for the Advancement of Science 2011; Henderson et al. 2011; Henderson and Dancy 2007), and may face tensions with professional norms about scientific identity (Brownell and Tanner 2012). In some cases, discipline-based educational research may not be presented in a way that is clearly connected to classroom application (Kempa 2002). In other cases, instructors may not have confidence in the validity of the interpretation of educational research (Herron and Nurrenbern 1999).

Concept inventories avoid some of these concerns, as they are generally designed to be easily used within the current framework of a course. However, there are limitations to their effective use. The target audience is not always clear, and instructors may be unsure of exactly how to interpret results. Furthermore, concept inventories are often limited in their scope and interpretation, and can be influenced by the specific design of the test questions and logistics of test implementation. Understanding how the inventory creators gathered evidence about its validity (Box 1) is critical (Adams and Wieman 2011).

This paper aims to be a resource for college instructors in evolution, helping to minimize the challenges and maximize the benefits of using concept inventories in teaching. We present the logic of why and how an instructor might choose to use a concept inventory in their teaching, and summarize current evolution concept inventories. We also briefly outline the general process of concept inventory validation. To ground the discussion in practice, we explain several ways an instructor might use the inventory to support their teaching, including applications that do not require formal student test-taking.

Why and how to use concept inventories

Many papers have examined the goals and benefits of using concept inventories to inform undergraduate teaching (Adams and Wieman 2011; D’Avanzo 2008; Garvin-Doxas et al. 2007; Knight 2010; Libarkin 2008; Marbach-Ad et al. 2010; Smith and Tanner 2010; Steif and Hansen 2007). Here, we synthesize and build upon these goals, highlighting several key benefits of using concept inventories to inform teaching of evolutionary concepts.

Concept inventories with validity evidence based on test content can inform learning objectives within a course or across a broader curriculum

The majority (14 out of 16) of concept inventories relating to evolution that we identified had empirical evidence for the validity of the test content (see Box 1 and Table 1), meaning that there were several steps in the development of the concept inventory where content experts (i.e. evolution experts) or other sources of expert knowledge (e.g. peer-reviewed literature or textbooks) were consulted. A subset of these concept inventories also attempt to cover all major themes relevant for the given topic assessed in the concept inventory by asking the content experts to delineate main learning goals and concepts related to the topic. As such, these concept inventories can be used to identify potential core ideas related to a topic, which can in turn influence an instructor’s preparation for a course. If the instructor follows principles of backward design (Wiggins and McTighe 2005), then these concept inventories provide a ready-made list of learning goals and concepts relevant to the evolutionary topic.

Table 1 Types of test validity evidence

Concept inventories can identify key misconceptions students hold about an evolutionary topic

Most concept inventories are designed specifically to identify student misconceptions; the multiple-choice concept inventories often rely on distractor answer choices that align with common misconceptions. In addition, several of the concept inventory publications we examined directly identify (either with empirical data or by reviewing peer-reviewed literature) common student misconceptions related to that evolutionary topic. Instructors can benefit from knowledge of these common student misconceptions, given the empirical evidence that a powerful and engaging way to promote deep learning is by eliciting and addressing misconceptions in a systematic manner (e.g. Allen and Tanner 2005; Andrews et al. 2011; Gregory 2009; Nelson 2008). By examining the list of misconceptions identified during development of the GeDI (Price et al. 2014), JLH was able to design activities to directly confront these misconceptions, and incorporated a homework assignment where students were asked to reflect upon their own genetic drift misconceptions and explain why they were incorrect. Students were also challenged to explain why several common misconceptions about drift were incorrect. Once these misconceptions are identified, instructors may draw upon articles that provide further insight into these misconceptions (e.g. Andrews et al. 2012; Gregory 2008) and may look into peer-reviewed curricula for activities designed to counter misconceptions about evolution (e.g. Andrews et al. 2011; Govindan 2018; Kalinowski et al. 2013; Meisel 2010).

Concept inventories allow for measuring student knowledge in a topic before a course or module

In addition to the identification of common misconceptions about a given topic, instructors who have students take a concept inventory at the beginning of a course (or before the topic is covered in the course) can better identify the level of expertise the students have on the given topic, thus allowing the instructor to tailor the instruction to the students’ background knowledge on the topic. The concept inventory can also identify specific misconceptions that students in the class harbor, again allowing the instructor to design specific learning activities to counter those misconceptions.

Concept inventories can be used to compare students’ background knowledge on a topic across different course sections

Concept inventories can be used to compare student levels across different course sections. For instance, one of the authors (JLH) teaches a course that has several lecture sections, with different sections each having a different instructor. The instructors of the course each give a pre-course assessment with questions from several concept inventories. If one section has many more students holding a particular misconception than another section, the instructor of the former can spend more time addressing the misconception while the other instructors may not need to spend as much time. The scores on this standardized pre-course assessment also contextualize scores on other standardized assessments (e.g. mid-semester and final exams) that are shared in common across the course sections. The instructors have found, unsurprisingly, that in years where students have performed significantly lower in the pre-course assessment in one section, those same students tend to perform worse on the standardized mid-semester and final exams. Without these data, the instructors might have mistakenly attributed the differences in scores to differences in grading or teaching. While there might still be differences in these latter categories (despite the instructors’ best efforts to standardize teaching and grading), the scores from the pre-course assessment provide greater context on student background levels.

Concept inventories can be used to assess student learning during a course, module, or activity

Many concept inventories can be used for a pre/post assessment, where the concept inventory is given on the first day of class (or is assigned outside of class for homework or a small amount of participation or bonus points) and then again on the last day of class or embedded in the final exam. Use of concept inventories for such pre/post assessment can be used to assess student learning of the particular evolutionary topic, and can also inform the instructor about which misconceptions, if any, the students still hold after the class, module or activity. In addition, there are some concept inventories (e.g. EcoEvo-MAPS; Summers et al. 2018) designed for longitudinal assessment of a given student cohort. Such an assessment can be given at multiple points throughout an undergraduate cohort’s college career, and provide valuable information on student learning throughout their time in the undergraduate program. Assessment data is crucial for the process of scientific teaching (Handelsman et al. 2004), and these data can also be used to identify demographic variables (e.g. ethnicity, gender, etc.) that correlate with learning or preparation if the instructors also collect these demographic information (Marbach-Ad et al. 2010).

Concept inventories can inform changes in instruction from year to year

The use of concept inventories to assess student learning in a course, track a cohort’s progress throughout their undergraduate careers, and identify remaining misconceptions can provide valuable feedback to instructors as they reflect on a course. These data can thus help identify both strengths and weaknesses in a given course, module, or activity, and the instructor can use these data to make changes as appropriate to the course. For instance, one of the authors (JLH) has made changes to his mid/upper-level evolution course, spending additional time on activities related to genetic drift, after questions from the GeDI in the first iteration of the course identified that students still harbored major misconceptions about drift and were not mastering the main learning objectives in a way that the instructor had hoped for. These questions from the GeDI will be used this semester to assess the impact of the changes made in the evolution course this year. Similarly, the use of concept inventories in a longitudinal fashion can also inform broader program-wide curricular discussions.

Concept inventories can inspire instructors to create their own activities and assessments

Finally, concept inventories can be a source of inspiration for instructors in terms of designing new activities and assessments. Concept inventories that have evidence of test content have been reviewed by content experts, and looking at the concepts, misconceptions, and question formats can generate new ideas for instruction and assessment.

How to administer the concept inventory as a test

Several of the approaches above do not require you to actually administer the concept inventory as a test. However, you may wish for students to take the concept inventory to measure student learning or background knowledge. At this point several common questions arise. Is it okay to use a subset of the inventory questions? Should students take this in class, or can it be administered online? Will offering extra credit bias the participation? Choosing only a subset of questions may be practical, as it allows a shorter assessment that can be tailored to your course learning goals. However, the process of validation for an inventory is based around the complete question set. You can still learn useful information about student learning, but data cannot be easily compared with other instances of test implementation. When possible, refer to the statistical analyses of a test’s internal structure, which may reveal clusters of conceptually related questions that either form a natural subset or provide a basis to select questions that still span some breadth of content. Regarding test location and incentives, Madsen et al. (2017) review many studies of concept inventory implementation, noting that a small amount of extra credit may increase test completion without unduly influencing scores. Madsen et al. also argue strongly for the assessment to be taken in some supervised setting, though the format could be paper or online. This eliminates concerns about students using outside resources or saving and sharing questions outside of class, and can increase completion rates.

General steps to use concept inventories

While there is no set “formula” for how to use concept inventories, we delineate five general steps for how to use a concept inventory.

  1. 1.

    Determine your goals for using concept inventories. In other words, how do you want to use concept inventories to inform your teaching? Which of the above goals do you wish to accomplish, and for which topic within evolution? Which classes are you thinking of using the concept inventory for? Is the class a non-majors class or one for biology majors? Is it an introductory or advanced class? Are you hoping to assess learning throughout the whole course, or for a specific module or activity? Thinking carefully about your goals and objectives is essential before you start looking at specific concept inventories.

  2. 2.

    Identify and obtain relevant concept inventories. Once you have thought carefully about your goals, you can now identify any relevant concept inventories to your chosen topic. Table 2 provides a current list of all concept inventories with content relevant to evolution as of the time of publication, as well as how to obtain them. Concept inventories are often, but not always, found in the relevant paper or its supplement.

    Table 2 Evolution concept inventories
  3. 3.

    Review the details of the concept inventory and its development. We have summarized some features of each concept inventory (e.g. target population, time it takes to complete the concept inventory, types of validation evidence; Table 2). This information can help you check the appropriateness of the concept inventory to your class and your goals. If you plan to administer the concept inventory as a test and use the results to draw conclusions about student learning, make sure that the validation population is similar to your focal student population, and that the evidence the inventory creators present is convincing. When in doubt, consider ways that you might gather additional evidence to strengthen your confidence in the inventory’s use. For example, you could conduct student think-aloud interviews or use additional free-response questions (Table 1); Furtak et al. (2011) model this process as they performed additional validation and adjusted the Concept Inventory of Natural Selection (Anderson et al. 2002) for use with high school students. In addition, be sure to review the inventory’s associated paper for more details about the concept inventory’s development. These details can be a valuable resource to reveal student thinking about the concept.

  4. 4.

    Establish a plan for how and when you will use the concept inventory. Once you have reviewed this information, you can then establish a plan of how and when you want to use the concept inventory for your class. For example, you might want to use the inventory both before and after a course or set of lessons, or you may only plan to use the assessment at a single time point.

  5. 5.

    Assess and reflect on your data, if appropriate. Finally, after implementing your plan, it is vital that you assess and reflect on any data you may have gathered from utilizing concept inventories. These data should allow you to make changes as appropriate to your teaching, and you may then iterate through this process again to continually assess and improve student learning.

Limitations of concept inventories

We hope that concept inventories will prove useful to some readers who had not previously considered their application. However, there are limitations to the use of concept inventories that all instructors should be aware of prior to use. We group these limitations into three main categories: validation-based, cognition-based, and logistical.

For validation-based limitations, concept inventories can be influenced by students’ ability to think critically and understand advanced vocabulary and jargon (Knight 2010; Smith and Tanner 2010). While promoting critical thinking and knowledge of evolution vocabulary are important goals, the lack of a foundation in either may confound students taking a concept inventory even if they do have a good conceptual framework of the topic. As such, scores on the concept inventory may not necessarily reflect students’ true understanding of the topic. In addition, given that most of these concept inventories rely primarily on multiple choice questions (or agree/disagree questions with even fewer choices), student scores may be artificially inflated by guessing, which can lead instructors to overestimate students’ mastery. Several authors of concept inventories (e.g. Price et al. 2014) caution against relying on a single data point of student performance on a concept inventory, and instead advise faculty to focus on comparing student scores across different times (e.g. a pre/post test). Summers et al. (2018) also note that student motivation on a given assessment plays a role in student performance. Instructors are advised to emphasize to students that they should take each assessment seriously, or to use class time or incentives to encourage effortful completion.

In addition, concept inventories may be limited by cognitive biases. Students’ mental models of an evolutionary concept may influence the accuracy of the concept inventory as an assessment of skill and knowledge. Novice students who have constructed naïve models of the concept may focus on (and thus be influenced by) surface features of the problem, such as the type of organism, while expert thinkers are able to identify the key biological concepts (Smith et al. 2013a). Studying student open responses to questions about evolutionary change, Nehm and Ha (2011) discovered that students perform worse when asked about evolutionary trait loss versus evolutionary trait gain, despite the two having similar explanations based on natural selection. Many other cognitive biases have been identified, including differences in student performance on questions testing identical evolutionary concepts when using familiar organisms versus unfamiliar taxa or when testing changes between versus within species (Nehm et al. 2012; Novick and Catley 2014; Opfer et al. 2012). Concept inventories that do not draw upon this body of knowledge to shape their design and validation may produce inaccurate results that are influenced by these cognitive factors, and instructors should be aware of these cognitive biases when teaching these subjects and using the concept inventories. For example, one may expect different patterns of student responses from a concept inventory on tree-thinking that uses only familiar organisms in its trees versus one that uses a mix of familiar and unfamiliar organisms.

There are also several logistical challenges to implementing concept inventories. While most of the evolution concept inventories that we identified (13 out of 16) rely on multiple choice questions, some assessments use open-ended questions. These questions require more time to grade, and there may be variation in scoring from one instructor to another, even with a given rubric. Furthermore, some concept inventories are not found in the associated peer-reviewed paper and thus may not be immediately accessible to instructors; we have attempted to alleviate this challenge by providing a column for how to access each concept inventory in Table 2. Despite this, some of the concept inventories require emailing authors, and other concept inventories may have restrictions on how they may be used. Finally, there may be problems with instrument validity if instructors use a partial set of questions from concept inventories, or even if they use questions in a different order (Balch 1989; Federer et al. 2015; Hambleton and Traub 1974), although a study that included analysis of question order did not find an effect for the GeDI (Tornabene et al. 2018). Using a partial set of questions may still provide valuable information to an instructor. However, it limits the instructor’s ability to generalize student performance to a measure of overall student facility with the broader concept, and restricts comparisons with other studies that use the assessment. In many cases this may not be a problem for practical use.

Identifying evolution concept inventories

To identify the currently published concept inventories, we conducted a comprehensive literature search with both Google Scholar and PubMed, using the search terms “evolution* ‘concept inventory’”, and “biology ‘concept inventory’”. Although this helped us locate many inventories of evolutionary concepts, we continued to find others through published references to other, non-peer-reviewed work. After building the complete list, both authors conducted another search and double-checked each published inventory’s references, and the papers citing each inventory, finding no additional evolution concept inventories as of October 24, 2018.

In total, we identified 14 concept inventories assessing specific topics in evolution, 2 broader concept inventories that had some questions assessing evolutionary topics, and 2 genetics concept inventories with questions that may be useful to instructors teaching evolution. Table 2 summarizes these inventories. We categorized each concept inventory by topic, and created a table with inventory details including: target students, question types and number, validation population, and types of validity evidence. The authors each independently coded each inventory, and any discrepancies were resolved through discussion.

Opportunities for new assessments

Even with 14 evolution-focused concept inventories, coverage across topics was uneven (Table 3). Seven inventories assessed natural selection, four assessed phylogenetics, and other topics generally had coverage by one or no inventories. We also mapped the questions from the two broader inventories, ecology and evolution–measuring achievement and progression in science (EcoEvo-MAPS; Summers et al. 2018) and the Biological Concepts Instrument (Klymkowsky et al. 2010), onto the topics outlined above. The authors of EcoEvo-MAPS also have their own categorization for each of their questions, available by contacting the corresponding author. Natural selection and phylogenetics were similarly well-covered here, as well as macroevolution and population genetics. However, many topics were sparsely or not at all covered by any inventories: speciation, evolution of behavior, human evolution, molecular evolution, sexual selection, quantitative genetics, evolutionary medicine, biodiversity, and human impact. As new concept inventories are created, the process of validation (particularly student think-aloud interviews and other response-process validation) will hopefully continue to reveal new misconceptions and forms of assessment for these less-covered topics.

Table 3 Topic coverage by current evolution concept inventories

Conclusion

This paper argues for the varied and flexible potential uses of concept inventories to support undergraduate learning of evolution. Although concept inventories may not always be the ideal assessment instrument for your learning goals, published descriptions of their creation and validation offer a rich additional resource for assessment and curricular development. Despite the large number of topic-specific inventories, many concepts in evolution remain uncovered and could benefit from new assessments. By summarizing the evolution concept inventories and outlining their details and validation approaches, we hope that instructors can quickly identify instruments for further examination. There are surely many other creative ways to use these inventories; usefulness in service of student learning is the key objective.