Encyclopedia of Science Education

Living Edition
| Editors: Richard Gunstone

Inquire, Assessment of the Ability to

  • Wynne HarlenEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-94-007-6165-0_62-2


Subject Matter Formative Assessment Inquiry Skill Summative Assessment Classroom Dialogue 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Formative; Inquiry skills; Reliability; Summative; Test; Validity

To inquire (also spelled enquire) is a term used in both daily life and in education, meaning to investigate or seek information to answer questions. In education, the ability to inquire is relevant in many subject domains, such as history, geography, the arts, as well as science, mathematics, technology, and engineering, when questions are raised, and the skills of generating, collecting, and using data are used in developing understanding. In science, understanding of the natural and made world is developed through using skills such as raising questions, collecting data, reasoning and reviewing evidence in the light of what is already known, drawing conclusions, and communicating results. Although an inquiry is generally initiated by a question, in education the value of the activity is more than finding an answer; it contributes both to the understanding of the “big ideas” that apply beyond the specific event or phenomenon being studied and to the development of skills that enable further learning.

Science Inquiry Skills

Skills used in scientific investigation and inquiry are identified in slightly different ways in different curricula and standards statements. However, they have much in common and generally include the following:
  • Asking questions

  • Generating hypotheses or possible answers

  • Making predictions

  • Planning and carrying out investigations

  • Analyzing and interpreting data

  • Constructing explanations based on evidence

  • Evaluating and communicating findings

The assessment of the ability to use these skills has to take account of three key points. First is that students have to be using, or given the opportunity to use, the skill in order for their ability to be assessed. Second is that any skill has to be used in relation to some subject matter: there can be no “content-free” skill. Questions and tasks have to be asked about something, observations are made about particular objects and events, and investigations are planned to answer questions about particular phenomena. There has to be some subject matter involved when skills are used and what this is makes a difference to whether skills are used. For example, a student may be able to plan an appropriate investigation about a situation where he/she has some knowledge of what are likely to be the variables to control, but fails to do this if the subject matter is unfamiliar. This has important consequences for assessment. The subject of a particular task or test item is just one of a potentially large number of alternative subjects. Practice bears out the possibility that a student’s result would be different if an alternative subject had been chosen; thus there is a variation in the results associated with the choice – an unavoidable error – since no two tasks with different subject matter or contexts can be exactly equivalent. To emphasize the role of content knowledge, the National Research Council adopted the term “practices” in place of “skills” (NRC 2012, p. 30). However, given the acknowledgment that some knowledge of the content is always involved, it does not seem necessary to abandon the use of this familiar term.

A third key point follows from the general recognition that assessment should be aligned with the educational goals and student learning objectives to be achieved through the curriculum content and pedagogy. This alignment is essential for validity of the assessment. Following the conception of the validity of an assessment as the extent to which there is evidence supporting inferences drawn from the results (Messick 1989), the validity of an assessment of ability to inquire depends on evidence that, when being assessed, students are engaged in activity that involves the use of some or all of the inquiry skills. This would not be the case, for instance, if a student simply recalls the answer to a question rather than using skills to work it out. Furthermore, the processes being used by the students should reflect the view of learning underlying inquiry-based pedagogy. This view is that students are active agents in their learning, bringing their existing experience and ideas to bear in pursuing questions or addressing problems that engage their attention and thinking. By collecting information for themselves, they have the evidence of what works and what does not work in helping them to make sense of different aspects of the world around. In addition, there is emphasis on individuals making sense of experience with the help of others, indicating a sociocultural constructivist perspective on learning and underlining the value of collaboration, communication, dialogue, and argumentation.

Assessment Purposes

Assessment of students’ learning involves generating, collecting, and interpreting evidence for some purpose. Three main purposes of assessment are commonly identified: assessment to assist learning (formative assessment), assessment of individual students’ achievement (summative assessment), and assessment to evaluate programs (Pellegrino et al. 2001). The focus here is on the first two purposes, which have a direct impact on individual students.

Formative Assessment of the Ability to Inquire

The practice of formative assessment, through teachers and students collecting data about learning as it takes place and feeding back information to regulate teaching and learning, is clearly aligned with the goals and practices of inquiry-based learning. It also supports student agency in learning through promoting self-assessment and participation in decisions about next steps, helping students to take some responsibility for their learning at school and beyond. Thus formative assessment fosters inquiry-based learning through supporting students in gathering and interpreting evidence in a manner that develops their understanding.

Gathering and Interpreting Data

Formative assessment is essentially in the hands of teachers who gather evidence students’ skills and understanding by:
  • Using questions designed to elicit students’ thinking and reasons for their actions

  • Promoting classroom dialogue

  • Reviewing students’ notebooks

For this purpose, teachers’ questions are best framed to show interest in the students’ thinking (“What are your ideas about what’s happening here?”) and to encourage the use of inquiry skills (“How are you going to test that idea?” “What will you do with these results?”). Promoting collaboration and dialogue among students not only fosters shared thinking but provides opportunity for teachers to observe how students interact and to listen to what they are paying attention to and how they are using words. The contributions and thinking of individual students can be gleaned from review of their notebooks.

However, whether the assessment is formative depends on how the evidence is interpreted and used. In formative assessment, interpretation is in terms of progress toward the specific goals of the lesson or unit of work. Both teacher and students should be aware of these goals, which determine the kind of evidence required to judge students’ progress. Through discussion with students, questioning that elicits their understanding of what they are doing and listening to how they explain what they are doing, the teacher decides about the relevant next steps, which may be to intervene or simply to move on. Assessment does not need to lead to action in order to be formative – an appropriate decision may be to take no action to change the ongoing activity.

Feedback to Students and into Teaching

The main use of evidence in formative assessment is to provide feedback, which is a two-way process: feedback from teacher to students and feedback from students into teaching. Feedback to students is the mechanism by which future learning opportunities are affected by previous learning and as such has the potential to be a powerful influence on learning. Feedback is most obviously given by teachers to students orally or in writing but also, perhaps unconsciously, by gesture, intonation, and indeed action, such as when assigning tasks to students. The focus and form of the feedback have to be carefully judged by the teacher. The focus of the feedback influences what students pay attention to, and the form it takes determines whether it can be used to advance learning. The work of Butler (1988) has been of considerable influence in distinguishing between judgmental and nonjudgmental feedback. Feedback that helps learning should be nonjudgmental, that is, it should:
  • Focus on the task, not the person

  • Encourage students to think about the work, not about how “good” they are

  • Indicate what to do next and give ideas about how to do it

In contrast, feedback that is judgmental is expressed in terms of how well the student has done (this includes praise as well as criticism) rather than how well the work has been done, making a judgment that encourages students to label themselves and compare themselves with others.

In formative assessment, feedback into teaching, using information that teachers pick up from observing their students, is used to inform teachers’ decisions about how to help students take their next steps in learning. This feedback enables teachers to adjust the challenges they provide for students to be neither too demanding, making success out of reach, nor too simple to be engaging. In this way, teaching is regulated so that the pace of moving toward the learning goals is adjusted to ensure the students’ active participation.

Student Self-Assessment

An important source of feedback to the teacher comes from students’ self-assessment and peer assessment, since the criteria students use in judging the success of their work reflect their understanding of what they are trying to do. Involving students in self-assessment and in making decisions about what their next steps should be and how to take them is a shared aim of formative assessment and of inquiry-based teaching. A prerequisite for being able to judge their work is that students understand what they are trying to do, not in terms of what is to be found, but in terms of the question to be addressed or problem to be solved. In addition, they need to have some notion of the standard they should be aiming for, that is, what is “good work” in a particular context. The criteria to be used by students in assessing their work can be conveyed implicitly through feedback from the teachers or developed more explicitly through brainstorming with students about, for instance: What makes a good plan for an investigation? What should be included in a good report of an inquiry? Understanding the goals of their work and the quality criteria to be applied supports the aim of increasing students’ responsibility for their work and develops their recognition of what is involved in learning (metacognition).

In Summary

Key practices of using assessment formatively to develop students’ ability to inquire are:
  • Students being engaged in expressing and communicating their understandings and skills through classroom dialogue, initiated by questions framed to elicit students’ thinking

  • Feedback to students that provides advice on how to improve or move forward and avoids making comparisons with other students

  • Teachers using information about ongoing learning to adjust teaching so that all students have opportunity to learn

  • Students understanding the goals of their work and having a grasp of what is good quality work

  • Students being involved in self-assessment so that they take part in identifying what they need to do to improve or move forward

  • Dialogue between teacher and students that encourages reflection on their learning

Summative Assessment of Ability to Inquire

Summative assessment is not a continuous part of teaching and learning as is the case for formative assessment where skills and understanding are assessed during inquiry-based activities. Rather, it takes place at certain times when a summary of students’ achievement is needed in order, for example, to report to parents, students’ next teachers, and the students themselves; to select students for courses; to accredit their learning; or to monitor progress of individuals and groups of students as they pass through the school. Information for these purposes may be gathered in various ways, the most common falling into three main groups:
  • Tests or special tasks given under controlled conditions or embedded in classroom activities

  • Summarizing information gathered by teachers during their work with the students over a period of time

  • Building a record over time, as in a portfolio created by teachers and/or their students

The choice of method will depend on the use to be made of the result and the demand that the particular use makes for reliability of the results. The reliability of an assessment refers to the extent to which the results can be said to be of acceptable consistency or accuracy for a particular use. Reliability is defined as, and, where possible, estimated quantitatively, by the extent to which the assessment, if repeated, would give the same result. In the case of formative assessment, judgments are made about action to take in a particular situation involving only the teacher and students, and the notion of making a repeatable judgment is not relevant. No judgment of grade or level is involved, so reliability in this formal sense is not an issue in formative assessment. However, when assessment results are used by others and may involve students being compared or selected, as in summative assessment, reliability becomes important.

It is important to realize that the extent to which the reliability of an assessment can be raised is limited by the interaction of reliability and validity and the effect that optimizing one has on the other. This is best illustrated in relation to items in a test. Attempts to ensure high reliability will inevitably favor the inclusion of items that can be consistently marked or marked by machine, limiting the range of outcomes that can be covered in the test and lowering its validity. Extending the range of what is assessed to the application of knowledge and skills requires the use of more open-response items where judgment is needed in marking, inevitably reducing the reliability. Thus there is a trade-off between reliability and validity which applies to all summative assessment whatever form it takes. It presents a particular problem in the assessment of skills such as involved in the ability to inquire.

Using Tests or Special Tasks for Assessing Ability to Inquire

The use of tests or special tasks is a time-honored approach to summative assessment. It is attractive because the tasks can be controlled and presented to all students in the same way, thus appearing to give the same opportunities for students to show what they can do. Tests and tasks can take different forms (e.g., written or performance) and can be presented in various ways from highly formal tests to special tasks embedded in normal work.

For ability to inquire to be validly assessed, the tasks or test items should require the use of inquiry skills. But, as noted earlier, skills are used in relation to some content, and so the task will be set in a context, requiring the skills to be used in relation to particular subject matter. Various steps can be taken to reduce the influence of knowledge of the subject matter. For instance, in Fig. 1 the subject is chosen as likely to be very familiar to the 11-year-old students concerned and thus does not constitute a barrier to engagement. In Fig. 2, all information needed about the subject for answering is given in an attempt to ensure that this knowledge is not a barrier.
Fig. 1

From APU Report of Science at age 11, DES 1985

Fig. 2

PISA assessment of Science 2000

Figure 1 is an item used in a survey of students aged 11 years. The subject matter is likely to be familiar to these students; thus the level of knowledge required is low, and the burden of the task is about conducting a fair test. The format for answering – and the requirement of the scoring rubric for the answer in each box to be correct – makes the chance of succeeding by guessing very low. But it also means that students have to read and understand the instructions for recording their answer; otherwise, there is a risk of failure for reasons other than not having the skill needed to the answer the question.

Figure 2 is an item written for the PISA surveys of 15-year-olds (OECD 2000). Students are asked to use the given information to support alternative conclusions about action that could be taken. The information is authentic and presents the sort of problem that students able to inquire should be able to engage with. The two parts to the task illustrate the uncertainty of interpreting scientific information in certain cases. In theory, all the information is provided, and the students are told how to interpret the chart. They do not need to know how carbon dioxide, methane, and particles and their effects on clouds cause heating and cooling. However, it is arguable that without any knowledge of these things, the question is likely to be meaningless, and they are unlikely to engage with the problem posed.

Figures 1 and 2 illustrate some of the features of written test items that endanger the validity of the test. The most obvious is the inevitable demand for reading and understanding the question and, depending on the answer format, for writing ability. In addition, the attempt to place the task in a context that can seem real to the student means that some sort of “story line” is presented as a context for the task. Students have to read and engage with the context in order to respond to the question. There is evidence that these features of the item do affect students’ measured attainment. The effect of the choice of a particular context can be reduced by using a range of contexts, balancing out the effect of any one. But since there is a limit to the length of a test, this would mean a larger number of shorter items, each assessing a small part of the ability to inquire. It raises the question of whether this is a valid way of assessing the ability to combine different skills in conducting a whole inquiry.

Some of the deficiencies of written tests – particularly the dependence on reading and writing – can be avoided by performance items, where students carry out a whole or part of an investigation with real objects and equipment. The question still has to be presented to the student who has to engage with it as if it were his or her own, and the situation is far from that of a normal classroom, since the students may be working alone (sometimes in pairs) with an administrator present to observe their actions. However, it does give an opportunity for student to explore, try out approaches, and start again if necessary. The main problem is one of generalizing from the very small number of extended investigations that it is feasible for any one student to undertake. Again it is the context that has a strong influence on the outcome. There is strong research evidence that students who perform well in one investigation will not necessarily do so in another testing the same skills but in a different context. Consequently, it is useful to consider alternatives to tests.

Summarizing Teacher-Based Assessment

One of the main alternatives to tests draws on the fact that the experiences that students need in order to develop desired skills also provide opportunities for their progress to be assessed. The key factor is judgment by the teacher. Assessment by teachers can use evidence from regular activities supplemented, if necessary, by evidence from specially devised tasks introduced to provide opportunities for students to use the skills to be assessed. The limitation on the range of evidence that can be obtained through a test does not apply when assessment is teacher based.

There are other advantages that go beyond more valid assessment of understanding and inquiry skills, since a greater range of competences can be included. Observation during regular work enables information to be gathered about processes of inquiry rather than only about products.

Assessment by teachers is not just a matter of teachers using their individual judgments about what evidence to use and how to interpret it. Summative assessment by teachers must follow agreed procedures and be subject to quality control measures appropriate to the use of the results, that is, stricter control for higher-stakes use. Evidence of students’ use of inquiry skills will be gathered by various means, as for formative assessment, and interpreted using broad criteria relating to skills development. Procedures for making judgments generally involve some to-ing and fro-ing between data and criteria to make an “on-balance” judgment as to which particular criteria are met. It is common for criteria to be identified at different “levels,” so that the outcome of the assessment can be expressed in terms of the level at which a student is performing. Levels are produced by mapping the progress of students in a particular area of learning, using evidence from research and from teachers’ experience. Some care has to be taken in using levels, however, as there is a risk of students becoming labeled and indeed labeling themselves in terms of levels achieved (Harlen 2013).

The most commonly expressed criticism of assessment by teachers concerns the reliability of the results. It can indeed be the case that, when no steps are taken to assure quality, teachers’ judgments are prone to a number of potential errors. However, there are various ways in which the reliability of teacher-based judgments can be brought to a level comparable with that of tests. These include group moderation and the use of exemplars. In group moderation, teachers meet to review samples of students’ work. The purpose is not to verify decisions about particular students’ work, rather to arrive at shared understandings of criteria and how they are applied, thus improving the reliability of future assessments. The provision of examples of students’ work (which can be in the form of video recording of inquiry in action) shows how certain aspects relate to the criteria of assessment, clarifying the meaning of the criteria in operation. Good examples also indicate the opportunities that students need in order to show their achievement of skills.

Building a Record over Time

This approach to summative assessment creates a portfolio that is not a sample of all a student’s work over a period of time, but reflects the best performance at the time of reporting. The evidence is accumulated gradually by retaining what is best at any time in a folder, or other form of portfolio (including computer files), and replacing pieces with better evidence as it is produced. The evidence can take a variety of forms from photographs, videos, artifacts, as well as writing and drawings. The approach enables students to have a role in their summative assessment by taking part in the selection of items in the folder or portfolio, a process for which they need some understanding of the broad goals and quality criteria by which their work will be judged. It is important that time is set aside at regular intervals specifically for students to review their work. This gives them time not only to decide what to put in the “best work portfolio” but also to consider what they can improve.

The final form of the portfolio is assessed at the time when a summative judgment is needed, either by the teacher or by external assessors, depending on the purpose and requirements of the assessment procedures. The process involves comparing evidence from the portfolio with the criteria to identify the “best fit.”

In Summary

Some key features of summative assessment of ability to inquire are:
  • Taking place at certain intervals when achievement has to be reported

  • Requiring methods which are as reliable as possible without endangering validity

  • Involving students using inquiry skills within a context, the nature of which is likely to affect students’ engagement and performance

  • Reporting achievement in terms of criteria describing the extent of use of inquiry skills

  • Involving some quality assurance procedures commensurate with the use made of the results

  • Where appropriate, involving students in the assessment and in this way contributing to their learning



  1. Butler R (1988) Enhancing and undermining intrinsic motivation: the effects of task-involving and ego-involving evaluation on interest and performance. Br J Educ Psychol 58:1–14CrossRefGoogle Scholar
  2. Harlen W (2013) Assessment and inquiry-based science education: issues in policy and practice. http://www.lulu.com/content/paperback-book/assessment-inquiry-based-science-education-issues-in-policy-and-practice/13672365
  3. Messick S (1989) Validity. In: Linn R (ed) Educational measurement, 3rd edn. American Council on Education/Macmillan, Washington, DC, pp 13–103Google Scholar
  4. NRC (National Research Council) (2012) A framework for K-12 science education. National Academies Press, Washington, DCGoogle Scholar
  5. OECD (2000) Measuring students’ knowledge and skills: a new framework for assessment. OECD, ParisCrossRefGoogle Scholar
  6. Pellegrino JW, Chudowsky N, Glaser R (eds) (2001) Knowing what students know the science and design and educational assessment. National Academy Press, Washington, DCGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Graduate School of EducationUniversity of BristolBristolUK