Educational Psychology Review

, Volume 21, Issue 3, pp 219–246

A Review of Self-Report and Alternative Approaches in the Measurement of Student Motivation


  • Sara M. Fulmer
    • Psychology, University of Notre Dame
    • Departments of Child and Youth Studies & PsychologyBrock University
Review Article

DOI: 10.1007/s10648-009-9107-x

Cite this article as:
Fulmer, S.M. & Frijters, J.C. Educ Psychol Rev (2009) 21: 219. doi:10.1007/s10648-009-9107-x


Within psychological and educational research, self-report methodology dominates the study of student motivation. The present review argues that the scope of motivation research can be expanded by incorporating a wider range of methodologies and measurement tools. Several authors have suggested that current study of motivation is overly reliant on self-report measures, warranting a move toward alternative approaches. This review critiques self-report methodology as a basis for examining alternative conceptualizations of motivation (e.g., phenomenological, neuropsychological/physiological, and behavioral) and related measurement tools. Future directions in motivational methodology are addressed, including attempts at integration or combination of these approaches and a preliminary functional framework for the development of novel, multidimensional approaches to the study of motivation.



Motivation consists of the biological, physiological, social, and cognitive forces that direct behavior. Motivation has a long history within educational research (see reviews by Ball 1982; Weiner 1992; Young 1950). However, recent research has been driven by a predominant focus on the cognitive, intrapsychological aspects, discounting the importance of additional personal and contextual factors in the relationship between motivation and academic achievement. Past theories of motivation have focused on biological instincts, drives, and arousal. Current theories of achievement motivation, such as self-determination, cognitive evaluation, achievement goal, and expectancy–value theories, predominantly examine cognitive and, to a lesser extent, social processes that influence motivation for a particular activity.

In order to advance research in this field, we must understand the methodological deficits currently at play and consider the application of lesser used methodological approaches to this construct (Hickey 1997). Despite efforts of various approaches to encapsulate the construct of motivation, a single approach is unable to capture its complexities. Limited discussion has begun from within various approaches on the value of incorporating alternative measurement strategies (Byrne 2002; Dörnyei 2000; Zimmerman 2008). Alternative approaches differ in how motivation is defined and theorized, the processes believed to mediate the relationship between motivation and actual behavior, and the measurement tools and tasks designed to assess motivational states.

The purpose of this review is to survey four diverse methodological traditions and formats of motivation measurement, supporting the need for the development of novel, multidimensional methodological strategies. While the following review maintains a close focus on methods and measures, we provide the following broad definition of motivation, recognizing that the theoretical heritage of several approaches reviewed below depart significantly from this definition. Motivation, derived from the Latin root movere, implies self-directed movement (Pintrich 2003) and represents the primary intrapersonal dynamic that orients an individual to particular learning goals. Motivation is pre-decisional and constitutes impulse toward a goal (Heckhausen and Kuhl 1985). Within the framework of Rheinberg et al. (2000) for how conative (e.g., motivation, volition, achievement orientations, action controls, etc.) factors influence learning, motivation influences which goals receive commitment, affecting the strength and quality of that commitment.

A rich history of motivation research in education has consistently linked motivation with specific targets such as reading or math (Wigfield 1997) and has shown that academic motivation is highly differentiated across particular subjects (Marsh 1990, 1992a). Thus, our review will focus on measures designed to assess student motivation for academic achievement, addressing methods for younger elementary school students and adolescents in middle and high school. An initial review of the limitations of self-report methodology will provide a rationale for the subsequent examination of alternative approaches to the measurement of motivation, including phenomenological/authentic, neuropsychological/physiological, and behavioral. Discussion of each approach will include a definition of the theoretical conceptualization of motivation and the motivational tools and tasks utilized. An understanding of each approach will lead to guidelines for future directions in motivation methodology, including preliminary attempts at integration or combination of these approaches and characteristics of an ideal measure of motivation.

It is important to note that, although this review is extensive, it is not exhaustive or entirely comprehensive, especially in the examination of self-report measures that are numerous in quantity and diverse in construction. While the review of self-report measures is exclusive to measures of achievement motivation, the examination of alternative approaches (e.g., phenomenological, neuropsychological/physiological, and behavioral) encompasses a wider range of applications due to the limited use of these measures in educational motivation research to date. However, in compiling each section, we ensured that the measurement tools and approaches reviewed were applicable to and compatible with the study of achievement motivation. Representative exemplars from each of the four different approaches have been included to highlight possible cross-pollination of methods and potentially derive novel methods for motivation measurement. An additional challenge for a review attempting to cover methods and measurement traditions with widely discrepant theoretical and conceptual heritages is varying definitions of what constitutes a quality measure. Traditional self-report scales have widely appreciated standards and definitions of reliability and validity. In contrast, one of the approaches we review does not admit reliability as a standard of quality for a measure and has a markedly different definition of what constitutes a “valid” measure. Thus, no single framework can be applied to the four measurement traditions, but the assumptions and internal measurement quality standards of the various approaches will be highlighted.

Self-Report Measures of Motivation with Students

This review of self-report measures of motivation will focus on a brief assessment of measures designed to investigate various constructs of motivation for academics and in learning situations. Research in the field of motivation has primarily relied on paper-and-pencil self-report measures. As a result, some authors have argued that methodological progress in motivation research is stagnant and that this has been both a result of and contributed to theoretical fragmentation due to a lack of progress in the elaboration and validation of motivational constructs (Snow et al.1996). The weight of focus on cognitive rather than affect-related measures may be due to the common framing of motivational constructs as based on an individual’s disposition (Murphy and Alexander 2000). Since motivational orientations are often assumed to be conscious and accessible to the individual, students should be able to self-report these beliefs. In terms of correlations with academic achievement, motivation has been partitioned into several dimensions; however, the operationalization and definition of motivation and the dimensions involved continue to be debated. Murphy and Alexander’s (2000) review of motivational terminology included goals, intrinsic and extrinsic factors, interest, and self-schema factors as central in the relationship between motivation and academic achievement. Marsh et al. (2003) have articulated the beginnings of an organizing template for such constructs. The Big-Two-Factor Theory of academic self-concept organizes motivational constructs under the two broad dimensions of learning and performance orientations, both considered as stable personal traits. Learning orientation involves mastery, competence, effort, and interest, while performance orientation consists of social comparison and extrinsic evaluations (Byrne 2002). This model expands on the achievement goal framework (Dweck 1986; Nicholls 1984) to include additional constructs under higher order learning and performance orientations, such as intrinsic and extrinsic motivation and cooperative, competitive, and individualist orientations.

At a methodological level, broad approaches to the study of motivation and academic achievement have been via academic self-concept, achievement goal, and intrinsic motivation measures, to name just a few. While there is significant overlap in the use of self-report methodology, these approaches differ in the specifics of construct definition. Academic self-concept is multidimensional and hierarchical, and findings have revealed moderate and consistent correlations with academic performance, especially when content-specific measures are employed (Marsh 1992a). Academic self-concept is often measured by collecting descriptive and evaluative perceptions of scholastic competence (Byrne 1996). Competence is a central feature of achievement goals, which have typically been defined in terms of mastery or performance orientations (Dweck 1986; Nicholls 1984). Further work added the approach–avoidance dimension to performance goals and more recently to mastery goals in a 2 × 2 framework (Elliot and McGregor 2001). Achievement goals are often measured through ratings of statements regarding academic beliefs or behaviors that represent each of the goal orientations (e.g., “It is important to me to do better than other students”). Intrinsic motivation has been defined in terms of attitudes, enjoyment, importance/value, and interest for a particular activity or learning domain, such as reading or mathematics (Wigfield and Guthrie 1997). Intrinsic motivation is most often measured via agreement to self-descriptive statements about orientation to an activity or set of activities (e.g., “I enjoy reading”).

Measurement tools

As self-report scales for motivation continue to proliferate, it would be unfeasible to discuss the numerous measures of achievement motivation within this review. Self-report measures in educational and psychological domains have been exponentially increasing in number and complexity, without sufficient consideration for developmentally suitable scales and formats for psychometric and theoretical validation. Furthermore, the use of self-report methods with students creates additional challenges, presenting a greater need for understanding and overcoming these limitations. The following brief review of particular extant and/or historical measures illustrates various broad categories of self-report measures. The particular scales selected reveal the range of achievement motivation constructs, encompassing the four central areas of motivation research as reviewed by Eccles and Wigfield (2002): expectancies for success (i.e., self-concept theories), task value (i.e., intrinsic motivation and self-determination theories), expectancy and value theories (i.e., expectancy–value theory), and motivation and cognition (i.e., achievement goal theories). Although these scales are not representative of all achievement motivation measures due to the abundance of both published and non-published measures in this field, these measures have been extensively applied within achievement motivation research or adapted in ad hoc researcher-constructed measures.

Self-report measures vary in length and design but typically entail a Likert-scale response format and, for younger children, the addition of pictorial representations (e.g., The Pictorial Scale of Perceived Competence and Social Acceptance for Young Children, Harter and Pike 1984). Proposed in 1932 by Likert, statements are rated according to categories, such as strongly disagree, disagree, neutral, agree, and strongly agree. Consecutive numbers are assigned to represent degree of endorsement to particular statements. Self-report measures that cover motivation typically include self-descriptive statements of behavior (e.g., “I choose to read books over other activities”) or attitude (e.g., “I like math”). Overall, the ease of administration and standardized design of self-report measures is conducive to comparing motivational orientations across ages, groups, and developmental or academic levels. It is important to note that many studies have taken an ad hoc approach to the use of existing published measures and non-published simple questionnaires, incorporating specific subscales or items of a measure. An example is the selection of items of Fortier et al. (1995) from Harter’s Perceived Competence Scale. Other studies have changed the scale to correspond with a different subject area, for example, the adaptation of Liu et al. (2006) of the Intrinsic Motivation Inventory (IMI) for measuring motivation in collaborative project, math, and science domains.

Self-report measures of achievement motivation have generally moved from a focus on a single dimension, such as self-concept [e.g., Piers’ (1984) Piers–Harris Children’s Self-Concept Scale (PHCSCS); Boersma and Chapman’s (1992) Perception of Ability Scale for Students (PASS)], to measures focusing on multiple dimensions of achievement motivation. For example, measures have integrated achievement goals and intrinsic motivation [School Motivation Questionnaire (SMQ) of Marsh et al. (2003)], intrinsic/extrinsic orientations, expectancy beliefs, value judgments, and affect [Motivated Strategies for Learning Questionnaire (MSLQ) of Pintrich et al. (1993)], and both social and cognitive aspects of motivation, such as achievement goals, achievement-related beliefs, and academic behaviors and strategies [Patterns of Adaptive Learning Scales (PALS) of Midgley et al. (2000)].

Some measures of achievement motivation focus on a general motivational orientation toward school [e.g., Academic Motivation Scale (AMS) of Fortier et al. (1995)], while others assess these constructs in multiple domains (Boersma and Chapman’s (1992) PASS; Bracken’s (1992) Multidimensional Self-Concept Scale; Gottfried’s (1986) Children’s Academic Intrinsic Motivation Inventory, CAIMI; Marsh’s (1992b) Academic Self-Description Questionnaire). As mentioned, measures such as the IMI (McAuley et al.1989) have been modified for use in different domains. Recent recognition of the strong link between personal motivation and particular behaviors has resulted in the development of domain-specific measures. For example, self-report measures have been designed to assess self-concept and self-efficacy particular to the domain of reading (Chapman and Tunmer’s (1995) Reading Self-Concept Scale, RSCS; Henk and Melnick’s (1995) Reader Self-Perception Scale). Wigfield and Guthrie’s (1995, 1997) Motivations for Reading Questionnaire assesses 11 dimensions of reading motivation, including self-efficacy and intrinsic and extrinsic purposes for engaging in reading. Finally, in recognizing the developmental differences in motivation and understanding of scale items, measures such as the PALS (Midgley et al.2000), the Reading Motivation Questionnaire (Peklaj and Bucik 2003; Wigfield and Guthrie 1997), and the CAIMI (Gottfried 1986, 1990) have been modified for elementary and middle school students.

Critique of self-report measurement scales

Self-report measures are conducted in both laboratory and field settings and can be used on one occasion or repeatedly to compare results longitudinally. A specific motivating task is often unnecessary, as self-report measures can be completed in various settings at an unspecified point in time. It is often assumed that motivation functions as a trait (e.g., domain-specific interests and academic self-concept) and is relatively stable over time and across situations; however, some theories (e.g., self-efficacy theory, Schunk 1991) and facets of particular theories (e.g., situational interest, Hidi et al.1992) treat motivation as fluid and situationally dependent. Self-report devices are also used to assess situational interest and task-specific motivations, which, in this case, are assumed to change over time. A growing body of evidence has documented both developmental and instruction-related changes in motivation (e.g., Marcoulides et al.2008; Nurmi and Aunola 2005; Wigfield et al.1997). The correlation of self-reported motivation and academic achievement is typically measured through grade or test results and/or psychoeducational assessments of domain-specific skills.

Self-reports have also been incorporated to assess the effectiveness of pedagogical methods. In the assessment of a biology curriculum program by Paris et al. (1998), self-reported interest was gathered before and after the instructional period, and academic achievement was defined in terms of results on quiz and problem solving tasks. Similarly, Veermans and Tapola (2004) assessed the effectiveness of a 4-year inquiry learning projects method through measurements of self-reported motivation at the beginning of each year.

Researchers continue to contribute to the development of self-report measures of motivation for academics, possibly due to the advantages evident in using this method. The psychometric benefits of well-constructed self-report measures are high internal consistency and specificity in construct definition (Marsh et al.2003). Furthermore, evidence linking reading achievement and self-perceptions of competence and motivation was initially found using self-report methodology (Chapman and Tunmer 1995). Self-reports are compatible with large-sample research studies, are easy and quick to administer, and have advantageous scaling properties suitable for the use of inferential statistics, allowing for straightforward statistical analysis and standardization (Elliott 2004). On the other hand, several widely recognized weaknesses of self-reports warrant attention to improved measurement and methodology (Elliott 2004; Keith and Bracken 1996). The following examination of methodological weaknesses for self-reports consists of developmental, construct definition, and measurement/reliability issues.

Developmental issues

One difficulty in the use of self-report measures for a range of age groups is the developmental differences in self-concept and motivation (Chapman and Tunmer 1995; Nicholls 1978). For example, research using the MSLQ (Pintrich et al.1993) has found varying factor structures of motivation in populations of different ages (Pintrich and De Groot 1990; Rao and Sachs 1999). Challenges associated with creating suitable measures have resulted in a lack of well-constructed, developmentally appropriate self-report measures for use with younger children (Chapman and Tunmer 1995; Marsh et al.1991). Designing self-report measures for younger children and children with disabilities is complicated due to problems with developmentally suitable item and response formats (Chapman and Tunmer 1995), sentence structure and word choice (Elliott 2004), and misleading negative items (Marsh 1986). Administration problems may occur if younger students are expected to complete the scale without assistance for clarification and ensuring understanding (Elliott 2004), especially if students read the items and response choices independently.

Younger children, in particular, may not be able to cognitively process all concepts within an item if it contains a motivational construct as well as a contextual reference, such as the teacher or classroom (Karabenick et al.2007). For example, a performance–avoid goal item in the PALS (Midgley et al.2000) requires students to consider the beliefs of their teacher, the abilities of other classmates, as well as the perception of their own knowledge or intelligence (“It’s important to me that my teacher doesn’t think that I know less than others in class”). Likewise, attempts to reduce the memory requirements for students via a modified response format, such as the forced-choice paradigm in Harter’s (1982) Perceived Competence Scale, increase other cognitive processing demands.

However, issues of item interpretation are not limited to young children. Karabenick et al. (2007) discussed the challenges involved in the cognitive processing of self-report items and the conceptually abstract terminology and multiple concepts included in many scale items. Interviews with middle school children revealed that students often experience difficulty understanding common words in motivation items (e.g., improvement), and the reasons for judging self-efficacy in commonly used items (PALS, Midgley et al.2000; Motivated Strategies for Learning Questionnaire, Pintrich and De Groot 1990). Urdan and Mestas (2006) also found that high school students interpreted performance goals from the PALS (Midgley et al.2000) in different ways than the researcher intended (e.g., responding to avoidance items using approach explanations). Similarly, the mastery–avoidance goal items in Elliot and McGregor’s (2001) Achievement Goal Questionnaire include complex wording and interpretations that would be difficult for children and adolescents to understand (e.g., “I am often concerned that I may not learn all that there is to learn in this class”). As a result, the use of additional methods in the measurement of motivation may increase accuracy and interpretation of results by both validating and supplementing self-report data (Veermans and Tapola 2004). Behavioral or physiological measures that assess motivation in terms of physical responses may be especially important for understanding motivation in younger children, as the accuracy of children’s perceptions of their thoughts and actions in relation to their motivation has been questioned (Nicholls 1978; Pintrich and Schunk 1996).

Construct issues

Due in part to continued debates regarding core motivational concepts, an absence of clear operational definitions, and issues with theoretical foundations, confusion over the distinctions between terms and inconsistent conclusions regarding motivational concepts persists (Bear et al.2002; Keith and Bracken 1996; Murphy and Alexander 2000). A number of self-report measures consider motivation from a single theoretical perspective, oversimplifying the complexities of motivation and resulting in a lack of understanding of the conceptual difficulties within this construct (Elliott 2004). For example, the PHCSC (Piers 1984) has been criticized for measuring self-concept as a unitary, rather than multidimensional, construct (Keith and Bracken 1996).

Additional issues arise in the measurement of constructs that involve opposing or separate orientations. For instance, Harter’s measure of intrinsic/extrinsic motivation (1981) has been criticized for placing intrinsic and extrinsic motivation at opposite ends of a continuum, while there is evidence that intrinsic and extrinsic factors independently, but simultaneously, influence academic performance and behavior (Lepper et al.2005). Similarly, performance and mastery goals have been measured as separate orientations, while findings have demonstrated that students hold both of these goals in academic contexts (Meece et al.2006; Pintrich 2000). Thus, items that contain references to both mastery and performance competencies or that situate these goals in competition with each other may not be conceptually accurate, depending on the theory underlying these goals (Elliot and Murayama 2008).

The measurement of achievement goals has also been criticized for lacking in rigor in the conceptualization and operationalization of goals in self-report measures, such as the Achievement Goal Questionnaire (Elliot and McGregor 2001) and the PALS (Midgley et al.2000; Elliot and Murayama 2008). For example, students often have a variety of underlying reasons for adopting and pursuing particular goals, and self-report items often merge the goal with the underlying reason (Elliot and Murayama 2008; Urdan and Mestas 2006). Elliot and Murayama (2008) also discussed the problems inherent in self-report items that do not assess intentional commitments or goals but rather reflect values (“It is important to me…”) or concerns (“I worry that…”). It is evident that these constructs are more complex than can be measured and interpreted based on survey data alone. Furthermore, self-report items commonly merge the goal with the underlying reason, even though students have a variety of underlying reasons for adopting and pursuing particular goals (Elliot and Murayama 2008; Urdan and Mestas 2006). Similarly, the IMI has been criticized for problems with construct validity due to the evaluation of both determinants and consequences of intrinsic motivation (Guay et al.2000).

Issues in conceptual understanding occur due to the extent of measures created for each motivational construct. Researchers following the same theoretical framework may find disparate results if they choose different scales to measure the same construct (Bong 1996). An additional problem is that similarly named subscales across different measures may appear equivalent but do not always cover similar domains (Bear et al.2002; Marsh et al.2003). Researchers may invent their own labels for these subscales, exacerbating problems with constructs lacking discriminant validity (Bong 1996). This is especially true of intrinsic motivation, with measures using different definitions and dissimilar subscales to derive a generalized picture of students’ motivation (e.g., Gottfried’s (1986) CAIMI assesses curiosity, persistence, and desire to master challenging tasks; AMS of Fortier et al. (1995) assesses motivation to know, to accomplish things, and to experience stimulation; Harter’s (1981) intrinsic/extrinsic scale assesses preference for challenge, curiosity, and independent mastery, as well as independent judgment or working for one’s own satisfaction and internal criteria for success). Terminology confusion is also evident in the overlap between self-concept, self-esteem, and self-efficacy (Bong 1996; Hattie and Marsh 1996) and the multiplicity of definitions and subscales used to assess self-concept for academics. For example, Chapman and Tunmer’s (1995) RSCS assesses perceived competence, perceived difficulty, and attitudes. Conceptualizing equivalent constructs via different subscales, Marsh’s (1992b) Self-Description Questionnaire school subscale measures perceived ability and belief about the potential for success at school. In the same way, disparity in the conceptualization of achievement goals (learning versus performance; task-involved versus ego-involved; mastery versus ability) has resulted in conflicting outcomes on motivation, performance, and achievement behaviors (Grant and Dweck 2003).

In addition, existing self-report measures of motivation are often highly general, though current theory and research indicates that context specificity is essential to motivational processes (Murphy and Alexander 2000). Although some scales, including the IMI, maintain acceptable reliability and validity when modified for use with a specific academic domain (McAuley et al.1989), the creation of novel measures for particular subject areas or activities without appropriate validation is common. These problems with construct definition in the application of self-report formats to the study of motivation suggest that multiple methods may be needed in order to uncover the complexities of motivation.

Reliability/measurement issues

Few scales of motivation are thoroughly developed, extensively used, or published, resulting in a lack of information about the effectiveness of the scales and an inability to compare results across studies using several different measures (Bear et al.2002; Keith and Bracken 1996). Due to the lack of validated measures for particular age groups or academic domains, researchers tend to use non-published self-report measures rather than psychometrically and theoretically robust scales (Keith and Bracken 1996).

Poor construction and limited validation of self-report measures lead to several psychometric weaknesses. For example, items that reference the context of the behavior as well as beliefs or attitudes about the behavior may result in inconsistent responses due to individual participant’s interpretation of different elements of the item (e.g., PALS item previously discussed, “It’s important to me that my teacher doesn’t think that I know less than others in class”). Initial internal validation of self-report scales by the researchers who have constructed the scale often consists of a single unidimensional indicator of internal consistency, Cronbach’s alpha. This may not be sufficient reliability or validity evidence because it does not test for unidimensionality but rather assumes it (Sijtsma 2009). Watkins and Coffey’s (2004) empirical reanalysis of the Motivations for Reading Questionnaire suggested that the scale should be revised to address lack of subscale reliability, problematic items, and discrepant factor analytic results. Elliott (2004) found that when additional measures such as interviews and observations were combined with self-report of motivation, behavior was often inconsistent with participant report. Other common measurement problems include a lack of representative normative samples (Bear et al.2002; Keith and Bracken 1996), an absence of testing in authentic situations (Veermans and Tapola 2004), and cross-cultural challenges due to differences in the definition and conceptualization of motivation (Pintrich 2003). Keith and Bracken’s (1996) review of self-report measures also suggested that several widely used measures of self-concept (i.e., PHCSCS and PASS) also lack in test–retest reliability.

Two broader and more problematic lines of criticism have implications for the use and development of self-report measures of motivation. The primary response format used to generate quantitative levels of endorsement to scale items is the Likert format. Such response formats have been criticized for conceptually inaccurate scoring formats, resulting in imprecision in interpretation. For example, an individual with a neutral response does not necessarily have more motivation than an individual with a “disagree” response. The neutral category has also been criticized for accuracy in interpretation, as individuals choose this response for a number of different reasons (e.g., truly neutral, indecision, absence of opinion, protest, etc.; see Klopfer and Madden 1980), which may be unrelated to the construct being measured. Revising response formats to present an even number of items obviates this but leads to other difficulties including a tendency for responses to be slightly biased toward the positive end of any scale. An additional structural problem with individual items is mixed formulation of negatively and positively worded items (e.g., “I like to read stories” versus “I do not enjoy reading”), despite evidence that the cognitive demands this places on students may reduce the reliability of such scales (Marsh et al.2003). Lucas and Baird (2006) have recently provided a comprehensive review of the various internal structural problems with self-assessment scales and scale items. Krosnick et al. (2005) also reviewed best practices in self-report item construction, though neither of these two sources include treatment of the special concerns in constructing items for children.

Perhaps the most important but least frequently articulated assumption of the Likert scale format for items is that the resulting scale can be considered interval level measurement. Most mainstay statistical methods (e.g., mean-based group comparisons such as t tests and F tests; correlational techniques such as regression) require that input variables be intervally scaled; that is, that there are a range of possible scores and that the intervals from one score to the next are equal. Rating a statement with the Likert-style categories has been argued to be an ordinal level of measurement at best, with the student providing amount of agreement on a ranking scale. When scale items are summed or averaged, there is an additional assumption that a difference of one point at the top of the scale is the same as the difference at the bottom of the scale. Although there is some debate among research methodologists, the assumption of interval level measurement is generally accepted as a requirement, and there are simultaneously questions about whether self-report scales fulfill this requirement (see Velleman and Wilkinson 1993, for a full discussion of these issues). While analytic techniques are available, which do not require variables to be measured on an interval scale (e.g., polychoric correlations), the scores that result from self-report scales are more often analyzed with techniques that require interval level measurement (e.g., Pearson product moment correlations).

The second line of criticism is more fundamental, emerging from the problematic assumption that Likert-derived scales qualify as interval level measurement. The psychometric basis of self-report scales that assess motivation has traditionally been via classical psychometric theory (Kline 2000). Known variously as True Score Theory (TST) or classical test theory, this quantitative basis for self-report measures has been implemented largely without consideration of its limitations. The use of TST in self-report measures has been criticized for non-linearity in scoring and, therefore, lacking accuracy in interpretation of results. Additional limitations of classical test theory include the following: (1) reliability and consistency of responses across the full range of measurement, (2) scaling of item difficulties or the degree of trait in question, and (3) provision of a single estimate for error of measurement, which does not account for differential reliability across the full range of a scale. Michell (1990, 1999) has argued that an isometric interpretation of self-report scale scores is problematic, as an individual with a higher score does not necessarily have higher motivation than an individual with a lower score. The Rasch measurement model (RMM, Andrich 2005; Waugh 2006) is one approach that has recently been more broadly applied to research measurement problems, having had a role within standardized test development for at least 30 years. Application within research measurement promises to address these limitations of TST, with a solid theoretical and empirical basis and statistical tools coming within reach of motivation researchers (Alagumalai et al.2005). This approach, also identified as a type of item response theory, is fundamentally a model that fits an individual’s item responses to the latent variable representing the conceptual target (i.e., the attitude, trait, motivation dynamic, etc.).

While most theorists agree that motivation is a multidimensional construct and some extant self-report measures build such dimensionality into their content domain, validation of dimensional structure remains a problematic issue. For ad hoc researcher-constructed scales, exploratory factor analysis has often been employed to check whether items fall into coherent groups, hopefully consistent with the generating theory. For example, see Worrell’s (1997) replication of the Harter Self-Perception Scale (1985) dimensions with academically talented adolescents. A better approach has been the use of confirmatory factor analysis to fit a priori theoretically motivated models to self-report data, confirming dimensionality within a motivation measure. Marsh and colleagues, along with others (e.g., Marsh et al.2003; Wilson and Trainin 2007), have done extensive studies using this technique, with results supporting the notion that motivation is multidimensional. Within this measurement paradigm, a comprehensive validation may also utilize both exploratory and confirmatory factor analysis, along with strong theoretical hypotheses about the dimensionality of motivation (e.g., Lau 2004; Swalander and Taube 2007). The RMM discussed earlier provides a viable alternative method for investigating the dimensionality of self-report scales. When applied via multidimensional Rasch modeling, there is some evidence that the RMM has advantages over the factor-analytic method, operating more efficiently to parse multiple dimensions within one self-report measure of motivation (Smith 1996; Wright 1996; Waugh and Chapman 2005).

At the level of construct validity, self-report measures confound the measurement of motivation with other variables, such as ability and attention. For example, Chapman and Tunmer (1995) observed that reading self-concept scores were related to students’ linguistic ability to interpret the negatively worded scale items. Younger students inaccurately interpreted negative items that were affirmatively worded (e.g., “I am bad at reading”) and were better able to interpret negative items when they were interrogatively worded and used the pronoun you, rather than I (e.g., “Do you make a lot of mistakes when reading?”). This resulted in greater consistency of positive and negative item responses. Furthermore, the use of self-report methodologies is based on the assumption that motives are conscious, accessible, and can be communicated to others (Murphy and Alexander 2000), even though motivation is based in cognitions and emotions that can only be partially accessed by the individual (Hannula 2006). Thus, several challenges and weaknesses in the exclusive use of self-report measures for the study of motivation warrant consideration of alternative approaches. Since the limitations of self-report measures are highlighted when additional methods are used, a multi-method approach would contribute to an increased understanding of the complexities of motivation (Elliott 2004). Although multi-method approaches are often time-consuming, difficult to construct, and may appear to be deficient in objectivity and precision, the study of motivation could benefit from a significant transformation. Consequently, the following is a review of alternative approaches to the study of motivation, including phenomenological/authentic, neuropsychological/physiological, and behavioral. Methods and approaches sampled below are drawn from both within and outside of educational research, providing examples of ways to enhance methodological richness in the study of motivation. Where possible, motivation for academics will be the primary focus.

Alternative Approaches to Motivation Measurement


The phenomenological approach provides a flexible, holistic methodology to the study of motivation, emphasizing individuals’ subjective experiences, meanings, and perceptions of their motivational states (Yeung 2004). Using a qualitative, descriptive approach, phenomenological methods attempt to illustrate the meaning of a commonly shared, yet individually diverse phenomenon such as motivation (Shedivy 2004). It is believed that qualitative measures provide more depth to the evaluation of motivation because they are based on students' own constructions of experience and emphasize idiographic patterns of motivation (Shedivy 2004), rather than the description of broad patterns or correlations across many students. The fundamental epistemological difference between the phenomenological and self-report approaches lies in child-derived versus experimenter/researcher-derived categories being the primary unit of analysis. In the phenomenological approach, “items” are not constructed by the researcher to which the student indicates agreement or degree of endorsement; rather, meaning emerges from the student’s experience of motivation and their language for articulating that experience.

Phenomenology considers motivation as more than a determined, biologically based construct. The central focus of phenomenology on individual intentionality complements the application of this method to motivation research. In this view, motivation is influenced by environmental, historical, and individual factors and involves a temporal aspect, illustrating that motivation is not a stable emotional or mental state but is subject to change over time (Dörnyei 2000) and, in the case of motivation for academics, change across learning contexts. This is especially relevant for the study of motivation in the classroom, as motivation must be maintained over time at various scales (e.g., lesson, period, semester, and year) and shifting contexts (e.g., peers, teachers, learning materials, rewards, and other contingencies). In this concern for context, the phenomenological approach recognizes the multi-faceted nature of motivation, highlighting interconnections among and the diversity of motivational states (Yeung 2004).

Within this approach, a consistent definition of motivation does not exist, as the definition and themes of individual motivational orientations are idiographic and derived during the analysis stage. However, several phenomenological researchers offer broad functional characterizations of motivation in relation to the individual self. Yeung (2004) suggests that motivation functions as a link between motives and commitment. Shedivy (2004) has characterized motivational orientations as self-organizing tools, functioning in either an instrumental (predominantly extrinsic) or integrative (predominantly intrinsic) mode. Others such as Gardner (1985) have suggested that motivation operates as a filter or orienting device, providing direction to energy through goals, wants, and efforts and self-consciously marking an individual’s attitudes toward an activity.

Advantages of the phenomenological method over self-report include the ability to interpret and integrate multiple factors that influence motivation, recognizing motivation as a process rather than a state that can be measured by a questionnaire at a single point in time (Dörnyei 2000). This approach identifies specific and general motives for behavior and different stages of motivation through a microanalysis of various factors, processes, circumstances, and challenges that determine motivation and behavior (Dörnyei 2000). Furthermore, phenomenology integrates cognitive, affective, and contextual factors in the study of motivation (West 2002). In this way, the phenomenological approach introduces a wider range of meanings for human experience, potentially approaching complexities of motivation more satisfactorily than the self-report method. Schiefele and Csikszentmihalyi (1995) provide an example of experience sampling along with self-report methods of assessing interest, achievement motivation, and ability in mathematics. Csikszenthihalyi’s flow theory of motivation, which involves a state of intense, experiential engagement with an activity, emerges directly from a phenomenologically based method of experience sampling (Csikszentmihalyi et al.2005). The use of multiple qualitative methods, such as interviews and observations, can overcome the challenges often experienced in measuring young children’s motivation while measuring motivation in a natural context, such as the classroom (Jarvenoja and Jarvela 2005; Perry et al.2002). In a case study scenario, painting a rich portrait of a student’s motivation can be directly implemented into interventions to increase that child’s motivation (e.g., Pierson 1999).

Limitations of phenomenological methods include the notion that analysis is subjective and interpretative and not isometric with empirical objective methods such as self-report. According to this criticism, findings from these methods only apply to the particular participants involved (Spinelli 1989). Yeung (2004) has argued that phenomenological analyses of human experiences and meanings can result in significant knowledge that can be generalized. If assumptions are not properly bracketed (Husserl 1958), the identification and analysis of core themes is merely a reflection of the biases and assumptions of the researcher, resulting in similar limitations to the self-report method as existing theories of motivation are maintained. This presents an additional challenge in integrating the phenomenological approach with self-report methods, as the phenomenological premise of bracketing assumptions and the idea that motivation is continuously redefined counters the underlying assumptions of self-report as grounded in specific theory and definitions of motivation.

Phenomenological methods share limitations with the self-report method. Due to the attention directed to individual experience, Ratner (1993) stated that the social and historical character of experience may be neglected; thus, an integrated focus of both individual and social perspectives is essential. The phenomenological approach assumes that humans can consciously assess and explain their motivation; yet, it is understood that motivational behaviors are influenced by unconscious drives and needs (Dörnyei 2000). Finally, the phenomenological method has been criticized for an overdependence on verbal descriptions (Spinelli 1989), which is problematic when considering the lack of correlation between individuals’ attitudes and their actual behavior (Brehm and Self 1989; Schneider et al.2004). Although the developmental ability of children to reflect on their motivational states and experiences may be considered a limitation of this method, the researcher must overcome this challenge by developing skills to structure an interview and atmosphere that will elicit children’s narratives of their experiences.

Measurement tools

The phenomenological approach involves diverse tools, both in methodology and analysis, and therefore, does not have a set instructional method for gathering or analyzing data. There is an underlying assumption that experiences and meanings have a structure and can be narrated (Shedivy 2004). Since phenomenological approaches to studying motivation are qualitative in nature, the use of in-depth, open-ended, thematic interviews is common in order to capture individuals’ descriptions, perceptions, and interpretations of their motivations (e.g., Perry et al.2002; Shedivy 2004; Yeung 2004). With this basic material, the researcher aims to derive the meaning of participant experiences (Creswell 1998).

In other authentic approaches to measuring motivation, running records of participant observation, case studies, and semi-structured, retrospective interviews with students are combined, which amend many of the challenges with student reports of motivation (Perry et al.2002). These approaches enrich our understanding of the developmental differences and changes in students’ perceptions of their own motivation, especially in terms of what individuals believe influences their motivation, similarities and differences between students’ actual actions and their beliefs about their motivation and actions, and how classroom contexts are connected to students’ motivation (Perry et al.2002). Interviews, in connection with observational data, also provide an understanding of unobservable aspects of behavior, such as metacognition, and the ability to compare multiple methods to increase reliability (Jarvenoja and Jarvela 2005; Perry et al.2002). The experience sampling method has emerged as a more authentic set of self-report techniques, capturing multiple reports of students’ beliefs, affect, and behaviors in natural contexts in real time (Larson and Csikszentmihalyi 1983). This approach allows for a better understanding of the dynamic patterns in an individual’s motivation across time through assessing immediate reactions to different contexts or experiences (e.g., a specific math lesson or a social studies test). This method has also been considered as more valid and strongly related to physiological and behavioral responses (Csikszentmihalyi and Larson 1987), likely because students reflect on their current emotional and cognitive state, rather than years of school experiences.

Phenomenological analysis involves an inductive, rather than deductive, process, as the themes emerge from the interview data, rather than from the researcher’s hypotheses, biases, or assumptions (Shedivy 2004). This focus on analyzing individual’s accounts, experiences, and themes to search for meanings is also defined as a methodology of reduction (Creswell 1998). Analysis of data in a phenomenological study consists of four major stages, with the primary goal of describing a phenomenon. First, the transcriptions of the interviews must be read in their entirety to gain a sense of the broad themes. This is followed by distinguishing significant elements of the construct (i.e., motivation) in individual statements. Next, the researcher must formulate the individual statements into overall meanings, which are then reduced to themes. Finally, these themes are synthesized into an overall narrative description of the phenomena, which may aid in creating models of motivational orientations (Creswell 1998; Yeung 2004).

Measurement tasks

The phenomenological approach often assesses motivation in natural, authentic contexts through experiences, rather than specific tasks (Jarvenoja and Jarvela 2005). For example, interviews have examined students’ motivation to continue studying a foreign language (Shedivy 2004) and adults’ motivation to volunteer (Yeung 2004). In the field of reading and literacy, Perry et al. (2002) studied students’ self-regulatory behaviors during literacy tasks, while West (2002) employed qualitative techniques to study motivation for literacy learning. Furthermore, Pierson (1999) implemented a literacy program based on students’ stated interests to determine whether intrinsic motivation and performance were enhanced. Thus, it is believed that more authentic understandings of motivation are based on natural conditions in connection to individually pertinent experiences. Gambrell et al. (1996) followed an approach that incorporates some of the principles of authentic assessment in developing their motivation to read profile (MRP). The MRP not only consists of a self-report survey component but also incorporates a semi-structured conversational interview that queries personal experiences across multiple reading contexts that students are likely to encounter (e.g., narrative and informational text, reading at home versus school, and general reading experiences). The MRP is an example of an open-ended interview that does not fully cohere with the phenomenological approach in that, while responses are open-ended, categories and the language to speak about motivation are provided for students in the interview prompts (e.g., “What do you think you have to learn to be a better reader?” pulling for researcher-defined notions of competence and achievement).

Student’s genuine descriptions and reflections of their experiences and beliefs may be more authentic and valid than their responses to predetermined statements in self-report methods or the unnatural, experimental tasks often used to elicit responses regarding motivational states. The incorporation of this approach is also based on Husserl’s assumption that individuals can describe their lived experiences, which will lead to an understanding of the essence and structure of achievement motivation for a particular student (Creswell 1998; Moustakas 1994). However, it is important to note that these methods share developmental challenges with self-report methods. While students may have difficulties comprehending or conceptualizing the content of a self-report measure and responding to a Likert-type scale, these methods encounter developmental limits in children’s abilities of self-expression. Validity in this case is not the psychometric validity of the self-report methods (i.e., does the measure adequately cover the construct as defined by the researcher) but is the degree to which knowledge about motivation derived from research aligns with the lived reality of the individual. As language represents truth and conveys the rational and conscious content of the mind (Gergen 2001), it is assumed that the products of these methods represent students’ truth experiences and individual knowledge regarding their motivation. The phenomenological approach asserts that, to create knowledge of student’s motivation, children’s perspectives, language, and experiences should be the dominant source of our knowledge in this domain.


Two additional approaches to the measurement of motivation emerge from different traditions than either the self-report or phenomenological. These traditions, the neuropsychological/physiological and the behavioral are not completely distinct from the previously reviewed approach but do have unique measurement strategies that emerge from unique assumptions about the nature of motivation. The neuropsychological approach to the study of motivation has integrated motivation and cognition with a focus on measuring neural specificity for different aspects of motivational states and influences (Taylor et al.2004). Past basic research with animals (for a review, see Jones and Gosling 2008) and humans with brain injury have shown that specific regions of the brain function to establish the learned value of objects and activities and to predict response to rewards. For example, a mid-brain structure, the diencephalon, has long been recognized as essential to goal-directed behavior, helping the organism identify what stimuli in the environment are important. Animals and humans with missing or damaged diencephalon can manifest both sham rage and sham motivation, intense emotion, and activity, respectively, with no particular goal or endpoint (Goltz 1960; Grill and Norgren 1978). In addition to neural responses, this approach asserts that complex interactions among biological, cognitive, and psychological systems determine behavior (Beauchaine 2001). It is believed that physiological changes in the nervous system in response to motivational stimuli and conditions, sometimes defined as motivational arousal, provide a useful measure of the influence of motivation on behavior (Blair et al.2004). Motivational arousal is believed to function primarily to produce or avoid a potential outcome (Brehm and Self 1989).

Although the history of neuropsychological measures involves a predominant focus on simple motivations (e.g., hunger, sex, and avoidance of pain), more complex human motivations that are evident in the absence of a biological need are now being examined (Arana et al.2003). Motivational concepts provide a crucial link for behavioral neuroscience between the limbic brain systems and psychological processes or behavior (Berridge 2004). Neuropsychological and physiological approaches also admit the possibility that unconscious (i.e., not explicitly recognized as internal states by individuals) drives influence motivation and behavior. This dynamic, proposed by Freud and others, formed the basis in the 1970s and 1980s for projective measures of motivation (i.e., Rorshach and Thematic Apperception Test). In particular, the Thematic Apperception Test was designed to assess the role of affect in achievement motivation, with motivational traits manifesting themselves via implicit motives (Vestewig and Paradise 1977). However, projective tests were later criticized for subjective scoring procedures and questionable construct validity and, therefore, were considered to be inadequate measures of self-reported motivation (Keith and Bracken 1996; but see also Schultheiss and Brunstein 2005 for a more recent integrated model of implicit and explicit motives).

The neuropsychological approach utilizes several definitions of motivation. There is a common belief that reward is a basic goal in human behavior, and motivation and goal-directed behavior are guided by appraisals of incentive values, priorities, rewards, and punishments within the neural systems in the brain (Arana et al.2003; Zalla et al.2000). Executive functions then access these appraisals to organize behavior (Taylor et al.2004), which is manifested through affective responses, including behavioral, autonomic, and physiological reactions (Berridge 2004). These responses are typically categorized as approach or avoidance behavior and/or positive and negative emotional states (Lang et al.1998; Zalla et al.2000). As a result, research has focused on the influence of motivation and value, usually in terms of reward anticipation and reinforcement, on cognitive task performance and activation in specific brain regions (Taylor et al.2004).

In terms of broad conceptualizations, the neuropsychological approach postulates two basic motivational systems, appetitive and defensive, which vary in both activation and arousal (Lang et al.1998). This basic division has been used in measure design. For example, Carver and White (1994) designed a behavioral rating scale of the Behavioral Activation System and Behavioral Inhibition System, which are presumed to be correlated with increased activity in specific areas of the frontal regions of the brain, resulting in either approach or avoidance behaviors and emotional states (Beauchaine 2001; Lang et al.1998). Furthermore, brain reactions have been categorized into “liking”, which is the brain reaction to sensory pleasure as a result of reward, and “wanting”, or the incentive value or salience of the reward (Berridge 2004). Wanting and liking emerge from two subcortical neurobiological systems that trigger motivation, where wanting is associated with mesolimbic dopamine activation, and liking is associated with activation within the nucleus accumbens. Although these neural activations are beyond conscious awareness, it is believed that activations in these regions are reflected in emotional states (Berridge 2004).

Neuropsychological and physiological methods have distinct advantages over self-report, arising from strong face validity and content specificity (Elbaum and Vaughn 2003). These advantages are detailed below; however, it can be difficult to demonstrate psychometric reliability and replicability across situations. Furthermore, increases in neuronal activity may be attributable to additional factors, such as increased effort, arousal, or attention, as a result of increased motivational states (Taylor et al.2004). The construct of motivation, as defined by this approach, does not delineate the boundary between affect and motivation clearly enough. An individual’s approach to a reward or punishment task, either in terms of fearing punishment or anticipating reward, may be driven equally by affective concerns as by motivation (Taylor et al.2004). Generalizability to children and adolescents is restricted due to a predominant focus on animal and adult participants and the incomplete brain development of children, which may result in different regions and pathways used for certain tasks, including motivation. Finally, it continues to be argued whether physiological measures of behavior are reliable, due to weak correlations with emotion and behavior (Litman 2005).

Measurement tools

In the fields of neuropsychology and brain imaging, motivational states are commonly measured using functional magnetic resonance imaging (fMRI) scans (Arana et al.2003; Elliott et al.2003; Mizuno et al.2008; Taylor et al.2004; Zalla et al.2000) and electroencephalographs (Dubrovinskaya and Machinskaya 2002). MRI scans have found increased activity in the putamen for motivation to learn, compared to motivation for reward, through an examination of blood–oxygen-level-dependent (BOLD) signals (Mizuno et al.2008). Taylor et al. (2004) found that motivation, in the form of rewards, interacts with similar neural networks as working memory. When monetary reward was used as an incentive, fMRI scans showed BOLD effects in the right superior frontal sulcus and bilateral intraparietal sulcus. Activation in the dorsolateral pre-frontal cortex also occurred during retrieval from working memory in the context of reward, showing a possible integration of information about value when using working memory processes. As shown through fMRI scans, the amygdala and orbitofrontal cortex have been associated with the presence of rewards and may be involved in comparing the incentive values of different stimuli in order to select between competing goals (Arana et al.2003; Elliott et al.2003). The amygdala has also been associated with responses to success or failure feedback (Zalla et al.2000). For increased reliability, these measures are often combined with behavioral measures, such as response time (Arana et al.2003; Elliott et al.2003; Ernst et al.2004; Taylor et al.2004), approach/avoidance behavior (Carver and White 1994), and self-report ratings of difficulty or value, anticipation of reward, desire to learn and goals, and emotional states during the task (Arana et al.2003; Mizuno et al.2008; Zalla et al.2000).

Within the physiological approach, motivation has been measured using eye trackers or pupillometers (Washburn and Putney 2001), which can provide a straightforward measure of mental effort and the mechanisms of interest and attention (Beatty 1982). These authors argue that increases in pupil dilation and more accurate visual gaze are associated with increased attention brought about by motivational stimuli. The correlation of pupil dilation with improved accuracy and response time may be explained by increased attention as an automatic response to the growing demands of a task, which is mediated by increased arousal. Measures of the sympathetic nervous system and cardiovascular reactivity, such as heart rate and blood pressure, have been taken as indicative of motivational arousal (Brehm and Self 1989) and demonstrate the activation and intensity level of an individual’s current motivational state (Lang et al.1998). Increases in motivational arousal due to reward or punishment correlate with increases in heart rate and blood pressure, while the heart rate and blood pressure of amotivated individuals remain stable (Brehm and Self 1989). Research in the field of sport psychology has found consistencies between self-reported arousal and actual physiological responses to motivational stimuli (e.g., Cumming et al.2007). However, an individual’s judgment of their own motivational arousal in achievement contexts often does not correlate with these objective measures of arousal, suggesting reasons for the inconsistencies often observed between self-report and neuropsychological/physiological methods (Brehm and Self 1989; Schneider et al.2004). Other measures have included skin conductance (Lang et al.1998), salivary cortisol (Blair et al.2004), and cardiac vagal tone combined with additional autonomic nervous system measures (Beauchaine 2001). Finally, affective reactions to stimuli can be measured using electromyography measures of facial muscle activity (Lang et al.1998).

Measurement tasks

Motivation is often measured during controlled laboratory experiments and may involve motor detection or response, working memory, and mathematical or cognitive tasks. In addition, researchers have used positive, neutral, and negative pictures, words, or sounds to prompt emotional arousal, in an attempt to correlate motivation with behavioral responses to stimuli (Hillman et al.2004; Lang et al.1998). Moreover, Arana et al. (2003) used high- versus low-incentive stimuli in a decision-making situation to measure the correlation of motivating conditions with difficulty of choice. In order to measure motivation, participants are provided with various types and magnitudes of reward or punishment feedback that are random or based on performance. These rewards or punishments may include win/loss, auditory feedback, point rewards, and real or imaginary monetary rewards (Elliott et al.2003; Taylor et al.2004; Washburn and Putney 2001; Zalla et al.2000). For example, Mizuno et al. (2008) used a working memory task to understand brain activation differences for academic rewards (feedback on correct answers designed to elicit feelings of competence and success) versus monetary rewards (obtaining points). It is assumed that participants are motivated by rewards, in addition to competition, success, and researcher encouragement. A series of compelling studies on the motivational salience of the color red, Elliot and colleagues (Elliot and Maier 2007; Elliot et al.2009) employed a variety of behavioral and physiological measures of approach/avoidance motivation. For example, motivation was conceptualized as degrees of inclination toward (approach) or away (avoidance) from a test stimulus as measured by an inclinometer. In another related experiment, inhibited motor action as an index of avoidance motivation was measured via the number of knocks on a door beyond which participants expected to encounter experimental stimuli.

A common target of neuropsychological assessment has been the detection of malingering, which is conceptualized in motivational terms as motivation to perform poorly and detected via the assessment of incomplete effort (Ross et al.2006). In these tasks, incentive to perform poorly is associated with chance performance and is taken as a dysfunction of normal motivational systems. Several assessment tools have been validated, with robust determination of both sensitivity and specificity at detecting motivation levels [e.g., see Axelrod et al. (2006) using the Digit Span subtests of the Wechsler intelligence tests for children and adults; see also Binder (2002) reviewing the use of the Portland Digit Recognition Task].


While the neuropsychological/physiological measurement strategy emerges from a structural approach, in which the central assumption is that motivation is a primary biological system, the behavioral approach makes no claims as to the origins of motivation. In contrast, the behavioral approach to the study of motivation is a functional one. Proponents of this approach assert that traditional theories of motivation are inadequate because they do not provide context-specific behavioral measures of motivation in authentic, natural settings (Hickey 1997; Jarvenoja and Jarvela 2005). This approach assumes that overt behaviors and reactions reflect an individual’s motivational state, and these behaviors are evident before, during, and after a task (Hillman et al.2004). Hillman et al. (2004) have asserted that many studies of motivation and emotions have incorporated self-report and physiological measures, with few studies observing behavioral reactions. The behavioral approach also recognizes the need for multiple methods to study complex motivational variables (Jarvenoja and Jarvela 2005; Veermans and Tapola 2004).

Motivation is defined within the behavioral approach according to the objective presence or absence of specific overt behaviors and emotions. Intrinsic motivation is often measured through behaviors, such as the choice to pursue and engage in tasks, and attending to and investigating a particular task, which may be due to feelings of arousal or drive (Henderlong and Paris 1996; Reeve and Nix 1997). Similar to the neuropsychological treatment of motivation, approach and avoidance behaviors are considered to be reflective of motivational states (Hillman et al.2004).

Advantages over self-report include the study of motivation in natural contexts and learning situations (Henderlong and Paris 1996) and measuring affective responses, rather than simply cognitive or self-evaluative reactions, to motivational stimuli and conditions. Behaviors such as involvement and engagement are especially important for success in reading and literacy and, therefore, are a significant aspect of motivation to measure. In addition, behavioral measures tend to have increased context-specificity and face validity when compared to self-report measures. Like self-report, most behavioral measures are relatively non-intrusive especially when considered against the neuropsychological/physiological approach (Reeve and Nix 1997) and can be completed by observers other than the participant (e.g., teachers, parents, etc.).

Challenges with the behavioral approach include limitations of the behavioral tasks themselves, as behavior is affected by the knowledge of these tasks, level of challenge, interest level of the participants, the influence of competing motives and activities, previous exposure to the task, and attentional constraints (Henderlong and Paris 1996; Hillman et al.2004; Wicker et al.1990). As suggested by Bong (1996), social desirability may affect behavioral results in the same way as self-report measures. Using both methods can inform the researcher about the validity of self-report responses (Bong 1996). Furthermore, if methodology is solely observational, researchers are often unable to assess how the activity influences students’ emotions, confidence, or self-concept, resulting in uncertainty in determining, which behaviors are reflective of motivation (Henderlong and Paris 1996). For this reason, most behavioral studies also include a self-report measure.

Measurement tools

The behavioral approach focuses on operationally defining behaviors that are considered to be reflective of motivational states. This may include analyzing video data, viewing behaviors through a one-way mirror, or using common measures to assess behaviors. Behaviors are measured either during the task or post-performance. Behaviors believed to reflect motivation, and intrinsic motivation in particular, include whether an individual approaches or pursues a task, latency of initiation, length of engagement, comprehensiveness of involvement, effort, and task-oriented activity versus off-task behavior (Henderlong and Paris 1996; Justice et al.2003; Reeve and Nix 1997; Veermans and Tapola 2004). Response or reaction time is commonly used in both neuropsychological and behavioral methodology, and it is assumed that improvement in response time due to motivation or rewards is attributable to increased arousal, which results in improved attention and motor preparedness to respond to highly valued stimuli (Elliott et al.2003; Washburn and Putney 2001). Interest and motivation measures have also included displays of positive affect during task-related behavior (Gilmore et al.2003). More specific bodily behavioral measures consist of hand speed and facial displays, including eye contact and frequency of eyes closed, although Reeve and Nix (1997) concluded that there may not be a consistent, reliable group of facial displays related to intrinsic motivation.

Free choice time is the most often implemented behavioral measure of intrinsic motivation and is operationally defined in terms of the time the participant engages with the target task after the experimenter has left the room for a specific period of time (Reeve and Cole 1987; Reeve and Nix 1997; Wicker et al.1990). The operationalization of free choice time is consistent with the psychological definition of intrinsic motivation (Guay et al.2000). However, the use of this measure has been challenged due to limitations including the assessment of post-performance motivation, the effect of the attractiveness of alternative activities in the room, discrepancies over correlations with self-reported intrinsic interest, the inability to use free choice time in natural, authentic settings, and the neglect of other aspects of motivation (Guay et al.2000; Reeve and Nix 1997; Wicker et al.1990).

Behavioral measures are becoming quite common in research on student motivation; for example, Henderlong and Paris (1996) studied the choice and persistence behaviors of children in a museum exhibit scenario, while Pierson (1999) focused on task engagement and persistence in personally interesting literacy activities. Behavioral measures for literacy motivation have also been constructed and include the Kaderayek–Sulzby Rating of Orientation to Book Reading (Justice et al.2003), which involves a rating of children’s engagement, participation, and interest in a book-reading task. Patrick et al. (1997) developed a protocol to observe several aspects of the classroom environment, including student motivated behavior (e.g., student affect associated with tasks, student help-seeking behavior, and student–student interactions). This measure is useful for a wider range of grade levels and emphasizes the importance of context in students’ motivation.

Measurement tasks

The tasks used in behavioral methods, especially for children, consist mainly of natural, authentic learning environments and activities. This may include informal learning settings, such as museums (Henderlong and Paris 1996), or formal learning situations, such as intervention programs for literacy (Justice et al.2003). Students typically select tasks that are personally motivating, instead of the researcher choosing or creating tasks that are assumed to be motivating for the student. Some similarities exist between behavioral and neuropsychological approaches with the occasional use of cognitive tasks that are believed to be intrinsically motivating (Reeve and Cole 1987; Reeve and Nix 1997; Wicker et al.1990) and extrinsic motivators, such as monetary rewards (Wicker et al.1990).

In reviewing the approaches to the measurement of motivation, it is evident that each of these approaches provides strengths to the study of motivation that resolve some of the limitations in self-report methodology. However, the challenge lies in the integration of these approaches, which has been rare in empirical research to date. The following future directions will provide a review of accounts of preliminary attempts at integration or combination of the previously discussed approaches and a summary of the characteristics of an ideal measure of motivation.

Future Directions

The preceding critiques of four approaches to motivation measurement imply that each perspective has specific and distinct strengths. Within each approach, researchers should take care to protect these strengths by engaging in best practices for that method. Reliability and construct specificity are the main advantages of the self-report method. These can be preserved by attending carefully to item construction, with especial attention to developmental issues in the writing of items and in response scale mechanics. Whether using an existing measure or creating a new measure, construct validation via a dimensionality analysis is also essential for preserving the strengths of the self-report format. In contrast, the idiographic, dynamic, and context-specific nature of the phenomenological approach is the major strength of that measurement perspective. Any procedure that imposes researcher-derived categories, classifications, or labels for motivational constructs nullifies this measurement advantage. The main advantages of the behavioral and physiological approaches are their use of natural contexts and the focus on face-valid motivated behaviors (e.g., attention to task, persistence, task choice, etc.) or physiological responses. Preserving the specificity of the target context and behavior maximizes this strength. Each of the four approaches have an established literature that details best measurement practices, and starting points have been identified in the critiques above. An additional caveat to these recommendations would be that, although each approach has specific strengths that should be preserved, the approach chosen for a particular study should fit with the research questions and theories driving the research.

As cumbersome as a full dimensionality assessment for a self-report measure or an authenticity audit for a phenomenologically based measure may be, attending to and conserving the strengths of each measurement approach is the easier of the two tasks presented to researchers by the present critique. Accounting and mitigating the weaknesses of each approach is a more difficult task. Where self-report methodology has strength in construct specificity, it may lack in face validity; where the behavioral approach has strength in face validity, the tasks or responses chosen may be misaligned with the lived motivational experience and learning contexts of the students involved; where the phenomenological approach captures the idiographic aspects of motivation, generalization to developmental and educational processes may be difficult. As a result, one potential mechanism for attending to the weaknesses of individual approaches may be a process whereby the strength of an alternate approach is incorporated to the measurement scheme or the broader study at hand.

Preliminary combination/integration

With the predominance of self-report methodology in the study of motivation, integrative methodological approaches should consist of the best of self-report with the incorporation of alternative measurement techniques from other approaches. Ideally, measures of motivation will be multidimensional in both theoretical perspectives and measurement techniques, reflecting multiple perspectives and approaches to capture the complexities of students’ motivational profiles with regards to academics. The aforementioned approaches to the study of motivation have been integrated in research in diverse ways; however, the predominant focus continues to emphasize either individual or environmental determinants of motivation (Veermans and Tapola 2004). With the limitations of each approach and the ability of other fields to partially compensate for these limitations, there is a need to form integrated, multi-pronged approaches to the study of motivation. Our comments on combination and/or integration of methodological approaches do not veer into the problems of theoretical integration. Acknowledging that each of the four measurement approaches arise from different and sometimes incompatible theoretical traditions, any specific attempt to combine techniques will need to weigh the extensive literature on problems of theoretical integration (e.g., Green 2007).

The most common multi-method approach is the combination of self-report and behavioral measures (i.e., Reeve and Cole 1987; Reeve and Nix 1997). As Bong (1996) suggested, self-report data can be integrated with behavioral methods, including observations in the learning environment and more overt behavioral indexes. Disparate findings have evolved from these multi-method approaches, possibly due to a superficial integration of the methods, or a limited, unidirectional conceptualization of the relationship between the individual and context (Veermans and Tapola 2004). Across multiple areas of study in learning and instruction, reciprocal relationships have been identified between the individual and environment, and therefore, the relationship with motivation should be investigated in a similar way (Reeve and Cole 1987).

Reeve and Cole (1987) found that self-report and behavioral measures of intrinsic motivation independently contribute to the variance in intrinsic motivation, concluding that both methods are crucial in the study of motivation. Reeve and Nix (1997) also found that self-reported interest and competence correlated with behavioral measures, and some facial displays of intrinsic motivation. An unanticipated finding by Frijters et al. (2005) resulted in the identification of a behavioral component to a self-report measure. Median time to complete each item strongly predicted degree of response to reading remediation. Children who took longer to complete each item grew at the fastest rate on reading outcomes, suggesting that persistence or effort on the self-report task functioned as a proxy for engagement with reading material of any kind, including items on a self-report task. Jarvenoja and Jarvela (2005) and Veermans and Tapola (2004) reached a similar conclusion, emphasizing the importance of a profile-oriented approach that combines individual (self-report) and situational measures to understand classroom motivation in natural settings. Relationships between self-report questionnaire responses and neurological activity have also been found, with the strength of reported subjective motivational state linearly related to the intensity of brain activity in the putamen (Mizuno et al.2008).

Attempts to integrate approaches have not been completely successful, possibly due to a lack of understanding about the particular theoretical underpinnings of the construct of motivation within each approach. This was evident in Reeve and Nix’s (1997) claim of incorporating a phenomenological approach, which was actually a self-report measure. This emphasizes the need for familiarity with the different approaches to motivation and proper integration suggestions, as highlighted in this review. Although some studies have combined self-report and behavioral methods, caution is required when integrating these methods, as they may be incompatible on some dimensions or draw on different aspects of motivation. Wicker et al. (1990) compared self-report and behavioral measures of intrinsic motivation for a task and found that self-reported motivation (e.g., feeling interested, engaged, and successful) and expressed motivated behavior were not correlated. Furthermore, their behavioral measure was negatively correlated to affect (e.g., feeling competent, reporting the task was appealing and fun) and goals (e.g., autonomous achievement). The authors concluded that self-report and behavioral measures are not equivalent in their measurement of intrinsic motivation and may measure different aspects of motivation. Wicker et al. (1990) stated that some motivational factors may have a similar influence on the self-report and behavioral measures, while other factors may have opposing effects, advising “some distrust of all extant measures of intrinsic motivation” (p. 85). The authors concluded that if the measures are to be combined to study similar aspects of motivation, clarity as to how self-report and behavioral methods differ and overlap in the measurement of motivation is required.

Similarly, Elliott (2004) discussed challenges in combining interviews and self-reports, as responses during an interview may not reflect self-report data. Thus, when integrating multiple methods in the study of motivation, we must be attentive to the dimensions of motivation measured by each instrument and the validity of these measures. A surface combination of methods is insufficient, as these methods must be rigorously compared and integrated to ensure that a reliable, valid measure of motivation is developed. At this point, a discussion of the characteristics of a more robust measure of motivation will be examined based on the strengths of the approaches explored throughout this review.

Integrated measurement strategy for motivation

Optimizing the measurement of motivation has been a topic of recent interest, especially with commentaries on the limitations of self-report (i.e., Kagan 2007) and the expanding methodological options available to overcome these limitations (i.e., for measuring self-regulation, see Zimmerman 2008). Although the consideration of alternative approaches is needed, it is also vital to extend our current understanding of existing methods through further research and theory development. This is especially relevant in the self-report approach in order to understand the most reliable and valid applications of existing and novel measures. Particularly in the case of a diverse, complex construct such as motivation, multiple perspectives, and the interactions between these perspectives must be utilized (Yeung 2004). The review of the approaches to the study of motivation and basic attempts at integration provides a rationale for a measure that retains the best characteristics of self-report measures (e.g., reliability and desirable scaling properties for use in statistical analyses) and behavioral measures (e.g., context-specificity and face validity). Construction of complementary measurement devices should integrate the two significant aspects of motivation, as suggested by the theoretical work of Ainslie (1992): (1) an attitudinal or emotional orientation component typically assessed through self-report or phenomenological methodologies and (2) a task-oriented component assessing the engagement and involvement with an activity, which corresponds to behavioral methodologies.

Pintrich (2003) recommended the integration of cognitive-individual and social-cultural approaches, as individual or contextual approaches on their own will not generate new knowledge in the field. Cognition and motivation must be examined from the outside in through a contextual and cultural lens, rather than focusing solely on individual and intrapsychological processes (Pintrich 2003). The integration of behavioral and neuropsychological models of motives and needs with social-cognitive approaches, and the combination of implicit and explicit constructs, will give rise to a more comprehensive understanding of motivation through a multiplicity of experimental designs and methodologies (Pintrich 2003). Furthermore, the measurement of motivation should be carried out in natural, authentic learning contexts (Jarvenoja and Jarvela 2005; Veermans and Tapola 2004).

One possibility for functional integration of motivation measures from different methodological traditions we propose is the input–output approach. As mentioned above, superficial combination of measurement methods has led to inconsistent results and questions about the validity of each measure separately. To combine measures functionally, the output from one measure could form the input for another. For example, a multidimensional self-report scale of motivation for reading may be used initially, generating a score profile across interest, effort, and perceived competence. Low-scale scores on one or more could form the basis for open-ended or phenomenological assessment focusing on that aspect of motivation. Similarly, the output from a behavioral measure of motivation could be used to detect changes in self-report of motivation. The degree of discrepancy between self-reports before and after either success or failure conditions in a behavioral measure may indicate whether the self-appraisal motivation system is intact or robust. Such a functional approach would capitalize on the strength of individual approaches but simultaneously be able to address complex motivational systems that operate in learning situations.

Using one method of measurement to contextualize the results of another reframes the lack of inter-measure consistency from a methodological problem to a potential way to advance theory in the area. For example, Turner et al. (2002) found classroom-level differences in the relation between middle school students’ avoidance strategies and their perceptions of the classroom goal structure through student self-report. To thoroughly understand and contextualize these relationships and between-classroom differences, audio transcripts and observational notes of teacher discourse were examined in nine classrooms. The selected classrooms had dissimilar patterns of avoidance strategies and perceptions of goal structure (e.g., low avoidance/high mastery; high avoidance/low mastery). The triangulation of methods leads to a more detailed analysis of how variations in instructional and motivational discourse may have related to quantitative differences in students’ self-report responses. This methodology also advanced theory in the area of classroom mastery goal structure through noting the importance of both cognitive and motivational support in classroom discourse, where the focus had been primarily on cognitive features.

As suggested recently by Kagan (2007), the lack of consistency between measurement modes may stem from the differing origins and intrapersonal contexts of responses to a particular self-report item or score on an observational checklist. Contextually qualified measurement opens the possibility for contextually relevant theories of motivation. In summary, although this review of various approaches to motivation methodology is not exhaustive, it does demonstrate the usefulness and necessity of understanding various theoretical foundations of motivation and the development of an integrative measurement approach. Novel measures to study the relationship between motivation and academic achievement must incorporate ideal characteristics, including the strengths of self-report, while overcoming the limitations of this method through the incorporation of additional methodological and conceptual approaches. The diversities within and between the phenomenological/authentic, neuropsychological/physiological, and behavioral approaches allow for multiple permutations in designing novel instruments and scales. Although integration will be challenging due to diverse definitions of motivation and instruments that measure different aspects of motivation, it is essential if we are to advance the study of motivation.

Copyright information

© Springer Science+Business Media, LLC 2009