Background and Objective
This review is an update of a previous review published in 2010, and aims to summarize the available studies on the measurement properties of physical activity questionnaires for young people under the age of 18 years.
Systematic literature searches were carried out using the online PubMed, EMBASE, and SPORTDiscus databases up to 2018. Articles had to evaluate at least one of the measurement properties of a questionnaire measuring at least the duration or frequency of children’s physical activity, and be published in the English language. The standardized COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist was used for the quality assessment of the studies.
This review yielded 87 articles on 89 different questionnaires. Within the 87 articles, 162 studies were conducted: 103 studies assessed construct validity, 50 assessed test–retest reliability, and nine assessed measurement error. Of these studies, 38% were of poor methodological quality and 49% of fair methodological quality. A questionnaire with acceptable validity was found only for adolescents, i.e., the Greek version of the 3-Day Physical Activity Record. Questionnaires with acceptable test–retest reliability were found in all age categories, i.e., preschoolers, children, and adolescents.
Unfortunately, no questionnaires were identified with conclusive evidence for both acceptable validity and reliability, partly due to the low methodological quality of the studies. This evidence is urgently needed, as current research and practice are using physical activity questionnaires of unknown validity and reliability. Therefore, recommendations for high-quality studies on measurement properties of physical activity questionnaires were formulated in the discussion.
PROSPERO Registration Number
|No conclusive evidence was found for both the validity and reliability for any of the included physical activity questionnaires for youth.|
|High-quality studies on the measurement properties of the most promising physical activity questionnaires are urgently needed, e.g., by using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist.|
|More attention on the content validity of physical activity questionnaires is needed to confirm that questionnaires measure what they intend to measure.|
Numerous studies have demonstrated beneficial effects of physical activity, in particular of moderate to vigorous intensity, on metabolic syndrome, bone strength, physical fitness, and mental health in children and adolescents [1, 2]. In order to monitor trends in physical activity, examine associations between physical activity and health outcomes, and evaluate the effectiveness of physical activity-enhancing interventions, valid, reliable, responsive, and feasible measures of physical activity are needed.
Accelerometers are considered to provide valid and reliable measures of physical activity in children and adolescents . However, accelerometers are not gold standard and underestimate activities such as cycling, swimming, weight lifting, and many household chores. Moreover, physical activity estimates vary depending on subjective decisions in data reduction such as the choice of cut-points for intensity levels, the minimum number of valid days, the minimum number of valid hours per day, and the definition of non-wear time . Furthermore, accelerometers cannot provide information on the type and context of the behavior and are labor-intensive and costly, especially in large populations .
Self-report or proxy-report questionnaires are seen as a convenient and affordable way to assess physical activity that can provide information on the context and type of the activity [5, 6]. However, questionnaires have their limitations as well, such as the potential for social desirability and recall bias [6, 7]. Thus, for measuring physical activity a combination of the more objective measures such as accelerometers and self-report questionnaires seems most promising.
A great many questionnaires measuring physical activity in children and adolescents have been developed, with varying formats, recall periods, and types of physical activity recalled. To be able to select the most appropriate questionnaire, an overview of the measurement properties of the available physical activity questionnaires in children and adolescents is highly warranted. In 2010, Chinapaw et al.  reviewed the measurement properties of self-report and proxy-report measures of physical activity in children and adolescents. As many studies assessing measurement properties of physical activity questionnaires have been published since then, an update is timely.
Therefore, we aimed to summarize studies that assessed the measurement properties (e.g., responsiveness, reliability, measurement error, and validity) of self-report or proxy-report questionnaires in children and adolescents under the age of 18 years published since May 2009. Furthermore, we aimed to provide recommendations regarding the best available questionnaires, taking into account the best available questionnaires from the previous review.
This review is an update of the previously published review of Chinapaw et al. . We followed the Preferred Items for Systematic Reviews and Meta-Analyses (PRISMA) reporting guidelines and registered the review on PROSPERO (international prospective register of systematic reviews; registration number: CRD42016038695).
Systematic literature searches were carried out in PubMed, EMBASE, and SPORTDiscus (from January 2009 up until April 2018). In PubMed more overlap in time was maintained (search from May 2008), as our previous searches showed that the PubMed time filter can be inaccurate, e.g., due to incorrect labeling of publication dates. The full search strategy can be found in the Electronic Supplementary Material (Online Resource 1).
Search terms in PubMed were used in AND-combination, and related to physical activity (e.g., motor activity, exercise), children and adolescents (e.g., schoolchildren, adolescents), measurement properties (e.g., reliability, reproducibility, validity) , and self- or proxy-report measures (e.g., child-reported questionnaire). Medical Subject Heading (MESH), title and abstract (TIAB), and free-text search terms were used, and a variety of publication types (e.g., biography, comment, case reports, editorial) were excluded. In EMBASE, search terms related to physical activity, measurement properties , and self- or proxy-report measures were used in AND-combination. The search was limited to children and adolescents (e.g., child, adolescent), and EMBASE-only. EMBASE subject headings, TIAB, and free-text search terms were used. In SPORTDiscus, TIAB and free-text search terms were used in AND-combination, related to physical activity, children and adolescents, and self- or proxy-report measures.
Inclusion and Exclusion Criteria
Studies were eligible for inclusion when (1) the aim of the study was to evaluate at least one of the measurement properties of a self-report or proxy-report physical activity questionnaire, or a questionnaire containing physical activity items; (2) the questionnaire under study at least reported data on the duration or frequency of physical activity; (3) the mean age of the study population was < 18 years; and (4) the study was available in the English language. Studies were excluded in the following situations: (1) studies assessing physical activity using self-report measures administered by an interview (one-on-one assessment) or using a diary; (2) studies evaluating the measurement properties in a specific population (e.g., children who are affected by overweight or obesity); (3) studies examining structural validity and/or internal consistency for questionnaires that represent a formative measurement model; (4) construct validity studies examining the relationship between the questionnaire and a non-physical activity measure, e.g., body mass index (BMI) or percentage body fat; and (5) responsiveness studies that did not use a physical activity comparison measure, e.g., accelerometer, to assess a questionnaire’s ability to detect change.
Titles and abstracts were screened for eligible studies by two independent researchers [Lisan Hidding (LH) and either Mai Chinapaw (MC), Mireille van Poppel (MP), Teatske Altenburg (TA), or Lidwine Mokkink (LM)]. Subsequently, full texts were obtained and screened for eligibility by two independent researchers (LH and either TA or MP). A fourth researcher (MC) was consulted in the case of doubt.
For all eligible studies, two independent reviewers (LH and either TA or MP) extracted data regarding the characteristics of studies and results of the assessed measurement properties, using a structured form. Extracted data regarding the methods and results of the assessed measurement properties included study population, questionnaire under study, studied measurement properties, comparison measures, time interval, statistical methods used, and results regarding the studied measurement properties. In the case of disagreement regarding data extraction, a fourth researcher (MC) was consulted.
Methodological Quality Assessment
Two independent reviewers (LH and either MC or LM) rated the methodological quality of the included studies using the standardized COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist [10,11,12]. For each measurement property, the design requirements were rated using a 4-point scale (i.e., excellent, good, fair, or poor). The lowest score counts method was applied, e.g., the final methodological quality was scored as poor in the case of a poor score on one of the items. The lowest rated items that determined the final score for each study are shown in Electronic Supplementary Material Online Resource 2. The methodological quality of the content validity studies was not assessed as often little or no information on the development of the questionnaire or on the assessment of relevance, comprehensiveness, and comprehensibility of items was available. One minor adaption to the original COSMIN checklist, also described in a previous review , was applied: Percentage of Agreement (PoA) was removed from the reliability box and added to the measurement error box as an excellent statistical method . To assess the methodological quality of test–retest reliability studies, standards previously described by Chinapaw et al.  regarding the time interval were applied: between > 1 day and < 3 months for questionnaires recalling a standard week; between > 1 day and < 2 weeks for questionnaires recalling the previous week; and between > 1 day and < 1 week for questionnaires recalling the previous day.
Questionnaire Quality Assessment
Reliability is defined as “the degree to which a measurement instrument is free from measurement error” . Test–retest reliability outcomes were considered acceptable under the following conditions: (1) intraclass correlation coefficients and kappa values ≥ 0.70 ; or (2) Pearson, Spearman, or unknown correlations ≥ 0.80 . Measurement error is defined as “the systematic and random error of a score that is not attributed to true changes in the construct” . Measurement error outcomes were considered acceptable when the smallest detectable change (SDC) was smaller than the minimal important change (MIC) .
The majority of the included studies reported multiple correlations per questionnaire for test–retest reliability, e.g., separate correlations for each questionnaire item. Therefore, an overall evidence rating was applied in order to obtain a final test–retest reliability rating, incorporating all correlations per questionnaire for each study. A positive (+) evidence rating was obtained if ≥ 80% of correlations were acceptable, a mixed (±) evidence rating was obtained when ≥ 50% and < 80% of correlations were acceptable, and a negative (–) evidence rating was obtained when < 50% of correlations were acceptable. For measurement error, no final evidence rating could be applied, as to our knowledge no information on the MIC is available for the included questionnaires. Furthermore, in the case of PoA, higher scores represent less measurement error.
For validity, three different measurement properties can be distinguished, i.e., content validity, construct validity, and criterion validity . Content validity is defined as “the degree to which the content of a measurement instrument is an adequate reflection of the construct to be measured” . Construct validity is “the degree to which the scores of a measurement instrument are consistent with (a priori drafted) hypotheses” . Hypotheses can concern internal relationships, i.e., structural validity, or relationships with other instruments. Criterion validity is defined as “the degree to which the scores of an instrument are an adequate reflection of a gold standard” .
Content validity could not be assessed, as for most studies a justification of choices, e.g., comprehensibility findings based on input from the target population or experts in the field, were missing. A summary of the studies examining content validity has been added in the results section. Since a priori formulated hypotheses for construct validity were often lacking, in line with previous reviews [13, 18] we formulated criteria with regard to the relationships with other instruments; see Table 1 for criteria. The criteria were subdivided by level of evidence, level 1 indicating strong evidence, level 2 indicating moderate evidence, and level 3 indicating weak evidence. Table 1 also includes criteria for criterion validity, e.g., when doubly labeled water was used as a comparison measure for questionnaires aiming to assess physical activity energy expenditure.
Most construct validity studies examined relationships with other instruments, reporting separate correlations for each questionnaire item. As with reliability, an overall evidence rating was applied incorporating all available correlations for each questionnaire per study (i.e., a positive, mixed, or negative evidence rating was obtained). Since no hypotheses were available for mean differences and limits of agreement, only a description of these results is included in the Results section (Sect. 3).
Inclusion of Results from the Previous Review
To draw definite conclusions regarding the best available questionnaires, the most promising questionnaires based on the previous review , i.e., published before May 2009, were also taken into account. As the previous review combined the methodological quality assessment and the questionnaire quality (i.e., results regarding measurement properties) in one rating, we reassessed the methodological and questionnaire quality of these previously published studies. We included only the studies that received a positive rating in the previous review for each measurement property. However, in the previous review, no final rating for measurement error was applied; therefore, all measurement error studies were reassessed and included in the current review. In addition, for construct validity, no final rating was applied in the previous review, as the majority of studies did not formulate a priori hypotheses. We chose to reassess the two studies showing the highest correlations between the questionnaire and an accelerometer, for each age category. The studies below this ‘top 2’ showed such low correlations that they would receive a negative evidence rating using our criteria. Furthermore, we assessed three other studies that formulated a priori hypotheses, as these studies may score higher regarding methodological quality. The reassessed studies are included in Tables 2, 3, 4 in the Results section.
We chose to divide the included studies in three age categories, i.e., preschoolers, children, and adolescents, and draw conclusions on the best available questionnaire(s) for each age category. A questionnaire was considered of interest when at least a fair methodological quality and a positive evidence rating were achieved. Additionally, for construct validity, the level of evidence (see Table 1) was taken into account, so questionnaires with a higher level of evidence comparison measure were considered more valuable. Because no evidence ratings were available for measurement error, these measurement properties were not taken into account when drawing conclusions about the best available questionnaire.
Systematic literature searches using the PubMed, EMBASE, and SPORTDiscus databases yielded 15,220 articles after removal of duplicates. After title and abstract screening, 110 eligible articles remained. Another 21 articles were found through cross-reference searches. Therefore, 131 full-text articles were screened, which resulted in the inclusion of 71 articles examining 76 (versions of) questionnaires. After additionally including 16 articles from the previous review, this resulted in 87 articles examining 89 (versions) of questionnaires. See Fig. 1 for the full selection process. Within the 87 articles, 162 studies were conducted, with 103 assessing construct validity, 50 test–retest reliability, and nine measurement error. Four of the included questionnaires were assessed by two of the included studies, i.e., the 3-Day Physical Activity Recall (3DPARecall) [19, 20], the Activity Questionnaire for Adults and Adolescents (AQuAA) [21, 22], the Oxford Physical Activity Questionnaire (OPAQ) [23, 24], and a physical activity, sedentary behavior, and strength questionnaire [25, 26]. Furthermore, two of the questionnaires were assessed by three of the included studies, i.e., the Physical Activity Questionnaire for Older Children (PAQ-C) [27,28,29], and the Previous Day Physical Activity Recall (PDPAR) [30,31,32]. In addition, various modified versions of questionnaires were assessed by the included studies.
The construct validity results are summarized in Table 2. Of the 72 questionnaires that were assessed on construct validity, eight were from the previous review. Fifteen of the questionnaires were assessed by two studies, two were assessed by three studies, one by four, one by five, and one by six studies. Six questionnaires were assessed in preschoolers, 29 in children, and 38 in adolescents (one questionnaire was assessed in both children and adolescents). The methodological quality rating of the construct validity studies ranged from poor to good: 49 studies received a poor, 49 a fair, and five a good rating. The low methodological scores were predominantly due to comparison measures with unacceptable or unknown measurement properties, and a lack of a priori formulated hypotheses. No definite conclusion could be drawn regarding the best available questionnaires for preschoolers, as studies on construct validity within this age category were of low methodological quality or received negative evidence ratings. For children, the best available questionnaire was found to be the Godin Leisure-Time Exercise Questionnaire  (fair methodological quality and positive level 2 evidence). Although the moderate level 2 evidence hampered our ability to draw conclusions on the validity, it is worthwhile to investigate further. We concluded that the most valid questionnaire in adolescents was the Greek version of the 3-Day Physical Activity Record (3DPARecord)  (fair methodological quality and positive level 1 evidence rating). Note that the 3DPARecord uses a different format (i.e., different time segments and categories) than the frequently used 3DPARecall.
Six of the included questionnaires were qualitatively assessed on content validity, one of which was assessed by two studies [25, 26, 34,35,36,37]. Studies used cognitive interviews, semi-structured interviews, and focus groups with children and adolescents and/or experts (e.g., researchers in the field of sports medicine, pediatrics, and measurement) to assess the comprehensibility, relevance of items, and comprehensiveness of the questionnaires. Due to a lack of details on the methods used regarding testing or developing these questionnaires, the methodological quality of these studies and the quality of the questionnaires could not be assessed. Ten of the included questionnaires were pilot-tested with children and/or parents on, for example, comprehensiveness and time to complete [33, 38,39,40,41,42,43,44,45]. However, again, the study quality could not be assessed due to the minimal amount of information provided. Lastly, 15 of the questionnaires were translated versions [33, 35, 39, 40, 43, 46,47,48,49,50,51,52,53]; the majority of these studies provided little information on the translation processes. These studies did not assess the cross-cultural validity, and thus no definite conclusion about the content validity of the translated questionnaires could be drawn.
The test–retest reliability results are summarized in Table 3. Of the 46 questionnaires assessed on test–retest reliability, five were from the previous review. Four of the questionnaires were assessed by two studies. Five questionnaires were assessed in preschoolers, 16 in children, and 26 in adolescents (one questionnaire was assessed in both children and adolescents). The methodological quality of the studies was rated as follows: 13 scored poor, 26 fair, and 11 good. The majority of poor and fair scores were due to the lack of a description about how missing items were treated and inappropriate time intervals between test and retest. The most reliable questionnaire in preschoolers was the Energy Balance Related Behaviors (ERBs) self-administered primary caregivers questionnaire (PCQ)  (fair methodological quality and positive evidence rating). In children, the most reliable questionnaires were the Chinese version of the PAQ-C , and the Active Transportation to school and work in Norway (ATN) questionnaire  (both good methodological quality and positive evidence rating). The most reliable questionnaires in adolescents were a single-item activity measure , and the Web-based and paper-based PAQ-C  (both good methodological quality and positive evidence rating).
Table 4 summarizes the measurement error outcomes. Of the nine questionnaires assessed on measurement error, two were from the previous review. One questionnaire was assessed in preschoolers, three in children, and five in adolescents. Four of the studies received a good methodological quality rating, and five received a fair one. Fair scores were predominantly due to the lack of a description about how missing items were treated.
This review summarizes studies that assessed the measurement properties of physical activity questionnaires for children and adolescents under the age of 18 years. Questionnaires varied in (sub)constructs measured, recall periods, number of questions and format, and different measurement properties that were assessed, e.g., construct validity, test–retest reliability, or measurement error. Unfortunately, most studies had low methodological quality scores and low evidence ratings, especially for construct validity. Additionally, no questionnaire was identified with both high methodological quality and positive evidence ratings for reliability and validity. Furthermore, for the majority of questionnaires there was a lack of data on both reliability and validity. Consequently, no definite conclusion regarding the most promising questionnaire can be drawn.
For adolescents, one valid questionnaire was found, i.e., the Greek version of the 3DPARecord . The 3DPARecord is a questionnaire using a segmented day structure that divides the previous 3 days (1 weekend day) into timeframes of 15 min each, with the adolescents reporting their activity using nine categories ranging from 1 (sleep) to 9 (vigorous physical activity and sport) for each of the timeframes .
Due to the predominantly low methodological study quality and negative evidence ratings for study results in children and preschoolers, no valid questionnaires were identified. The low methodological quality of the studies was predominantly due to a lack of a priori formulated hypotheses and the use of comparison measures with unknown or unacceptable measurement properties. Moreover, in some studies comparisons between non-corresponding constructs were made, e.g., moderate to vigorous physical activity (MVPA) measured by a questionnaire compared with total accelerometer counts.
Test–Retest Reliability and Measurement Error
For preschoolers, one reliable questionnaire was identified: the ERBs self-administered PCQ ; two reliable questionnaires were identified for children: the Chinese version of the PAQ-C  and the ATN questionnaire ; and two for adolescents: a single-item activity measure  and the web- and paper-based PAQ-C .
Many questionnaires received a positive evidence rating but due to the low methodological quality of the studies no definite conclusions regarding their reliability could be drawn. The low methodological quality was mainly due to inappropriate time intervals between test and retest, and the lack of a description about how missing items were handled. Unfortunately, no final evidence rating for measurement error could be computed as none of the studies provided information on the MIC.
Strengths and Limitations
A strength of this review is the separate assessment of the questionnaire quality (i.e., results for measurement properties) and the methodological quality of the study in which the questionnaire was assessed. This provides transparency in the conclusion regarding the best available questionnaires. Furthermore, data extraction and assessment of methodological quality were carried out by at least two independent researchers, minimizing the chance of bias. In addition, cross-reference searches were carried out, thereby increasing the likelihood of finding all relevant studies. However, we only included English-language studies, disregarding relevant studies published in other languages.
Recommendations for Future Research
Due to the methodological limitations of existing studies, we cannot draw definite conclusions on the measurement properties of physical activity questionnaires. This hampers the identification of the most suitable questionnaires for assessing physical activity in children. To improve future research we recommend the following:
Using appropriate translation methods ;
Using the mode of administration in a validation study that is intended in the field;
Defining the context of use and the measurement model of the questionnaire to determine which measurement properties are relevant to examine;
For construct validity, choosing a comparison measure that measures a similar construct and formulating hypotheses a priori;
For reliability studies, test and retest should concern the same day/week when recalling a previous day/week;
More research on the responsiveness of valid and reliable questionnaires;
Building on or improving the most promising existing questionnaires rather than developing new questionnaires;
Providing open access to the examined questionnaire; and
Editors of journals to request reviewers and authors to use a standardized tool such as COSMIN for studies on measurement properties.
Unfortunately, conclusive evidence for both validity and reliability was not found for any of the identified physical activity questionnaires. The lack of high-quality studies examining both the reliability and the validity of a questionnaire hampered the ability to draw definite conclusions about the best available physical activity questionnaire for children and adolescents. Thus, high-quality methodological studies examining all relevant measurement properties are highly warranted. We strongly recommend researchers adopt standardized tools, e.g., the COSMIN methodology [11, 56, 57], for the design and report of future studies. Current studies using physical activity questionnaires should keep in mind that their results may not adequately reflect children’s and adolescents’ physical activity levels, as most questionnaires lack appropriate validity and/or reliability.
Bangsbo J, Krustrup P, Duda J, Hillman C, Andersen LB, Weiss M, et al. The Copenhagen Consensus Conference 2016: children, youth, and physical activity in schools and during leisure time. Br J Sports Med. 2016;50:1177–8.
Janssen I, Leblanc AG. Systematic review of the health benefits of physical activity and fitness in school-aged children and youth. Int J Behav Nutr Phys Act. 2010;7:40.
Trost SG, McIver KL, Pate RR. Conducting accelerometer-based activity assessments in field-based research. Med Sci Sports Exerc. 2005;37:531–43.
Toftager M, Kristensen PL, Oliver M, Duncan S, Christiansen L, Boyle E, et al. Accelerometer data reduction in adolescents: effects on sample retention and bias. Int J Behav Nutr Phys Act. 2013;10:140.
Welk GJ, Corbin CB, Dale D. Measurement issues in the assessment of physical activity in children. Res Q Exerc Sport. 2000;71:59–73.
Sallis JF. Self-report measures of children’s physical activity. J Sch Health. 1991;61:215–9.
Kohl HW, Fulton JE, Caspersen CJ. Assessment of physical activity among children and adolescents: a review and synthesis. Prev Med. 2000;31:S54–76.
Chinapaw MJM, Mokkink LB, van Poppel MNM, van Mechelen W, Terwee CB. Physical activity questionnaires for youth: a systematic review of measurement properties. Sports Med. 2010;40:539–63.
Terwee CB, Jansma EP, Riphagen II, de Vet HCW. Development of a methodological PubMed search filter for finding studies on measurement properties of measurement instruments. Qual Life Res. 2009;18:1115–23.
Terwee CB, Mokkink LB, Knol DL, Ostelo RWJG, Bouter LM, de Vet HCW. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res. 2012;21:651–7.
Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res. 2010;19:539–49.
Terwee CB. COSMIN checklist with 4-point scale. 2011. https://www.cosmin.nl. Accessed 1 Apr 2016.
Hidding LM, Altenburg TM, Mokkink LB, Terwee CB, Chinapaw MJM. Systematic review of childhood sedentary behavior questionnaires: what do we know and what is next? Sports Med. 2017;47:677–99.
de Vet HCW, Mokkink LB, Terwee CB, Hoekstra OS, Knol DL. Clinicians are right not to like Cohen’s κ. BMJ. 2013;346:f2125.
Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63:737–45.
Terwee CB, Bot SDM, de Boer MR, van der Windt DAWM, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60:34–42.
de Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in medicine: a practical guide. 1st ed. Cambridge: Cambridge University Press; 2011.
van Poppel MNM, Chinapaw MJM, Mokkink LB, van Mechelen W, Terwee CB. Physical activity questionnaires for adults. Sports Med. 2010;40:565–600.
McMurray RG, Ring KB, Treuth MS, Gregory J, Pate RR, Schmitz KH, et al. Comparison of two approaches to structured physical activity surveys for adolescents. Med Sci Sports Exerc. 2008;36:2135–43.
Pate RR, Ross R, Dowda M, Trost SG, Sirard JR. Validation of a 3-day physical activity recall instrument in female youth recall. Pediatr Exerc Sci. 2003;15:257–65.
Chinapaw MJM, Slootmaker SM, Schuit AJ, van Zuidam M, van Mechelen W. Reliability and validity of the Activity Questionnaire for Adults and Adolescents (AQuAA). BMC Med Res Methodol. 2009;9:58.
Slootmaker SM, Schuit AJ, Chinapaw MJM, Seidell JC, van Mechelen W, Sallis J, et al. Disagreement in physical activity assessed by accelerometer and self-report in subgroups of age, gender, education and weight status. Int J Behav Nutr Phys Act. 2009;6:17.
Scott JJ, Morgan PJ, Plotnikoff RC, Lubans DR. Reliability and validity of a single-item physical activity measure for adolescents. J Pediatr Child Health. 2015;51:787–93.
Lubans DR, Sylva K, Osborn Z. Convergent validity and test–retest reliability of the Oxford Physical Activity Questionnaire for secondary school students. Behav Change. 2008;25:23–34.
Tucker CA, Bevans KB, Teneralli RE, Smith AW, Bowles HR, Forrest CB. Self-reported pediatric measures of physical activity, sedentary behavior, and strength impact for PROMIS: conceptual framework. Pediatr Phys Ther. 2014;26:376–84.
Tucker CA, Bevans KB, Teneralli RE, Smith AW, Bowles HR, Forrest CB. Self-reported pediatric measures of physical activity, sedentary behavior, and strength impact for PROMIS: item development. Pediatr Phys Ther. 2014;26:385–92.
Kowalski KC, Crocker PRE, Faulkner RA. Validation of the physical activity questionnaire for older children. Pediatr Exerc Sci. 1997;9:174–86.
Storey KE, McCargar LJ. Reliability and validity of Web-SPAN, a web-based method for assessing weight status, diet and physical activity in youth. J Hum Nutr Diet. 2012;25:59–68.
Crocker PR, Bailey DA, Faulkner RA, Kowalski KC, McGrath R. Measuring general levels of physical activity: preliminary evidence for the physical activity questionnaire for older children. Med Sci Sports Exerc. 1997;29:1344–9.
Trost SG, Ward DS, Mcgraw B, Pate RR. Validity of the Previous Day Physical Activity Recall (PDPAR) in fifth-grade children: validity of the previous day physical activity. Pediatr Exerc Sci. 1999;11:341–8.
Welk GJ, Dzewaltowski DA, Hill JL. Comparison of the computerized ACTIVITYGRAM Instrument and the previous day physical activity recall for assessing physical activity in children. Res Q Exerc Sport. 2004;75:370–80.
Trost SG, Marshall AL, Miller R, Hurley JT, Hunt JA. Validation of a 24-h physical activity recall in indigenous and non-indigenous Australian adolescents. J Sci Med Sport. 2007;10:428–35.
Argiropoulou EC, Michalopoulou M, Aggeloussis N, Avgerinos A. Validity and reliability of physical activity measures in Greek high school age children. J Sports Sci Med. 2004;3:147–59.
Aggio D, Fairclough S, Knowles Z, Graves L. Validity and reliability of a modified english version of the physical activity questionnaire for adolescents. Arch Public Health. 2016;74:3.
Bervoets L, Van Noten C, Van Roosbroeck S, Hansen D, Van Hoorenbeeck K, Verheyen E, et al. Reliability and validity of the Dutch Physical Activity Questionnaires for Children (PAQ-C) and Adolescents (PAQ-A). Arch Public Health. 2014;72:47.
DiStefano C, Pate R, McIver K, Dowda M, Beets M, Murrie D. Creating a physical activity self-report form for youth using Rasch methodology. J Appl Meas. 2016;17:125–41.
Gray HL, Koch PA, Contento IR, Bandelli LN, Ang I, Di Noia J. Validity and reliability of behavior and theory-based psychosocial determinants measures, using audience response system technology in urban upper-elementary schoolchildren. J Nutr Educ Behav. 2016;48:437–52.
Saint-Maurice PF, Welk GJ. Validity and calibration of the youth activity profile. PLoS One. 2015;10:e0143949.
Tetali S, Edwards P, Murthy GVS, Roberts I. Development and validation of a self-administered questionnaire to estimate the distance and mode of children’s travel to school in urban India. BMC Med Res Methodol. 2015;15:92.
Bacardi-Gascón M, Reveles-Rojas C, Woodward-Lopez G, Crawford P, Jiménez-Cruz A. Assessing the validity of a physical activity questionnaire developed for parents of preschool children in Mexico. J Health Popul Nutr. 2012;30:439–46.
Bere E, Bjørkelund LA. Test-retest reliability of a new self reported comprehensive adolescents commuting to school and their parents commuting to work—the ATN questionnaire. Int J Behav Nutr Phys Act. 2009;6:68.
Lee KS, Trost SG. Validity and reliability of the 3-day physical activity recall in Singaporean adolescents. Res Q Exerc Sport. 2005;76:101–6.
Wang JJ, Baranowski T, Lau WP, Chen TA, Pitkethly AJ. Validation of the Physical Activity Questionnaire for Older Children (PAQ-C) among Chinese children. Biomed Environ Sci. 2016;29:177–86.
Thomas EL, Upton D. Psychometric properties of the physical activity questionnaire for older children (PAQ-C) in the UK. Psychol Sport Exerc. 2014;15:280–7.
Zelener J, Schneider M. Adolescents and self-reported physical activity: an evaluation of the Modified Godin Leisure-Time Exercise Questionnaire. Int J Exerc Sci. 2016;9:587–98.
González-Gil EM, Mouratidou T, Cardon G, Androutsos O, De Bourdeaudhuij I, Góźdź M, et al. Reliability of primary caregivers reports on lifestyle behaviours of European pre-school children: the ToyBox-study. Obes Rev. 2014;15:61–6.
Cerin E, Sit CHP, Huang Y-J, Barnett A, Macfarlane DJ, Wong SSH. Repeatability of self-report measures of physical activity, sedentary and travel behaviour in Hong Kong adolescents for the iHealt(H) and IPEN—adolescent studies. BMC Pediatr. 2014;14:142.
Singh AS, Vik FN, Chinapaw MJM, Uijtdewilligen L, Verloigne M, Fernández-Alvira JM, et al. Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project. Int J Behav Nutr Phys Act. 2011;8:136.
Gioxari A, Kavouras SA, Tambalis KD, Maraki M, Kollia M, Sidossis LS. Reliability and criterion validity of the self-administered physical activity checklist in Greek children. Eur J Sport Sci. 2013;1:105–11.
Huang YJ, Wong SHS, Salmon J. Reliability and validity of the modified Chinese version of the Children’s Leisure Activities Study Survey (CLASS) questionnaire in assessing physical activity among Hong Kong children. Pediatr Exerc Sci. 2009;21:339–53.
Malan GF, Nolte K. Measuring physical activity in South African grade 2 and 3 learners: a self-report questionnaire versus pedometer testing. S Afr J Res Sport Phys Educ Recreation. 2017;39:79–91.
Benítez-porres J, López-Fernández I, Raya JF, Álvarez Carnero S, Alvero-Cruz JR, Álvarez Carnero E. Reliability and validity of the PAQ-C questionnaire to assess physical activity in children. J Sch Health. 2016;86:677–85.
Zaragoza Casterad J, Generelo E, Aznar S, Abarca-Sos A, Julián JA, Mota J. Validation of a short physical activity recall questionnaire completed by Spanish adolescents. Eur J Sport Sci. 2012;12:283–91.
Terwee CB, Mokkink LB, Hidding LM, Altenburg TM, van Poppel MN, Chinapaw MJM, et al. Comment on “Should we reframe how we think about physical activity and sedentary behavior measurement? Validity and reliability reconsidered”. Int J Behav Nutr Phys Act. 2016;13:66.
Terwee CB, Prinsen CAC, Chiarotto A, Westerman MJ, Patrick DL, Alonso J, et al. COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual Life Res. 2018;27:1159–70.
Prinsen CAC, Mokkink LB, Bouter LM, Alonso J, Patrick DL, de Vet HCW, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27:1147. https://doi.org/10.1007/s11136-018-1798-3
Mokkink LB, de Vet HCW, Prinsen CAC, Patrick DL, Alonso J, Bouter LM, et al. COSMIN Risk of Bias checklist for systematic reviews of Patient-Reported Outcome Measures. Qual Life Res. 2018;27:1171. https://doi.org/10.1007/s11136-017-1765-4
Dwyer GM, Hardy LL, Peat JK, Baur LA. The validity and reliability of a home environment preschool-age physical activity questionnaire (Pre-PAQ). Int J Behav Nutr Phys Act. 2011;8:86.
Rice KR, Joschtel B, Trost SG. Validity of family child care providers’ proxy reports on children’s physical activity. Child Obes. 2013;9:393–8.
Corder K, Van Sluijs EMF, Wright A, Whincup P, Wareham NJ, Ekelund U. Is it possible to assess free-living physical activity and energy expenditure in young people by self-report? Am J Clin Nutr. 2009;89:862–70.
Sarker H, Anderson LN, Borkhoff CM, Abreo K, Tremblay MS, Lebovic G, et al. Validation of parent-reported physical and sedentary activity by accelerometry in young children. BMC Res Notes. 2015;8:735.
Määttä S, Nuutinen T, Ray C, Eriksson JG, Weiderpass E, Roos E. Validity of self-reported out-of-school physical activity among Finnish 11-year-old children. Arch Public Health. 2016;74:11.
Eisenmann JC, Milburn N, Jacobsen L, Moore SJ. Reliability and convergent validity of the godin leisure-time exercise questionnaire in rural 5th-grade school-children. J Hum Movement Stud. 2002;43:135–49.
Ridley K, Olds TS, Hill A. The Multimedia activity recall for children and adolescents (MARCA): development and evaluation. Int J Behav Nutr Phys Act. 2006;3:10.
Ayala-Guzmán CI, Ramos-Ibáñez N, Ortiz-Hernández L. Accelerometry does not match with self-reported physical activity and sedentary behaviors in Mexican children. Bol Med Hosp Infant Mex. 2017;74:272–81.
Nascimento-Ferreira MV, De Moraes ACF, Toazza-Oliveira PV, Forjaz CLM, Aristizabal JC, Santaliesra-Pasías AM, et al. Reliability and validity of a questionnaire for physical activity assessment in South American children and adolescents: the SAYCARE study. Obesity. 2018;26:S23–30.
Colley RC, Wong SL, Garriguet D, Janssen I, Gober SC, Tremblay MS. Physical activity, sedentary behaviour and sleep in canadian children: parent-report versus direct measures and relative associations with health risk. Health Rep. 2012;23:45–52.
Gwynn JD, Hardy LL, Wiggers JH, Smith WT, D’Este CA, Turner N, et al. The validation of a self-report measure and physical activity of Australian Aboriginal and Torres Strait Islander and non-Indigenous rural children. Aust N Z J Public Health. 2010;34:57–65.
Van Hoye A, Nicaise V, Sarrazin P. Self-reported and objective physical activity measurement by active youth. Sci Sports. 2014;29:78–87.
Tremblay MS, Inman JW, Willms JD. Preliminary evaluation of a video questionnaire to assess activity levels of children. Med Sci Sports Exerc. 2001;33:2139–44.
Moore HJ, Ells LJ, McLure SA, Crooks S, Cumbor D, Summerbell CD, et al. The development and evaluation of a novel computer program to assess previous-day dietary and physical activity behaviours in school children: the Synchronised Nutrition and Activity Program (SNAP). Br J Nutr. 2008;99:1266–74.
Harro M. Validation of a questionnaire to assess physical activity of children ages 4-8 years. Res Q Exerc Sport. 1997;68:259–68.
Bringolf-Isler B, Mäder U, Ruch N, Kriemler S, Grize L, Braun-Fahrländer C. Measuring and validating physical activity and sedentary behavior comparing a parental questionnaire to accelerometer data and diaries. Pediatr Exerc Sci. 2012;24:229–45.
Muthuri SK, Wachira LJM, Onywera VO, Tremblay MS. Direct and self-reported measures of physical activity and sedentary behaviours by weight status in school-aged children: results from ISCOLE-Kenya. Ann Hum Biol. 2015;42:239–47.
Børrestad L, Østergaard L, Andersen LB, Bere E. Associations between active commuting to school and objectively measured physical activity. J Phys Act Health. 2012;10:826–32.
Reichert FF, Menezes AMB, Araujo CL, Hallal PC. Self-reporting versus parental reporting of physical activity in adolescents: the 11-year follow-up of the 1993 Pelotas (Brazil) birth cohort study. Cad Saude Publica. 2010;26:1921–7.
Veitch J, Salmon J, Ball K. The validity and reliability of an instrument to assess children’s outdoor play in various locations. J Sci Med Sport. 2009;12:579–82.
Sithole F, Veugelers PJ. Parent and child reports of children’s activity. Health Rep. 2008;19:19–24.
Rääsk T, Lätt E, Jürimäe T, Mäestu J, Jürimäe J, Konstabel K. Association of subjective ratings to objectively assessed physical activity in pubertal boys with differing BMI. Percept Mot Skills. 2015;121:245–59.
Beltrán-Carrillo VJ, González-Cutre D, Sierra AC, Jiménez-Loaisa A, Ferrández-Asencio MÁ, Cervelló E. Concurrent and criterion validity of the 7 Day-PAR in Spanish adolescents. Eur J Hum Mov. 2016;36:88–103.
McCrorie PRW, Perez A, Ellaway A. The validity of the Youth Physical Activity Questionnaire in 12–13-year-old Scottish adolescents. BMJ Open Sport Exerc Med. 2016;2:e000163.
Rääsk T, Maëstu J, Lätt E, Jürimäe J, Jürimäe T, Vainik U, et al. Comparison of IPAQ-SF and two other physical activity questionnaires with accelerometer in adolescent boys. PLoS One. 2017;12:e0169527.
Troped PJ, Wiecha JL, Fragala MS, Matthews CE, Finkelstein DM, Kim J, et al. Reliability and validity of YRBS physical activity items among middle school students. Med Sci Sports Exerc. 2007;39:416–25.
Wang C, Chen P, Zhuang J. Validity and reliability of International Physical Activity Questionnaire-Short Form in Chinese youth. Res Q Exerc Sport. 2013;84:S80–6.
Murphy MH, Rowe DA, Belton S, Woods CB. Validity of a two-item physical activity questionnaire for assessing attainment of physical activity guidelines in youth. BMC Public Health. 2015;15:1080.
Dollman J, Stanley R, Wilson A. The concurrent validity of the 3-Day Physical Activity Recall in Australian youth. Pediatr Exerc Sci. 2015;27:262–7.
Ridgers ND, Timperio A, Crawford D, Salmon J. Validity of a brief self-report instrument for assessing compliance with physical activity guidelines amongst adolescents. J Sci Med Sport. 2012;15:136–41.
Kowalski KC, Crocker PRE, Kowalski NP. Convergent validity of the physical activity questionnaire for adolescents. Pediatr Exerc Sci. 1997;9:342–52.
Al-Hazzaa HM, Al-Sobayel HI, Musaiger AO. Convergent validity of the Arab teens lifestyle study (ATLS) physical activity questionnaire. Int J Environ Res Public Health. 2011;8:3810–20.
Gråsten A, Watt A. A comparison of self-report scales and accelerometer-determined moderate to vigorous physical activity scores of Finnish school students. Meas Phys Educ Exerc Sci. 2016;20:220–9.
Hallal PC, Reichert FF, Clark VL, Cordeira KL, Menezes AMB, Eaton S, et al. Energy expenditure compared to physical activity measured by accelerometry and self-report in adolescents: a validation study. PLoS One. 2013;8:e77036.
Stanley R, Boshoff K, Dollman J. The concurrent validity of the 3-day Physical Activity Recall questionnaire administered to female adolescents aged 12–14 years. Aust Occup Ther J. 2007;54:294–302.
Campbell N, Gaston A, Gray C, Rush E, Maddison R, Prapavessis H. The Short QUestionnaire to ASsess Health-enhancing (SQUASH) physical activity in adolescents: a validation study using doubly labeled water. J Phys Act Health. 2016;13:154–8.
Ottevaere C, Huybrechts I, De Bourdeaudhuij I, Sjöström M, Ruiz JR, Ortega FB, et al. Comparison of the IPAQ-A and Actigraph in relation to VO2max among European adolescents: the HELENA study. J Sci Med Sport. 2011;14:317–24.
Martínez-Gómez D, Calabro MA, Welk GJ, Marcos A, Veiga OL. Reliability and validity of a school recess physical activity recall in Spanish youth. Pediatr Exerc Sci. 2010;22:218–30.
Ekelund U, Neovius M, Linne Y, Rossner S. The criterion validity of a last 7-day physical activity questionnaire (SAPAQ) for use in adolescents with a wide variation in body fat: the Stockholm Weight Development Study. Int J Obes. 2006;30:1019–21.
LeBlanc AGW, Janssen I. Difference between self-reported and accelerometer measured moderate-to-vigorous physical activity in youth. Pediatr Exerc Sci. 2010;22:523–34.
Sallis JF, Buono MJ, Roby JJ, Micale FG, Nelson JA. Seven-day recall and other physical activity self-reports in children and adolescents. Med Sci Sports Exerc. 1993;25:99–108.
Tian H, Du Toit D, Toriola AL. Validation of the Children’s Leisure Activities Study Survey Questionnaire for 12-year old South African children. Afr J Phys Health Educ Recreat Dance. 2014;20:1572–86.
Telford A, Salmon J, Jolley D, Crawford D. Reliability and validity of physical activity questionnaires for children: the Children’s Leisure Activities Study Survey (CLASS). Pediatr Exerc Sci. 2004;16:64–78.
Bonn SE, Surkan PJ, Trolle Lagerros Y, Bälter K. Feasibility of a novel web-based physical activity questionnaire for young children. Pediatr Rep. 2012;4:127–9.
Treuth MS, Sherwood NE, Butte NF, McClanahan B, Obarzanek E, Zhou A, et al. Validity and reliability of activity measures in African–American Girls for GEMS. Med Sci Sports Exerc. 2003;35:532–9.
Strugnell C, Renzaho A, Ridley K, Burns C. Reliability of the modified child and adolescent physical activity and nutrition survey, physical activity (CAPANS-PA) questionnaire among Chinese–Australian youth. BMC Med Res Methodol. 2011;11:122.
Costa-Tutusaus L, Guerra-Balic M. Development and psychometric validation of a scoring questionnaire to assess healthy lifestyles among adolescents in Catalonia. BMC Public Health. 2015;16:89.
Barbosa N, Sanchez CE, Vera JA, Perez W, Thalabard J-C, Rieu M. A physical activity questionnaire: reproducibility and validity. J Sports Sci Med. 2007;6:505–18.
Rangul V, Holmen TL, Kurtze N, Cuypers K, Midthjell K, Biddle S, et al. Reliability and validity of two frequently used self-administered physical activity questionnaires in adolescents. BMC Med Res Methodol. 2008;8:47.
Liu Y, Wang M, Tynjälä J, Lv Y, Villberg J, Zhang Z, et al. Test-retest reliability of selected items of Health Behaviour in School-aged Children (HBSC) survey questionnaire in Beijing, China. BMC Med Res Methodol. 2010;10:73.
Bobakova D, Hamrik Z, Badura P, Sigmundova D, Nalecz H, Kalman M. Test–retest reliability of selected physical activity and sedentary behaviour HBSC items in the Czech Republic, Slovakia and Poland. Int J Public Health. 2014;60:59–67.
Prochaska JJ, Sallis JF, Long B. A physical activity screening measure for use with adolescents in primary care. Arch Pediatr Adolesc Med. 2001;155:554–9.
The contribution of Lisan Hidding was funded by the municipality of Amsterdam, Amsterdam Healthy Weight Programme.
Conflict of interest
Lisan Hidding, Mai Chinapaw, Mireille van Poppel, and Teatske Altenburg declare that they have no conflicts of interest. The institute of which Lidwine Mokkink is a part receives royalties for one of the references cited in this review (de Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in medicine: a practical guide. 1st ed. Cambridge: Cambridge University Press; 2011).
About this article
Cite this article
Hidding, L.M., Chinapaw, M.J.M., van Poppel, M.N.M. et al. An Updated Systematic Review of Childhood Physical Activity Questionnaires. Sports Med 48, 2797–2842 (2018). https://doi.org/10.1007/s40279-018-0987-0