Introduction

Measuring the physical activity (PA) of older adults is a challenge for researchers (Sun et al., 2013). A systematic review by Prince et al. (2008) identified a number of direct and indirect tools, which measured different parameters. Generally speaking, these tools can be divided into two types. Self-reported measurement offers information about how the older adult personally assesses activity intensity, whereas objective tools provide externally-quantified information, such as energy expenditure, recorded steps and activity duration. Both forms of assessment offer distinct advantages and disadvantages. Objective tools measure energy expenditure or actual movement and are generally considered more reliable without response and recall biases, and helpful for validating subjective measures of PA (Kowalski et al., 2012); however, require more expense and time, and place a higher burden on both the participant and the researcher than self-reported measures. Furthermore, some measures (e.g., accelerometers, pedometers) provide limited information about activity and cannot currently measure certain forms of PA such as swimming, resistance exercise, upper body movements, cycling and complex movements (Warms, 2006).

Subjective measures relying on self-reporting are practical, easy to administer to large groups, and cost efficient. They are also generally well accepted, produce low client burden, and do not interfere with usual routines; however, either over or under-estimation may occur due to inaccurate recall, perceived social desirability and misinterpretation. In addition, existing indirect tools do not accurately measure lower intensity PA, which occurs frequently in older adults, and are susceptible to interference from a range of factors such as health status, medical conditions and medications, fatigue, pain, concentration and distractibility, and changes in mood, as well as depression, anxiety, and problems with memory and cognition (Colbert et al., 2014; Kowalski et al., 2012; Meijer et al., 2001; Shephard, 2003). Nevertheless, the International Physical Activity Questionnaire (IPAQ) is used worldwide to indirectly evaluate volumes of sedentary behavior and moderate to vigorous physical activity throughout the previous seven-day week (Craig et al., 2003).

Activity monitors, a direct measurement option, are becoming increasingly popular in studies with older adults also experience execution issues (Bento et al., 2012; Davis & Fox, 2007; de Bruin et al., 2008; Garatachea et al., 2010; Murphy, 2009; Taraldsen et al., 2012). Such concerns include the variety of monitoring devices, the lack of standard protocols, and the considerable variability in the parameters obtained from the devices. Even so, among these monitors, accelerometers provide the most accurate and reliable PA data among older adults (Copeland & Esliger, 2009; Lee & Shiroma, 2014). One of the most commonly-used accelerometers in older population is Actigraph, which is often employed as a reference standard when evaluating other activity monitors (Straiton et al., 2018).

As no individual measurement method currently exists for comprehensive assessment of PA, Cervantes and Porretta (2010) emphasize the importance of using a range of simultaneous methods to accurately triangulate varied PA input, and Kortajarena et al. (2019) recommend a combination of both accelerometers and questionnaires to establish precise relationships between PA and cardiovascular risk in studies on older populations. Indeed, Skender et al. (2016) found such a combination to yield the most complete set of data for assessing PA.

Studies indicate that PA level among older adults differs by age, with the oldest (80– 85 years) displaying as much as 50% lower activity than younger groups (Lohne-Seiler et al., 2014). Age appears to be accompanied by a shift toward spending time on low-intensity activities at the expense of moderate- and high-intensity activities (Meijer et al., 2001).

The aim of this study was to compare the results of direct and self-reported measurements of PA in three age groups of women aged over 60 years, and to confirm whether direct and indirect measurement of PA are dependent on participant age. A secondary aim was to evaluate the relationship between direct and indirect measurement of PA across the three age groups.

Material and Methods

The study was approved by the University Ethics Committee and was performed in accordance with the Declaration of Helsinki. Prior to any testing, the participants provided their written informed consent.

Participants

The study included 200 women. The inclusion criteria comprised age 60 years or older, independence in everyday life, no mobility limitation impacting daily activities (self-reported). None of the participants had any health contraindications to physical activity; all were volunteers recruited from 18 senior centers in Warsaw which offered various forms of activity. All of the women were retired. The education status of majority of the women was secondary (53%), with 37.5% completing higher education, 5% vocational education and 4.5% primary education. The participants were divided into three groups according to age, with 71 women in the youngest group (age 60–65 years), 72 in the middle group (age 66–70 years), and 57 in oldest group (> 70 years).

Procedure

Initial screening was conducted either in person or by senior center staff. Eligible volunteers were then scheduled for an in-person session in the senior center on Mondays. Researcher explained the purpose and over-all procedure of the study to the participants. After providing their signed informed consent, the participants completed a baseline questionnaire that included information on general characteristics (age, height), and their weight was measured in lightweight sport clothing without shoes, with the use of a weight scale. The participants were then trained on how to use the ActiGraph GT3-BT and given written instructions and contact information for researcher, in case they needed any assistance. At the end of the seven-day home monitoring period (on Mondays), they returned the ActiGraph GT3-BT to the senior center and completed the self-reported International Physical Activity Questionnaire (IPAQ).

Outcomes

The self-reported International Physical Activity Questionnaire (IPAQ) – short version was used to assess PA in all three groups. This questionnaire was originally developed for people under 69 years old but is commonly used to assess adults older than 69 years (Grimm et al., 2012; Wassink-Vossen et al., 2014). The psychometric properties of IPAQ for older adults are considered moderate/acceptable (Cleland et al., 2018). When administering the IPAQ in the older population (Heesch et al., 2010), caution is required regarding questions about PA, including vigorous and moderate intensity, walking and sitting time, during the previous seven days.

As walking is regarded as moderate intensity PE, the analysis combined all forms of walking into one category. Moderate-to-vigorous physical activity (MVPA) was calculated according to the IPAQ scoring protocol (minutes of walking × 3.3 METs + minutes of moderate activity × 4 METs + minutes of vigorous activity × 8 METs).

Physical activity was measured with the ActiGraph GT3-BT (ActiGraph, LLC; Pensacola, FL, USA). The device records movement in three planes with a detection frequency set at 30 Hz. The Actilife v6.11.8. program was used to analyze the data. Sedentary time was defined as a score of 100 signals per minute or less (Matthews et al., 2008), low intensity 100–2019, moderate 2020–5998 and vigorous above 5999 (Troiano et al., 2014). The Actigraph is reliable and valid for assessing PA among older adults and is often used as a standard for measuring physical activity (Straiton et al., 2018). MVPA time and step data was also collected. Additionally, to allow comparison with the IPAQ, MVPA was calculated in METs per minute per day (minutes of moderate activity × 4 METs + minutes of vigorous activity × 8 METs). Each participant wore a waist accelerometer for seven consecutive days with instructions to remove the device before contact with water (e.g. bathing, swimming, exercising in water). Participants recorded the times when the device was put on and taken off each day on a log provided by the research team. The participants were asked not to alter their usual activity during this seven-day period. The analysis included days when the device was worn for at least 10 h.

Statistical analysis

The statistical analysis was performed using STATA (version 13; Statacorp, TX, USA). As the data was not normally distributed (Kolmogorov–Smirnov test), differences in general characteristics and PA data between the groups were assessed using the non-parametric Kruskal–Wallis test. The effect size was determined by calculating the η2 according to the formula presented by Tomczak and Tomczak (2014). Values of η2 > 0.36, > 0.04 and > 0.01 (Lenhard & Lenhard, 2016) were typically considered to represent large, moderate and small effect sizes, respectively. Where significant differences were identified post hoc, pair-wise comparisons were employed using the Mann–Whitney test with a significance level of 0.02, p = α/ [k (k-1)/2] (k – numbers of groups) to adjust the a priori alpha for multiple comparisons. The t-test for independent samples was used to test for differences between direct (accelerometer) and indirect (IPAQ) measurements of PA intensity in all three age groups. Additionally, the effect size was determined by the Cohen’s d for all statistically significant results, and d values of > 0.80, > 0.50 and > 0.20 were typically considered to represent large, moderate, and small meaningfulness of results, respectively. The relationships between accelerometer-derived and IPAQ-reported activity measures, as well as age and PA data, were assessed using Pearson’s r, with the statistical significance set at p < 0.05.

Results

A statistically-significant difference in height and weight was observed between the youngest and oldest group as well as between the medium and oldest group (p < 0.02) (Table 1). Group one (youngest), differed significantly from group three (oldest) in terms of moderate and vigorous intensity PA, MVPA, and steps per day measured by the accelerometer. Groups two (middle) and three (oldest) differed significantly with regard low and moderate intensity PA, MVPA and steps per day measured by the accelerometer.

Table 1 General Characteristics of the Groups and Physical Activity Data

The comparison analysis results (Table 2) revealed significant differences between direct (accelerometer) and indirect (IPAQ) measurements of PA intensity (moderate, vigorous and MVPA) for all three age groups. Also, sedentary time, measured by the accelerometer, was significantly different at p = 0.0001 (large effect size) from sedentary time reported in IPAQ by all three groups. Moreover, no significant correlation between subjective and objective PA measurements were observed in any of the three age groups (Table 2). For all participants, significant correlations between age and PA indices, measured by the accelerometer, were found for low and moderate intensity PA (r = -168, p = 0.020; r = -295, p = 0.001, respectively), MVPA (r = -295, p = 0.001) and number of steps per day (r = -291, p = 0.001) (Fig. 1).

Table 2 Accelerometer-derived and IPAQ-estimated Indices of Physical Activity and Sedentary Behavior in the Three Age Groups
Fig. 1
figure 1

Significant correlation between the age and the low intensity physical activity (A), moderate intensity physical activity (B), MVPA (C) and number of steps per day (D)

Discussion

A significant difference was observed between objective and subjective PA assessment in women over 60 years old,; however this was not dependent on age group. Kowalski et al. (2012) suggest that measuring PA with a combination of direct and self-reported methods provides more holistic information, and hence, studies should employ multiple methods to fully evaluate PA. Indeed, our findings indicate that agreement between accelerometer-derived and IPAQ-reported PA measures was poor.

Prince et al. (2008) found the results yielded by PA self-reporting measures to be both higher and lower than those obtained by direct measurement, and emphasize the need for valid, accurate, and reliable PA measures. Additionally, while motion detectors are considered reliable PA assessment tools, standardized data collection methods and units for data reporting are needed to allow result comparison across studies (Bento et al., 2012; Taraldsen et al., 2012). A comparison of Actigraph PA data with questionnaire in Spanish people over 60 years old by Kortajarena et al. (2019) found that while objective methods appear more appropriate for measuring PA in women, a combination of objective and subjective methods seems to be more suitable in men.

In the present study, a decline in PA was observed with age, which was similar to findings by other authors (Lohne-Seiler et al., 2014; Meijer et al., 2001). Our findings indicate that older adults are sedentary for a majority of the day, or participate in low-intensity PA. Accelerometer data confirmed a statistically-significant decrease in all types of PA (except vigorous) and steps with age. The lack of correlation between age and vigorous PA could be explained by the large standard deviation of vigorous PA indicating high data variability and limited predictability. No correlation was observed between age and PA measured indirectly. This may be because the IPAQ included sitting questions, which do not address all types of low intensity PA. Our findings confirm those of Celis-Morales et al. (2012) and Grimm et al. (2012) indicating that sedentary behavior is greatly underestimated by the IPAQ compared to Actigraph recording, moderate and vigorous PA were, on the other hand, overestimated by the IPAQ. Grimm et al. (2012) identified significant relationships between IPAQ walking and accelerometer moderate walking, IPAQ total PA and MVPA, and IPAQ sitting and accelerometer sedentary behavior in older adults; however, in the present study, all measured parameters differed between the indirect and direct methods and no correlation existed. The differences between the studies might be due to the fact that present study used a triaxial Actigraph accelerometer, and the other studies used the uniaxial version. While Grimm et al. (2012) confirm a correlation exists between measured direct and indirect parameters, they conclude that the IPAQ appears to be a poor indicator of individual older adult PA behavior; even so, they suggest that it may better suited for larger population-based samples.

Celis-Morales et al. (2012) report a divergence between objective and subjective PA measurement methods based on Actigraph and the IPAQ long version in a group ranging in age from 18 to 73 years. They found that using the IPAQ to determine activity measures led to significant over-reporting of PA and under-reporting of sedentary behavior compared to the accelerometer-derived measures; for example, the difference between sedentary time measured with those two methods was 13%. In contrast, the difference between these two measurement methods in our present study was almost 60%. Celis-Morales et al. (2012) also found a strong correlation between accelerometer-derived and IPAQ-reported sedentary time, concluding that the IPAQ quantified sedentary behavior more accurately than it quantified PA. In the present study, no correlation was observed between direct and indirect methods; however, this may have been influenced by the age of the study participants, as all were over 60 years. A study comparing the IPAQ long version to commercially-available wearable activity tracker in a group of Portuguese older adults found considerable variation between self-reported and direct measurements (Domingos et al., 2021). The authors conclude that although accelerometry may be more a more accurate method, self-report questionnaires could provide valuable information about the context of the activity. It has also been suggested that the IPAQ long version does not offer sufficient reliability and validity for measuring sedentary behavior and MVPA in people over 60 years old (Ryan et al., 2018); as observed in our present study, it was found to underestimate sedentary behavior and overestimate MVPA. It seems that both the long and short versions of the IPAQ questionnaire should be used with caution in older population.

Several researchers have tried to adapt the IPAQ for use with older adults. Van Holle et al. (2015) assessed PA in older Belgian adults with an adapted long version of IPAQ. The main changes in the questionnaire included combining items on vigorous PA with moderate PA and adding gait speed and recreational cycling items. Similar to our present results, they found that older adults tended to over-report their MVPA. The authors suggested including more items describing low-intensity PA in the IPAQ, as these activities are a significant part of daily life for the older adult group. Neither the long or short versions of the IPAQ include any questions about low-intensity PA, which might be a limitation for use with older populations. Hurtig-Wennlöf et al. (2010) adapted the short version of the IPAQ for older populations in Sweden. The main changes included providing specific examples of activities and switching the question order, to start with sitting and end with vigorous PA. Their findings indicate that the adapted IPAQ was more accurate in assessing sitting time than the original version. Cleland et al. (2018) reported the IPAQ validity scores could be strengthened by providing additional detail for activities older adults might perform on a daily basis, potentially improving recall. Sedentary behavior underestimation in older adults was reported using two different PA questionnaires (Gennuso et al., 2015). Innerd et al. (2015) evaluated PA in a very old population, aged over 85 years, using an accelerometer and a specially-designed PA questionnaire. The relationship between the direct and indirect PA measurement methods was low, even when using a questionnaire developed for this particular age group. Therefore, measuring PA with other standardized questionnaires dedicated only to older populations requires continued focus upon the existing issues using self-reported measures for older adults.

Study limitation

The limitations of the current study was that it was based on an convenience sample. Therefore, any statistical generalization must be made with caution, due to the potential self-selection bias. However, as our objective was to compare direct and indirect measurement of PA in women over 60 years old, the atypical attributes of the volunteers, such as motivation, activity level and other correlates, will potentially equally affect the outcomes of the two types of analyzed measurements. Nevertheless, the risk of bias has to be taken into account when drawing conclusions from our present findings. In addition, due to lack of a normal distribution of the data in the three age groups, nonparametric tests were used for statistical analysis; although distribution-free tests are less powerful than the corresponding parametric test, the sample sizes used in the present study, i.e. above 50 participants, may improve the power. Even so, despite the methodological limitations of the study, we believe our findings nevertheless are of value for practitioners working with older woman similar to those in our present study.

In summary, direct PA measurement indicated a decline in PA with age across all three age groups of women over 60 years old. Moreover, inconsistencies were found between the objective (accelerometer) and self-reported (IPAQ) measurements of PA intensity (moderate, vigorous and MVPA) for all three age groups. Our findings support the use of multiple PA measurement methods to provide more accurate information on older adult PA, as sedentary behavior was highly underestimated with IPAQ use, while moderate and vigorous PA were overestimated. Thus, researchers and clinicians should carefully select adequate older adult physical activity outcome measures.

Contribution

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Anna Ogonowska-Slodownik, Natalia Morgulec-Adamowicz and Malgorzata Kalbarczyk. The first draft of the manuscript was written by Anna Ogonowska-Slodownik and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.