Background

Mobility is essential to everyday life, with significant positive impacts on active aging, physical activity and quality of life in older adults [1]. Conversely, impaired mobility is an early predictor of physical disability [2]. Mobility can be achieved through various motor actions, such as locomotion (e.g. walking, running) in ambulatory people or displacement using manual wheelchair in people with physical disabilities [3, 4]. Walking is reported as the first form of locomotion in which people engage worldwide [5]. For most individuals, walking forms the foundation for maintaining mobility and also contributes substantially to daily physical activity (through active transportation, activities of daily living and exercise), which is important for health [6, 7]. It is low-cost and accessible to most people and can be easily incorporated into everyday life. Running is also a low-cost form of physical activity for those who are able to achieve it.

In clinical settings, many common rehabilitation measures (e.g., timed walking tests, balance tests) are used to assess components of ambulatory capacity. However, many tests of ambulatory capacity have floor effects that limit their responsiveness to detect changes in frail older adults [8]. The World Health Organization recommends using performance measures to determine impact of disease in daily life, and to avoid floor and ceiling effects that are often related to capacity tests [9]. Thus, wearable devices, which provide objective measures, are commonly used for walking performance assessment in research and clinical settings.

Locomotor activities can be measured by estimating distance travelled or by quantifying number of steps (e.g., walking or running) [10]. With the growing interest in the development of technological innovations, many easily wearable devices offer the possibility to obtain these locomotion-based parameters (e.g., distance, number of steps) during walking or running in daily life [11]. Among these devices, ActiGraph is one of the most commonly used activity monitors for research in various populations [12, 13]. ActiGraph is used to quantify the volume of walking (e.g., step count and distance) in people with incomplete spinal cord injury [14], hospitalized elderly [15, 16], stroke survivors [17,18,19], children aged 10–17 years [20, 21], people with multiple sclerosis [22, 23], or in healthy people [24,25,26,27]. There are several models of ActiGraph that, vary according to type of sensors (e.g., accelerometer, gyroscope) and data processing (e.g., filter). Knowing that walking speed may vary between elderly people and adults, it would be interesting to determine if walking speed affects the results accuracy. In the literature, studies have reported walking speeds affected outcomes accuracy for step counting and distance estimation [28,29,30]. Furthermore, results can be affected by positioning and ActiGraph devices [31,32,33].

Although several studies have used the ActiGraph as a reference system for step counting [21, 34,35,36,37,38], little is known about the psychometric properties (e.g., criterion validity). Criterion validity is an estimate of the extent to which a measure agrees with a gold standard. A recent systematic review has shown reliability and criterion validity of commercially available wearable devices for step counting, energy expenditure and heart rate but some devices such as ActiGraph were excluded due to the unmanageable number of returned studies following title and abstract screening [39]. Full et al. [39] reported that 133 studies out of 169 were performed in healthy people. To the best of our knowledge, no systematic review has been conducted regarding the criterion validity of ActiGraph in adults (less than 65 years) or older adults (65 years and more). The aim of this systematic review was: (1) to summarize evidence related to the criterion validity of ActiGraph devices for step counting and distance travelled in adults or older adults (2) to compare criterion validity of different devices of ActiGraph according to positioning, walking speed and various processing data in adults or older adults.

Methods

This systematic review followed the “Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)” guidelines [40].

Description of the ActiGraph devices

ActiGraph devices (manufactured by ActiGraph LLC Pensacola, FL) are small and lightweight activity monitors (mass: 19–27 g; width: 33–39 mm; height: 11–37 mm; thickness: 18–46 mm) that are equipped with an accelerometer and sometimes also with a gyroscope. The accelerometer measures linear acceleration in one or three orthogonal directions. ActiGraph detects dynamic accelerations (resulting from motion) ranging from ± 3 to ± 16 g and static acceleration (e.g., force of gravity detected when stationary) depending on device types [41]. The acceleration signal is digitized by an 8- and 12-bit analog-to-digital converter at a frequency of 10–100 Hz (in multiple of 10 Hz, e.g., 10 Hz, 20 Hz) depending on the device, filtered and reported as an "activity count". The signal is filtered at bandwidth of 0.21–2.5 Hz using a band-pass filter. Actilife software is a post-processed environment which determines steps per epoch. There are different ActiGraph devices: the ActiGraph 7164, the ActiGraph GT1M, the ActiGraph wGTX +, the ActiGraph GT3X +/wGT3X +, the ActiGraph GT3X-BT, and the ActiGraph GT9X +. These devices differ according to filter, mechanism used by the sensors (e.g. Piezoelectric, microelectro-mechanical-system capacitive), battery life or addition of other sensors (e.g., gyroscope and magnetometer for the ActiGraph GT9X +). The ActiGraph wGT3X + differs from ActiGraph GT3X + by adding a specific function for heart rate monitoring. The price varies between US$325 (+ US$349 for the Actilife software) and US$1016 depending on the device. According to the manufacturer's recommendations, ActiGraph devices can be positioned using wristbands or elastic bands on the wrist, ankle, thigh and/or hip. Characteristics of ActiGraph types are presented in Table 1.

Table 1 Characteristics of ActiGraph devices

Search strategy

For literature search, keywords were designed around three concepts, namely (#1) the activity monitor (“ActiGraph”), (#2) the psychometric property (“validity”) and, (#3) outcomes (“distance” or “step count”). A more detailed search strategy was then developed by AMN: Armelle-Myriane Ngueleu, and CB: Corentin Barthod, including key words related to the three basic concepts and their synonyms. The search strategy was conducted in each database: Medline (OVID), Embase, IEEExplore, CINAHL, Engineering Village and Web of Science according to free and controlled terminologies. The initial search was performed on February 15, 2020 and an update was performed on August 3, 2021.

Study selection

To be included in this systematic review, the studies should have: (1) reported results pertaining to the criterion validity of an ActiGragh device, (2) analyzed variables for walking distance or step count, (3) used at least one of the following reference systems: motion capture, manual counting, video recording (for counting steps or distance), other valid device (for distance) or a predefined distance (for distance estimation), (4) included healthy participants (aged 18 and over) and (5) been published in English or French. Titles and abstract of the identified articles were screened independently by two reviewers (AMN, CB) who determined their eligibility. In case of discrepancies, consensus was reached through discussion. In the absence of consensus, a third reviewer (CSB: Charles Sèbiyo Batcho) screened the study and new discussions took place until consensus was reached. The same procedure was used for full-text selection.

Methodological quality

The two reviewers (AMN, CB) independently conducted a quality analysis of the articles based on two quality assessment tools. First, the COSMIN grid (“consensus-based standards for the selection of health measurement instruments”) [42] was used to critically appraise the quality of the criterion validity [box H]). Second, MacDermid's grid was used to evaluate the structural and methodological qualities (research questions, study design, measurements, analyses and recommendations) of the articles [43]. These two grids provide information on the overall article quality. An initial meeting was held beforehand to agree between the two reviewers on each evaluation criterion. A second meeting was held following the independent critical appraisal by the two reviewers to reach consensus on the evaluation.

For each grid, the score was converted into a percentage. We assigned the value 1 to the point "excellent" and 0 to the points “good”, “fair” or “poor” for the COSMIN grid. The quality score for both grids was characterized as follows: Very low quality (VLQ) = 0–25%, low quality (LQ) = 26–50%, moderate quality (MQ) = 51–75% and high quality (HQ) = 76–100% [44]. Pre-consensus inter-rater agreement was calculated using the Gwet-weighted coefficient on each individual item of the COSMIN grid. The level of inter-rater agreement was defined as: poor < 0.0; slight 0.0–0.2; fair 0.21–0.4; moderate 0.41–0.6; substantial 0.61–0.8; excellent 0.81–1 [45]. An intraclass correlation coefficient (ICC) was calculated to assess inter-rater agreement on the overall McDermid grid score. The ICC score was defined as follows: values < 0.5 indicate poor agreement, values between 0.5 and 0.75 indicate moderate agreement, values between 0.76 and 0.9 indicate good agreement, and values between 0.91 and 1 indicate excellent agreement [46].

Data extractions

Each reviewer performed a complete data extraction from the articles included in this review. The following targeted variables were extracted using a standard data extraction tool [43]: sample size, participants’ characteristics (age, body mass index), ActiGraph devices, ActiGraph positioning, reference systems, parameter evaluated (step count or distance), evaluation duration and validity index. The indices extracted for criterion validity were: accuracy (percentage), r (simple correlation coefficient), ICC (intraclass correlation coefficient), LoA (limit of agreement) and t-test.

Data analysis

For studies reporting means comparison, a Cohen’s d (D) was calculated (see Eq. 1), in order to quantify the difference size as following:

$$D=\frac{Average of difference}{Standard deviation of gold standard}.$$
(1)

The criterion validity of the ActiGraph devices was determined using three interpretation tables depending on different indices (Pearson correlation coefficient, intra-class correlation coefficient and average comparisons). A measure was considered to be valid if it had a correlation qualified as “Good” or “Excellent” according to the magnitude r or ICC, or if the effect size was “trivial” according to Cohen’s d.

The Pearson correlation coefficient is interpreted using the Cohen scale [47]

The intra-class correlation coefficient is interpreted using the Ciccetti scale [48]

The effect size (Cohen’s d) associated with average comparisons is interpreted using the Hopkins scale [49]

• < 0.3: Very low

• Between 0.3 and 0.49: Moderate

• Between 0.5 and 0.69: Good

• Between 0.7 and 1: Excellent

• < 0.4: Very low

• Between 0.4 and 0.59: low

• Between 0.6 and 0.74: Good

• Between 0.75 and 1: Excellent

< 0.2: Trivial

Between 0.2 and 0.59: Low

Between 0.6 and 1.19: Moderate

Between 1.2 and 1.99: Important

Between 2 and 4: Very important

 > 4: Extremely important

Results

Following the literature search, 862 articles were retrieved from the six databases and 21 articles were included after removal of duplicates, screening of titles, abstracts and full-text analysis of the articles (see Fig. 1) (Additional file 1).

Fig. 1
figure 1

PRISMA flow chart for systematic review of the criterion validity of ActiGraph for step count and distance

General characteristics of the studies

The total sample included 637 participants with an average age of 30.3 ± 7.5 years (for adults) and 82.7 ± 3.3 years (for older adults) (see Table 2). All the included articles reported the criterion validity of the ActiGraph devices for step counting. Only one article reported the criterion validity of ActiGraph type for both step counting and distance. Experiments were performed in older adults (n = 4) [1, 10, 16, 50] and in adults (n = 16) [11, 22, 24, 25, 27, 51,52,53,54,55,56,57,58,59,60,61]. In one study, assessment was performed in adults and older adults [26]. Reported walking speeds ranged from 0.43 to 4.43 m.s−1 [1, 10, 16, 22, 24,25,26,27, 50,51,52,53,54, 56, 57, 59,60,61]. Thirteen articles tested walking and running speeds in sessions lasting from 2 to 15 min [22, 24,25,26,27, 51,52,53,54,55,56,57, 59], of 30 min in a study [60] and one study used an incremental test (i.e., a speed that increased progressively) for two minutes [11]. In other studies, walking speeds were tested on walking distance of 10 [16], 40 [61] or 100 [1] meters, or for 100 steps [10]. Two studies have assessed an ActiGraph device during a full day (11.6 ± 1.5 h) [16] and three days (13 h 30 mins per day) [58], respectively. The handheld tally counter and video observation were used as gold standard in most studies except three papers that used StepWatch monitor as gold standard [16, 58, 61]. Experiments were performed in indoor setting in all studies except in two studies (outdoor setting) [5860].

Table 2 Characteristics of the included studies

Devices of ActiGraph used

A total of five ActiGraph devices were reported in the studies including the ActiGraph 7164, the ActiGraph GT1M, the ActiGraph wGTX +, ActiGraph wGT3X − BT and the ActiGraph GT3X +. Among them, the ActiGraph 7164 (n = 3) and GT1M (n = 2) devices are unidirectional and the ActiGraph wGTX + (n = 1), wGT3X − BT (n = 3) and GT3X +/wGT3X + devices are tri-directional. The ActiGraph GT3X +/wGT3X + device was the most commonly used (n = 13). The positioning of the ActiGraph devices differed across studies: hip [1, 10, 11, 16, 22, 24, 25, 27, 50,51,52,53,54,55,56,57, 59,60,61], ankle [16, 59, 61] and wrist [51, 59, 60] (see Table 2). Fourteen studies positioned the ActiGraph devices only on the hip [1, 10, 11, 22, 24, 25, 27, 50, 52,53,54,55,56,57], four studies simultaneously on the hip and wrist [26, 51, 58, 60], two study on the hip and ankle [16, 61] and one study on the hip, wrist and ankle [59].

Methodological quality

The scores on the MacDermid critical appraisal tool ranged from 50 to 91% with a mean ± SD of 74 ± 9.8% (see Table 3). Eleven articles were classified as high-quality, nine articles as moderate-quality, and one article as a low-quality. The results of the COSMIN grid are presented in Table 4 and the scores for criterion validity (box H) ranged from 50 to 100% with a mean of 74.8 ± 18.2%. Eleven articles were classified as high-quality, two articles as moderate-quality and eight articles as low-quality. All articles did not score for sample size except one study [11] and detailed exclusion/inclusion criteria, thus partially explaining the moderate quality score in both grids. The pre-consensus inter-rater agreement between reviewers for the total scores of the MacDermid and COSMIN grids was considered good (ICC = 0.87) and excellent (Gwet = 0.85–0.92), respectively.

Table 3 Assessment of methodological quality of studies using the MacDermid grid
Table 4 Assessment of studies examining criterion validity using COSMIN grid

Criterion validity of ActiGraph devices for step counting

Twelve studies used manual step counting as gold standard [1, 10, 11, 22, 24, 25, 27, 50, 55,56,57, 60], six studies used video observation [26, 51,52,53,54, 59] and three studies used StepWatch monitor [16, 58, 61] (see Table 2). In terms of validity indices, five studies compared the ActiGraph and reference system measures by determining the Pearson/Spearman correlation coefficient [1, 24, 52, 56, 59] and six studies used an intra-class correlation coefficient [11, 16, 27, 57, 58, 61]. Fourteen studies used different tests of average comparison (confidence interval, standard error of measurement, percentage of difference, percent error, mean measurement bias scores, mixed linear model, mean absolute percentages error (MAPE), Bland–Altman plots, t-tests). The results are associated with walking speeds (see Table 5). Results of Cohen’s d for studies with average comparison are presented in Table 6.

Table 5 Criterion validity indices of ActiGraph types for step counting and distance in healthy adults and older adults
Table 6 Calculation of Cohen’s d

Criterion validity of ActiGraph types for distance

One study estimated distance using the ActiGraph wGT3X + in comparison with global positioning system (GPS) for a total duration of 30 min in outdoor setting and reported a moderate criterion validity [60]. The ActiGraph wGT3X + was positioned on the hip and wrist. However, only outcomes of the hip-worn ActiGraph were reported based on two methods (linear mixed models and equation estimated speed multiplied by time) for distance estimation. The linear mixed models were used to estimate distance from corresponding parameters measured by each activity monitor for each walking bout (GPS distance, hip- and wrist-worn ActiGraph total vector magnitude (VM) raw data, hip- and wrist-worn ActiGraph total VM counts and total steps. VM raw data and counts refer to the VM computed from the resampled raw acceleration and counts per second for a yielded wearing positioning [60]. The equation estimated speed was based on each walking bout (GPS mean speed, hip- and wrist-worn ActiGraph mean VM raw data, hip- and wrist-worn ActiGraph mean VM counts and step cadence, and ankle-worn StepWatch step cadence). A walking bout was defined as period of time in which steps occurred in subsequent 30-s intervals. For each method, distance estimation was assessed from vector magnitude (defined by \(\sqrt{{x}^{2}+{y}^{2}+{z}^{2}}\) where x, y, and z represent the raw acceleration or the counts yielded from each axis) counts, vector magnitude raw data and total steps [60]. Outcomes seem to show use of vector magnitude counts and vector magnitude raw data is more accurate than use of steps for both distance estimation methods (linear mixed models and equation estimated speed multiplied by time) [60].

Discussion

The main objective of this systematic review was to determine the criterion validity of ActiGraph devices for step counting and distance estimation in healthy adults and older adults. Twenty-one articles were included in this review and results showed that the ActiGraph GT3X + and wGT3X − BT yielded better criterion validity than the ActiGraph 7164, wGTX + and GT1M for step counting. One study examined the criterion validity of ActiGraph wGT3X + for the estimation of distance travelled in adults.

Studies included in this systematic review evaluated the criterion validity of ActiGraph devices for step counting and distance in adults and elderly people. Five different ActiGraph devices were reported and the ActiGraph GT3X +/wGT3X + was predominantly reported in 13 out of 21 studies [1, 16, 25, 27, 50, 51, 54,55,56,57,58,59,60,61]. All ActiGraph devices reported in this systematic review included only the accelerometer and assessed in indoor setting except in two studies (outdoor setting) [58, 60]. Furthermore, assessment time was short (from 2 to 15 min) in most studies with small errors. For example, Esliger et al. [52] reported 5.3% of error on four minutes of walking (i.e. five to seven steps per minute). Thus, results did not reflect daily use of the ActiGraph devices in outdoor setting in healthy people.

Criterion validity according to the ActiGraph devices

For step counting

Overall, the criterion validity of ActiGraph is distinguished by type of internal accelerometer (unidirectional or tridirectional) and different analysis algorithms. Two unidirectional ActiGraph devices (the ActiGraph 7164 and the ActiGraph GT1M) showed a moderate criterion validity. Indeed, the ActiGraph 7164 was valid (≤ 5.3% error) in two studies [22, 52] and according to walking speeds, had a moderate (≤ 13% error) validity in one study [54]. The ActiGraph GT1M exhibited low to high validity (0.37 ≤ r ≤ 0.69) depending on walking speeds in the study by Abel et al. [24] and (− 61% ≤ difference ≤ − 1%) in the study by Feito et al. [25]. These results are not encouraging for the use of these two unidirectional ActiGraph devices for step counting. Regarding the tridirectional devices, the ActiGraph wGTX + was partially valid in the only study that had evaluated it [26], while the validity of ActiGraph GT3X +/wGT3X + was good in the most studies, except in four studies that had reported validities (from low to high) according to walking speeds [16, 27, 57, 61]. Indeed, step count validity was low at low walking speeds (≤ 0.9 m/s) and good to excellent at self-selected comfort walking or running speeds (≤ 4.44 m/s) [16, 50, 57]. The ActiGraph wGT3X − BT yielded high criterion validity at walking speeds (from 0.9 m/s to 1.3 m/s) in three studies that assessed it [11, 25, 60].

For distance estimation

Only the hip-worn ActiGraph wGT3X + was used in one study [60]. Therefore, a comparison of criterion validity of ActiGraph types is not possible for distance estimation. In this study, two methods were used based on linear mixed models (method 1) and equation estimated speed multiplied by time (method 2) from vector magnitude counts, vector magnitude raw data and steps. Overall, method 2 seems to yield outcomes of distance estimation more accurate than method 1 regardless of data used. However, one study is insufficient to make conclusion.

Criterion validity depending on filters used

Filters significantly impact on the criterion validity of the ActiGraph devices. Indeed, in individuals with low walking speeds (e.g. frail elderly), the use of filters (e.g. low frequency extension—LFE with cutoff frequency at 4 Hz, 10 Hz) allows extending the bandwidth, and theoretically increases sensitivity to lower intensity movements [25, 62]. Therefore, the LFE allows to increase the sensitivity of accelerometer signal at low intensity movements by decreasing the proprietary amplitude threshold [61]. However, the LFE seems not to be relevant for step detecting at high intensity movements [61]. Validity of the ActiGraph GT3X + was higher using LFE (e.g. ICC = 0.83) in comparison with default data processing (e.g. ICC = 0.05) independently of the positioning (hip, ankle) in slow walkers [16]. However, in individuals with high walking speeds, a LFE can lead to an overestimating of actual steps due to greater amount of movement artifacts being counted as steps, specifically for the wrist-worn ActiGraph [63]. A LFE seems not to improve accuracy the ActiGraph wGT3X + for distance estimation in adults with self-selected walking speed [60].

Criterion validity depending on sampling frequency

Nine studies did not report signal processing, however signal processing can affect outcomes. For studies which reported signal processing, sampling frequency was not the same, although the sampling frequency was within frequency range provided by the manufacturer. Indeed, nine studies which assessed the ActiGraph GT3X + /wGT3X + reported three sampling frequencies (30 Hz, 60 Hz and 100 Hz) [1, 27, 51, 55,56,57,58, 60, 61]. Step count validity with sampling frequency of 100 Hz was low (0.03 ≤ ICC ≤ 0.64) in study of Riel et al. [57] and moderate (23% of error) in study of Webber et al. [1]. Three out of five studies using 30 Hz of sampling frequency had step count validity varying of low to high (0.0 ≤ ICC ≤ 0.99 and − 50% ≤ error ≤ − 0.1%) [27, 51, 61]. In two studies, criterion validity was good (0.76 ≤ r ≤ 0.99 [56] and 97.8% ≤ detection rate ≤ 99.6% [60]) for step count using a sampling frequency of 30 Hz. Two studies used a sampling frequency of 60 Hz and reported a moderate (− 32% ≤ error ≤ 14%) validity of step detection [55, 58]. Results of these nine studies did not indicate which sampling frequency was more appropriate for a better step count.

Criterion validity depending on dynamic range

Two studies used the ActiGraph 7164 with different dynamic ranges (0.05–3.2 g and 0.5–1.25 g) and reported different accuracies [22, 52]. Indeed, in the study of Esliger et al. [52], acceleration with 0.5–1.25 g of dynamic range seemed to yield a better accuracy in detecting steps. Dynamic ranges of 0.06–1.94 g and ± 6 g were reported only in one study for the ActiGraph GT1M [24] and GT3X + [1], respectively. No studies reported dynamic range of ActiGraph wGT3X +. Therefore, it is difficult to assess impact of dynamic range on criterion validity of ActiGraph GT1M, GT3X + and wGT3X +.

Criterion validity according to walking speed

Results showed the impact of walking or running speeds on the criterion validity of ActiGraph types for step counting. Indeed, slow walking did not allow valid step counting measurements using the ActiGraph devices. Thus, all the ActiGraph devices were not valid for walking speeds below 54 m min−1 (0.9 m/s) [10, 24,25,26,27, 50, 54, 57]. There is probably a speed threshold value for each ActiGraph device, below which step counting is no longer valid. The signal measured might not be sufficient to reach the proprietary threshold in step detecting for slow walking. Indeed, slow walking is generally characterized by low acceleration amplitude. These results are in accordance with data reported in the literature [29, 30, 39, 64]. Indeed, studies have reported low criterion validity for step counting using activity monitors integrating an accelerometer at low walking speed [10, 30, 39].

Criterion validity according to the positioning of ActiGraph devices

The criterion validity of the ActiGraph devices also differs depending on the positioning. Indeed, all 21 studies positioned the ActiGraph devices on the hip. Four studies placed the ActiGraph devices on the hip and the wrist simultaneously [26, 51, 58, 60] two studies on the hip and the ankle [16, 61] and one study on the hip, wrist and ankle [59]. All studies that quantified number of steps with the wrist-worn ActiGraph devices used an average comparison and reported significant differences in regards to gold standard. The hip-worn ActiGraph generally showed a better validity depending on walking/running speeds and ActiGraph devices. These results can be explained by the fact that during walking or running, the upper limbs and mainly the wrist generate acceleration movements that can induce false positive or true negative in results of step detection. The hip produces less random movements, which can reduce steps detection biases. A possible reason for this under- or overestimation of steps could be a lack of specificity of signal processing algorithms to differentiate between actual steps and spurious or false positive step detection caused by the bouncing of the accelerometers on the waist belt [52]. The ankle-worn ActiGraph GT3X + and wGT3X − BT seemed to yield outcomes more valid in step detection at walking speeds from 1.1 m/s to 1.6 m/s in two studies which assessed different walking speeds (0.2 m/s to 2.7 m/s) [59, 61]. The only study that compared three wearing positions of ActiGraph wGT3X − BT (hip, wrist and ankle) reported that for step counting, the hip positioning was the most valid at walking speeds from 1.1 m/s to 2.7 m/s [59]. However, in the same study, the hip-worn ActiGraph wGT3X − BT was the less valid at walking speed of 0.5 m/s [59]. Both studies that used the ActiGraph GT3X + located on the hip and ankle reported the ankle-worn ActiGraph GT3X + yielded less error than the hip-worn ActiGraph GT3X + for step counting at walking speeds from 0.2 m/s to over 1.4 m/s [16, 61]. The number of studies and participants is small to conclude on the impact of ActiGraph positioning on results validity. Nonetheless, criterion validity of ActiGraph seemed to depend on walking speed, positioning and ActiGraph types. Results reported in this systematic review conform with the literature on the influence of positioning of activity monitors on validity of step counting [65].

Strengths and weaknesses of ActiGraph devices

For step counting

This systematic review shows that the ankle-worn ActiGraph GT3X + is valid for step counting at walking speeds from 0.2 m/s to over 1.4 m/s in indoor setting. However, only two studies have assessed the ankle-worn ActiGraph GT3X +. Step counting with hip-worn ActiGraph wGT3X − BT also appeared valid, depending on walking speeds (from 1.1 m/s to 2.7 m/s). Therefore, there is a minimum walking speed (0.9 m/s) below which some ActiGraph devices are no longer valid [22, 52]. The ActiGraph GT3X + and wGT3X − BT seem to be the most valid devices for step counting. However, all included studies were conducted in indoor setting except two studies [58, 60]. Therefore, results did not reflect daily use of the ActiGraph devices. In 19 out of 21 studies, the ActiGraph devices were assessed on short durations with small errors. This can lead to large differences over a 24-h period of use. In addition, sample sizes of the studies were small, thus results cannot be generalized. Results showed that some ActiGraph devices were not valid at low walking speeds (< 0.9 m/s) for step counting.

For distance estimation

Furthermore, ActiGraph devices provide raw data that need to be analyzed using custom algorithms. The availability of raw data should facilitate development of algorithms for distance estimation. Only one recent study assessed criterion validity of the ActiGraph wGT3X + located on the hip for distance estimation and reported moderate results (7.4–18.8% of error). However, other studies should be realized to estimate distance using different ActiGraph types, ActiGraph locations and walking/running speeds.

Limitations

This systematic review focused only on studies conducted in adults and older adults to reduce variability of reported data and thus reduce the risk of bias related to variability in walking patterns. However, other systematic reviews should be conducted to identify the psychometric properties of the ActiGraph in pathological populations (e.g., stroke survivors) which have variable walking patterns. In addition, only one study included in this systematic review focused on criterion validity of the ActiGraph devices for the estimation of distance. This can be due to the activity monitor types and the healthy population defined in our inclusion criteria. However, it is important to note a lack of studies on the recent ActiGraph GT9X +, which could also allow a good validity because data analysis is based on various sensors. It should be mentioned that results of the studies included in this systematic review are mostly performed in indoor setting exempt for two studies. However, according to the manufacturer, the main purpose of ActiGraph devices is to collect information in daily life in individuals (e.g., to evaluate their life quality or physical activity practice). Several studies did not report the signal processing, sensitivity, dynamic range and analysis algorithm. Indeed, the signal processing needs to be reported in studies to facilitate comparison of devices. A design standardized validation protocol should be established to normalize validation method and enable comparison between devices. The design standardized validation protocol should indicate different walking or running speeds, durations and settings of assessment, signal processing description, device location, etc.

Conclusion

The main objective of this systematic review was to determine the criterion validity of ActiGraph devices for step counting and distance estimation in healthy adults and older adults. This review revealed a lack of studies (only one study) on the estimation of distance travelled in healthy people. The hip-worn ActiGraph wGT3X + yields a moderate criterion validity for distance estimation. Regarding the criterion validity for the step count, this systematic review revealed that the ActiGraph GT3X + /wGT3X + and wGT3X − BT provide outcomes that are closer to reference measures than other previous ActiGraph devices. Results showed that the ActiGraph GT3X + /wGT3X + and wGT3X − BT have good criterion validity for step counting (under certain conditions of walking speed, positioning and filters used).