Background

Flexible flat foot (also known as pes planus or planovalgus) in children, when there is the appearance of a lowered medial longitudinal arch, with or without rearfoot eversion [1] is one of the most frequently reported reasons to seek orthopaedic opinion [2]. Yet, in typically developing children, normative data indicates ‘flat’ is normal for children up to eight years of age [3], due to age appropriate osseous and ligamentous laxity, increased adipose tissue and immature neuromuscular control [4, 5]. Although variable, the ‘flatness’ of this foot posture reduces over the first decade of life [3, 6,7,8,9]. However, some children with a flexible flat foot posture report lower limb pain [10] and have demonstrated reduced lower limb function [11]. Furthermore, adults with flexible flat feet report significantly increased levels of back and lower limb pain [12] and reduced quality of life [13]. The challenge for health professionals is in identifying when a child’s foot is, or isn’t, in keeping with developmental expectations, particularly in relation to foot posture and/or function; in order to reassure, monitor or intervene accordingly [14, 15]. Therefore, the measure used to indicate where a foot posture is outside of the expected flatness in children (i.e. the diagnoses of flat foot) needs to be valid, reliable and appropriate for developing foot posture typically observed.

Flat foot is diagnosed through a variety of measures, including plain film radiographs (e.g. x-ray), static foot posture measures and footprint analysis [16]. Plain film radiographs are considered the reference standard to determine flat foot magnitude; however, this method is costly, involves radiation risk, and is not routinely used in clinical practice [17]. Plain film radiographs, static postures or footprint methods allow flat foot description by analysing different angles or measures and, in many cases, comparing these to known population norms. The prevalence of paediatric flat foot has been reported as low as 0.6% and as high as 77.9% (age range 5 to 14 years and 11 months to 5 years respectively), [18, 19]. Whilst an explanation of this broad variation may be due to the changing foot posture as the child develops, there is concern that the measures of flat foot may not differentiate between what is an expected level of ‘flatness’ in children and abnormal presentations [3]. To the best of the authors knowledge, there is no comprehensive review of the psychometric properties of flat foot measures as they apply to the paediatric population [16].

The two core elements of psychometric properties are reliability and validity [20]. Reliability relates to the inherent variability of a foot posture measure and the error that is attributable to the rater and the tool used, expressed as the stability of the data when measured by: one observer over two or more occasions (i.e. intra-rater reliability); or two or more observers (inter-rater reliability), [21]. Validity relates to the extent to which a tool measures what it is intended to measure [21]. Validity of a foot posture measure can be expressed in several ways. For example, criterion-related validity would be the ability of one measure of flat foot to predict results of another measure of flat foot that is assumed to be valid, such as comparing a foot print indices to a plain film radiograph as the reference standard [20]. Or construct validity, which in broad terms determines if the measure has enough ‘sensitivity’ to detect when the condition exists (e.g. a measure with high sensitivity has a low level of false-positive diagnoses), and ‘specificity’ to detect when the condition does not exist (e.g. a measure with high specificity has a low level of false-negative diagnoses) [22]. To be confident that a diagnosis of flat foot is correct, the measure used needs to be both valid and reliable for the population to which it’s applied.

The primary aim of this systematic review was to investigate how paediatric foot posture is measured and how paediatric flat foot posture is defined. The secondary aim is to identify the psychometric properties of the foot posture measures used to determine if these measures are valid and reliable for this population.

Methodology

Protocol and registration

The systematic review was guided by the PRISMA protocol [23]. The registered protocol is listed on PROSPERO, registration number: CRD42016033237.

Information sources and search strategy

The following databases were searched from inception to Jan 2017: MEDLINE [Ovid], CINAHL, EMBASE, The Cochrane Library, AMED, SportDiscus, PsycINFO, and Web of Science. The search terms are outlined within Table 1.

Table 1 Search terms for systematic review of the literature on flexible flat foot in paediatrics

Medical subject headings (MeSH) were exploded, combined with relevant keywords and truncated as necessary. Searches were limited to English language studies. Further studies were sought from a review of reference lists, conference proceedings and personal communications with content experts (Fig. 1). In addition, studies referenced within the final included articles that cited psychometric properties of the measures and criteria used to define flat foot were sourced (Fig. 1).

Fig. 1
figure 1

Flow chart of search strategy

Eligibility criteria

Studies were included if published in peer-reviewed journals, participants were aged ≤18 years and the outcomes included a definition and measure of flat foot. Table 2 displays the full inclusion and exclusion criteria.

Table 2 Inclusion and exclusion criteria

Title, abstract and full-text screening was independently conducted by two investigators (MP, HB/SM) with a third reviewer (CW) consulted in the event of non-agreement (Fig. 1).

Critical appraisal of bias and data extraction

A priori decision was set to include all studies meeting the criteria regardless of potential risk of bias and include all measures of flat foot where validity and reliability measures reached a moderate or above rating (see data management for rating parametres), [22, 24, 25]. Data extraction was in keeping with the aims of the study and included; study design, participant age range, sample size, ethnicity/country of study, foot posture measure(s), flat foot definition and relevant psychometric data related to QAREL and a purpose-built criterion described below.

The outcomes of interest in validity studies were sensitivity, specificity and correlation with a reference standard (e.g. plain film radiographs). Validity was assessed with a purpose-built criterion (Additional file 1), covering: reported validity of the flat foot measure and definition; age (in years) of the test population; differences in the cited protocol reported and included study protocol; and, a pragmatic determination of whether validity was demonstrated for a paediatric population (yes/no/with caution). For example, a yes was assigned if a paediatric sample was used for validity testing, the study protocol matched the cited protocol and sensitivity / specificity or correlations with reference standard were moderate or above; a no would be assigned if the study population was adult or sensitivity/specificity or correlations with reference standard were below moderate. With caution was assigned if the study population had been paediatric but aspects of sensitivity/specificity or correlation with a reference standard had mixed results (Additional file 1).

The reliability outcome of interest was inter-rater agreement. Inter-rater reliability studies were appraised using the QAREL checklist [26, 27] and a purpose-built assessment (Additional file 1). The 11 item QAREL tool assesses: if the test evaluated a sample of representative subjects; was it performed by raters representative of those standardly using the measure; were raters blinded to i) the findings of other raters, ii) their own prior findings iii) the reference standard outcomes, iv) other clinical information, and v) cues that were not part of the procedure; was the order of examination randomised; was the time interval between measures suitable; did they apply the protocol appropriately; and, was the statistical analysis correctly conducted. Each item was scored as yes, no, unclear or not applicable rating. The QAREL score is the number of items that received a ‘yes’ rating (Additional file 1). The purpose-built criteria covered five criteria; definition of flat foot used, age (in years) of the test population; differences in protocol reported between the cited and included article; inter-rater reliability measure and outcome; and, a pragmatic determinant of whether reliability was demonstrated for a paediatric population (yes/no/with caution). The assignment of yes/no/with caution were based on similar outcomes as for validity ratings (Additional file 1).

Two investigators independently extracted data and assessed articles against the QAREL criteria and purpose-built criteria (HB, MP/CW) with any discrepancies resolved by a fourth reviewer (SM).

Data management

Data were synthesized in table form. Correlations with reference standards and inter-rater reliability outcomes were presented as Intraclass Correlation Coefficients (ICCs) [22], kappa coefficients [27] or sensitivity and specificity data [25]. For consistency, outcomes were rated according to Fig. 2. All other responses displayed as descriptive only or awarded a yes/no/with caution response. Outcomes were required to be deemed valid and reliable to be accepted as appropriate.

Fig. 2
figure 2

Rating parametres applied

Due to the heterogeneity of the included studies, a meta-analysis was not conducted. Instead, a descriptive synthesis of the results was undertaken.

Results

Study selection

The search strategy identified 1101 unique titles (Fig. 1). Following screening, a total of 27 articles were included in the review.

Participants

A total of 15,301 child participants were included within the 27 studies (Table 3). Participants ranged between 3 and 18 years of age. Sample sizes ranged from 22 to 5866 (Table 3). In one study, all participants were male [28]. Four studies separated participants into overweight and normal weight groups for analysis [29,30,31,32]. Ethnicity or country of study was reported in 26 studies, representing 15 different ethnicities or countries (Table 3).

Table 3 Summary of included studies

Study design

The majority of included studies were cohort [30, 33,34,35,36,37,38,39,40,41,42,43,44] and cross-sectional [29, 45,46,47,48,49,50,51,52], with a respective 13 and 9 of each study design. Of the other five included articles, three were case control [31, 32, 38], one was a case series [53], and one was a quasi-randomised controlled trial [28].

Primary findings

Foot posture measures and definitions

Across the 27 included studies, 20 foot posture measures were used, involving 40 definitions of flat foot (Table 4). Ten of the 27 studies used multiple measures of flat foot. One study featured a novel method of footprint evaluation [51]. Methodological variations existed across studies, with different parameters and angles assessed following measurement, and different methods for obtaining the footprint/angle and determining flat foot (Table 4, Additional file 2).

Table 4 Rating of reported validity and reliability for foot posture measures and definition of flexible flat foot in paediatric populations

Of the 20 foot posture measures used, six were plain film radiographs of angles including calcaneal pitch (or calcaneal inclination), anterior-posterior talocalcaneal (AP talocalcaneal), plantarflexion of talus, lateral talocalcaneal, calcaneal-first metatarsal and talus-first metatarsal angles (Table 4). Nine were footprint indices (Chippaux-Smirak index, Arch index, Clarke’s angle [or Footprint angle, Alpha angle], Staheli Arch index, Footprint index, Martirosov’s K index, Footprint evaluation, Instep and Plantar footprint), (Table 4). There were four static foot measures (rearfoot eversion, Arch height index, Foot Posture Index–6 item version [FPI-6] and navicular height) and one plantar pressure study [Foot Ground Pressure], (Table 4).

The Arch index was the most frequently used measure (n = 7), with the Chippaux-Smirak index and rearfoot eversion also frequently employed (n = 6 respectively), (Table 4). A further seven measures were used in more than one study (Clarke’s angle (n = 5), Calcaneal pitch and Staheli arch Index (n = 4), and, AP talocalcaneal, Talus-first metatarsal angle, Arch height index and FPI-6 (n = 2 respectively)), (Table 4). Nine alternate assessment measures were used once across the included studies: plantarflexion of talus, lateral talocalcaneal angle, calcaneal-first metatarsal angle, and; Footprint index; Martirosov’s K Index; instep; Plantar Footprint; navicular height; and, Foot Ground Pressure (Table 4).

The most commonly used flat foot definition was the Arch Index ≥0.26, used four times across the 27 included studies. An Arch index of >0.26 was used twice, and ≥0.28 used once in three further studies. A Chippaux-Smirak Index of ≥45 and >62.70% were used twice (n = 2 respectively). Other definitions used twice across the included studies were talus-first metatarsal angle, rearfoot eversion 5° and 4°, and a Clarke’s Angle of <42° (Table 4).

Thirteen of the included 27 studies did not investigate or report the psychometric properties of the measures used to determine paediatric flat foot [30, 32, 33, 35, 36, 40, 45, 49, 50, 52,53,54,55], (Table 4), leaving 8 of the 20 foot posture measures used within this systematic review without reported validity or reliability outcomes to justify their use. Specifically; plain film radiograph measures of AP talocalcaneal angle, plantarflexion of talus, lateral talocalcaneal angle, calcaneal first metatarsal angle; the Instep; Plantar footprint; navicular height; and, Foot Ground Pressure, (Table 4, Additional file 1).

Quality and appropriateness of reported psychometric properties for a paediatric population

Two studies investigated the validity of the foot posture measures used with their studies [34, 38], five studies [29, 37, 47, 48, 56] justified their choice by citing seven existing studies [57,58,59,60,61,62,63] and one study did both [31]. No foot posture measures were assessed with a ‘yes’ ranking in relation to their validity for a paediatric population (Table 4, Additional file 1). The Chippaux-Smirak index, Clarke’s angle, Staheli arch index and the FPI-6 respectively were ranked as relevant to a paediatric population ‘with caution’ (Table 4, Additional file 1).

The quality of the reliability testing, in relation to a paediatric population, was also limited. Four studies investigated the reliability of the measure used to determine flat foot within their studies [37, 41, 46, 47], five studies [28, 29, 34, 39, 51] justified their choice by citing seven existing studies [64,65,66,67,68,69,70] and three studies did both [31, 37, 47]. Two cited articles were not available to assess [64, 67]. The Arch index, Chippaux-Smirak index, Staheli arch index, rearfoot eversion, Arch height index and the FPI-6 received a ‘yes’ ranking as relevant to their reliability for a paediatric population (Table 4, Additional file 1), with only the Chippaux-Smirak index, the Staheli arch index and rearfoot eversion reported as having almost perfect repeatability within this population (Table 4). However, alternative studies investigating the Chippaux-Smirak index, Staheli arch index and the FPI-6, as well as the Clarke’s angle were assessed as relevant to a paediatric population ‘with caution’ (Table 4, Additional file 1).

Summary of results

From the 27 studies included, data were extracted for 20 foot posture measures involving 40 definitions of flat foot within a paediatric population (Table 3). Eight of the included 27 articles investigated the reliability or validity of the flat foot measures used, six further articles justified their choice of measure by citing existing psychometric data and 13 articles neither justified nor reported psychometric properties for their measures of choice (Table 4). Seven measures, involving 11 definitions of flat foot, were determined to have reported validity or reliability specific for a paediatric population (Table 4). Of these measures, no measure had strong data to support validity and reliability of the measure in paediatric samples, and only three were reported to have moderate or with caution validity data and moderate or above reliability data for a paediatric population. Specifically, these three measures were the Chippaux-Smirak index of >63%, ≥59% and ≥40% (for children aged six to nine, three to seven and nine to 16 years respectively), the Staheli arch index of >1.07 and ≥1.28 (for children aged three to six and six to nine respectively) and the FPI-6 of ≥ + 6 (for children aged three to 15 years), (Table 4).

Discussion

There was a modest body of evidence reporting paediatric specific measures of foot posture. There was no consistently used measure to determine paediatric flexible flat foot in the literature and the choice of foot posture measure, in relation to the validity and reliability, was rarely justified. Within the scope of this review, only three measures of flexible flat foot had any published data to support validity and reliability of the measure within a paediatric population; the Chippaux-Smirak index, Staheli arch index and the FPI-6. However, each of these measures were deemed to have limitations.

The Staheli arch and Chippaux-Smirak, used four and six times respectively across this review, are foot print indices, based on the width of the midfoot compared to the width of the rearfoot (Staheli arch) or metatarsals (Chippaux-Smirak), when the foot is in bipedal weight-bearing relaxed stance, expressed as a ratio (Additional file 2). As the child’s arch develops with age, the ratio should decrease accordingly. This is supported by normative data [3]. The definition of flat foot for the Chippaux-Smirak index within this review did decreased linear to age: 62.7% in 3 to 6 year olds, to ≥40% in 9 to 16 year olds (Table 3). However, the definitions of flat foot for the Staheli arch index did not decrease as expected (e.g. >1.07 in 3 to 6 year olds and ≥1.28 in 6 to 9 year olds, Table 3). This finding is not consistent with existing normative data and suggests these definitions should be used with caution. Furthermore, concerns exist that two-dimensional indices are limited in their ability to assess a three-dimensional construct [71]. It is suggested that categorising the foot posture based on footprint data disregards the complexity and multi-planar motion of the foot [3]. This greatly challenges the validity of the measures using this construct. At a minimum, these measures are reportedly influenced by the weight of the participants [72].

The FPI-6 is a composite tool that assesses multiple components of foot posture, relative to the age of the participant, and presents as an overall score between − 12 to + 12 [73], (Additional file 2). The ‘with caution’ rating assigned to the validity of the FPI-6 was due to the results including an adult population [63]. A flat foot definition ≥ + 6 for a paediatric population is well supported in the literature in terms of normative data [3, 69, 74, 75] and it is considered as the only flat foot scale that accommodates differences between normal and overweight/obese children [29]. Furthermore, only the FPI-6 was tested with a broad age range (i.e. children aged 5 to 16 years old [69, 70]). Interestingly, the FPI-6 was only used in two of the include studies [28, 29], despite being the recommended foot posture measure associated with the GALLOP proforma [76] (an opinion and evidence based proforma for assessment of gait and lower limbs in paediatrics).

The topic of paediatric foot posture remains controversial [39, 77] with little consensus on how this frequently observed foot type should be measured, defined or assessed. Importantly, it is acknowledged that a flat foot posture outside of expected norms may not require management. Clinician’s evaluation of the child, directed by a validated tool such as the paediatric flat foot proforma (p-FFP) [15] assist the clinician in determining when intervention may be required. What this review has highlighted, however, is an issue central to the discourse surrounding this topic. That is, much of the evidence that guides clinician assessment and intervention into paediatric flexible flat foot are potentially based on unsubstantiated measures. It is essential this is addressed in future research. Valid and reliable diagnoses of flat foot appropriate to the paediatric population is required to i) inform the clinician when the foot posture is not in keeping with expected development, and ii) allow research to be appropriate and clinically applicable.

Considering the difficulties associated with static foot print analysis, researchers and clinicians may need to consider the FPI-6 or alternative composite tools (such as the foot mobility magnitude model [78]) or dynamic measurement to better understand paediatric foot structure. Indeed, paediatric based studies have shown a significant difference between static structure and dynamic foot function [79] which may be of clinical relevance. As there was a paucity of dynamic measures in the included studies, further investigation may be beneficial. This extends also to a lack of understanding on the ability of these measures to detect change over time. For researchers to adequately assess development of, and intervention effects in, paediatric flexible flat foot, measures need to be robust and applicable.

There are a number of key limitations in this study. Only English language studies were included in the search strategy and the risk of bias of the included studies was not assessed with a specific critical appraisal tool. Many of the included studies did not cite support for their choice of measure or did not cite appropriately. Indeed, many of the studies reporting existing data assumed it was obtained appropriately and transferrable to their study. For example, Villarroya et al. (2009) quoted psychometric data for the Chippaux-Smirak index from the Kanatli, Yetkin and Cila (2001) article, which relates to the validity of the Staheli arch index; and Mathieson et al. (1999) was quoted in Nikolaidou et al. (2006) even though it obtained data from an adult population. Many studies did not describe their methods or population clearly (Table 2), and two texts were unavailable to the authors [64, 67]. Therefore, these results should be interpreted accordingly. This systematic review was also limited by a paucity of literature in relation to foot posture assessment in the paediatric population. Within the limits of this study, even the reference standard measures (e.g. plain film radiographs) had little psychometric data. Although this review had a broad scope, it did not account for studies which looked solely at the psychometric properties of a measure without a definition of pes planus. Therefore, future studies may search for these measures individually. Furthermore, this systematic review process was underpinned by best practice in the conduct of systematic reviews (PRISMA), however, potential publication and language bias should be acknowledged.

Conclusion

A synthesis of available literature reveals that there is not a universally accepted criterion for diagnosing abnormal paediatric flat foot within existing literature, and psychometric data for the measures and definitions used was limited. Within the limits of this review, only three measures of flexible flat foot had any published data to support validity and reliability of the measure within a paediatric population (Chippaux-Smirak index, Staheli arch index and FPI-6), each with their own limitations. Further research into valid and reliable, clinically relevant foot posture measures, including dynamic measures and the influence of age, gender and body mass on flat foot incidence, specifically for the paediatric population, is required. Furthermore, age-specific cut-off values should be further defined.