Background

The Fried Phenotype Model of Frailty (PMF) [1, 2] is characterized by cumulative decline across multiple physiologic systems resulting in decreased reserve and resistance to stressors, leading to increased risk for adverse outcome [2]. Five components are used as an index in the 5c-PMF measurement model (5c-PMF refers herein to the measurement model of the PMF): unintentional weight loss, exhaustion, weakness, slowness, and low physical activity. Based on the number of components, individuals are categorized as robust (no component), prefrail (1–2 components), or frail (≥ 3 components) [2]. The PMF 3-class model hypothesizes [2, 3] that individuals are homogeneous within each frailty class, but heterogeneous between classes.

The 5c-PMF measurement model, based on a model of the cycle of frailty from which syndrome components were identified, is one of the most widely used instruments in clinical practice [4] and can be considered foundational in the biological approach to frailty. Structural validity of the 5c-PMF was studied, and hypotheses and statistical procedures suggested. Compared to many frailty instruments, the PMF has been extensively validated. Most instruments measuring frailty were validated only as risk assessment tools [4,5,6].

In the PMF, frailty is conceptualized as a medical syndrome [2, 3, 7] which is a set of interwoven components [8] presenting two defining features. First, a syndrome is a manifestation of a phenotype. A population is hypothesized to be heterogeneous with at least one class of individuals presenting a significant number or all of its components, and at least one class that does not. Second, if frailty is a syndrome, frailty components are expected to aggregate within classes of a Latent Class Analysis (LCA) according to a similar gradient [3]. That is, the prevalence of frailty components are expected to increase progressively from robust to frail classes. If they co-occur, that is, if subsets of criteria occur preferentially in some of the classes only, then frailty is the result of distinct biological processes, rather than the expression of a single phenotype. Different syndromes may share subsets of components, but are differentiated from each other by a specific grouping of components.

These two features have been tested using LCA in several studies [3, 9,10,11]; all rejected the null hypothesis of K = 1-class model (homogeneity) and concluded that the 3-class heterogeneity model was the best-fitting model. However, in some of these studies, the 3-class and the 2-class models did not differ statistically using the Bayesian Information Criterion (BIC) and the chi-square goodness of fit test. Lohman et al. [11] rejected the 2-class model based on the Lo-Mendell-Rubin test. In addition, Liu et al. [10] could not reject the 4-class model.

Previous studies investigated frailty classes based on a K = 1 null hypothesis. However, rejecting K = 1-class models is different than accepting K > 1-class models. The hypothesis of similar manifestations of components among frailty classes can be tested by comparing a K > 1-categorical approximation of a continuous process for frailty (homogeneity) against the hypothesis of a “true K > 1-class” model (heterogeneity). In the former case, the classes are no more than recoding of a continuous variable over K homogeneous categories – classification parameters are equal across classes [12]. None of the previous studies tested this null hypothesis.

Most studies investigating frailty with a priori dichotomized continuous component scores are based on adaptations of the Bandeen-Roche et al. [3] procedures. Prior dichotomization of component scores is justified on the basis that the frailty syndrome classifies individuals in at least two classes: those that present and those that do not present the syndrome [3]. However, dichotomization is a characteristic of the syndrome, not of the components of the syndrome. A priori dichotomization of scores assumes that frailty is a syndrome, before testing the hypothesis of heterogeneity of a population on the components of frailty.

The aim of this study was to investigate the construct validity of frailty as a syndrome, using Factor Mixture Models (FMM) [13] on continuous frailty component measures. Three null hypotheses stemming from the Bandeen-Roche et al. [3] theoretical framework are tested: 1. K > 1-categorical representation of a continuous process; 2. K = 1-class; and 3. frailty components are ordered differently among classes, implying multiple etiological and pathophysiological processes.

Methods

Sampling frame

Participants were from the FRéLE study (Fragilité: une étude longitudinale de ses expressions). Three databases [14,15,16] were used to estimate the distribution of frailty as in the PMF for sample size calculation. Results showed that six equal size strata (men and women in three age groups: 65–74, 75–84, and ≥ 85 years old), each with 270 respondents, were appropriate to identify frailty classes [17].

The Régie de l’assurance maladie du Québec (RAMQ - Québec public and universal health insurance program agency) database was used to select randomly the FRéLE sample (n = 1643). Community-dwelling adults, aged 65 or older, were recruited in 2010 from three areas: metropolitan (Montréal), urban (Sherbrooke), and urban-rural (Victoriaville). Some of the FRéLE baseline questions were selected from the Canadian Community Health Survey (CCHS) [18]. Comparison of the Québec CCHS and FRéLE respondents showed that FRéLE reflects basic socio-economic and health status characteristics of the elderly population across Québec [19].

The sampling frame has been described previously [17, 19]. Three panels were collected over two years. Of the 1643 respondents at baseline (T0), 84.4% participated at T1 and 88.4% at T2. Losses were due to mortality (13% over the data collection period), or to voluntary dropout and inability to locate (13%). Construct validity of the 5c-PMF measurement model was examined with T0 data only. Predictive validity of the frailty classes was estimated with T2 data and mortality over a three-year period after T0.

Measures

The five components of the PMF were assessed (Table 1) with a mix of performance tests and self-reported questions. Each measure is extensively described in Provencher et al. [19]. Self-reported unintentional weight loss of 10% of normal weight or lost of 4.5 kg or more during the previous year were used to assess the 5c_PMF weight loss component. The “vitality” section of the SF-36 [20] was used to assess exhaustion. The PASE [21] (Physical Activity Scale for the Elderly), a brief and valid instrument for measuring physical activity in elderly population, was used to measure low physical activities. Time to walk, from a stationary position, a distance of 2.44, 3 or 4 m, according to the space available at the participant’s home [22], was used to assess slowness. Finally, weakness was obtained from grip strength measured on a Martin Vigorimeter, using the American Society of Hand Therapists protocol [22].

Table 1 Frailty Components: Cut-points for the CHS and the FRéLE data sets

Frailty components

With the FMM, continuous scores on the 5c-PMF components were used. Twelve respondents were not able to do the gait speed test. This component was considered left censured [20] and the twelve respondents were included in the analysis. Only 186 respondents lost weight. This component was defined as a Poisson with inflation variable [20]. Weight loss was thus represented with two items in the FMM with continuous variables: a dichotomous variable separating those with and those without weight loss, and a continuous variable with weight loss in kg for those losing weight (Table 1, last column).

Tests of the ability of the FRéLE data set to reproduce LCA PMF results used a priori dichotomized items obtained with the Bandeen-Roche et al. [3] procedures. Cut points were obtained separately for men and women. Differences in the operational definition of frailty components between the Women’s Health and Aging Studies (WHAS) and the FRéLE studies are found in Table 1. For clarity, only the CHS definitions are reproduced.

Factors associated with frailty

Several variables were used to test predictive validity: Age and sex (RAMQ files); education (CCHS questions) [18]; self-reported health status (SF-36) [21]; number of chronic diseases (Functional Comorbidity Index (FCI)) [22]; depression symptoms (Geriatric Depression Scale (GDS)) [23]; cognitive functioning (Montreal Cognitive Assessment (MoCA)) [24]; disability in basic and instrumental activities of daily living (ADL, IADL) Katz et al., [25] Lawton and Brody [26] scales; mortality in the three-year period following the first interview obtained from the Institut de la statistique du Québec.

Statistical analyses

Two of the conditions for the syndromic characterization of frailty are subcases of measurement-invariance conditions: K = 1-class and K > 1-categorical approximation of a continuous process [12]. These conditions are used as null hypotheses, as they impose homogeneity conditions on 5c-PMF measurement model parameters.

Within LCA, the null hypothesis of homogeneity of a population is defined by a 1-class model. However, LCA can be considered a special case within a wider set of a family of models: factor mixture models (FMM). FMM tie a classification model (LCA) to a factor analytical model (FAM). Figure 1 shows a syndromic frailty FMM with continuous measurements. In the LCA (Fig. 1, lower part), FAM parameters are fixed to zero. Likewise, in FAM (Fig. 1, upper part), LCA parameters are excluded. In this case, the FMM parameters used to define the latent classes are null. Thus, the FAM is a model of homogeneity. Parameters generating latent classes are not defined within FAM, and FAM cannot be used to test the validity of a syndrome.

Fig. 1
figure 1

Factor Mixture Model (FMM). Structural parameters for syndromic frailty model with continuous measurements. Visual presentation of factorial analytical and latent class models]

Measurement-invariance conditions operationalize a set of null and non-null models useful in studying syndromes. For practical reasons, Fig. 1 shows only the structural parameters of a syndromic frailty model with continuous measurements and only one factor with factor loadings invariant over classes. The Ikc (c = 1 to 5 frailty components in k = 1, …,K classes) are “intercepts”, or classification parameters. Pk (k = 1, …,K) classes are related to each of the “c” components through the Ikc intercepts. In FMM, part of the variation of the frailty components is used to classify individuals into frailty classes (classification parameters Ikc), and part (factor loadings Fc) is attributed to unmeasured constructs, among which are syndromes that share some of the frailty components. As represented in Fig. 1, factor loadings Fc for each “c” component do not vary between components. This factor structure can be said to be “reflective” [27, 28].. However, factor loadings in this “reflective” model are not used to classify individuals into frailty classes. Finally, in Fig. 1, residuals “ekc” are obtained on “c” components and one “ef” for the F factor.

Constraints over FMM parameters define seven basic FMM models; four of them correspond to null and three to non-null hypotheses on the syndromic characterization of frailty:

  1. 1.

    A one-class null model can be defined by two models:

  1. a.

    full FMM;

  2. b.

    LCA;

  1. 2.

    Two null hypotheses model are K > 1-categorical representation of a continuous process. Both imply that classification parameters and factor loadings are equal across classes:

  1. a.

    Strong measurement invariance (SoMI);

  2. b.

    Strict measurement invariance (SiMI). SiMI adds so SoMI equal residual variances across classes (Supplemental Material A);

  1. 3.

    Two non-null hypotheses are available. In both cases, classification parameters vary across classes. Both are compatible with the syndromic hypothesis for frailty inasmuch as the K = 1-class model is rejected. Both can be tested with residual variances, equal (ev) or unequal (uv) across classes:

  1. a.

    Weak measurement invariance (WMI-ev and WMI-uv);

  2. b.

    Null measurement invariance (NMI-ev and NMI-uv). In the NMI, factor loadings are unequal across classes;

  1. 4.

    LCA includes only the lower part of Fig. 1. LCA has equal (LCA-ev) or unequal (LCA-uv) residual variances. This is a non-null hypothesis as long as K > 1.

Given the non-rejection of LCA, WMI or NMI K > 1-class models, the ordering of the Ikc is compared among classes.

Eight analytical steps

First, the 3-class 5c-PMF was tested on dichotomized components with FRéLE results compared to the Bandeen-Roche et al. results [3] on the WHAS. Second, the number of factorial dimensions in 5c-PMF components was examined with Confirmatory Factor Analysis (CFA) to design the FAM within the FMM. Third, null hypotheses of K = 1-class for frailty were examined for each of the FMM and LCA models using BIC and bootstrap likelihood ratio test (BLRT) statistics. BLRT was estimated by applying the Nylund et al. [29] procedure to 1000 Monte Carlo replications. Also, acceptability of the Monte Carlo parameter estimates was examined with the Muthén & Muthén [30] procedure (Supplemental Material B). The fourth step searched for the minimum acceptable K number of classes in each of the LCA and FMM models. Fifth, the hypotheses of uv were tested against the hypotheses of ev. Sixth, differences of classification parameters between classes, and their ordering, were examined in LCA and FMM models. Seventh, the null hypothesis of K > 1-categorical representation of a continuous process was tested against the selected non-null models (Supplemental Material C). And eighth, predictive validity was investigated in the 5c-PMF and in the final FRéLE model. All analyses were conducted using Mplus (Version 8) [20].

Results

Do FRéLE and WHAS results differ on LCA with dichotomous representation of frailty components?

Adding the frailty components that are met for each respondent, and grouping the scores in three frailty classes (robust: 0; prefrail: 1–2; frail: 3–4-5), frequencies of frail respondents in the WHAS, CHS [3] and FRéLE studies (Table 1) are almost the same (11.3 to 12.0%). The prefrail are a much larger proportion of respondents in the CHS (55.2%) than in the WHAS (43.8%) and the FRéLE (38.3%) data sets. FRéLE has the largest proportion of robust respondents.

Results from the LCA on FRéLE and WHAS dichotomous components are presented in Table 2. As expected, BIC for the 2-class models is smaller than BIC for the 1-class models. The BIC differences between the 3-class and 2-class models are positive, indicating a better fit for the latter. Also, with BIC, the hypothesis of no difference is not rejected in the comparison of 3-class with 2-class models for FRéLE and WHAS. However, the BLRT for FRéLE indicated a significant difference between the 3-class and the 2-class model, as in Lohman et al. [11]

Table 2 Latent Class Analysis (LCA): Results from the WHAS and the FRéLE studies

LCA with dichotomized components on the FRéLE sample yielded the same frailty classes as in WHAS, though the WHAS participants were community-dwelling American women aged 70–79 years [3] while the FRéLE sample was drawn from community-living Canadians aged 65–93 years, mainly French-speaking. Also, measures for frailty components differed somewhat in the two studies (Table 1). Using LCA procedures with dichotomized items on the FRéLE sample, the null hypothesis of a homogeneous population (that is, frailty is not a syndrome) was rejected, making the FRéLE sample an acceptable starting point for revisiting the 5c-PMF even though some components were not measured in FRéLE on the same scales as the WHAS or the CHS.

How many factors from the components?

In the second step, confirmatory FAM was run on two-factor and one-factor solutions. The likelihood ratio test (LRT) could not reject the null hypothesis of one dimension at α = 0.05 (X2 = 5.4 with 4 degrees of freedom) (Table not included).

Is the null hypothesis of the 1-class model rejected in FRéLE with LCA and FMM models?

The null hypothesis of 1-class was rejected in both LCA and FMM according to BIC and BLRT (Table 3, lines 2 classes | 1 class in all subtables).

Table 3 Likelihood ratio-based tests for number of classes

How many classes?

BLRT procedures and BIC statistics were used to identify the number of classes in each of the LCA and FMM models (see Supplemental Material D). Models and results for the Monte Carlo estimation procedure are shown in Table 3. All models met the Muthén and Muthén criteria [30]. In models that were not rejected, three to four classes were identified (Table 3).

Are residuals equal over classes?

Within each model, the null hypothesis of equal variance (ev) over unequal variance (uv) was tested (Table 4, Parts A-D). Both BLRT and BIC tests rejected all equal variance models.

Table 4 Selecting the final frailty model

Do frailty components increase in the same direction across the selected LCA, WMI and NMI classes?

In the 5c-PMF 3-class LCA models with dichotomized components, class thresholds within each component were different throughout classes, and were ordered in the same direction (Table 5, Part A). In the WMI-uv and NMI-uv, components were ordered on classes in two sets: 1. exhaustion, physical activity and weight loss as P1 → P2 → P3; 2. gait speed and grip strength as P2 → P1 → P3. The robust class is P3, while individuals with the lowest scores on gait speed and grip strength are located in P2. Individuals with greater weight loss, exhaustion and low physical activity are grouped in P1. The WMI-uv and NMI-uv models are thus excluded from further investigation as components are ordered differently. These two models suggest that the five components are from two syndromes: one indicative of muscle strength (gait speed and grip strength), the other may be an expression of exhaustion captured by three components: the perception of exhaustion (low scores on SF-36 vitality subscales), physical activity (low scores on the PASE), and weight loss. In the LCA-uv model, components are hierarchically ordered as expected.

Table 5 Factor loadings, class intercepts, class thresholds and classes means for the Fried, the SiMI and the WMI models

Is the null hypothesis of strong measurement invariance (SoMI) rejected?

The LCA-uv model was tested against the null hypothesis represented by the SoMI model (Table 4, Part E). With both the BIC and BLRT, the SoMI model could not be rejected. That is, the null hypothesis that the population is homogeneous (suggesting that the 5c-PMF is not a measure of frailty as a syndrome) could not be rejected in the final model.

Are SoMI-uv classes associated with expected predictors and consequences of frailty?

Even though SoMI-uv classes yielded factor loadings and class intercepts of equal value within components (Table 5, Part A), they are distributed along a single, ordered latent variable.

Distributions of frailty components in the FRéLE SoMI 4-class, with continuous components, and the 5c-PMF LCA 3-class models, with dichotomized components, are shown in Table 5, Part B. Components at baseline were distributed as expected in the 5c-PMF LCA; their values increased from frail to robust classes in each case. The SoMI model replicated these results, except for involuntary weight loss in kilograms. Also, in all cases, the range of variation of components between classes was greater in the SoMI than in the 5c-PMF LCA. The results were replicated with factors associated with frailty at T2 (Table 5, Part C), except for gender. There were no gender differences between frailty classes in the 5c-PMF LCA. However, the SoMI model showed that males represent 61.4% of the lowest health status class.

The SoMI-uv classes were associated with the five frailty components at a higher level than the 5c-PMF. These classes were also associated with some of the socio-economic and health status variables usually considered in studies of predictive validity for frailty, even though they were generated from a categorization of a continuous latent variable.

Discussion

The aim of this study was to examine the validity of the measurement of the syndrome of frailty based on the five components of the PMF using FMM. Our results show that the hypothesis of homogeneity, indicating the inability to distinguish between frailty classes based on the 5c-PMF, cannot be rejected.

The 5c-PMF is a measure of frailty as a syndrome, if its components identify at least two classes among a population: those with the syndrome, and those without the syndrome. Thus, models used to investigate the 5c-PMF must test parameters generating frailty classes. Statistical models that do not have parameters that generate classes, or do not offer tests on classification parameters, are not appropriate.

The FMM framework includes classification parameters and allows testing three null homogeneity hypotheses: 1. K > 1-categorical representation of a continuous process; 2. K = 1-class mode; 3. A model with subsets of components occurring in only some of the classes. All of these hypotheses are defined in terms of classification parameters. Inasmuch as two or more well-separated classes are obtained, each component will also show well-separated distributions within each class. The FMM procedure classifies observed cases according to their scores on each of the components. Also, cut points for each component can be identified. Classification parameter estimates and cut points may be obtained from different samples and their values compared (Supplemental Material E). Thus, the FMM framework can take into account variations of the manifestation of syndromes in different contexts.

In our study, one of the three null hypotheses could not be rejected – the SoMI-uv model. In effect, the estimated classification parameters in this model are equal across classes for each frailty component, the distribution of components within classes is not well-separated, and cut points appear arbitrary (Supplemental Material E). They do not meet the Bandeen-Roche et al. [3] requirement for syndromic frailty which specifies that parameters are different and ordered similarly between each class.

Variations of frailty components in categories of the SoMI-uv model were wider in scope than variations in the PMF’s robust, prefrail and frail classes. Similar results were obtained with factors (measured at T2) associated with frailty classes (measured at T0). Thus, predictive validity of the 4-categorical representation of frailty as a continuous process was higher than the predictive validity of the 5c-PMF. This suggests that a model generating frailty categories by a continuous process (the SoMI-uv model) may be a useful health status construct in population health surveys, even though frailty cannot be considered a syndrome. The 4-categorical representation of frailty is the best categorization of frailty as a continuous variable in the FRéLE study. Other health status constructs, such as self-rated health, though not a syndrome, have proven to be useful in population health studies. However, the validity and reliability of a continuous construct of frailty using the five PMF components need to be examined using recognized psychometric techniques. Reflective and / or formative models [27, 28] may offer useful conceptual schemes and operational procedures to examine the construct of frailty as a continuous variable. This examination was beyond the scope of our study.

There are a number of limitations to this study:

  1. 1.

    Some of the measurements of component in FRéLE were the same as in Bandeen-Roche et al., [3] while others differed. However, LCA with dichotomized components conducted on the FRéLE sample replicated the results reported in WHAS [3];

  2. 2.

    Frailty components represent specific points in the frailty biological cycle [1]. On the one hand, they are manifestations of a clinical syndrome [1]. On the other hand, etiological and pathophysiological processes are at the source of clinical manifestations [8]. Given the results of our analysis, frailty based on the five continuous components of the PMF cannot be used to classify individuals in classes of frailty as a syndrome. The refinement of measures that use biological bases for frailty may lead to a stronger theoretical rational for sound clinical measures [31];

  3. 3.

    The five-component model used in PMF to measure frailty is a clinical construct. The 4-categorical representation of frailty as a continuous process (the SoMI model), applies only to the clinical characterization of frailty with the 5c-PMF.This conclusion cannot be extended to the validity of the pathophysiological and etiological processes of frailty, as represented in the frailty cycle.

  4. 4.

    One of the SoMI-uv model classes is small. This class represents individuals with low scores from all PMF components. In a larger sample or in a sample having a lower average health and physical function than the FRéLE cohort, the separation of this group from the other three may become significant enough to break the continuity of the categorical representation of the continuous process found in our study. Thus, replication of this study is needed in international cohorts with a wide range of physical function and survival rates.

Conclusion

Inasmuch as frailty is a syndrome, the frailty measure used in clinical settings or in population health surveys should at the very least differentiate individuals with the syndrome from those without the syndrome [3, 8]. However, the predictive ability of a syndrome depends on a clear understanding of the notion of syndrome, on the modelling of its etiological factors and pathophysiological processes, and how its clinical components are linked with these processes. We have shown that the association of categories based on a single continuum of frailty (SoMI model categories) with a set of well-known correlates of adverse outcomes in old age was as strong as, or stronger than, with 5c-PMF categories. This is an example of the caution needed in using predictive validity to examine the validity of frailty measurements. Controlling for age, sex, and chronic disease, other studies have shown a specific but weak contribution of frailty to disability [32]. These results are an illustration of the Xue et al. [6] injunction “… to move beyond predictive validity to examine consistency of frailty diagnosis and its implication …” .Without these features, addressing frailty in a clinical setting may not improve patient health [33]. Thus, research may have to focus on frailty as a biological entity, examine its etiological and pathophysiological basis and its consistency as a diagnosis. Without a sound basis, the search for valid and reliable measurement tools for public health and clinical practice may be a frustrating endeavor that produces a collection of measures [34] of detrimental states resulting in detrimental consequences.