Investigation of the Factor Structure and Differential Item Functioning of the Child and Adolescent Mindfulness Measure (CAMM): Analysis of Data from a School-Based Cluster Randomised Controlled Trial

This study used data from a randomised controlled trial of a school-based mindfulness programme in the UK to investigate the structure and performance of the 10-item Child and Adolescent Mindfulness Measure (CAMM). The study included 7924 children and adolescents aged 11 to 14 years. Participants provided CAMM data at pre-intervention, 7 months (post-intervention) and 1 year. Exploratory factor analysis (EFA) of pre-intervention data was undertaken. Multiple indicators multiple causes (MIMIC) models were fitted to pre-intervention responses to investigate differential item functioning across groups defined by gender, year group and ethnicity. Response shift resulting from receiving the mindfulness programme was investigated by fitting MIMIC models to compare item functioning between the intervention and control arms. EFA results indicated that the 2-factor model was a good fit. Eight items were associated with the first factor, while the remaining two items, which specifically addressed avoiding unwanted thoughts and feelings, were associated with the second factor. MIMIC model findings indicated that girls scored lower (ostensibly less mindful) on 4 items than boys that had the same latent level of mindfulness; as a result of receiving the mindfulness programme, participants scored lower on one item (“At school, I walk from class to class without noticing what I’m doing”) after holding latent level of mindfulness constant. Findings indicate that the CAMM has a 2-factor structure in the UK in late childhood and early adolescence. While we did observe some differences in how individual items performed across groups, these differences were small compared to the overall variability in the CAMM scores. Current controlled trials ISRCTN86619085.

The last four decades have seen a growth in research interest in the concept of mindfulness, the development of interventions for increasing mindfulness and the creation of measures for quantifying it. Mindfulness has been defined as "the awareness that emerges through paying attention on purpose, in the present moment, and non-judgmentally to the unfolding of experience moment by moment" (Kabat-Zinn, 2003). Mindfulness-based interventions (MBIs) have been developed to improve mental and physical health outcomes by targeting the regulation of emotional and coping processes. MBI techniques encourage intentionality, a shift of focus to the present moment and encourage the individual to accept negative and unpleasant experiences without engaging with or actively trying to change them (Perry-Parrish et al., 2016).
MBIs such as mindfulness-based stress reduction (MBSR; Kabat-Zinn, 1982) and Mindfulness-Based Cognitive Therapy (MBCT; Teasdale et al., 2000) can improve a broad range of outcomes in adults, including executive 1 3 functioning, depression, anxiety (Dunning et al., 2019), insomnia (Gross et al., 2011) and irritable bowel syndrome (Zernicke et al., 2013). These MBIs have been adapted for children and adolescents by shortening the mindfulness techniques and using more age-appropriate language and activities (Perry-Parrish et al., 2016). A recent meta-analysis of MBIs delivered in school settings with adolescents aged between 12 and 18 years found statistically significant but small effects on stress, depression and anxiety (Fulambarkar et al., 2023). Similarly, a meta-analysis by Dunning et al. (2022) found that compared to passive control groups, MBIs were effective at improving anxiety, attention, executive function, and negative and social behaviour at postintervention, but there was no evidence of sustained benefits at follow-up. Although evidence for the success of MBIs in both child and adolescent populations is growing, there is a need for further investigation into its long-term effects (Kalmar et al., 2022).
The availability of reliable and valid instruments for assessing mindfulness level in children and adolescents is a prerequisite for investigating the effects of MBIs on mindfulness skills and identifying the mechanism through which MBIs impact on distal outcomes (Goodman et al., 2017). Several measures have been developed to assess mindfulness in children and adolescents Goodman et al., 2017;Pallozzi et al., 2017). These include unidimensional measures that provide a global indication of mindfulness (the Child and Adolescent Mindfulness Measure, CAMM, Greco et al., 2011; the Mindful Attention Awareness Scale for Adolescents, MAAS-A, Brown et al., 2011; and the Mindful Attention Awareness Scale for Children, MAAS-C, Lawlor et al., 2014) and multidimensional measures that allow for a more nuanced understanding of the complexity of mindfulness (the Mindful Thinking and Action Scale for Adolescents, MTASA, West, 2008; the Mindfulness Inventory for Children and Adolescents, MICA, Briere, 2011; the Mindfulness Scale for Pre-Teens, Teens, and Adults, MSPTA, Droutman, 2015; the Comprehensive Inventory of Mindfulness Experiences-Adolescents, CHIME-A, Johnson et al., 2017;the Mindful Student Questionnaire, MSQ, Renshaw, 2017; the Five Facet Mindfulness Questionnaire Adolescent-Short Form, FFMQ-A-SF, Cortazar et al., 2020; and the Self-Compassion Scale-Youth Version, SCS-Youth, Neff et al., 2021). With the exception of the MICA, all of these measures have been psychometrically evaluated, including the factor structure, internal consistency and convergent validity Goodman et al., 2017;Pallozzi et al., 2017).
The CAMM is one of the quicker measures to administer (Goodman et al., 2017) and has the potential advantage over other short measures (MAAS-A and MAAS-C) of being developed specifically for use with children and adolescents, rather than adapted from an existing adult measure. It also has a low reading grade level compared to other measures (Pallozzi et al., 2017). The CAMM is a traitbased self-report measure of "present moment awareness, and non-judgmental, non-avoidant responses to thoughts and feelings" (Greco et al., 2011). It was developed and validated using a US sample of children and adolescents aged 10 to 17 years. It is comprised of items encompassing the aspects of "acting with awareness" and "accepting without judgement" (Greco et al., 2011). It is one of the most commonly used mindfulness measures (Pallozzi et al., 2017) and has been psychometrically evaluated on non-clinical samples in several countries and using different language versions Baumann et al., 2022;Chiesi et al., 2017;Cordeiro et al., 2022;Cunha et al., 2013;de Bruin et al., 2014;Dion et al., 2018;García-Rubio et al., 2019;Greco et al., 2011;Guerra et al., 2019;Kuby et al., 2015;Limpo et al., 2022;Mohsenabadi et al., 2020;Prenoveau et al., 2018;Ristallo et al., 2016;Roux et al., 2019;Saggino et al., 2017;Theofanous et al., 2020;Vinas et al., 2015;Wang et al., 2018).
Although based on a multidimensional conceptualisation of mindfulness proposed by developers of the adult Kentucky Inventory of Mindfulness Skills (KIMS; Baer et al., 2004), most of the research on the CAMM, including the initial development and validation study (Greco et al., 2011), has provided evidence of a single-factor structure. Some studies have, however, provided evidence that there are two underlying factors (Cordeiro et al., 2022;de Bruin et al., 2014;Mohsenabadi et al., 2020;Ristallo et al., 2016;Wang et al., 2018). Furthermore, some of the studies that endorsed the single-factor structure found that specific items had weak factor loadings and/or needed to be removed to achieve adequate fit for the factor analysis model Baumann et al., 2022;Cunha et al., 2013;García-Rubio et al., 2019;Guerra et al., 2019;Limpo et al., 2022;Roux et al., 2019;Saggino et al., 2017;Vinas et al., 2015). The variation in the factor structure of the CAMM across different settings warrants further examination.
As the CAMM is used across different subgroups of children and adolescents (e.g. based on gender, age and ethnic group), it is important that it has the property of measurement invariance; that is, each item carries the same meaning and interpretation when completed by people from different groups (Brown, 2015). Chiesi et al. (2017) found the CAMM to be invariant overall across gender and age in an Italian sample aged 11-18 years. The 8-item Italian version of the CAMM  was invariant between boys and girls. Theofanous et al. (2020), in their Cyprus-based study of Greek-speaking adolescents (mean age 16 years) and young adults (mean age 22 years), found the CAMM to be invariant between those groups and with respect to gender. To our knowledge, no research has been undertaken on the invariance of the CAMM across ethnic groups. Previously, Prenoveau et al. (2018) found a 1-factor structure had adequate internal consistency (Cronbach's alpha (α) = 0.88) when used in a sample mostly comprised of adolescents from minority racial groups (predominantly African American) based in low-income environments, but Wang et al. (2018) found evidence of a 2-factor structure in their study of fifth grade students, three quarters of whom identified as belonging to a racial minority group (including 56% as Hispanic).
In addition to the need for the CAMM to be invariant across subgroups, for the measure to be credible for quantifying real change in mindfulness level, there should be no response shift in the interpretation of the items; in other words, it should be invariant over time. The presence of response shift bias means that the responders' understanding of the construct being measured is different across time points, such as pre-and post-test, and as a result the construct being measured may not be the same or may have different structural components (Goodman et al., 2017). The only study on time invariance of the CAMM found it to be invariant over a 4-month period in a sample of Greekspeaking participants (mean age 16 years) (Theofanous et al., 2020).
Mindfulness-based interventions for improving mental health of children and adolescents are increasingly being evaluated in randomised controlled trials (Dunning et al., 2019. Valid measures are required in such studies to evaluate the impact of the intervention on mindfulness and to evaluate mindfulness as a mediator of the intervention effect on the distal mental health outcomes (Goodman et al., 2017). As well as being time-invariant under natural conditions, the meaning and interpretation of the CAMM needs to be invariant between participants that do and those that do not receive the intervention in order to obtain unbiased estimates of effect (Bartos et al., 2023). It has been noted that changes in the conceptualisation of mindfulness that can result from receiving an intervention can lead to a response shift where the recipients score lower on mindfulness than their counterparts that do not receive mindfulness training, the increased awareness and understanding of the concept of mindfulness making them aware of the extent to which they lack the trait (Grossman, 2011). In the context of a trial, a more subtle response shift may result in the benefit of the intervention being underestimated (Bartos et al., 2023). Krägeloh et al. (2018), in their sample of German-speaking adults aged 19 to 73 years and living in Switzerland, investigated response shift on the Comprehensive Inventory of Mindfulness Experiences (CHIME) measure following receipt of a mindfulness-based intervention course. Evidence of response shift was found for only 7 of the 37 CHIME items, and, contrary to what might be expected based on the literature in this area, most of the items that had response shift overstated the improvement in mindfulness level. Bartos et al. (2023), in their study of adult musicians aged 19 to 39 years living in Spain, investigated response shift on the Five Facet Mindfulness Questionnaire (FFMQ) resulting from a mindfulness-and yoga-based intervention. They found statistically significant evidence of response shift on the total FFMQ score and the Observe subscale of the FFMQ, and used the "then-test" method to correct for the resulting underestimate of the intervention effect. The analyses in the Krägeloh et al. (2018) and Bartos et al. (2023) papers were based on relatively small sample sizes (181 and 31 participants, respectively) and both used a pre-post study design with a single group that received the intervention. To our knowledge, no study has used data from an experimental design with separate groups of intervention and control participants to investigate response shift on a mindfulness measure resulting from receiving a mindfulness-based intervention, and no study has explored this type of response shift in a sample of children and adolescents.
To our knowledge, the factor structure and measurement invariance of the CAMM have not been explored in the UK. A recently completed UK-based trial of a mindfulness training programme delivered in secondary schools provided the opportunity to examine these aspects of the validity of the CAMM in late childhood and early adolescence. The present study conducted an exploratory factor analysis (EFA) to investigate the factor structure of the CAMM. We then fitted multiple indicators multiple causes (MIMIC) models to investigate measurement invariance of the CAMM across groups defined by gender, year group and ethnicity. Finally, we fitted MIMIC models to investigate response shift over time in the CAMM, both in the absence of receiving an intervention and resulting from the delivery of the mindfulness programme.

Participants
The analyses use data from the My Resilience in Adolescence (MYRIAD) study, a cluster randomised controlled trial of the delivery of a school-based mindfulness curriculum to improve mental health outcomes in children and adolescents . In the trial, mainstream secondary schools were deemed eligible if they had a substantive headteacher in position, had not been identified as inadequate in their most recent quality and performance inspection, and had the strategy and structure in place to deliver social emotional learning curricula. After consent was obtained from schools to participate, parents were provided with the option to opt their child out of the trial; the children and adolescents themselves provided assent prior to participation.

Procedure
Schools (clusters) recruited from across the UK were randomly allocated to either deliver the school-based mindfulness curriculum (intervention arm) or continue to just deliver usual Personal, Social, Health and Economic (PSHE) education (Department for Education, 2011) (control arm). The school-based mindfulness training (SBMT) programme was delivered to teach mindfulness skills in 10 structured lessons, including attentional control and self-regulation of thoughts and behaviours. The participants were encouraged to use these skills in their everyday lives. Resources were provided to support the implementation of the SBMT; these and other aspects of the intervention are described in further detail in several papers (Kuyken et al., 2017Montero-Marin et al., 2021).
Schools were recruited in two cohorts, with Cohort 1 recruited in the 2016/2017 academic year and Cohort 2 recruited in 2017/2018. Eligible children were recruited from Years 8 and 9 (usually spanning the age range 12 to 14) and registered with classes randomly selected with equal probability from within the schools for participation in the trial. Assessments were administered at baseline (before randomisation), pre-intervention (after randomisation and before intervention delivery), post-intervention (7 months after the pre-intervention assessment) and 1-year post-intervention (12 months after the pre-intervention assessment).

CAMM
The 10-item CAMM was administered as one of the study outcomes at all waves from pre-intervention onwards. Items were developed to assess the extent to which the responders notice thoughts, feelings and sensations; engage in their current activity; and are open to experiencing a full range of thoughts and feelings. The specific items (listed in Table 1) are worded negatively and have a 5-point Likert scale response set. Item responses are reverse-scored (from 0 = less mindful to 4 = more mindful) and summed to give a total possible CAMM score from 0 to 40, with higher scores indicating a higher level of mindfulness.

Data Analyses
An EFA was conducted of the CAMM measure to investigate its dimensional structure. MIMIC models were fitted to investigate measurement invariance, specifically differential item functioning (Brown, 2015, pp.273-283). The sample size calculation was based on the primary objective of the MYRIAD trial: to provide 90% power at the (2-sided) 5% level of significance to detect a difference between the trial arms of 0.20 standard deviations on mental health and wellbeing outcomes. The sample size was inflated to allow for non-independence between the responses of participants from the school (using standard formulae (Eldridge & Kerry, 2012) and assuming an intra-school correlation coefficient of 0.04); multiple testing related to the analysis of three primary outcomes (maintaining an overall type I error rate of 5%); and an anticipated 20% loss to follow-up.
Analyses were undertaken using Stata version 17 (Stata-Corp., 2021) and Mplus version 8 (Muthén & Muthén, 2017) software. Baseline/pre-intervention child characteristics 3. I keep myself busy so I don't notice my thoughts or feelings. 4. I tell myself that I shouldn't feel the way I'm feeling. 5. I push away thoughts that I don't like. 6. It's hard for me to pay attention to only one thing at a time. 7. I get upset with myself for having certain thoughts. 8. I think about things that have happened in the past instead of thinking about things that are happening right now. 9. I think that some of my feelings are bad and that I shouldn't have them. 10. I stop myself from having feelings that I don't like.
were summarised using means, standard deviations and ranges for continuous variables and numbers and percentages for categorical variables. Analyses of CAMM data for any given wave only included participants that provided responses to the items at that wave; for longitudinal analyses (across two time points), only participants that provided responses at both waves were included. Non-independence between repeated measures on the same child in longitudinal analyses and between children's responses within the same school (cluster) in all other analyses was accounted for by obtaining robust estimates of standard error using the TYPE=COMPLEX and CLUSTER commands in Mplus. Statistics are generally reported using two decimal places. Otherwise, one decimal place is used to report percentages; three decimal places are used to report unstandardised (raw) and standardised model coefficients and most of the goodness-of-fit statistics; three decimal places are used to report p-values between 0.001 and 0.009 and p-values are reported as "<0.001" when less than 0.001.

Exploratory Factor Analysis
Given the inconsistency in findings in previous research on the number of salient factors in the CAMM and the lack of previous research in the UK, an exploratory rather than confirmatory factor analysis approach was considered appropriate for the present study. EFA was undertaken of the CAMM items at pre-intervention. The parallel analysis method based on simulated data across 10 replications was used to explore the number of factors that should be extracted in the EFA (Brown, 2015). An oblique (geomin) rotation method was used which allows the derived factors to be correlated. Several goodness of fit indices were used to decide on the optimal number of factors: root mean square error of approximation (RMSEA); Comparative Fit Index (CFI); Tucker-Lewis Index (TLI); and standardised root mean square residual (SRMR). The following value ranges are indicative of good fit: RMSEA close to or below 0.06; CFI and TLI close to 0.95 or greater; and SRMR close to or below 0.08 (Hu & Bentler, 1999).
As the CAMM items are ordinal, the parallel analysis and EFA used the matrix of polychoric correlations between the items. EFA models were fitted using mean and varianceadjusted diagonally robust weighted least squares estimation (WLSMV). Under this approach, the ordinal items are assumed to be categorised versions of traits that have an underlying latent normal distribution and the polychoric correlations quantify the strength of associations between the items on that latent scale. In the EFA, the delta parameterisation method was used where the variance of the latent versions of the items was set to 1. Factor loadings from the analysis quantify the relationship between the latent continuous versions of the items and the factors that explain the correlations between the items.

MIMIC Models to Investigate Differential Item Functioning
MIMIC models (Brown, 2015, pp.273-283) were fitted to investigate differential item functioning in the observed CAMM items. Specifically, the models were fitted to assess the extent to which differences in the mean level of each item across subgroups reflect differences in the underlying unobserved latent concept of mindfulness measured by the items rather than differences in the way that the subgroups interpret or respond to the items. The subgroups in these analyses were defined by gender, year group and ethnic group, and by whether the participants received the mindfulness intervention. Models were also fitted to assess the extent to which changes in the mean level of each item over time reflect real changes in mindfulness rather than changes in the way that children and adolescents interpret or respond to the items.
The MIMIC model is a structural equation model. The analysis involved fitting a CFA model, based on the number of salient factors indicated in the previously described EFA analyses, extended by incorporating a categorical covariate (defined by demographic subgroup, study wave or trial arm status) as a predictor of both the factors and, if indicated, the CAMM items simultaneously. The pathways in the MIMIC model are illustrated in Fig. 1. Solid lines are used to indicate the effects of the covariate (depicted by the large rectangle) on the factors (depicted by circles); dashed lines are used to indicate the effects of the factors on the items (depicted by small rectangles); and dotted lines are used to indicate the effects of the covariate on the items. The coefficient for each dotted line is the mean difference in the item score between demographic subgroups (or study waves or trial arms) when the underlying score on the associated factor is held constant and, therefore, quantifies differential item functioning between the groups. Put another way, the coefficient between the covariate and a given item indicates the extent to which the mean difference between the subgroups with respect to the item is greater or smaller than might be expected based on the relationship between the covariate and the factor that explains the correlations amongst the items. If direct paths are required between the covariate and the CAMM items, this indicates that some of the effect of the covariate on the item is not mediated via the factor and that, therefore, some of the difference between subgroups with respect to the item is not solely due to differences in the underlying concept (mindfulness) that is measured by the CAMM, indicating differential item functioning between the subgroups or over time.
The approach used in the MIMIC modelling is described as follows. First, a CFA model was fitted where the covariate was used as a predictor of the factors only. The modification indices (approximate improvement in the chi-squared goodness of fit statistic for the model) provided in the output were used as the criteria to identify items for which it was indicated that a direct path with the covariate, if freely estimated, would improve the fit of the model (Brown, 2015). The covariate-toitem path that had the largest modification index was added to the model. This process was used to add further paths, one item at a time, until there were no remaining potential The effects of the factors on the items.
The effects of the covariate on the items.

Ψ12
indicates there is a correlaƟon between the factors.
indicates there is a correlaƟon between the errors for item 2 and item 6. covariate-to-item paths for which the modification index was greater than 10 (i.e. no path parameters for which the p-value for testing the null hypothesis of no association is less than 0.001). In the MIMIC models, the items were specified as continuous using robust maximum likelihood to estimate the parameters. The scale of the latent variables was identified by specifying the marker indicator approach, fixing the metric of the latent variable to be the same as one of the items associated with it. We primarily focus on interpreting unstandardised coefficients from the models which can be interpreted as mean differences between subgroups on the latent variables and the items. Otherwise, where standardised coefficients are reported, this is clearly indicated. MIMIC models were fitted to investigate differential item functioning in the pre-intervention CAMM responses across demographic subgroups defined by gender (using boys as the reference category), year group (using Year 8 as reference) and ethnic group (using the White group as reference). Subgroup was specified as the covariate, using indicator (dummy) variables in the models to compare the subgroups. The pre-intervention data were used for this analysis because at that wave the responses in both trial arms would not have been impacted by delivery of the mindfulness programme and, therefore, we could utilise the entire sample.
In order to investigate whether changes occur over time in the interpretation of the CAMM items in the absence of receiving mindfulness training, MIMIC models were fitted using data from the control arm of the trial to quantify change in interpretation of the items between pre-intervention and each of the post-intervention and 1-year follow-ups. Study wave was specified as the covariate (pre-intervention = 1 versus post-intervention/1-year follow-up = 2), with the preintervention wave as the reference category.
Finally, using data across both trial arms, MIMIC models were fitted to investigate the extent to which delivery of the mindfulness programme results in changes in mean response on the CAMM items that cannot be explained by intervention-generated changes in the latent mindfulness factor(s). MIMIC model analyses of each of the pre-intervention, postintervention and 1-year follow-up waves were undertaken using trial group status (control = 1 versus intervention = 2) as the covariate. The control group was the reference category. No differential item functioning was anticipated for the pre-intervention data as the intervention had not yet been delivered at that wave.

Results
In the MYRIAD trial, 8376 children were recruited from 389 classes in 84 secondary schools in the UK (4144 children, 192 classes and 41 schools in the control arm and 4232 children, 197 classes and 43 schools in the intervention arm). Eleven (13%) schools required improvement based on their grading from OFSTED (Office for Standards in Education, Children's Services and Skills), a government organisation for rating school quality and performance. For only 30 (36%) schools was the percentage of children that was eligible for free school meals (an indicator of deprivation with higher levels indicating greater deprivation) above the median percentage in the UK (29.4% based on data from the UK Department of Education in 2017), indicating that the sample of schools includes participants that are, overall, less deprived than the wider national population. Schools were recruited from all four major regions of the UK (England, Scotland, Wales and Northern Ireland). The majority of schools were mixed gender (73 (87%)) with the remainder being girls only schools. Exactly half of the schools had over 1000 pupils.
Of those recruited to the MYRIAD trial: 7924 (94.6%) provided CAMM data at pre-intervention; 7472 (89.2%) provided CAMM data at post-intervention; and 7171 (85.6%) provided CAMM data at the 1-year follow-up. Of the 4144 participants recruited to the control arm of the trial, 3630 (87.6%) provided CAMM data at both preand post-intervention and 3409 (82.3%) provided CAMM data at both pre-intervention and 1-year follow-up. The demographic characteristics and the CAMM total score for the participants are summarised in Table 2. The mean (SD; range) age of the 7924 trial participants who provided CAMM data for at least the pre-intervention wave and are included in this paper was 13.12 (0.57; 11.93 to 14.85) years and 54.6% of those included were girls. Of the remaining 452 children who are not included, 40.0% were girls; only 151 of these provided data on age (mean (SD) = 13.02 (0.61)).

Exploratory Factor Analysis of CAMM Items at Pre-intervention
The polychoric correlations between the CAMM items, the percentage of participants that are in each category and the mean and standard deviation of the CAMM items are reported in Table 3. The highest mean score (i.e. most mindful as reverse scored) was for Item 2 (At school, I walk from class to class without noticing what I'm doing) and the lowest was for Items 5 (I push away thoughts that I don't like) and 8 (I think about things that have happened in the past instead of thinking about things that are happening right now). The parallel analysis method indicated that 2 factors should be extracted in the EFA (Fig. 2). The first and second eigenvalues were 4.95 and 1.48, respectively, indicating that, between them, they account for 64.3% of the variation across the CAMM items. The goodness of fit indices for the 1-factor and 2-factor models are reported in Table 4, indicating that the 1-factor model was inadequate and the 2-factor model provided a good fit to the data (RMSEA (90% CI) = 0.073 (0.070 to 0.077); CFI = 0.983; TLI = 0.971; SRMR = 0.032). The factor loadings are reported in Table 5, with loadings above 0.40 highlighted in bold. The first factor loads saliently on the 8 items (#1 to #4 and #6 to #9) that quantify present moment non-judgmental awareness and the second factor loads saliently on Items 5 and 10, indicating that the tendency to push away unwanted thoughts and feelings represents a sub-construct within the CAMM measure. There was a moderate correlation (0.30) between the factors. The reported analysis allowed for non-independence between responses of children in the same school. A sensitivity analysis in which nonindependence within classrooms was instead allowed for provided almost identical factor loadings and similar, but very slightly poorer, model fit (RMSEA (90% CI) = 0.079 (0.076 to 0.083); CFI = 0.980; TLI = 0.965; SRMR = 0.032). The key findings for the pre-intervention wave were also found consistent when analysing data from the post-intervention and 1-year follow-up waves, which are presented in Supplementary Information (Table A1;  Table A2).

MIMIC Models Investigating Differential Item Functioning at Pre-intervention Across Demographic Subgroups
The two-factor confirmatory factor analysis model with Items 1, 2, 3, 4, 6, 7, 8 and 9 specified as indicators of the first factor and Items 5 and 10 for the second factor provided a slightly poorer fit to the pre-intervention CAMM data (RMSEA (90% CI) = 0.080 (0.077 to 0.083); CFI = 0.933; TLI = 0.911; SRMR = 0.048) than the corresponding 2-factor exploratory factor analysis solution. This might be expected given that the CFA only freely estimates specified paths rather than all possible paths between the factors and items and, unlike in the EFA, the items are specified as continuous rather than categorical. Examination of the output revealed a large modification index value (302.83) for a correlation between the errors for Items 2 (At school, I walk from class to class without noticing what I'm doing) and 6 (It's hard for me to pay attention to only one thing at a time) that is not accounted for by the first factor. When the CFA model was extended to allow the errors for these items to be correlated, the fit improved (RMSEA (90% CI) = 0.074 (0.071 to 0.077); CFI = 0.944; TLI = 0.923; SRMR = 0.044).  Using this CFA model as a basis, 2-factor MIMIC models were fitted to the data using gender, year group and ethnicity as covariates in separate models. Unstandardised path coefficients between the covariate and each of the 2 latent factors and the CAMM items are reported in Table 6. Coefficients between the covariates and the factors are reported regardless of the p-values for the relationships as these paths are part of the basic MIMIC model, whereas coefficients are only reported for covariate-to-item paths that were statistically significant and, therefore, added to the model. Higher scores on the latent factors and the CAMM items indicate a higher level of mindfulness. Positive coefficients indicate a higher level of mindfulness in comparison to the reference category.
The coefficients for the relationships between gender and the latent variables indicate that on average girls are less mindful than boys with respect to present moment nonjudgemental awareness (Factor 1) but are more mindful in having a lower tendency to push away unwanted thoughts and feelings (Factor 2). The corresponding standardised coefficients indicate girls are 0.315 standardised scores lower and 0.068 standardised scores higher on the respective latent factors. After accounting for the relationship between gender and the latent factors, there was evidence that girls have a lower score (less mindful) and, therefore, show more agreement with CAMM Items 1 (I get upset with myself for having feelings that don't make sense), 4 (I tell myself that I shouldn't feel the way I'm feeling), 7 (I get upset with myself for having certain thoughts) and 8 (I think about things that have happened in the past instead of thinking about things that are happening right now) and a higher score (more mindful) and show less agreement with Item 6 (It's hard for me to pay attention to only one thing at a time) compared to boys that have the same underlying level of mindfulness as quantified by the measure, indicating differential item functioning with respect to these items. Taking the sum of the unstandardised coefficients for the relationships with these five items indicates that girls underreported their mindfulness on the total CAMM score by 0.77 compared to boys. This is equivalent to around a tenth of a standard deviation on the CAMM total score and is, therefore, relatively small. Based on the relationship with the latent factor, Year 9 children were less mindful than Year 8 children with respect to present moment non-judgemental awareness (Factor 1), but there was little evidence of differential item functioning with respect to the comparison between the year groups.
Regarding ethnicity, the coefficients for the relationships between the ethnic group and the latent variables indicate that on average the mixed group was less mindful than the White group on both dimensions and that the Asian/British Asian and Arab groups were specifically less mindful on the tendency to push away unwanted thoughts and feelings (Factor 2). After accounting for those relationships, only the Asian/British Asian group showed differences to the White group on item response; the former had higher mean scores (more mindful) meaning less agreement with Items 2 (At school, I walk from class to class without noticing what I'm doing) and 6 (It's hard for me to pay attention to only one thing at a time) and a lower mean score (less mindful) meaning more agreement with Item 4 (I tell myself that I shouldn't feel the way I'm feeling); again, as for the gender analysis, individually and collectively, these differences are small compared to the variation in the CAMM total score.

MIMIC Models Investigating Differential Item Functioning Between Pre-intervention and Follow-up in the Control Arm
MIMIC models were fitted to longitudinal data within the control arm only to investigate response shift on the CAMM items between pre-intervention and follow-up in the absence of school-based mindfulness training. Although study wave was related to the first factor indicating that children were less mindful at post-intervention than at pre-intervention regarding present moment non-judgmental awareness (coefficient (SE) = −0.102 (0.014)), based on the modification indices, there was little evidence of change in the way the items were interpreted between those study waves. There was, however, evidence that the mean score was 0.071 (SE = 0.009) lower (less mindful) on Item 2 (At school, I walk from class to class without noticing what I'm doing), 0.070 (SE = 0.013) higher (more mindful) on Item 5 (I push away thoughts that I don't like) and 0.049 (SE = 0.011) higher (more mindful) on Item 8 (I think about things that have happened in the past instead of thinking about things that are happening right now) at 1-year follow-up compared to pre-intervention after taking account of the relationship between study wave and the two latent mindfulness factors (again, children were less mindful at the 1-year follow-up than at pre-intervention regarding present moment nonjudgmental awareness (coefficient (SE) = −0.060 (0.008))). These differences in item functioning between study waves are, however, very small relative to the standard deviations of the items at pre-intervention and extremely small relative to the standard deviation of the total CAMM score.

MIMIC Models Investigating Differential Item Functioning Between the Intervention and Control Arms Following Delivery of the School-Based Mindfulness Programme
To investigate differential item functioning resulting from delivery of the school-based mindfulness programme, we fitted MIMIC models to each of the pre-intervention, postintervention and 1-year follow-up waves using trial arm status as the covariate to predict the 2 latent factors and the CAMM items. There was no statistically significant evidence of an effect of trial arm status on the latent factors at any of the three study waves. The result for the pre-intervention wave is expected given that the mindfulness curriculum had not yet been delivered, and for the same reason there was no differential functioning with respect to any of the items at that wave. At the post-intervention and 1-year follow-ups, however, the mean score on Item 2 (At school, I walk from class to class without noticing what I'm doing) was lower (less mindful) for the intervention arm compared the control arm by 0.211 (SE = 0.034) and 0.172 (SE = 0.031) units, respectively, after taking account of the relationships (all non-statistically significant) between trial arm status and the latent factors. Although this is around a fifth of a standard deviation for the CAMM item, it is only a very small fraction of the standard deviation of the total CAMM score. At the 1-year follow-up, the MIMIC model results indicated that the mean score on Item 4 ("I tell myself that I shouldn't feel the way I'm feeling") was slightly higher (more mindful) for the intervention arm compared to the control arm by 0.070 (SE = 0.020) after taking account of the relationship between trial arm status and the latent factors.

Discussion
This study investigated the factor structure of the CAMM measure for quantifying levels of mindfulness in late childhood and early adolescence using data from a UK-wide 1 3 school-based cluster randomised controlled trial of a mindfulness curriculum for improving mental health. It also investigated differential item functioning in the CAMM measure across demographic groups, over time, and resulting from delivery of the mindfulness curriculum.
The EFA findings indicate that the CAMM is composed of 2 main factors. Eight items are associated with the first factor, which could be said to quantify present-moment nonjudgmental awareness; the remaining 2 items (#5 and #10) that are associated with the second factor relate to avoiding unwanted thoughts and feelings. This is the same as findings of previous exploratory factor analyses for children aged 10 to 12 years in The Netherlands (de Bruin et al., 2014) and children and adolescents aged 11 to 18 in Italy (Ristallo et al., 2016). Wang et al. (2018), in their exploratory factor analysis using a US sample of children in fifth grade (aged 10 to 11) from minority racial groups, also found factors representing present-moment non-judgmental awareness and avoiding unwanted thoughts and feelings, with the difference that the latter factor loaded on 5 items (#3, #4 and #9 as well as #5 and #10). Along similar lines, Mohsenabadi et al. (2020), in their study based in Iran, identified a 2-factor solution with the avoiding thoughts and feelings factor loading on Item 9, in addition to Items 5 and 10. It is also notable that Limpo et al. (2022), in their study of fourth grade children in Portugal, omitted Items 5 and 10 for their best model (a 1-factor solution) as they both had low loadings in the EFA. Furthermore, in their item response analysis of the CAMM, Chiesi et al. (2017) found that there was residual correlation between Items 5 and 10 that was not accounted for by the single-factor model. Theofanous et al. (2020) also found that, in the context of the single-factor model, freeing the errors for Items 5 and 10 to be correlated was required to achieve acceptable fit. Item 5 has been found to be psychometrically problematic in several studies that endorsed a single-factor structure for the CAMM Cunha et al., 2013;García-Rubio et al., 2019;Saggino et al., 2017;Theofanous et al., 2020;Vinas et al., 2015). The emerging avoiding unwanted thoughts and feelings dimension may result from the very similar wording of Items 5 and 10, as previously noted by Chiesi et al. (2017). If more similar items to represent this aspect were added to the CAMM, they might together form a reliable factor that is useful for research purposes.
In the course of establishing a CFA model to use as the basis for the MIMIC models, it was found that the fit of the 2-factor model was improved by freeing the errors of Items 2 and 6 to be correlated. This is consistent with findings of de Bruin et al. (2014) that, for their Netherlands-based sample of adolescents aged 13 to 16 years, the second factor represented distractibility or difficulty paying attention (Items #2 and #6). These items could be tapping into the extent to which the children are (not) mindful in the school environment. Amongst the studies that endorsed a singlefactor structure, several have noted the inadequate psychometric properties of Item 2 Baumann et al., 2022;García-Rubio et al., 2019;Limpo et al., 2022;Roux et al., 2019;Saggino et al., 2017) and Item 6 García-Rubio et al., 2019).
In contrast to previous studies that found little evidence of measurement invariance between gender groups (Chiesi et al., 2017;Saggino et al., 2017;Theofanous et al., 2020), MIMIC model analyses in the current paper indicated differences between boys and girls in the way that the CAMM items are interpreted. There was evidence that girls score lower (less mindful) on four items compared to boys that have the same underlying levels of mindfulness. The coefficients collectively suggest that the total CAMM score for girls would be 0.77 units lower than for boys after accounting for mindfulness, although this is only around a tenth of a standard deviation of the measure. There was evidence that children in the Asian/British Asian category score higher (more mindful) on the Items (#2 and #6) related to distractibility (i.e. had lower distractibility) than White children who had the same underlying levels of mindfulness. Again, this differential item functioning is relatively small in the context of the variability of those items and the total score. There was little evidence of differences for the other ethnic group categories or differences between Year-8 and Year-9 children.
Findings from fitting the MIMIC model to the control arm participants indicated that, after accounting for the level of mindfulness, they scored lower (less mindful) on Item 2 and higher (more mindful) on Items 5 and 8 at 1-year follow-up than they did at pre-intervention. This result was not consistent with the lack of evidence of differential item functioning between Year-8 and Year-9 children at pre-intervention. Higher scores on Item 5 ("I push away thoughts that I don't like") reflect lower levels of thought suppression as well as increased mindfulness. Gullone et al. (2010) found thought suppression decreased between ages 9 and 15 years, and suggest that reliance on this strategy decreases during adolescence where more adaptive executive and social strategies are learnt to manage difficult emotions. Notwithstanding this possible explanation, the estimates of differential item functioning for these items were very small (less than 0.10) and should not be over-interpreted as substantive.
The MYRIAD trial data provided the opportunity to investigate whether the delivery of a mindfulness programme to children could result in response shift or, specifically, change in the way they interpret the meaning of the CAMM items. The findings of the MIMIC model analysis using trial arm status as the covariate indicate that intervention arm children, on average, scored 0.211 lower (less mindful) at post-intervention and 0.172 lower at 1-year follow-up on CAMM Iitem 2 (At school, I walk from class to class without noticing what I'm doing) than their control arm counterparts with the same level of mindfulness as quantified by the measure. It is notable that this item specifically refers to mindfulness behaviour in the school context. The MYRIAD trial found a statistically significant but small difference on the CAMM total score such that the mean score in the intervention arm was 0.60 units lower (less mindful) than in the control arm at the post-intervention wave, the large sample size providing the trial with power and precision to find such small effects . Posthoc analyses revealed that the difference between the trial arms on the CAMM total score was largely driven by the school setting-related CAMM item (#2). It might be that the initial impact of the school-based mindfulness programme is to make the children more aware of mindfulness as a concept and that they are not mindful in the school environment. It is important, however, to appreciate that the differential functioning for the item is extremely small relative to the variability in the total CAMM score. That children and adolescents in receipt of a mindfulness curriculum scored lower than those who did not is in keeping with previous observations that adolescents with meditation experience score lower on the CAMM (de Bruin et al., 2014). Exposure to mindfulness concepts raises awareness of lack of mindfulness and, therefore, a response shift that is unrelated to the level of mindfulness is a natural consequence (Goodman et al., 2017;Grossman, 2011).
In summary, for children aged 11 to 14 in a UK sample, the CAMM measure was characterised by a 2two-factor structure with 8 items associated with a factor representing present-moment non-judgmental awareness and the two remaining items associated with a factor representing avoiding unwanted thoughts and feelings. The differential item response on the CAMM was trivial when comparing gender groups and ethnic groups and, on this basis, we conclude that the measure can be used to compare true levels of mindfulness with little bias between these groups in the UK population. Finally, the initial impact of programmes for teaching mindfulness to children may be to make them slightly more aware or critical of their lack of mindfulness in the context of the school environment.

Limitations and Future Research
While the study used analysis methods that allowed for nonindependence between the responses of children in the same school (cluster), a potential weakness is that the intermediate level of clustering at the class level was not allowed for as it was not possible to fit three-level models (i.e. children within classes within schools) for the analyses reported here. Sensitivity EFA in which the class rather than the school was specified as the cluster provided essentially the same findings.
Although the study benefitted from using data from a nationally representative sample containing a large number of schools and children, the number of children in some ethnicity groups was small; differential item functioning may not have been easily detected as a result of this. Also, the study was limited to just two adjacent school year groups making it harder to examine differential item functioning with respect to age. Finally, there was only a short follow-up phase to examine natural changes in the interpretation of the CAMM over time.
For children aged 11 to 14 in this UK sample, the CAMM measure was characterised by a 2-factor structure with 8eight items loaded on by a factor representing presentmoment non-judgmental awareness and the 2 remaining items loaded on by a factor related to avoiding unwanted thoughts and feelings. Research on the factor structure of the CAMM has resulted in inconsistent findings that are at least partly related to the different contexts of those studies, so the current study addressed a gap for the UK population. As the CAMM was developed for use between ages 10 and 17 years (Greco et al., 2011), further research is needed to confirm the factor structure in the UK for people that are older than those included in the current paper.
Girls have a tendency to score lower on some CAMM items compared to boys with the same underlying levels of mindfulness, but these differences, resulting from differential item functioning, are small relative to the variability in the CAMM total score. The differential item functioning was also trivial when comparing ethnic groups. On this basis, we conclude that the CAMM can be used to compare levels of mindfulness in late childhood and early adolescence with little differential interpretation between gender groups and ethnic groups in the UK. Further research should be undertaken on differential item functioning across these groups using data from participants spanning a wider age range than were included in this study.
The current study provides further evidence that the initial impact of programmes for teaching mindfulness to children and adolescents may be to make them slightly more aware or critical of their lack of mindfulness. Future randomised controlled trials that administer a measure of mindfulness both before and after the delivery of school-based mindfulness interventions (Molina Palacios et al., 2023) will provide data that can be used to further test this hypothesis.
Besides psychometric properties, there are other considerations when choosing a mindfulness measure. For example, it has been suggested that trait-based measures that quantify inherent disposition to mindfulness as a stable characteristic are more suited to measuring longer term changes in mindfulness following an intervention, whereas state-based measures that quantify mindfulness practice in a specific moment are better for quantifying short-term effects Goodman et al., 2017). Most of the available measures for children and adolescents, including the CAMM, are trait-based Goodman et al., 2017), but to facilitate a more comprehensive evaluation of the impacts on mindfulness, future trials of MBIs should consider including both trait and state measures.