Mindfulness-based meditation practice (MBP) is defined in terms of attentional focus and capacity for acceptance; awareness to the present moment and non-judgmental unfolding of experience (2003). Contention in the definition of mindfulness exists (Brown and Ryan 2003; Davidson and Kaszniak 2015) as some models and associated assessments include behavioral expressions of mindfulness such as reactivity (Baer et al. 2008, and see review in Carpenter et al. 2019a, b). MBP can also be differentiated from loving kindness meditation and compassion meditation (Hoffman et al. 2011). MBP may represent a stand-alone intervention to foster wellbeing and fulfilment, or as a technique within cognitive-behavioral therapy (CBT; Dimidjian et al. 2010) and other therapies (Hofmann et al. 2011) to promote emotion regulation (Bullis et al. 2014; Carpenter et al. 2019b; Chambers et al. 2009; Goyal et al. 2014; Grecucci et al. 2015; Mennin et al. 2013). However, the two-factor model centering on the cognitive skills in attention and acceptance has strong support (Britton et al. 2018; Gecht et al. 2014; Gao et al. 2018; Sauer et al. 2012; Tran et al. 2013, 2014) and was adopted for the present study.

Meta-analytic evidence supports MBP as a strategy for the reduction of anxiety, depression, and stress as well as the enhancement of psychological well-being (e.g., Khoury et al. 2013, 2015; Bartlett et al. 2019; Hofmann et al. 2010) including as a learning support in education contexts (Aherne et al. 2016; Conley et al. 2013). Identifying those factors that serve to optimize the benefits of MBP are needed, and one potentially useful avenue is a better understanding of the role of adherence with MBP.

In the context of Mindfulness-Based Stress Reduction (MBSR; 2003) and Mindfulness-Based Cognitive Therapy (MBCT; Segal et al. 2002), a meta-analysis by Parsons et al. (2017) found a small but significant association between participants' self-reported MBP and intervention outcomes (N = 898 participants from 28 studies, r = 0.26). While MBP adherence-outcome relations have been found elsewhere (e.g., Bowen and Kurz 2012; Carmody and Baer 2008; Lau et al. 2006), other reviews and individual studies have not obtained positive MBP adherence-outcome relations (Vettese et al. 2009). It is possible that low statistical power explains at least some of the inconsistent findings (see Kazantzis 2000), but the matter of poor MBP adherence assessment has also been noted (Shapiro et al. 2003).

Studies seeking to clearly establish the MBP adherence-outcome relationship have been hampered by methodological limitations in measuring adherence. These limitations include (1) an exclusive focus on practice quantity, rather than quality, and (2) failure to distinguish between formal and informal practice.

The Freiburg Mindfulness Inventory (FMI; Walach et al. 2006), is a commonly used measure, asks respondents to rate how often “I sense my body whether eating, cooking, cleaning or talking.” Likewise, the Mindful Attention Awareness Scale (MAAS; Brown and Ryan 2003) asks respondents to rate how often “I rush through activities without being really attentive to them.” Along with the Five Facet Mindfulness Questionnaire (FFMQ; Carpenter et al. 2019a, b), these commonly used measures are excellent assessments of the product of MBPP, the experience of being mindful, rather than of MBPP adherence per se.

With few exceptions (i.e., Del Re et al. 2013), studies examining MBP adherence have concentrated on the amount of practice rather than degree of skill acquisition (Vettese et al. 2009). Preliminary findings indicate quality of MBP adherence and intervention outcome were associated in Del Re et al. but the assessment of quantity and quality in Del Re et al. were asynchronous leading to limited evaluations. The focus on quantity rather than quality of adherence parallels the limitations of homework adherence assessment in CBT (Kazantzis et al. 2016, 2017). The issue of whether MBP produces its effects simply through amount of time spent engaged in practice (i.e., quantity) or through acquired skills in meaningful attentional focus and non-judgmental acceptance (i.e., quality) remains unclear. Therefore, there is a need for an assessment of MBP that delineates quantity and quality of adherence, as well as the potential for broad generalization of mindfulness and therefore different subtypes of MBP because individuals may dedicate specific times for MBP (i.e., formal practice) and also engage in MBP through situations in their everyday life (i.e., informal practice).

Hindman et al. (2015) directly compared the relative contributions of formal and informal MBP and found that both were effective in reducing psychological morbidity at post-treatment, but formal MBP produced the greatest benefits. Similarly, Crane et al. (2014) followed 99 adults with major depression during a 7-week MBCT program. Formal MPB was a significant predictor of depressive relapse, although informal practice was not. Two additional studies have provided further evidence for the superior effect of formal MBP (i.e., Carmody and Baer 2008; Hawley et al. 2014). However, it is possible that non-significant MBP adherence-outcome relationships involving informal practice may be due to inherent methodological challenges in measuring informal practice (Crane et al. 2014). Whereas the structured nature of formal MBP makes it relatively straightforward for participants to self-report the frequency and duration of home-based mindfulness practice, informal practices, in contrast, tend to occur irregularly throughout the day and are prone to retrospective memory biases (Schwarz 1999; Baumeister et al. 2007). Therefore, this study aimed to overcome these limitations by providing researchers and clinicians with a valid and reliable tool to comprehensively measure MBP.

The Mindfulness Adherence Questionnaire (MAQ) was designed to assess regular and sustained practice in attentional focus and non-judgmental acceptance (i.e., quantity, quality, subtype of practice). The term “adherence” was adopted to refer to formal and informal MBP (see review Holdsworth et al. 2014).

The present study adopted the two-factor model of MBP comprising attentional focus and capacity for non-judgmental acceptance. In this context, attention involves a deliberate, focused awareness of moment-to-moment internal and external experiences (Mennin et al. 2013; Siegel et al. 2009). In particular, those skilled in MBP require sustained attention for prolonged periods (Parasuraman 1998; Posner and Rothbart 1992), capacity to intentionally switch attention, and inhibit secondary elaborative processing of emotions and cognition (Heeren et al. 2009). MAQ item content assessed all aspects of attentional focus in MBP.

Non-judgmental acceptance reflects attitude towards experiences, encompassing curiosity, non-reactivity, openness, and acceptance (2003; Shapiro and Schwartz 2000). In the context of MBP, acceptance serves to counteract habitual thought processes through tolerance of difficult emotional experiences (Hayes and Feldman 2004), and the suspension of judgement regarding the negative implications of those experiences for self-concept (Keng et al. 2016). MAQ item content assessed acceptance of cognitive, emotional, and physiological discomfort.

Formal MBP generally refers to sitting meditation (Williams and Kabat-Zinn 2011) whether it be brief (minutes) or extended (hours), guided or unguided. This includes guided instruction from a teacher concerning the nature and content of practice, physical posture, and the attitudinal and attentional qualities to employ (Hawley et al. 2014). The locus of attention can be any sensory object, including bodily sensations, breath or sounds. However, informal MBP, involves bringing mindful awareness to daily activities and facilitating the transfer of skills and attitudes cultivated during formal practice into everyday life (Kabat Zinn 1990). For example, when performing regular household chores, one is encouraged to mindfully attend to each task in order to fully absorb what is occurring in each moment while maintaining the attitude of curiosity, nonjudgment and acceptance. MAQ items content mapped both formal and informal MBP (see Supplementary Information).

The present study tested the psychometric properties of the MAQ. In terms of evaluating construct validity, it was hypothesized that the MAQ would load onto a nested-factor model with one general factor (Practice) and two specific factors (Formal and Informal) representing the two subtypes of practice (Hypothesis 1). It was also hypothesized that the MAQ would demonstrate adequate internal consistency reliability (Hypothesis 2). In terms of evaluating discriminant validity, it was hypothesized that the MAQ would capture mindfulness adherence as a construct distinct from trait mindfulness, as measured by the FMI (Walach et al. 2006) and MAAS (Brown and Ryan 2003, Hypothesis 3). It was hypothesized Quality of both formal and informal mindfulness practice would be more strongly associated with higher levels of trait mindfulness than quantity of practice (Hypothesis 4). Finally, it was hypothesized that MAQ measurement over time will be stable through displaying intraclass correlation coefficient (ICC) values > 0.70 (Polit 2014, Hypothesis 5).



Data from two studies were used to examine separate MAQ psychometric properties. Study 1 used a cross-sectional design to examine internal reliability and construct validity. First-year undergraduate medical students who had just completed the core curricular, 5-week Health Enhancement Program (HEP; see below) at Monash University were selected on a convenience basis. During a lecture the total cohort of 310 students were invited to participate in the study. At the subsequent final HEP tutorial, having consented to participate, a total of 282 students returned completed self-administered, de-identified questionnaires (91% response rate). There were no inclusion/exclusion criteria. Participants were predominantly young adults (Mage = 18.50, SDage = 0.98), female (56.4%), racially/ethnically diverse (51.4% Asian, 33.3% Caucasian, and 5.8% other) and single (83.0%, see Supplementary Infomration). Study 2 used a longitudinal design to explore stability of MAQ measurement and how scores change over the course of a 4-week mindfulness intervention, described further below. The final sample consisted of 55 voluntary participants who were mostly female (80.0%), had English as their first language (85.5%), had completed undergraduate or postgraduate university study (78.2%), and had a broad age range (Mage = 51.4, SDage = 13.6). Twenty-three (41.8%) participants were current meditators at the beginning of the course and 37 (67.3%) had previously studied mindfulness and/or another type of meditation.


For Study 1, classroom tutors administered study measures to participants at the end of their final tutorial session. This timing was chosen in order to minimize disturbance with the program whilst still permitting retrospective assessment of mindfulness adherence. Descriptions of the intervention for Study 1, the Health Enhancement Program (HEP) are presented in detail elsewhere (Hassed et al. 2008) and should be distinguished from the active control program of the same name developed by MacCoon et al. (2012). In brief, this 5-week program is part of the core curriculum for undergraduate first-year medical students at Monash University, taking place in the second half of the first semester of their five-year undergraduate medical course. Students were provided eight lectures discussing the evidence base underpinning mind-body medicine, mindfulness and lifestyle factors, including the relations among mental and physical health, neuroscience, and psychoneuroimmunology. Theoretical learning was supported by five 2-h experientially-based, practical tutorials comprising the two arms of the HEP, one being mindfulness-based and the other dedicated to lifestyle management (Hassed et al. 2009). Students were recommended to practice weekly formal mindfulness meditation practices (e.g., starting with five minutes of mindfulness meditation twice daily and brief mindful pauses anything from 15 s to two minutes as often as needed), as well as informally practicing being mindful in daily life. Emphasis was placed on both attentional and attitudinal aspects of cultivating mindfulness. At each tutorial this was followed by discussion of experiences and insights in class the following week. The content of the HEP is examinable, so it is assumed all students were motivated to understand the basic concepts and underlying science and rationale, although personal application of knowledge and skills was optional. However, previous research has shown that once they understand the underlying science and personal relevance, 90.5% of students report personally practicing and applying mindfulness in their own lives, leading to improved indices of mental health, including anxiety, depression and hostility (Hassed et al. 2009) even during the high-stress pre-exam period.

For Study 2, participants were invited to complete the MAQ at the end of each week of a 4-week online intervention, Mindfulness for Wellbeing and Peak Performance, developed and delivered by Monash University staff on the “FutureLearn” digital education platform. Focusing on mindfulness as a means to enhance wellbeing, reduce stress and improve performance, students worked through approximately three hours of material each week including brief, explanatory course videos, curated articles, guided formal mindfulness exercises, informal mindfulness practices, self-reflection, and discussion with mentors and other students via an optional comments forum. As with the HEP, students were encouraged to practice mindfulness both formally (starting with 5 min of meditation twice a day, plus brief mindful pauses as needed) and informally in daily activities, emphasizing attentional and acceptance aspects.


Mindfulness Adherence Questionnaire (MAQ)

The newly developed 12-item MAQ is a self-report adherence MBP occurring within the past week and was administered at the completion of the HEP. The first two items measure formal practice in terms of frequency and average duration of practice (in mins). The remaining 10 items measure (see Supplementary Information) the quality of formal practice (e.g., When meditating, how much of the time were you practicing an accepting attitude toward what you were experiencing?) and informal practice (e.g., In your daily life, how much of the time were you practicing paying attention while working or studying?). However, the MAQ does not measure quantity of informal practice due to the inherent difficulties in doing so, as previously discussed (see Crane et al. 2014). Items are scored on a 7-point Likert-scale ranging from 0 (never) to 6 (always), with greater total subscale scores reflecting higher practice quality for that respective practice subtype.

In a preliminary evaluation of the MAQ (N = 260 in Kassim 2016) confirmatory factor analysis (CFA) that revealed a two-factor model distinguishing formal from informal practice, with adequate internal consistency (α = 0.79), and the correlation (r = 0.59) between the two factors (formal and informal practice) was significant (p < 0.001), indicating a relationship between the two latent variables. In Kassim report, items loaded significantly onto their respective factors (loadings ranging from 0.32 to 0.74 on the FM scale and between 0.59 and 0.76 on the IFM scale), but factor loading for item four (λ > 0.32) fell below the recommended level (λ > 0.50), which indicated that the factor does not fit well with the model. Finally, the coefficients of determinations of the factor loadings (R2) ranged from 0.10 to 0.58, and with the exception of item four, all remaining R2 were significant (Fig. 1). On the basis of these preliminary findings the MAQ was deemed suitable for further study.

Fig. 1
figure 1

Confirmatory factor analysis of the two-factor model of the MAQ of formal (FM) and informal (IFM) practice. Standardized coefficients and measurement errors are shown; all paths are statistically significant (p < .001)

In Study 1, Cronbach’s α for the scale was 0.79, which is acceptable internal consistency (Tavakol and Dennick 2011). In Study 2, participants completed the MAQ Quantity items weekly from weeks 1–4 and the Formal and Informal subscale items weekly from weeks 2–4 of a 4-week mindfulness course. The Cronbach’s α in weeks 2, 3 and 4 for the Formal subscale were 0.67, 0.67 and 0.87 respectively, and for the Informal subscale were 0.91, 0.93 and 0.93 respectively.

Freiburg Mindfulness Inventory (FMI)

The 14-item FMI (Walach, et al. 2006) measures trait mindfulness, emphasizing the attentional and attitudinal qualities that comprise the construct. Sample items include: When I notice an absence of mind, I gently return to the experience of the here and now and I am impatient with myself and others (reverse-scored). Items are rated on a 4-point Likert scale ranging from 1 (rarely) to 4 (almost always), with higher total scores reflecting higher trait mindfulness. The FMI has found to be reliable in non-clinical populations (α = 0.83; Kohls et al. 2009), and captures mindfulness distinct from other potentially similar constructs such as self-awareness and dissociation (Walach et al. 2006). In Study 1, internal consistency was acceptable (α = 0.80).

Mindful Attention Awareness Scale (MAAS)

The 15-item MAAS (Brown and Ryan 2003) also measures trait mindfulness; however, it considers the construct as consisting solely of an attentional component—unlike other scales e.g., the FMI, which also capture attitudinal components (see Sauer et al. 2012). This is based on an assumption by Brown and Ryan that acceptance aspects of mindfulness are an aspect of and dependent on the attentional component. Sample items include: I find it difficult to stay focused on what’s happening in the present and I break or spill things because of carelessness, not paying attention, or thinking of something else. Items are rated on a 6-point Likert-scale ranging from 1 (almost always) to 6 (almost never), with higher mean scores reflecting higher trait mindfulness. While it has been critiqued for being negatively worded (i.e., Höfling et al. 2011), it has been widely used in research and has demonstrated reliability (α = 0.89, MacKillop and Anderson 2007) and validity for use in university populations (Osman et al. 2016). In Study 1, internal consistency was acceptable (α = 0.79).

Data Analysis

Data were analyzed using IBM SPSS version 22 (Armonk, NY: IBM Corp. 2013) and Mplus version 7.4 (Muthén and Muthén 1998–2017). Prior to analysis of Study 1 data, item responses on the MAQ, MAAS, and FMI were examined for accuracy of data entry, missing values, outliers, normality, and multicollinearity. Item responses were within the expected range for each scale, indicating no out-of-range values. In total, 65 out of 11,240 values (0.58%) were missing from the dataset. With the exception of MAQ question eight, all scale items had at least some missing data (range: 1 to 8 values missing). Little’s Missing Completely at Random test (Little 1988) was nonsignificant, χ2(447) = 494.78, p = 0.059, indicating missing values occurred completely at random. Thus, the Expectation–Maximization (EM) multiple imputation procedure was used to replace missing values as it tends to yield unbiased parameter estimates (Schlomer et al. 2010).

In examining univariate outliers, raw scores were converted into standardized scores, with cases exceeding ± 3.29 standard deviations beyond the mean deemed outliers (Tabachnick and Fidell 2013). However, no univariate outliers were identified using this criterion. The presence of multivariate outliers was determined through Mahalanobis distance statistic at the p < 0.001 level (Kline 2015). Six cases were identified due to representing “spurious activity”, e.g., endorsing items that simultaneously indicated both very low and very high levels of trait mindfulness (see Cousineau and Chartier 2010). This left 276 cases for the final analytic sample. Values of skewness and kurtosis were small for item responses and did not exceed absolute values of 3 and 10, respectively, indicating univariate normality (Kline 2015). An index of weekly formal practice quantity was derived by multiplying formal practice frequency by average practice duration. This index was log-transformed due to non-normality. Visual inspection of matrix scatterplots revealed that relationships between study variables were of a linear pattern. Pearson correlations were computed and did not indicate multicollinearity at r = 0.90 (range 0.01–0.78; Tabachnick and Fidell 2013). However, Small’s Omnibus Test of Multivariate Normality was significant, χ2(20) = 367.43, p < 0.001. Therefore, to compensate for multivariate non-normality, Satorra and Bentler’s (1994) correction was applied in order to create robust standard errors for the model.

Structural Equation Modelling (SEM) was utilized in the evaluation of the MAQ. Firstly, a Confirmatory Factor Analysis (CFA) evaluated the factor structure of the MAQ using Satorra and Bentler’s (1994) method of estimation for non-normal distributed data. In CFA, a theoretically derived factor structure is defined a priori and imposed on the data, and model testing is performed to confirm or refute this structure (Brown 2006). A new model was considered for the present study: a nested-factor model comprising two specific factors (Formal and Informal) plus a general factor denoting overall mindfulness practice (Practice). Unlike a model based on two factors plus a second-order factor, items in a nested-factor model load on both specific factors and the general factor simultaneously. Hence, the general factor represents mindfulness practice generally as a conceptually broad “target” construct that the scale measures, whereas specific factors represent the conceptually narrower subdomains of formal and informal practice (Reise 2012). Given that factors studied within psychology rarely represent unidimensional hierarchical constructs (Widhiarso and Ravand 2014), a nested-factor model has a number of advantages over a second-order model factor (Chen et al. 2006). Chiefly, a nested-factor model considers both the independence as well as interdependence of comprising factors, allowing researchers to (1) analyze the contribution of specific factors that are independent of the general factor, and (2) test whether specific factors predict external variables over and above the general factor.

Goodness-of-fit was evaluated using multiple indices: relative Chi-square (ratio of χ2/df), root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), comparative fix index (CFI), and Tucker–Lewis index (TLI). Good fit is indicated by a relative Chi-square ratio lower than 3:1 (Kline 2015), RMSEA and SRMR values lower than 0.06 and 0.08, respectively, and values for CFI and TLI exceeding 0.95 (Hu and Bentler 1999). However, it should be noted that the use of strict cut-off values is controversial, with some researchers asserting that such fit indices are poor indicators of model “acceptability” (Barrett 2007). Others argue that although fit indices can be subject to misuse, they are valuable criteria for model theory testing (Hayduk et al. 2007). The aforementioned fit indices were retained for this study, but should be interpreted cautiously.

Convergent validity was evaluated by calculating the average variance extracted (AVE) of both factors, which measures the level of variance captured by these factors in relation to the amount of variance due to measurement error (Fornell and Larcker 1981). Convergent validity was evidenced by the AVE exceeding the recommended value of 0.50, indicating that greater than half of the variances observed in the items are accounted for by their hypothesized factors (Fornell and Larcker 1981). Discriminant validity between the two factors was evaluated using Fornell and Larcker’s (1981) criterion. Adequate discriminant validity exists if the AVE of both factors is greater than the variance shared by both (i.e., the squared correlation coefficient). Furthermore, discriminant validity with measures of trait mindfulness (i.e., FMI and MAAS) was determined if the AVE was greater than the squared correlation between factors composing these scales. This shared variance was calculated using canonical-correlation analyses (Hair et al. 2014).

Internal consistency reliability was estimated using coefficient omega (ω), derived from Widhiarso and Ravand’s (2014) formula for nested-factor models. Within this study, omega reflects the reliability of a total score formed from the combination of the general Practice factor and its corresponding Formal and Informal specific factors. Omega overcomes limitations of Cronbach’s α as it (1) analytically capable of partialing out measurement error, (2) does not require the assumption of a tau-equivalent model (i.e., that all factor loadings are equal), and (3) is not unduly affected by the dimensionality of the scale. Dunn et al. (2013) provide further review of the differences between Cronbach’s α and omega. Omega is similarly interpreted on a range from 0 to 1.

For participants reporting no formal practice quantity, a score of zero was replaced for each item response on the MAQ Formal subscale (reflecting the lowest quality of practice). This resulted in an asymmetrical and positively skewed distribution for total scores on this subscale. Therefore, the sample was dichotomized into two groups: participants reporting at least some quantity of formal practice (n = 213; 77%); and participants reporting no quantity of formal practice (n = 63; 23%). Pearson correlations were performed to examine cross-sectional relationships between study variables for the formal practitioners only (n = 213). Although MacCallum et al. (2002) argue that dichotomization of quantitative variables often yields misleading results in psychological research, they suggest it may be justified here as a useful means of revealing statistical relationships. However, whether this grouping represents a true dichotomy of formal practitioners and non-practitioners is yet to be empirically verified, so results must be interpreted cautiously.

Prior to conducting analyses for Study 2, data were examined and found to have no missing data in relevant variables. The distribution of MAQ Quantity, Formal and Informal subscale scores at each timepoint were approximately normal and viewing the scatterplots of scores over time suggested the relationships were approximately linear. Paired-samples t-tests were conducted to examine differences in MAQ subscale scores between each week of course participation. Pearson correlation analyses were calculated to measure the consistency of change over time. ICCs and their 95% confidence intervals were additionally calculated using two-way random effects models to measure absolute agreement between scores over time.


From Study 1, descriptive statistics for each included psychometric scale are presented in Table 1. Descriptive statistics for item responses on the MAQ are presented in Table 2. The results of the CFA indicated good fit: relative Chi-square = 2.51 (70.28/28), CFI = 0.960, TLI = 0.936, RMSEA = 0.074 (90% CI [0.053, 0.096]), SRMR = 0.050 (see Supplementary Information for Standardized factor loadings for the nested-factor model.) All item loadings were statistically significant at the p < 0.001 level, indicating that each item loaded on either specific factor in addition to the general factor. For the general factor, magnitude of loadings ranged from 0.23 to 0.34 (M = 0.30, SD = 0.04). For the specific factors, loadings varied considerably among items across the two subscales: loadings on the Formal specific factor ranged from 0.51 to 0.54 (M = 0.52, SD = 0.02), whereas loadings on the Informal specific factor ranged from 0.27 to 0.29 (M = 0.27, SD = 0.01).

Table 1 Descriptive statistics for scores on psychometric scales
Table 2 Descriptive and reliability statistics for the MAQ

Coefficient omega was 0.733 for the nested-factor model, which is good for a newly developed scale (Kline 2015). This value indicated that 73.3% of total score variance was attributed to the combination of the general Practice factor and its corresponding specific Formal and Informal factors. The AVE value for the nested-factor model was calculated to be 0.455, which was just below the acceptability value of 0.50 (Fornell and Larcker 1981). This indicated that the MAQ measured slightly more measurement error than the intended construct of mindfulness adherence. In terms of reliability analysis for each subscale separately, both informal and formal revealed reliability coefficients of 0.428 and 0.669, respectively. The fact that when partialed out and analyzed separately reliability was substantially lower suggests that the reliability of the MAQ is greater than the sum of its parts. Thus, it is imperative to consider both subscales simultaneously.

The canonical-correlation between the Formal and Informal factors was 0.644. This squared value (0.415) was not greater than the AVE value of 0.455, indicating that discriminant validity between these two constructs was supported (Fornell and Larcker 1981). Furthermore, the canonical-correlations between the nested-factor model of the MAQ and measures of trait mindfulness (MAAS and FMI) were 0.387 and 0.429, respectively. Neither of these squared values (0.150 and 0.184, respectively) exceeded the AVE value of 0.455, indicating that discriminant validity between these measures was also supported. Three indices of mindfulness adherence were considered for the following analyses, as measured by the MAQ: formal practice quantity, formal practice quality, and informal practice quality. Both formal and informal practice quality positively and significantly correlated with cross-sectional levels of trait mindfulness, as measured by both the MAAS and FMI (see Table 3). In particular, informal practice quality more strongly correlated with trait mindfulness than formal practice quality. However, formal practice quantity did not significantly correlate with any measure of trait mindfulness, nor with any MAQ subscale.

Table 3 Pearson correlations for psychometric scale scores

From Study 2, Mean (SD) MAQ subscale scores from four consecutive weeks of measurement are shown in Table 4. Significant changes in MAQ Quantity scores were seen between weeks 1, 2 and 3 but not between weeks 3 and 4. Formal subscale scores increased significantly between weeks 2 and 3 but not between weeks 3 and 4. Significant Informal subscale changes were seen between weeks 2 and 3 and between weeks 3 and 4. As a measure of test–retest reliability, strong positive correlations were evident between consecutive week measures of all MAQ subscales (see Table 4). Absolute agreement was high (ICC = 0.83 to 0.85) for the Quantity subscale from weeks 2 to 4 but lower for weeks 1 to 2 (ICC = 0.44), likely reflecting the increase in meditation quantity after commencing the mindfulness course. The ICC range for the Formal subscale ranged from 0.59 to 0.75 and for the Informal subscale ranged from 0.68 to 0.79.

Table 4 How MAQ scores differed over time during participation in a 4-week mindfulness course


The present study evaluated the factor structure and psychometric properties of the MAQ; a newly developed self-report tool to comprehensively measure MBP adherence. In evaluation of the first hypothesis, that the MAQ would fit a nested-factor model distinguishing between formal and informal practice, the CFA indicated adequate fit in a sample of 282 medical students (H1). Items comprising MAQ subscales consistently loaded on both the general factor (Practice) and its corresponding specific factors (Formal and Informal), suggesting that researchers may consider this model as a viable alternative to other conceptually related yet distinct models (e.g., those with correlated factors without a dominant general factor).

With respect to the second and third hypotheses, the internal consistency reliability of the MAQ was adequate (H2), and discriminant validity analyses revealed that the MAQ captured MBP adherence as a construct distinct from trait mindfulness (H3). As previous measures of state or trait mindfulness do not assess processes during formal and informal MBP, there exists a need to measure practice quality and quantity with an instrument such as the MAQ in order to see how practice quality and quantity correlate with potential outcome measures like mental health, quality of life and well-being. The processes undertaken during the practice of mindfulness may be a distinct but vital factor determining outcomes over and above changes in state or trait mindfulness.

It was hypothesized that quality of both formal and informal mindfulness practice would be more strongly associated with higher levels of trait mindfulness than quantity of practice (H4). Support for the hypothesis was obtained; practice quality was associated with higher levels of trait mindfulness through cross-sectional evaluation. These results corroborate with previous research highlighting the positive impact of practice quality on post-intervention outcomes (Del Re et al. 2013; Goldberg et al. 2014). In contrast, practice quantity was not associated with trait mindfulness, consistent with studies citing weak or nonsignificant relationships between quantitative measurements of mindfulness adherence and self-reported outcomes (Ribeiro et al. 2017; Vettese et al. 2009).

In examining MAQ stability over time (H5), the MAQ Quantity, Informal and Formal subscales were shown to be sensitive to change over time, were strongly positively correlated over time which suggested consistent measurement of change. While ICC values for consecutive week measurements exceeded 0.70 for most time-points, early change in meditation quantity and later change in meditation quality showed lower absolute agreement.

The present study investigated the utility of the MAQ to measure both formal and informal MBP and to differentiate quantity and quality of such practice. This was achieved by demonstrating in a non-clinical sample that: (1) MBP adherence is a multifaceted construct consisting of two distinct subtypes of practice (formal, informal) rather than a unitary construct; and (2) in terms of measuring and scoring MBP quality, it may be appropriate for researchers, clinicians and other practitioners to consider the joint functioning of a general factor representing overall mindfulness practice (practice) and its corresponding specific factors (formal and informal). Our nested-factor modelling approach is advantageous because, unlike other statistical models, it accounts for the correlations between the formal and informal factors and was therefore more theoretically “correct” given the accuracy of assumptions that similar psychological constructs tend to be correlated (Widhiarso and Ravand 2014).

The patterns of factor loadings on the MAQ suggested that deriving subscale scores was empirically justified, which will be useful for researchers and clinicians seeking to separately assess the separate contributions of formal and informal practice to outcomes. Specifically, the relative magnitude of factor loadings indicated that the formal and informal specific factors accounted for meaningful, additional variance in MAQ items, even after controlling for variance due to the general practice factor. While it would be preferable for researchers to recreate the nested-factor model of the MAQ with sufficiently large sample sizes (e.g., n > 200), findings nevertheless suggest it may be appropriate to derive subscales by simply summating items (with a corresponding increase in measurement error) with smaller sample sizes (Brouwer et al. 2013).

The findings regarding discriminant validity are important because they further elucidate what the MAQ actually measures. If the MAQ shared much of the same variance as measures of trait mindfulness (e.g., the FMI and MAAS), then it could not be assumed that the MAQ is capturing a distinct construct of mindfulness adherence (Singh 1991). Theoretically, mindfulness adherence represents one’s actual practice and application of mindfulness and should therefore be separate from the outcomes of such adherence (i.e., increased trait mindfulness). It was hypothesized that the MAQ would capture mindfulness adherence as a construct distinct from trait mindfulness (as measured by the MAAS and FMI). Findings empirically support this notion. It should be noted that many CFA studies rely upon model fit indices in demonstrating construct validity (Hayduk et al. 2007). The fact that the present study considers other statistical metrics such as discriminant validity is an important methodological strength of this study. In addition to demonstrating the MAQ to have adequate psychometric properties, the present study also investigated the relationship between adherence to mindfulness home practice and levels of self-reported trait mindfulness.

Considering the 91% response rate for Study 1, we are confident that our data is a fair representation of the overall cohort although caution should be taken regarding the generalizability of these findings with regard to other student or community samples. In explaining the null finding for practice quantity, it may be that merely attempting practice is insufficient in terms of cultivating trait mindfulness. Practice quantity says little about whether one is bringing the appropriate attitudes and consistency necessary for skill acquisition (Ericsson et al. 1993), and the broad array of mindfulness-specific appraisals that theoretically determine adherence (Kazantzis and L’Abate 2005). For instance, a client that falls asleep or becomes disinterested during meditation or informal practice will presumably not cultivate trait mindfulness (Del Re et al. 2013). Similarly, a client who does not fully understand the rationale, or comprehend the steps involved in formal or informal practice may be less likely to adhere with recommended practice. Mindfulness experts have instead emphasized the importance of practice quality; Kabat Zinn (1994), for example, stated that “five minutes of formal practice can be as profound or more so than forty-five minutes…the sincerity of your effort matters far more than elapsed time” (p. 123). More broadly, the psychotherapy literature has not only demonstrated that homework quality and quantity both operate to produce outcomes (Kazantzis et al. 2016), but that the former contributes to a greater extent to outcomes presumably for the reasons outlined above (see Neimeyer et al. 2008 for an example of cognitive reappraisal). However, it must be noted that both frequency (approximately twice in the prior week) and average duration (approximately 5 min) were much lower than that observed in many other mindfulness studies. In MBSR, for example, clients may perform formal practice for up to 45 min per day (Kabat-Zinn and Chapman-Waldrop 1988). This is perhaps expected considering there was no self-selection of participants, and motivation may not have been strong for many students with the significant majority not experiencing clinical levels of psychological distress and being required to learn mindfulness as a course requirement. Hence, low formal practice quantity cannot be ruled out as a possible explanation for its non-significant association with levels of trait mindfulness.

From a practical perspective, the MAQ provides mindfulness instructors a convenient means of monitoring a client’s between-session mindfulness adherence. For example, there are many people who may find formal meditation practice difficult or confronting and yet may benefit significantly from the informal practice. Unless mindfulness adherence and quality of practice is measured in a more comprehensive way, this important point may be lost.

While Study 1 robustly demonstrated the internal reliability and construct validity of the MAQ, test–retest reliability and sensitivity to change was examined in Study 2. ICCs between measurements over consecutive weeks approached or exceeded the previously recommended threshold of 0.70 (Polit 2014). Weaker absolute agreement was found between course week 1 and 2 for the Quantity subscale (ICC = 0.44) and between week 3 and 4 for the Formal subscale, which likely reflect practice quantity and quality changes influenced by course participation. With correlation r-values between consecutive week measurements ranging from 0.57 to 0.85, this also suggests acceptable consistency of change measured using the MAQ. Examining the extent of change in Formal and Informal subscales across weeks of mindfulness course participation also found different trajectories of change. The quality of Formal meditation practice showed a smaller increase from weeks 2 and 3 and no change between weeks 3 and 4. In contrast, Informal mindfulness practice showed increases across all time periods. Whether the different trajectories for formal and informal mindfulness practice were impacted by extent of meditation practice at course commencement could be further explored in future research. However, the finding of different change trajectories for formal and informal mindfulness practice reinforces the importance of separately measuring the quality of formal and informal mindfulness practice (as is done via the MAQ).

Notwithstanding the important contributions this research makes to the mindfulness intervention literature, some limitations must be mentioned. Firstly, the nested-factor model had an AVE value (0.455) slightly below the recommended value of 0.50, indicating that summating items to derive subscale scores is associated with 54.5% measurement error. However, given that the model fit and reliability were adequate, reliance on AVE as a sole measure of convergent validity may not be necessary (Borsboom et al. 2004). Results of the model should nevertheless be viewed as provisional and in need of replication. Secondly, the study used a correlational approach rather than an experimental approach when examining practice-outcome relationships. While it may seem intuitive that mindfulness adherence temporally precedes and causes higher trait mindfulness, the inverse direction is also plausible. For instance, an individual with higher trait mindfulness may be less distractible and more likely to remember to engage in home practice. More nuanced time-varying analyses would be helpful to confirm changes in variables across MBIs (Adolph et al. 2008) and determine whether increased adherence temporally precede changes in trait mindfulness. Not only would this approach confirm causal inferences, but may also reduce the impact of retrospective response biases (e.g., social desirability and mood effects) that occur when adherence is only measured post-intervention (Hoyt et al. 2006; Kazantzis et al. 2001). Thirdly, this research only considered a single outcome measure (trait mindfulness). One may presume, on the basis of prior literature, that trait mindfulness at least partially mediated other psychological benefits that were not measured (Dobkin and Zhao 2011; Nyklíček and Kuijpers 2008). Nevertheless, future research may do well to confirm the direct relations between adherence and a broader array of clinical (e.g., depressive and anxious symptoms) and non-clinical (e.g., study or workplace performance) variables. Moreover, follow-up of outcomes weeks and months beyond the end of intervention are recommended. This is especially warranted in light of research indicating that the superior benefits of practice quality become even more salient when assessed at follow-up (Kazantzis et al. 2016).

It could be argued that replacing informal practice scores for low-scoring participants and dichotomizing the MAQ scores in the analyses is problematic because dichotomization of continuous scores discards meaningful variance. Although losing variability in such dichotomisation, Baneshi and Ar (2011) have demonstrated that dichotomisation of a continuous variable can actually increase the sensitivity of the model, thereby improving the prediction of group membership. It is also worth noting that the responses to the formal subscale of the MAQ showed an average that is at the low end of the range, with a large standard deviation (M = 10.87, SD = 6.64). The lack of correlation between formal mindfulness quality and measures of trait mindfulness may be due to the low formal mindfulness quality scores, but further research will be required to confirm this due to the difficulty in interpreting results from the low end of the scale in the current study.

Another factor worthy of consideration is that although it would have been optimal to administer the MAQ on a weekly basis in Study 1, to measure fluctuations in weekly practice quality and quantity, concerns about the burden of data collection requirements made this unfeasible for students completing this program as a part of core curriculum during a period of high academic load. This increases the potential for response bias (Van Dam et al. 2017). Hence a second sample was recruited in Study 2, comprising volunteers from a free online mindfulness intervention who opted to complete the MAQ for each week of the course. However, due to the small sample size, it is recommended that future studies address this in larger studies of self-selected participants engaged in a mindfulness program as a part of a research project, rather than a non-self-selected group where the research is embedded within a core curricular mindfulness program. Future research should also relate the MAQ subscales to outcome measures typically associated with improvements following mindfulness interventions, in order to determine whether formal, informal, and practice measures independently relate to improved outcomes. Finally, participants in Study 1 comprised a community sample of medical students with presumably little knowledge of the complexity of mindfulness practice. In one sense this is a strength of the current research—the results indicate that the MAQ shows validity in measuring mindfulness in individuals without specialist knowledge of the field. However, some research suggests that the meaning of items comprising self-report mindfulness questionnaires tend to be interpreted differently based on one’s mindfulness experience. In other words, these items possess differential item functioning (DIF; see Grossman and Van Dam 2011; Van Dam et al. 2009). However, whether DIF is present for the construct of mindfulness adherence (as opposed to trait mindfulness) remains unclear. Demonstration of support for a nested-factor model across a diverse range of clinical and non-clinical populations would help confirm the measurement invariance of the MAQ.