Background

Amidst legalization and regulation of recreational cannabis in Canada in 2018, and legalization of medicinal and/or recreational use in 40 American States, prenatal cannabis use (PCU) is rising [1, 2]. Despite recent studies showing associations between PCU and adverse maternal, infant, and child outcomes, such as pregnancy anemia, preterm birth, small for gestational age, placental abruption, neonatal intensive-care unit (NICU) admission, and intrapartum stillbirth [3,4,5,6], the evidence remains conflicting [7,8,9,10,11,12]. A critical limitation of published studies is a high risk of misclassification bias resulting from a lack of standardized measurement of PCU across adequate domains, including frequency, dose, modes, timing of use in pregnancy, and second-hand smoke and vapour. There is an urgent need for high-quality cannabis-related health research, and pregnant individuals and infants have been identified as priority populations [9, 10, 13]. Improved measurement of PCU in research is a key component to improving the quality of the evidence.

Current PCU measurement options available for research include administrative data collected during routine prenatal care, substance use disorder (SUD) screening tools, non-validated questionnaires, and biological tests. Administrative data is problematic for research use because pregnant people are known to under-report prenatal substance use to physicians, due to fears of judgement and/or being reported to child services [14, 15]. Further, PCU screening is not standardized practice, occurs variably, and is seen as low-priority for the majority of obstetricians [16]. While Canadian studies using administrative data have reported PCU prevalence between 2 and 3% [2,3,4], emerging evidence from an anonymous Canadian survey indicates an 11% prevalence of PCU [17]. In a US study only 36% of women with cannabis-positive urine tests had reported their use to a care provider [18], indicating that the majority of those using cannabis prenatally may be misclassified in administrative data studies. Because those whose infants are at higher risk of PCU-related outcomes may also be less likely to report their PCU due to being younger, and socioeconomically disadvantaged, the current evidence may substantially underestimate the impacts of PCU on infant health outcomes.

While self-administered research questionnaires can garner more accurate reporting of substance (e.g. alcohol) use in pregnancy than screening in clinical settings [19, 20], non-standardized survey questions have limited utility for measurement of PCU, as they can unintentionally convey perceived bias against PCU. They often identify cannabis as an illicit drug and do not differentiate between medicinal and recreational use, which may increase response bias, as the stigma of recreational use in pregnancy is higher, and people may be more willing to disclose cannabis use if they can attest that it is for medicinal purposes [18, 21]. Survey questions are problematic for studying nuanced associations with maternal and infant health outcomes due to inconsistent assessment of frequency and timing of use, including changing patterns through pregnancy, and often lack dose measurement, or use subjective dose-terminology [9, 10, 22,23,24,25,26,27,28]. Further, most lack measurement of potentially important consumption modes aside from smoking (vapourized, edible, topical, second-hand) [22,23,24]. Standardized SUD screening tools aim to detect a diagnosable SUD, and do not measure patterns PCU throughout pregnancy [29]. Many screen for alcohol misuse alone [30,31,32,33], or combine all drugs into a single category [29] preventing the separate evaluation of cannabis-related health outcomes. Biological (urine/blood/saliva) cannabis-screeners exist, but are limited to detection within 1–5 weeks of use, or up to 30–60 days in exceptional circumstances after high-dose long-term use, depending on individual metabolism and test cut-off levels [34,35,36,37,38]. Given that pregnancy is a 40-week period, the utility of these tests is limited. Biological samples are also resource-intensive and stigmatizing to collect, limiting their utility for prospective research.

Our study developed and validated a novel PCU measurement tool, that addresses the limitations of current measurement methods, using a prospective patient-oriented approach to identify patient-perceived stigma, and reduce perceived sources of response bias, using a six-step, peer-reviewed process [39].

Methods

We recruited pregnant participants who used cannabis prior to or during pregnancy, between 08/2019 and 04/2020 for the mixed-methods tool development phase and an external validation cohort between 04/2022 and 12/2022. We used social media advertising targeted to women aged 18–45 years, residing in Alberta, with listed interests or group memberships related to pregnancy, parenting, and/or cannabis, and posted gender-neutral recruitment ads in an online trans-gender parent support group. Study recruitment letters were also mailed to patients who visited Alberta Health Services (AHS) clinics for pregnancy-related care in the preceding six months, identified using pregnancy-related codes in the National Ambulatory Care Reporting System (NACRS)(Appendix A). We included participants meeting target criteria who were < 36 weeks’ gestation at intake. Our target development sample size of 150 participants was sufficient to detect a Cronbach’s alpha of ≥ 0.9, with 95% confidence for test–retest reliability on a tool that contains up to 15 items [39], and our external convenience sample of 85 participants was feasible for conducting urine tetrahydrocannabinol (THC) bioassays with available resources.

Step 1 qualitative interviews

We conducted individual in-depth interviews (IDIs) with 8 participants who used cannabis prenatally, and 2 who used previously but not in pregnancy, purposively selected from the full sample (Fig. 1). Two research assistants with qualitative interview training conducted telephone interviews at a time chosen by the participant, about views and experiences with cannabis use in general, and during pregnancy. Prior to interviews, research staff contacted participants twice to discuss study details, including confidentiality, and establish a trusting relationship, by disclosing their own connections to the study topic, emphasizing a non-judgmental approach, and acknowledging all experiences shared were important. We recorded and transcribed interviews verbatim, and used deductive thematic analysis to extract pre-determined themes of: language around cannabis and its use; perceptions of stigma and judgement, and their relationships to truthful disclosure of use; patterns of use in pregnancy (timing, frequency of use, typical dose); motivations for use; and forms of cannabis used. Two team members experienced in qualitative methods coded salient content that corresponded to the pre-determined themes, collapsed codes into broader themes using constant comparison technique, discussion and consensus. Themes were then reported back to the qualitative participants via email for member-checking of the relevance and appropriateness to ensure truth value.

Fig. 1
figure 1

A: Study flow diagram: development phase (08/2019 to 04/2020). 1B) External Validation phase (04/2022 to 12/2022)

Step 2 devising items

We devised constructs for the tool, and item wording, to draft the tool based on strengths and shortcomings identified in existing SUD tools and published survey questions (Table 1), and on themes identified from interviews. We eliminated double-barreled questions, ambiguous wording and ensured a 6th grade reading level.

Table 1 Measurement domains of existing prenatal cannabis measurement options.

Step 3 Cognitive interviewing and bias reduction

Schwartz and Oyserman [42] propose five stages of cognition required to accurately self-report behaviour, each of which are susceptible to bias: 1. question understanding, 2. recalling relevant behaviour, 3. inference & estimation, 4. mapping answer onto response options, and 5. answer editing. Cognitive interviews (CIs) assess the cognitive processing of each item and its response options by a respondent as they read and respond to the tool. To identify points of bias at all five stages of cognition, we conducted individual CIs with an additional sub-sample of participants from the full sample, in which respondents were asked to think aloud, and share impressions, understanding, and reasoning related to each of the five stages of cognition, as we administered the newly developed tool [43]. CI participants were recruited sequentially via social media advertising. We iteratively revised items according to participant feedback prior to each subsequent interview, until no new suggestions for revision were made in two consecutive interviews (after interview 17).

Step 4 content validation

We then formatted the refined items into the CEPT online tool, compared to our critical appraisal of existing tools to ensure it captured all domains of measurement that are critical to prospective research cannabis in pregnancy, including timing, multiple modes of consumption, dose per use and frequency of use.

Step 5 convergent and discriminant validation

We then administered the finalized CEPT, along with the SURP-P [44] and 4Ps + [30] SUD screening tools via electronic questionnaire, to our remaining sample of 150 women. We measured concurrent validity of CEPT responses against detailed cannabis use information revealed during the interviews using Cohen’s weighted kappa. There is strong evidence that a high degree of truth value can be achieved with rigorous qualitative interview techniques [45]. We assessed discriminant validity of CEPT responses against SURP-P and 4Ps + tools using Cohen’s kappa. We calculated internal consistency on all CEPT cannabis consumption items using Chronbach’s alpha, acknowledging that it measures multiple constructs of cannabis exposure (i.e. any use, frequency, timing, dose, mode and reasons), rather than a single construct. However, we anticipated internal consistency among the CEPT items, as a person indicating use should have non-zero responses for dose, mode frequency and reasons for use. We then re-administered the tool to all development-phase participants (n = 150) 3 months later to assess test–retest reliability using a weighted Cohen’s kappa (Fig. 1a, b).

Step 6 external validation

In an additional external sample of 84 pregnant participants, we validated CEPT responses against urine bioassay measurements of 11-nor-9-carboxy-Δ9- THC, the most abundant THC metabolite (Fig. 2). Participants provided urine samples in sterile collection containers that were shipped frozen to our laboratory by pre-paid courier for analysis, within 24 h of completing an online questionnaire including the CEPT. We stored samples at -80°c until analysis. 2ml aliquots were taken from thawed samples, centrifuged and diluted (10x) with ultrapure water and assayed in duplicate using a 96-strip-well, THC Metabolite ELISA Kit (catalogue # 701570, Cayman Chemicals™, United States of America) according to manufacturer’s protocol, by team members blinded to CEPT results. No freeze–thaw cycles were allowed, and the lowest threshold of THC positivity detectable by the kits with 88% sensitivity (0.072ηg/ml) was used to classify those with PCU versus those without [40].

Fig. 2
figure 2

Cognitive interviews—bias reduction for the five stages of cognition. References:

Results

Our sample included 254 pregnant women (including those who used cannabis in pregnancy or in the past, and those who’d never used cannabis), 170 in the development phase and 84 in the external validation cohort. Despite efforts to recruit gender-diverse participants, none enrolled in the study. Specific sub-samples participated in various steps (Fig. 1). Table 2 summarizes participant characteristics at enrollment. Other sociodemographic characteristics of our sample were similar to the overall maternal population in Canada [46,47,48]. (Suppl. Figure 1).

Table 2 Participant characteristics at enrollment

Qualitative interviews

We completed qualitative data collection after 10 interviews, when we reached thematic saturation (no new themes emerged). Summaries of deductive themes and illustrative quotes are presented in Table 3.

Table 3 Deductive themes and illustrative quotes (n = 10)

Interviews informed bias-minimizing language and wording, tool structure, and appropriate response options for frequency dose and reasons for use. Themes drove the terminology and language used in the tool preamble and questions, guided tool structuring including inclusion of specific items (e.g. reasons for use) and response options, and determined the method of dose measurement. While legalization was perceived to have reduced stigma around cannabis use in general, perceptions of stigma against prenatal use were prevalent and thus important for consideration to encourage accurate disclosure. Several participants noted that including a response option to disclose cannabis consumption that occurred only prior to pregnancy recognition was crucial, and noted if this option was not present, they would not report use, even if they had consumed cannabis prior to pregnancy recognition. A challenging aspect of cannabis consumption measurement is identifying dose. IDI results identified a reliable method of categorizing approximate dose per use (i.e. comparing amounts to common objects, like food items or coins). Approximate THC/CBD content can be inferred based on mean THC content of dried cannabis available on the contemporary market (24%) [49], or the labeled concentration of products as reported by participants.(Supplementary file 2).

Cognitive interviews

We completed cognitive interviews with a separate sub-sample of 17 participants to assess and minimize points of bias through participant-led refinement (Fig. 2). This resulted in 9 sequential iterations of our initial draft tool. Perceived sources of bias at all five stages of cognition were identified, and changes made based on participant feedback.

Question understanding

Most draft-tool questions were well understood; however, some changes were made to improve clarity.

Recalling relevant behaviour

All participants indicated they were accurately able to recall details of first-hand cannabis consumption, including frequency, trimester of consumption, reasons, modes, and amounts per use. Nearly all participants (93%) indicated they were able to accurately recall the details of second-hand cannabis smoke or vapour exposure, aside from brief outdoor exposures. We amended the second-hand exposure question to include exposure while in the same room as the user.

Inference & estimation

Participants did not express concerns about inference or estimation on items measuring any consumption/exposure, or frequency, timing or reasons for use. Dose questions were adjusted to address perceived ambiguity and aid with estimation (Fig. 2).

Mapping answers onto response options

Several participants noted problems with initial dose-per-use options, increments for some product types were deemed too large for use in pregnancy, and we refined categories to align with appropriate ranges and increments.

Answer editing

No participants expressed the need to edit responses once the above clarifications and response-option edits had been made. Participants agreed the tool was non-judgemental, appropriate, and acceptable to them in pregnancy, and that it would elicit truthful responses, confirming face and content validity from the participant perspective.

The final CEPT measures weeks of gestation, second-hand exposure, partner use, trimester(s) of use, frequency, reasons, modes of consumption, and dose per use for each mode indicated. Frequency items repeat for each trimester, and dose items for each mode of use indicated. (Supplement 2).

Validity and reliability

Concurrent validity was excellent, with agreement between IDI participant CEPT responses and use reported in IDIs, ranging from 80 to 100%, and kappa values ranging from substantial (0.72) to perfect (1.0) [50] (Table 4). The timing of use construct showed the lowest level of agreement, which was expected, given that the second administration of the CEPT was at a later point in pregnancy. Use will be reported in more trimesters as a pregnancy progresses. A greater proportion of participants (40%) reported third-trimester use on the online CEPT, compared with IDIs (30%), which occurred 5–6 weeks prior, as many were not yet in the third trimester at the time of IDI.

Table 4 Concurrent validity of the CEPT vs. In-depth interview (n = 10)

Discriminant validation indicated poor agreement between two pregnancy SUD screening tools (4ps+ and the SURP-p) [33], with weighted Kappa values ranging from -0.31 to 0.36 indicating that the CEPT measures different constructs from those on the existing tools. (Table 5).

Table 5 Discriminant validity of CEPT versus SUD screening tools (n = 153)

Reliability testing showed excellent internal consistency (Chronbach’s alpha = 0.92) and substantial to near-perfect Kappa values (0.71–0.99) for test–retest reliability (Table 6). Although some patterns of use may be expected to change throughout pregnancy, the strong agreement between early and late pregnancy responses on the CEPT support that recall of cannabis consumption using this tool is reliable up to delivery.

Table 6 Test–retest reliability of the CEPT—3-month interval (n = 153)

CEPT-reported cannabis use was valid against urine-THC bioassay with 100% sensitivity, and 82% specificity, indicating that it has promise as an improved measure of PCU for research purposes (Table 7). All participants with positive urine bioassay disclosed that their last cannabis use was within 1 week of the urine sample being collected, indicating that the time elapsed since last use was the main driver of lower specificity.

Table 7 External validation

Discussion

The CEPT addresses the measurement limitations faced by previously published studies of PCU and maternal and infant health, which are highly susceptible to misclassification bias, have inconsistent findings, and are rated moderate at best by the US National Academies of Science Engineering and Medicine [10, 41]. It offers researchers a measurement option that has initially shown strong validity and reliability, that accounts for frequency, modes, reasons and estimated dose-per-use, and separately measures CBD and THC. The CEPT measures the frequency of use in each trimester separately to capture changing patterns of PCU. This enables an estimate of the total number of uses throughout pregnancy, based on the number of months, weeks or days in the given trimester (which can be adjusted for gestational age at delivery), and can then be multiplied by the estimated dose per use to generate an estimated total exposure over the pregnancy interval. It also measures frequency of second-hand exposure in each trimester, in addition to partner’s cannabis use. The CEPT thus enables a more complete picture of PCU patterns and a more nuanced estimate of total exposure over pregnancy than currently published studies have been able to capture. The patient-oriented methods we used are a strength; qualitative interviews can reveal aspects of health behaviour that contrast with the researcher’s underlying assumptions, that can interfere with the five stages of cognition leading to biased response [39, 42]. Prenatal alcohol use studies indicate that non-disclosure bias for prenatal substance use varies according to participant perceptions, and that question wording and structure informed by patient-oriented designs can improve validity [20, 51]. Further, the language, tone, and perceived intent of the tool are critical to non-biased response. Our qualitative interviews guided us in reducing perceived judgemental or stigmatizing language in our tool. The cognitive interviews further reduced sources of bias. While we may never be able to completely eliminate PCU reporting bias our patient-oriented development process was chosen because it is crucial for minimising perceived stigma, and ensuring a much lower probability of bias than the methods of measurement used in previous studies, including self-selection for biological samples, which do not allow the participant to explain their reasons for use, nor to judge the researachers’ motivations.

Although there remains no feasible gold-standard measure of prenatal cannabis consumption across the entire gestational period, the CEPT represents a useful tool for researchers to augment the quality and expand the scope of longitudinal research into the health outcomes associated with prenatal cannabis exposure. Our results support that it minimizes self-report bias, and its nuanced measurement of multiple dimensions of cannabis consumption may also reduce misclassification of very low exposures, allow for assessment of potential dose–response relationships, and enable the identification of critical windows of fetal exposure in future studies, that were not possible with previous crude measures.

Limitations

The CEPT is designed to measure behaviours over pregnancy, rather than to detect a condition or health state. Where medical screening tools can be validated against diagnostic tests or interview, validating a measure of behaviour is more complex. A limitation of our study is the lack of a true gold-standard measure of PCU for validation, which was financially infeasible for this study, as it requires multiple bioassays of at least weekly serial urine samples throughout gestation. However, we have preliminarily validated CEPT responses against a biological reference-standard, showing excellent sensitivity and high specificity. While we could not attain a true biological gold-standard in our study, the validation we conducted against single bioassays, and in-depth qualitative interviews remains rigorous. Biological levels of THC metabolite cannot be falsified, and the qualitative methods we employed result in high credibility and truth-value for qualitative results [52]. Further, interviews allowed for comparison of binary cannabis use as well as PCU patterns (modes, frequency, timing) that cannot be validated with a biological test. Although our study sample was adequate to detect a Cronbach’s alpha of ≥ 0.9 on a tool with up to 15 items, we acknowledge that our external bioassay validation sample (n = 84) was small, and differences in maternal age, marital status and household income between the development and validation smaples were noted. Future validation studies should include larger samples to confirm findings, and should explore whether the estimated dosage measured by the CEPT correlates to quantitative biological THC and CBD metabolite levels. Further, psychometric testing of the CEPT is recommended in future studies. It is also important to note that our tool and the validation conducted are limited to English-speaking individuals, and any translations will require further validation.

Conclusion

PCU and its associated health outcomes have been identified as priorities for research in Canada and the U.S. following cannabis legalization [9]. We recommend the CEPT as a rigorous, feasible, patient-oriented health research tool for measuring PCU. The use of the CEPT as a standardized measure of PCU in future studies can contribute substantial new knowledge about the implications of timing, dose, frequency, and modes of exposure for maternal, fetal, infant and child health, accounting for varying patterns of consumption and the strength and diversity of cannabis products available on the contemporary legal market. The CEPT has the potential to significantly improve measurement accuracy and thus the quality of research in this area, which can in turn inform evidence-based education, prevention and health policy to mitigate potential health risks.