FormalPara Key Points

The claims-based algorithm for last menstrual period that used ICD-10-CM Z3A codes accurately estimated the documented date of last menstrual period.

The ICD-10-CM claims-based algorithms for spontaneous abortion, pre-eclampsia, premature delivery, and low birthweight performed well, with positive predictive values exceeding 70%.

Algorithms for major congenital malformations, placenta previa, and small for gestational age did not perform well and require further refinement.

1 Introduction

Administrative healthcare databases are increasingly used to evaluate medication safety during pregnancy [1,2,3,4]. These databases include claims submitted by healthcare providers for payment and records of patient encounters within healthcare systems, including pharmacy dispensing, inpatient and outpatient diagnoses, and procedures. As these databases are created primarily for administrative and billing purposes, rather than research, the validation of exposure and outcome variables defined by codes on claims against a gold standard, such as medical records, is essential to ensure the validity of research studies conducted using these data [5,6,7,8].

Algorithms to define pregnancy and infant outcomes based on claims using International Classification of Diseases, 9th Revision Clinical Modification (ICD-9-CM) codes have been developed and validated, including algorithms for pre-eclampsia [1, 7, 9], preterm birth [5], small for gestational age (SGA) [8, 9], and major congenital malformations (MCMs) [6, 10, 11]. However, previous validation studies of algorithms based on International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes have been limited to the outcomes of spontaneous abortion [12], preterm birth [12], stillbirth [12, 13], and a subset of MCMs [14]. Furthermore, the use of codes for gestational age (Z3A codes) introduced with ICD-10-CM may improve estimation of pregnancy start date, as prior algorithms have been based on the date of the observed pregnancy outcome [15,16,17].

The goal of this exploratory project was to develop and validate ICD-10-CM claims-based algorithms for key variables needed to conduct post-marketing pregnancy safety studies in claims databases. These variables included the estimated date of last menstrual period (LMP), which is necessary for establishing a pregnancy timeline, and multiple pregnancy outcomes that are used as key primary and secondary endpoints. The primary endpoints were MCMs and spontaneous abortion while the secondary endpoints consisted of placenta previa, pre-eclampsia, premature delivery, low birthweight, and SGA. The claims-based algorithms included a simple algorithm, defined as the presence of at least one claim for the outcome, and additional candidate algorithms based on patterns of services received with the goal of identifying a best-performing algorithm for each outcome.

2 Methods

2.1 Data Source

This study used data from the Optum Research Database (ORD), a claims database from a large US health insurer. As early as 1993, medical and pharmacy claims data are available for 70 million individuals with both medical and pharmacy benefit coverage. The study population was identified using Optum’s Dynamic Assessment of Pregnancies and Infants (DAPI), a process that includes a set of definitions and algorithms that are applied to claims data to identify pregnancies, outcomes, and link data from mothers and infants within the ORD [4, 18]. Due to the size of the ORD, there are approximately 200,000 new pregnancies identified each year within the database. All pregnancies are linked to infant(s) using a linkage algorithm that utilizes the infant’s date of birth, the estimated delivery date, and a family member ID. Of pregnancies that result in live births, approximately 85% of mothers can be linked to an infant [19].

2.2 Study Population

Women aged 18–55 years with an estimated LMP date (i.e., pregnancy start date) and pregnancy end date between 01 January 2016 and 31 December 2017 were identified. This time period was chosen because this validation study was conducted as background for surveillance that began in 2018. The population was limited to women who had continuous medical and pharmacy benefit coverage for a minimum of 6 months prior to their estimated LMP date (i.e., the baseline period) through to the end of pregnancy. Within this study population, the infant study population was identified among pregnancies for which the mother and infant data could be linked.

The ORD contains data from health plans that contract for “administrative services only”; access to medical records was not allowed for patients enrolled in these health plans. As this study required medical record review, women and infants who were enrolled in “administrative services only” plans were excluded from the study population, and the study outcomes were identified among those remaining (Fig. 1a, b).

Fig. 1
figure 1figure 1

a Cohort creation flow diagram for women/pregnancies. DAPI Dynamic Assessment of Pregnancies and Infants, LMP last menstrual period. aHave continuous medical and pharmacy benefit coverage for a minimum of 6 months (182 days) prior to and including the estimated LMP through the end of pregnancy. bEarliest LMP occurring on or after 01JAN2016; end of pregnancy ending by 31DEC2017. cThis step determines women for whom Optum can seek medical charts for the pregnancy and outcome, assessed at the pregnancy episode level. Pregnancies among women enrolled in administrative services only plans were excluded because access to medical records was not allowed for patients in these plans. The final study population consisted only of women with pregnancies for whom Optum could seek medical charts. b Cohort creation flow diagram for infants. aSee Figure 1a for details of the study population creation. bThis is “multi-gestation” pregnancies that have livebirth(s) and stillbirth(s) (e.g. twins, one liveborn and one stillborn; quadruplets, some liveborn). cIncludes stillbirths, ectopic, molar, and abortions (“spontaneous,” “elective,” and “other” per DAPI definitions). dLinked pregnancies are pregnancies for which the mother and infant data could be linked. Forty-one linked infants were from pregnancies ending in a non-livebirth, possibly representing misclassification of how these pregnancies ended. eThis step determines infants for whom Optum can seek medical charts. Infants enrolled in administrative services only plans were excluded because access to medical records was not allowed for patients in these plans. The final study population consisted only of infants for whom Optum could seek medical charts

Each pregnancy was followed from the day after the estimated LMP date through to the first of the following: 60 days after the date of end of pregnancy, disenrollment from the health plan, or end of the study period. Infants who were linked to their mothers were followed from the estimated date of delivery through to the first of the following: disenrollment from the health plan, or end of the study period.

2.3 Protection of Human Subjects

The study protocol was approved by the New England Institutional Review Board and all data access conformed to applicable Health Insurance Portability and Accountability Act policies.

2.4 Estimation of LMP Date

The algorithm to estimate LMP date utilized all available codes indicating weeks of gestation (Z3A.00 to Z3A.42, excluding Z3A.49). For each woman, the number of Z3A codes varied based on natural variability in number and timing of clinical visits. First, for each woman and each observed Z3A code, the LMP date was estimated by subtracting the weeks of gestation based on the Z3A code from the date of service in the claim (e.g., if a Z3A.10 code [10 weeks gestation of pregnancy] was observed on 10 July 2019, 10 weeks was subtracted from the date, resulting in a LMP date of 01 May 2019). These LMP date estimations were repeated for each available Z3A code recorded on a claim during pregnancy, resulting in multiple estimated LMP dates for each woman, which were sequentially sorted. To identify pregnancy episodes, LMP clusters were created by grouping all LMP dates within 6 weeks of each other (from the earliest estimated LMP forward using up to a 6-week window, which was chosen based on previous publications [15, 17]). The LMP date for each pregnancy episode was estimated using two methods: (1) LMP date from the first observed Z3A code (i.e., earliest service date with a Z3A code) within the pregnancy episode, and (2) median LMP date based on all Z3A codes within the pregnancy episode.

For pregnancies where Z3A codes were not observed, algorithms informed by published literature and refined following obstetrician-gynecologist input were utilized to estimate the corresponding LMP [15,16,17, 20]. For these pregnancies, the estimated LMP was calculated based on algorithms that assume different lengths of gestation for full term singleton births (39 weeks), multiple births (36 weeks), stillbirths (28 weeks), abortions (10 weeks), trophoblastic diseases (8 weeks), and ectopic pregnancies (8 weeks).

2.5 Identification of Outcomes

Pregnancies and infants were classified according to the presence or absence of an outcome of interest. Outcome groups were not mutually exclusive. To maximize sensitivity, each study outcome was identified using a simple algorithm, defined as the presence of at least one claim for the outcome during follow-up, based on ICD-10-CM diagnostic codes and Current Procedural Terminology (CPT) codesFootnote 1 (Table 1, Supplemental Table 1). Spontaneous abortion, placenta previa, pre-eclampsia, and premature delivery were identified at any point during the pregnancy from pregnancy (maternal) claims; MCMs, low birthweight, and SGA were identified following delivery from infant claims.

Table 1 International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) Diagnosis and CPT®a Codes to Estimate LMP Date and Identify Pregnancy and Infant Outcomes

2.6 Medical Record Procurement and Adjudication

From the outcomes identified using the simple algorithm, a subset of 750 charts based on an a priori number for each outcome (300 for MCMs, 200 for spontaneous abortion, and 50 for each of the remaining 5 secondary outcomes) was selected randomly for medical record procurement and adjudication. Among the 300 potential cases of MCMs randomly selected for the validation sample, 92 cases were subsequently removed because they only had diagnosis codes for minor congenital malformations (Supplemental Fig. 1). As MCMs are typically the outcome of interest in pregnancy safety studies, minor malformations were excluded from the algorithm. Following this exclusion, 658 randomly selected outcomes were included in the validation sample.

Medical records for women were reviewed for LMP date, spontaneous abortion, placenta previa, pre-eclampsia, and premature delivery; medical records for infants were reviewed for MCMs, low birthweight, and SGA.

The date of LMP was adjudicated by an epidemiologist. Date of LMP, and gestational age (along with service date) when present, was recorded from each medical record. The final date of LMP was adjudicated based on all information in the record, informed by recommendations from The American College of Obstetrics and Gynecology [21].

Two geneticists with expertise in teratology (for MCMs) and 2 obstetrician-gynecologists (for all other pregnancy and infant outcomes) reviewed medical records for each potential case. The presence or absence of the diagnosis in the record was independently adjudicated by the 2 clinicians. Consensus was sought in the case of discrepant results. The clinical adjudicators were instructed to use their own clinical judgment and the guidance provided in the Appendix (Supplemental Material). For each outcome, a patient was classified as follows: (1) a definite case if the medical records included dated documentation that met the criteria for an outcome; (2) a probable case if all the criteria were not met, but sufficient information was present; (3) a non-case if the information in the record did not indicate presence of the outcome; and (4) insufficient information if the record did not contain the specific reports and notes needed to make an adjudication decision.

2.7 Statistical Analysis

Baseline characteristics of the source population overall and for the validation sample were examined. The frequency and percentage were calculated for categorical variables and the mean and standard deviation were calculated for continuous variables.

To assess the validity of estimated LMP date, the number of days between the claims-based estimates and the adjudicated LMP date was calculated by subtracting the adjudicated LMP from the estimated LMP. The claims-based estimates included:

  1. 1.

    LMP date from the first observed Z3A code within the pregnancy episode;

  2. 2.

    Median LMP date based on all Z3A codes within the pregnancy episode;

  3. 3.

    LMP date estimated using literature-based algorithms (for the subset of pregnancies where Z3A codes were not observed).

For each outcome, candidate algorithms that had been specified a priori, were used and performance metrics developed in order to identify the best-performing claims-based algorithm. Additional algorithms were developed by reviewing claims profiles from the subset of chart-confirmed outcomes to identify patterns of claims associated with outcome confirmation, including type of service, clinician specialty, and temporality of claims [11]. The candidate algorithms that were evaluated included the following:

  • Algorithm 1: at least 1 claim (simple algorithm);

  • Algorithm 2: at least 2 claims on separate days;

  • Algorithm 3: 2 claims separated by a specific number of days (e.g., 7, 14, 28 days);

  • Algorithm 4: at least 1 claim from a specific provider specialty(ies) (e.g. hospital, obstetrics and gynecology);

  • Algorithm 5: at least 1 claim from a specific site(s) of care (e.g. inpatient, outpatient visit, professional visit);

  • Algorithm 6: additional algorithm for placenta previa based on timing of claim relative to delivery date.

The positive predictive value (PPV) and corresponding 95% confidence interval (CI) was calculated for the simple algorithm and each of the candidate algorithms for each outcome. The PPV was calculated as the sum of definite and probable cases divided by the number of potential cases reviewed, after excluding those whose charts lacked sufficient information to determine case status. The best-performing algorithm for each outcome was selected based on the PPV and the number of definite or probable cases identified by the algorithm, as a proxy for sensitivity. Sensitivity could not be determined since charts were only sought for claims-identified cases; potential cases without a claims-based diagnosis were not sought.

All analyses were conducted using SAS software, version 9.4 (SAS Institute Inc, Cary, NC).

3 Results

3.1 Study Population

Details of the cohort formation can be found in Fig. 1a, b. There were 53,956 pregnancy episodes among 50,624 women and 31,445 linked infants in the final study population. Descriptive characteristics of the study population are provided in Supplemental Table 2.

From the 53,956 pregnancy episodes, we identified 10,182 (18.9%) spontaneous abortion, 908 (1.7%) placenta previa, 2028 (3.8%) pre-eclampsia, and 1742 (3.2%) premature delivery outcomes using the simple algorithm (Table 2). Among 31,445 infants, 2600 (8.3%) had at least one MCM identified, 1711 (5.4%) were low birthweight, and 1273 (4.1%) were SGA based on the simple algorithm (Table 2).

Table 2 Positive predictive values of claims-based simple algorithms (Algorithm 1) for pregnancy and infant outcomes based on adjudicated medical records

Among the 658 randomly selected outcomes included in the validation sample, 72 cases could not be sent for chart procurement due to a priori provider refusal (i.e., the patients’ providers were on the ‘do not contact’ list). Consequently, medical records were sought for 586 cases, of which 398 (67.9%) were procured and 365 (62.3%) were adjudicated (33 charts with insufficient information were excluded) (Table 2). Descriptive characteristics were similar for women whose charts could be adjudicated and women whose charts could not be adjudicated (Supplemental Table 3).

3.2 Validation of Claims-based Algorithms

3.2.1 Last Menstrual Period

Among the 215 records procured for the validation of pregnancy outcomes, 10 were excluded from the analysis for LMP because, within the claims data, the pregnancy episode overlapped with another episode within the same woman. As such, 205 records were used for LMP validation. Of these, 157 pregnancy episodes had at least one Z3A code. Table 3 compares the estimated median LMP date based on all Z3A codes within the pregnancy episode to the adjudicated LMP date. The median absolute difference in days was 4.0 (IQR: 2.0–10.0) overall and the median LMP date was ± 7 days from the adjudicated LMP date among 65.0% of pregnancies. According to pregnancy outcome, median LMP date was ± 7 days from adjudicated LMP date among 34.3% of pregnancies with spontaneous abortion, 89.7% with premature delivery, 95.7% with placenta previa, and 90.6% with pre-eclampsia. The estimated median LMP date was later than the adjudicated LMP date for 126 pregnancies (80.3%).

Table 3 Number of days between estimated median date of LMP based on all Z3A codes in DAPI and adjudicated LMP, overall and according to pregnancy outcome

Results for estimated LMP based on the first observed Z3A code were similar (Supplemental Table 4a). The 48 pregnancies for which a Z3A code was not observed were primarily spontaneous abortion (95.8%), and the difference in days between estimated and adjudicated LMP was 16.0 (IQR: 8.0–25.0) (Supplemental Table 4b).

3.2.2 Pregnancy Outcomes

For the pregnancy outcomes, 318 medical records were sought and 215 (67.6%) records were procured: 125 (69.8%) spontaneous abortion, 26 (55.3%) placenta previa, 34 (70.8%) pre-eclampsia, and 30 (68.1%) premature delivery (Table 2).

Among the 125 medical records reviewed for spontaneous abortion, 100 (80.0%) were adjudicated as definite or probable cases (Table 4). The PPV for the simple algorithm (Algorithm 1) was 84.7% (95% CI 78.3, 91.2). The additional candidate algorithms also performed well; the highest PPV was observed for Algorithm 3b which required 2 claims separated by at least 14 days (92.6%, 95% CI 82.7, 100).

Table 4 Positive predictive values for candidate claims-based algorithms: spontaneous abortion

The PPVs for the simple algorithm (Algorithm 1) for each of the secondary pregnancy outcomes were: 13.0% (95% CI 0.0, 26.8) for placenta previa, 78.3% (95% CI 61.4, 95.1) for pre-eclampsia, and 92.3% (95% CI 82.1, 100.0) for premature delivery (Table 2). Positive predictive value estimates for all candidate claims-based algorithms developed for the secondary pregnancy outcomes are provided in Supplemental Table 5a–c. For placenta previa, the additional candidate algorithms also had low PPVs (best-performing PPV: 33.3%) (Supplemental Table 5a). Among 26 records for placenta previa, 15 had claims for complete placenta previa (6 with hemorrhage), one had a claim for partial placenta previa, and 10 had claims for low-lying placenta; all 3 confirmed cases had a claim for complete placenta previa with hemorrhage (Supplemental Table 6). For pre-eclampsia and premature delivery, the additional candidate algorithms performed as well or better than the simple algorithm, with some PPVs reaching 100%. (Supplemental Table 5b, c).

3.2.3 Infant Outcomes

Among the infant outcomes, 268 medical records were sought and 183 (68.3%) obtained: 130 (74.7%) MCMs, 27 (56.3%) low birthweight, and 26 (56.5%) SGA (Table 2).

Among the 130 medical records from infants identified as having a MCM by the simple algorithm (Algorithm 1), 54 (41.5%) were classified as definite cases, 1 (0.8%) as a probable case, 70 (53.8%) were non-cases, and 5 (3.8%) had insufficient information to determine case status (Table 5). Among the 70 non-cases, the adjudicators classified 51 (72.9%) as cases of a minor malformation. The PPV of the simple algorithm (Algorithm 1) was 44.0% (95% CI 35.3, 52.7), while Algorithm 3 which required 2 claims separated by at least 30 days had a PPV of 67.8% (95% CI 55.9, 79.7). The PPVs for MCMs by organ system identified by the simple algorithm (Algorithm 1) are presented in Supplemental Table 7.

Table 5 Positive predictive values for candidate claims-based algorithms: major congenital malformations

The PPV for the simple algorithm (Algorithm 1) for low birthweight was 96.3% (95% CI 89.2, 100.0) (Table 2) and the performance of additional candidate algorithms was similarly high (Supplemental Table 8a). For SGA, the PPV for the simple algorithm (Algorithm 1) was 34.8% (95% CI 15.3, 54.2) (Table 2); furthermore, the additional candidate algorithms all performed poorly (Supplemental Table 8b).

3.3 Best-performing Algorithms

Table 6 shows the proposed best-performing algorithms for each of the pregnancy and infant outcomes based on PPV ≥ 70.0%, and the number of potential cases and definite cases identified by the algorithm. The simple algorithm (Algorithm 1) performed best for spontaneous abortion, premature delivery, and low birthweight. For pre-eclampsia, the best-performing algorithm was the one that required at least one claim from an inpatient stay (Algorithm 5), which had a PPV of 85.7% (95% CI 70.7–100.0).

Table 6 Proposed claims-based best-performing algorithms for pregnancy and infant outcomes based on adjudicated medical record

4 Discussion

In this exploratory study, several claims-based algorithms for pregnancy and infant outcomes were developed and validated through medical record review and adjudication. We also developed claims-based algorithms that accurately estimated the date of LMP among pregnancies resulting in a livebirth. The primary outcomes of interest were MCMs and spontaneous abortion, but algorithms for other pregnancy outcomes, infant outcomes, and LMP date were also evaluated. A simple algorithm based on a single claim was used to identify each outcome and a best-performing algorithm was determined based on the performance characteristics of all candidate algorithms. The algorithms performed well for spontaneous abortion, pre-eclampsia, premature delivery, and low birthweight and poorly for MCMs, placenta previa, and SGA.

Last menstrual period date was estimated accurately with Z3A codes, although it was observed that estimated LMP tended to be a few days later (median: 4.0, IQR: 2.0–10.0) than adjudicated LMP. This is likely because Z3A codes denote completed weeks of gestation; for example, a woman who had a doctor visit when her fetus was gestational age 10 weeks, 3 days would receive a Z3A.10 (10 weeks gestation of pregnancy) code on her claim. Additionally, estimated LMP date was less accurate for pregnancies with a claim for spontaneous abortion. The first specific Z3A code is Z3A.08 (8 weeks gestation of pregnancy), consistent with timing of first prenatal visit [29]. Prior to 8 weeks gestation, Z3A codes are non-specific; for these codes, we assigned 4 weeks gestation. As most miscarriages occur prior to the 12th week of pregnancy [30], a woman with spontaneous abortion may have had only non-specific Z3A codes in her claims or no Z3A codes at all if she had not yet sought clinical care, resulting in less accurate estimation of LMP date. Among pregnancies without a Z3A code in this study, 96% had spontaneous abortion as the outcome, for which we assigned 10 weeks gestation. Nonetheless, it was observed that the estimated LMP for these pregnancies was approximately 2 weeks different (median: 16.0 days, IQR: 8.0–25.0) than adjudicated LMP.

The simple algorithm for MCMs had a PPV of 44.0%. Among the 130 cases identified as having a MCM by the simple algorithm, 51 (39.2%) were adjudicated as minor malformations only. A previous study conducted by Carman et al within the ORD using ICD-9-CM codes also observed a PPV of 47.8% [11] for the simple algorithm, but the PPV for the final algorithm was 80.4%, which is higher than the PPV of 67.8% observed for the best-performing algorithm in this study. In the Carman et al study, the candidate algorithms were developed using a separate, iterative process for each body system category. The body system-specific algorithms were then applied to the infant study population, resulting in an overall PPV that was improved. Similarly, in a paper that was published after completion of the current study, Kharbanda et al proposed separate algorithms for each organ system when they converted previously validated ICD-9-CM algorithms for MCMs to ICD-10-CM [14]. The algorithm PPVs were 80% or higher for most defects, although they only validated algorithms for seven targeted MCMs. In the current study, we developed and validated algorithms to identify any MCM. We subsequently examined the PPVs for the simple algorithm by body system, but there were small numbers of records adjudicated for several categories. The simple algorithm did not perform well for many of the body systems and additional candidate algorithms were not explored given the small number of records adjudicated within many body system categories. Nonetheless, given the promising results from Carman et al and Kharbanda et al, future work on algorithms for identifying MCMs should be directed towards developing and validating algorithms for additional MCMs, but doing so according to specific MCM categories. Additionally, it is necessary to refine the list of minor malformations for exclusion using available ICD-10-CM references [22,23,24] and clinical input.

For spontaneous abortion, the simple algorithm and candidate algorithms all performed well, with all PPVs approaching 85% or higher. Nonetheless, the PPVs observed in this study were slightly lower than the percent agreement between the claims-based algorithm for spontaneous abortion and physician adjudication of electronic medical records (EMRs) reported by Moll et al (100.0%, 95% CI 93.9, 100.0) [12]. One explanation for the better performance of the algorithm in Moll et al is the restriction of the validation sample to pregnancy episodes with a start date, which was estimated by presence of at least one pregnancy-related code, not including the code for the outcome. This restriction resulted in 22% attrition for the spontaneous abortion outcome. In contrast, we estimated LMP date using outcome-based algorithms [15,16,17, 20] for pregnancies where Z3A codes were not observed, so zero pregnancies were excluded due to lack of pregnancy start date. Although this approach may have resulted in identifying some spontaneous abortions that were not true cases, it is less likely that spontaneous abortions that occurred very early in pregnancy, prior to the first prenatal visit, were excluded. Additionally, in the Moll et al study, the claims-based algorithms were validated using the structured components from linked (EMRs). In validation studies, however, the gold standard for diagnosis is based on a review of the complete medical record (i.e., structured and unstructured fields) and it is uncertain whether structured fields alone provide the same gold standard.

Although the other pregnancy and infant outcomes were investigated as secondary endpoints in this study, the results obtained will inform the next steps in the development of claims-based algorithms for these outcomes. The low PPV of the placenta previa algorithm may be due in part to revision of the clinical definition for the outcome after adjudication had begun. The simple claims-based algorithm used to identify cases included all ICD-10-CM codes under O44, including codes for low-lying placenta and codes for any trimester of diagnosis. However, placenta previa early in pregnancy often resolves as the uterus enlarges [25]. During adjudication, the definition was restricted to clinically relevant cases, including placenta previa that persisted into the second or third trimester, or for whom there was indication of a caesarian section delivery due to bleeding in the medical record. To account for this revised definition, an algorithm that required at least one claim within 2 weeks of the delivery date was developed. This algorithm also had a low PPV, potentially due to few charts meeting this definition. Future studies should start with a simple algorithm based on claims for complete placenta previa or previa with hemorrhage from an inpatient stay close to delivery or a claim for a caesarian section.

Algorithms for low birthweight performed well (all PPVs close to 100%) while algorithms for SGA performed poorly. This seeming contradiction likely stems from more standardized definitions of low birthweight. In some infants diagnosed as SGA by treating physicians, the birthweight and gestational age in the chart indicated that the infant was not below the 10th percentile of the growth curves [26]. This may occur when intrauterine growth restriction (IUGR) was indicated during pregnancy; however, IUGR and SGA may not be equivalent [27]. A recent study that validated an ICD-9 claims-based algorithm for SGA reported a higher PPV, but their validation criteria included birthweight below the 20th percentile if accompanied by a diagnosis of SGA or IUGR [9]. Another potential explanation for the poor performance of the SGA algorithm is the inclusion of all ICD-10-CM P05 codes. Although this was done to improve sensitivity, it is possible that only a subset of codes may be relevant for the identification of true cases of SGA.

The PPV for the algorithm for premature delivery was 92%, which is higher than the percent agreement between the claims-based algorithm for preterm live birth and physician adjudication of EMRs reported by Moll et al (62.4%, 95% CI 52.0, 71.7) [12]. Nonetheless, the prevalence of premature delivery based on the algorithm in the current study (3.2%) was lower than the national estimate of 10% [28]. A likely explanation for the low prevalence is that we identified premature delivery from claims in the maternal record only; we did not examine claims for preterm birth in the infant record. Future studies of this outcome should consider using a combination of maternal and infant claims if possible.

This study had several strengths. The use of Optum’s large DAPI population with access to source medical records enabled us to investigate the performance of claims-based algorithms for several outcomes relevant to pregnancy safety, including those that are relatively rare. The number of pregnancies in the study period accrued quickly due to the large population size, providing results rapidly to inform public health and regulatory decision making.

Nonetheless, this study had several limitations. To avoid missing any potential cases, the simple algorithm used to identify outcomes was based on a single diagnosis code, but this does not always reflect presence of disease. The diagnosis may be incorrectly coded, as observed for SGA, or the diagnosis code may reflect rule-out criteria or a minor rather than a major form of the condition, as observed for MCMs. Although more rigorous algorithms were developed, improvement in PPV may have been limited because all records were initially selected based on the simple algorithm. Additionally, in this study, sensitivity could not be calculated because charts were only sampled for claims-identified cases.

While the medical records served as the gold standard for validation, they may be incomplete. For example, 30% of medical records for pre-eclampsia had insufficient information to determine case status, mainly due to missing blood pressure and lab information. Further, a small number of charts was adjudicated for the secondary outcomes. Although this was largely by design as this was an exploratory study, only 62.3% of medical records sought for this study were procured and adjudicated, which is lower than historical procurement rates in the ORD (70–85%) [11]. Studies that seek medical records for pregnancies and infants have inherent challenges compared to other studies. For example, some relevant personally identifiable information needed for requesting charts from providers, such as infant’s first name or social security number, may be missing which may impact procurement for charts of outcomes identified soon after birth. Oversampling potential cases should be considered when conducting similar studies to try and overcome this issue.

For validation of LMP date, the analyses were restricted to pregnancies with a claim for an adverse pregnancy outcome, which may limit generalizability. Nonetheless, the pregnancies with premature delivery, placenta previa, and pre-eclampsia in this study often had multiple Z3A codes observed during the pregnancy which likely improved estimation of LMP. The accuracy of estimated LMP date among uncomplicated pregnancies is likely to be similar due to the high probability that multiple Z3A codes would be observed within a full-term pregnancy.

5 Conclusion

In conclusion, the ICD-10-CM claims-based algorithm for spontaneous abortion performed well and can be used in administrative databases. The algorithms for LMP date and the secondary outcomes pre-eclampsia, premature delivery, and low birthweight also performed well, but it would be beneficial to validate these algorithms in other study populations using a larger number of procured charts to ensure their generalizability. Furthermore, the value of applying the algorithm for premature delivery within both maternal and infant claims should be assessed. ICD-10-CM algorithms for MCMs, placenta previa, and SGA did not perform well; these algorithms are not recommended for use in research studies without further refinement. Future algorithm refinement for MCMs should build upon validation studies that have developed body system-specific algorithms that have performed well while also honing the list of minor malformations for exclusion. Additionally, the possible benefits of utilizing other data sources (e.g., electronic health record, national registries) to study MCMs as an outcome should be considered. For placenta previa, the outcome definition that is clinically relevant for medication safety studies should be determined based on trimester of diagnosis, clinical characteristics (e.g., hemorrhage), and presence of a caesarian section at delivery. Subsequently, the diagnosis and procedure codes included in the claims-based placenta previa algorithm can be adjusted accordingly. Future work on algorithms for SGA may consider alternative definitions that utilize codes for low birthweight when accompanied by gestational age. By building upon the findings from this exploratory study and similar studies, it is likely that improved ICD-10-CM algorithms for MCMs, placenta previa, and SGA can be developed.