Does PGT-A improve assisted reproduction treatment success rates: what can the UK Register data tell us?

Purpose To show how naïve analyses of aggregated UK ART Register data held by the Human Fertilisation and Embryology Authority to estimate the effects of PGT-A can be severely misleading and to indicate how it may be possible to do a more credible analysis. Given the limitations of the Register, we consider the extent to which such an analysis has the potential to answer questions about the real-world effectiveness of PGT-A. Methods We utilise the publicly available Register datasets and construct logistic regression models for live birth events (LBE) which adjust for confounding. We compare all PGT-A cycles to control groups of cycles that could have had PGT-A, excluding cycles that did not progress to having embryos for biopsy. Results The primary model gives an odds ratio for LBE of 0.82 (95% CI 0.68–1.00) suggesting PGT-A may be detrimental rather than beneficial. However, due to limitations in the availability of important variables in the public dataset, this cannot be considered a definitive estimate. We outline the steps required to enable a credible analysis of the Register data. Conclusion If we compare like with like groups, we obtain estimates of the effect of PGT-A that suggest an overall modest reduction in treatment success rates. These are in direct contrast to an invalid comparison of crude success rates. A detailed analysis of a fuller dataset is warranted, but it remains to be demonstrated whether the UK Register data can provide useful estimates of the impact of PGT-A when used as a treatment add-on. Supplementary Information The online version contains supplementary material available at 10.1007/s10815-022-02612-y.


Background
Preimplantation genetic screening for aneuploidy (PGT-A) covers a variety of procedures for detecting non-diploid embryos post culture but prior to implantation during an IVF treatment cycle. The rationale is that some aneuploid embryos are non-viable, and therefore selection of non-aneuploid embryos will improve success rates, a notion which has been challenged on several grounds [1] including the presence of mosaicism in the embryos [2] and the potential for an embryo to correct such errors [3,4]. A number of techniques, mainly requiring invasive biopsy of the embryo, have been employed [5] with whole genome sequencing of trophectoderm biopsies from blastocyst-stage embryos being the current standard practice The risks and benefits of PGT-A have been highly controversial with strong commercial interests influencing the debate [6][7][8].
There have been a number of randomised controlled trials addressing different scenarios for the use of PGT-A with different PGT-A techniques and different endpoints, many of which have been small and of poor quality [9]. Meta-analyses suggest PGT-A is not of proven effectiveness in terms of live birth rates, and older versions appear to have been detrimental overall [9]. Three recent trials using next-generation sequencing from blastocyst biopsies with cryopreservation all show small effect sizes favouring the control arms: OR 0.93 (0.69-1.3) [10]; OR 0.91 (0.51 to 1.63) [11]; OR = 0.75 (0.57 to 1.0) [12], the latter being a cumulative live birth outcome. In the UK, the Human Fertilisation and Embryology Authority (HFEA) has reviewed the evidence as part of their "traffic light" evaluation of IVF add-ons and concluded that PGT-A should have a "red" rating -"No evidence to show that it is effective and safe" [13].
As the UK regulator, the HFEA maintains a register of all ART treatments and their outcomes. This includes the use of PGT-A in each treatment cycle. The HFEA retains a full dataset, extracts of which are made available to bona fide researchers on application and subject to strict confidentiality restrictions. They also maintain publicly accessible datasets with a limited, highly anonymised, subset of the data which can be downloaded from their website [14]. Additionally, as a public body, it is subject to the UK Freedom of Information (FoI) legislation and so responds to legitimate and reasonable requests for summary information. Superficially, it may be appealing to utilise these 'real world' practice data to investigate the utility of PGT-A, although it is not clear whether the available data would enable a valid analysis.
Such a FoI request was recently used to obtain crude success rates for PGT-A and non-PGT-A treatment cycles and these data have been cited in publications, conferences and webinars [15], including a paper in this journal [16] to suggest that, despite the evidence from RCTs, PGT-A is an effective treatment add-on for IVF in clinical practice. The data used consisted solely of aggregated numbers of cycles, embryos transferred and live birth events (LBE) for 6 age bands over a 3-year period (2016-8). They are reproduced in the supplementary material (Table S1). Such an analysis is fundamentally flawed and potentially seriously misleading because: 1. The use of crude aggregate rates provides no contextual information and no ability to compare like with like. The FoI supplied data are not well defined in terms of inclusions and exclusions. 2. There was no adjustment for confounding except for banding by age group: this is particularly problematic when the treatment is only offered by selected clinics and is dependent on patient choice and ability to pay. Clinics offering PGT-A have potentially different patient demographics, treatment protocols and pathways as well as differing treatment eligibility criteria. 3. A comparison group of all non-PGT-A cycles includes many cycles that would likely not have had PGT-A, as there were insufficient embryos to biopsy. As the Register only records PGT-A treatments that were delivered, the comparator group must consist of treatments that could have had PGT-A if the option were available.

Aims
In this paper, we aim to show how naïve analyses of aggregated register data to estimate the effects of PGT-A can be severely misleading. We will utilise the publicly available HFEA Register data to determine if the observational data could in fact be compatible with the results of RCTs despite the serious limitations of these data.
Having looked at the publicly available data, we then aim to indicate how it may be possible to do a more epidemiologically credible analysis of the full HFEA Register data. Given the limitations of the register data, we consider the extent to which such an analysis has the potential to answer questions about the real-world effectiveness of PGT-A.

Dataset
We utilised the data publicly available for research on the HFEA website [14] and the 2015-6 and 2017-8 cohorts were downloaded on 27/4/2022. After minimal reformatting to harmonise variable names and coding, these were merged into a single dataset and the subset of IVF treatments commencing in 2016-2018 extracted.
From this dataset we extracted all IVF cycles. Cycles using donated eggs and PGT-M cycles were excluded. We included only cycles where the recorded intention was to proceed to implantation; this excludes treatments whose sole purpose was to create eggs or embryos for storage or donation.
These inclusion/exclusion criteria approximate to those used in creating the data provided for FoI. They differ somewhat due to differing definitions used by the HFEA for FoI requests and those provided in the public research database.
Additionally, we excluded 17 cycles with missing age, 11 fresh transfers with missing data on IVF or ICSI, 68 that were not identified as either fresh or frozen cycles and 55 recorded as both a fresh and a frozen cycle.

PGT-A cycles
All cycles recorded as using PGT-A were included.

Control cycles
PGT-A requires that the treatment proceeds as far as producing embryos for biopsy and the Register records embryo biopsies, not the intention to perform such. As PGT-A is used for embryo selection, it usually also requires that there be more than one embryo available. Therefore, PGT-A treatment cycles must be compared to non-PGT-A (control) cycles that could at least potentially have had PGT-A, if it were available. That is, we want a control group of cycles that progressed as far as having viable embryos after embryo culture that could have been biopsied and exclude cycles that failed to progress to that stage. Unfortunately, such detailed intermediate outcomes are not available in the Register, and we have to use surrogate variables to approximate to this restriction. We defined three such groups: 1. From the outcomes available, the subset of cycles that had an embryo transferred and also had an embryo stored (cryopreserved) for future storage do form a subset which could potentially have had PGT-A (such cycles must have had at least 2 assessable embryos at the time of transfer). We also limit this to fresh cycles, as this represents current practice at the time most treatments involved a primary transfer of 1 or 2 fresh embryos with frozen transfers being secondary. As the public data do not link fresh and frozen transfers, consideration of the frozen cycles was not possible using these definitions. Thus we defined a primary control group as having: This definition has the disadvantage that it excludes patients treated in centres where embryo storage is not available or in private centres where patients cannot, or choose not, to pay for storage.
2. As an alternative, we considered simply limiting the control cycles to those where a reasonable number of embryos were available for selection. As the data available were banded (lowest band pools 1-5 embryos), we selected those with > 5 embryos created. Thus, we define a secondary control group with: Data suggest that between 30 and 50% of embryos created will survive to the blastocyst stage, depending on the quality threshold used to assess a viable blastocyst [17]. Thus, 5 embryos created will, on average, give 2-3 blastocysts for PGT-A selection. This group will therefore exclude some cycles when sufficient blastocysts were created from fewer embryos; however, the nature of the banded data precludes an analysis of cycles with fewer embryos created. It will additionally include a number of cycles which suffered over-stimulation and necessitated a "freeze-all" and delayed transfer.
3. As a sensitivity analysis, we also considered a control group that consisted of cycles with > 5 embryos and an All analyses were conducted in the R statistical environment (v4.1).

Primary outcome
The primary outcome in the analyses here is a live birth event (LBE) in the specific treatment cycle. The inclusion criteria restrict this to cycles started with an intention to create a baby and the selection of cases and controls to those cycles that have progressed as far as having embryos considered suitable for transfer.

Outcomes not considered
Ideally, we would have used LBE in the first transfer of a sequence of cycles associated with each egg retrieval, mimicking one of the outcomes in the Cochrane review [9]. Whilst the Register does contain the information necessary to determine this, the publicly released data used here do not.
We would also like to consider cumulative LBE over all the transfers following an egg retrieval. This is potentially derivable from the full Register, but not available in the public dataset. Register data are not yet available beyond 2018, and therefore there is not yet long enough follow-up for cycles commenced in 2018.
Multiple birth outcomes would also form part of a comprehensive evaluation, both multiples per LBE and per cycle.
LBE per embryo transferred is not an appropriate comparison for a treatment option which reduces the number of cycles where embryo transfer takes place.

Models
We fitted logistic regression models for LBE with the following covariates: 1. PGT-A as a main effect plus age band and the covariates listed below. This provides estimates of the overall effect of PGT-A versus the controls 2. PGT-A, age band and their interaction along with the covariates listed below. This model was parametrised to give estimates of the PGT-A effect in each age band.
Model estimates are presented as OR with 95% CI for the overall effects and 99% CI for the individual age bands to account for multiple testing. A likelihood ratio test comparing the two models was used to test the statistical significance of Embryos.Created > 5 & Embryos.Transferred > 0 & Fresh the interaction term -that is, whether the effect of PGT-A varied with age group.

Covariates included
Patient characteristics: age, previous ART cycles (as categorised), previous LB (noting we only have births arising from ART), cause of infertility (5 binary variables). Age is known to be strongly non-linear and an ideal representation should capture the shape of this relationship rather more accurately that the limited age bands.
Treatment specification: IVF/ICSI, eSET (which should be pre-specified rather than consequential on the availability of embryos), use of hormonal stimulation (as this will affect the uterine environment).
Year: to allow for temporal effects.

Covariates not included
Fresh/frozen: is at least in part related to the use of PGT so should not be included. Number of embryos transferred: is dependent on the embryo selection so should not be included.

Covariates that could not be included
Pre-PGT outcomes: (eggs collected and embryos created) should be included as they are associated with the decision to conduct PGT-A, but are not available in this dataset for frozen cycles which form most of PGT cohort. Also, the public data aggregate those with 1-5 eggs/embryos so lacks the detail at the low end which is necessary to use this as a covariate.
Embryo stage at transfer: should be included but not available for most frozen cycles in this dataset, but is potentially available in the Register.
Centre and funding source: PGT-A is offered by relatively few centres to self-funded patients. Thus, the PGT-A-treated patients have potentially very different demographics, treatment protocols and pathways along with differing treatment eligibility criteria. Although not in the public dataset, these data are available in the full Register.
Ethnicity: This is in the Register but not the public dataset and may be incomplete or poor quality.
Duration of infertility: This is in the Register but not the public dataset and may be incomplete or of poor quality.

Patient and treatment characteristics
These are shown in Table 1 for the PGT-A and control datasets. The control groups comprise ~ 25% of the non-PGT-A cycles; 32% of the non-PGT-A cycles are the secondary frozen cycles; and 39% have few embryos created and are therefore excluded as controls. PGT-A was recorded in a small number of cycles where fewer than 6 embryos were created. Of the PGT-A cycles, 13% were associated with fresh cycles.

PGT-A v non-PGT-A
Comparing PGT-A with all non-PGT-A cycles, the dataset used here replicates the results derived from the FoI dataset (Supplementary Table S2). It is worth noting that the limited covariate adjustment does not make a big difference to the estimates, although as discussed above, many important covariates are not available in the public dataset.

PGT-A v controls
Taking a plausibly appropriate control group of cycles which could have had PGT-A, we see that the treatment effect of PGT-A is markedly different with overall OR for LBE of 0.82 (0.68-1.00) using the > 1 transferrable embryo controls ( Table 2) and 0.80 (0.64-0.99) using the > 5 embryos created controls (Table 3), both suggesting that PGT-A has a negative effect on LBE. The estimates are compatible with the estimates from the recent randomised trials.
There is evidence of an age by PGT-A interaction with PGT-A being less disadvantageous in older women. However, there are few of these in the dataset and the selection biases are very strong in this age group where NHS treatment is not available, so this has to be treated very cautiously.
A sensitivity analysis with a more tightly defined control group shows similar results (Supplementary Table S3).

Discussion
The estimates derived here demonstrate clearly that the use of crude aggregated national data which fails to account for confounding and treatment selection effects [16] is highly misleading. The analyses here demonstrate clearly that the conclusions can be reversed by selecting controls more carefully and making a clinically relevant comparison. We must stress that the analysis presented here is not intended to be definitive as the public data have too few data on some important confounders and do not allow consideration of the multiple cycles that comprise a full ART treatment. There is a suggestion (as in some of the trials e.g. [10]) that its efficacy in the first transfer increases with age which may be worthy of further study and further targeted RCTs.
Based on our experience with the public dataset, we can begin to define what an epidemiologically sound analysis to address the effectiveness of PGT-A using the HFEA register Table 1 Characteristics of the PGT-A cycles, non-PGT-A cycles and the two control groups    otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.