## Abstract

The *Ariel* Space Mission aims to observe a diverse sample of exoplanet atmospheres across a wide wavelength range of 0.5 to 7.8 microns. The observations are organized into four Tiers, with *Tier 1* being a reconnaissance survey. This Tier is designed to achieve a sufficient signal-to-noise ratio (S/N) at low spectral resolution in order to identify featureless spectra or detect key molecular species without necessarily constraining their abundances with high confidence. We introduce a *P*-statistic that uses the abundance posteriors from a spectral retrieval to infer the probability of a molecule’s presence in a given planet’s atmosphere in Tier 1. We find that this method predicts probabilities that correlate well with the input abundances, indicating considerable predictive power when retrieval models have comparable or higher complexity compared to the data. However, we also demonstrate that the *P*-statistic loses representativity when the retrieval model has lower complexity, expressed as the inclusion of fewer than the expected molecules. The reliability and predictive power of the *P*-statistic are assessed on a simulated population of exoplanets with H\(_2\)-He dominated atmospheres, and forecasting biases are studied and found not to adversely affect the classification of the survey.

### Similar content being viewed by others

Avoid common mistakes on your manuscript.

## 1 Introduction

During the past decade, the number of exoplanet discoveries has increased exponentially, bringing the total number of confirmed exoplanets to more than 5000 by mid-2022. Numerous space missions are contributing to the effort of detecting new exoplanets, such as Kepler [1, 2], TESS [3], CHEOPS [4], PLATO [5], GAIA [6], together with ground instrumentation such as HARPS [7], WASP [8], KELT [9], and OGLE [10]. Over time, the field emphasis has gradually expanded from the determination of bulk planetary parameters to the search for a deeper understanding of the true nature of exoplanets and their formation-evolution histories.

Multiband photometry and spectroscopy of transiting exoplanets are currently the most promising techniques for characterizing the composition and thermodynamics of exoplanet atmospheres [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30], as they allow us to effectively separate the signal of the planet from that of its host star. Observations in the near- to mid-infrared can probe the neutral atmospheres of exoplanets to study the signal from the rovibrational transitions of molecules [15, 31].

Current instrumentation has enabled this kind of atmospheric characterization only for a few tens of planets orbiting close to their host stars over a limited wavelength range [e.g. 17, 19, 32, 33]. A considerable contribution to exoplanetary science will come from the James Webb Space Telescope (*JWST*), launched in December 2021 [34], and *Ariel*. *JWST* provides broadband spectroscopy in the range of 0.6 to 28.5 micron of the electromagnetic spectrum, sufficient to detect all molecular species [31, 35,36,37,38,39].

### 1.1 Ariel and its Tiers

The Atmospheric Remote-Sensing Infrared Exoplanet Large-survey, *Ariel*, will launch in 2029 as the M4 ESA mission of the Cosmic Vision program [40, Ariel Definition Study Report^{Footnote 1}]. *Ariel* will conduct the first unbiased survey of a statistically significant sample of approximately 1000 transiting exoplanet atmospheres in the 0.5–7.8 \(\mu m\) wavelength range. Three photometers (VISPhot, 0.5–0.6 \(\mu m\); FGS1, 0.6–0.80 \(\mu m\); FGS2, 0.80–1.1 \(\mu m\)) and three spectrometers (NIRSpec, 1.1–1.95 \(\mu m\) and R \(\ge \) 15; AIRS-CH0, 1.95–3.9 \(\mu m\) and R \(\ge \) 100; AIRS-CH1, 3.9–7.8 \(\mu m\) and R \(\ge \) 30), provide simultaneous coverage of the whole spectral band. This broad spectral range encompasses the emission peak of hot and warm exoplanets and the spectral signatures of the main expected atmospheric gases such as H\(_2\)O, CO\(_2\), CH\(_4\), NH\(_3\), HCN, H\(_2\)S, TiO, VO [e.g. 15, 31]. *Ariel* will allow us to comprehensively understand the formation-evolution histories of exoplanets as well as to extend comparative planetology beyond the boundary of the Solar System.

After each observation, the resulting spectrum from each spectrometer is binned during data analysis to optimize the signal-to-noise ratio (S/N). Therefore, by implementing different binning options, the mission will adopt a four-Tier observation strategy, expected to produce spectra with different S/N to optimize the science return. Tier 1 is a shallow reconnaissance survey created to perform transit and eclipse spectroscopy on all targets to address questions for which a large population of objects needs to be observed. Tier 1 spectra have S/N \(\ge \) 7 when raw spectra are binned into a single spectral point in NIRSpec, two in AIRS-CH0, and one in AIRS-CH1, for a total of seven effective photometric data points. A subset of Tier 1 planets will be further observed to reach S/N \(\ge \) 7 at higher spectral resolution in Tier 2 and Tier 3 for detailed chemical and thermodynamic characterization of the atmosphere. Tier 4 is designed for bespoke or phase-curve observations [41].

### 1.2 Detecting molecules in Tier 1 spectra

Among the main goals of Tier 1 observations is to identify planetary spectra that show no molecular absorption features (because of clouds or compact atmospheres) and to select those to be reobserved in higher Tiers for a detailed characterization of their atmospheric composition and thermodynamics. Tier 1 observations, however, have a much richer information content even though the combination of S/N and spectral resolution might not be adequate to constrain chemical abundances with high confidence using retrieval techniques.

Adapting existing data analysis techniques or developing new methodologies can be essential to extract all relevant information from the Tier 1 data set. In a previous study, [42] were successful in demonstrating, using color-color diagrams, that Tier 1 observations can be used to infer the presence of molecules in the atmospheres of gaseous exoplanets, independently from planet parameters such as mass, size, and temperature. However, their method has an estimator bias that depends on the magnitude of the instrumental noise; a detailed characterization of instrumental uncertainties is required to remove the estimator bias before it can be used for quantitative predictions. In this follow-up paper, we develop a new method that is both reliable and unbiased to address the following question: *can we use Tier 1 transmission spectra to identify the presence of a molecule, with an associated calibrated probability?*. Hence, these calibrated probabilities can also be used to inform the decision-making process to select Tier 1 targets for re-observation in *Ariel*’s higher Tiers for detailed characterization.

Section 2 outlines the methodology used in this analysis. Section 2.1 describes our data analysis strategy for detecting a molecule in these spectra. Section 2.2 details our experimental data set, including the planetary population, forward model parameters, atmosphere randomization, and noise estimation. Section 2.3 summarizes the spectral retrievals performed, discussing the optimization algorithm and the priors used. Section 2.5 describes the data analysis tools used to evaluate the probability forecasts of the method. Section 3 details the results obtained in terms of forecast reliability (Section 3.1), predictive power (Section 3.2), and bias of the abundance estimator utilized (Section 3.3). Finally, Section 4 discusses all the results, and Section 5 summarizes the main conclusions of this analysis.

## 2 Methods

Tier 1 transmission spectra contain sufficient information to infer the presence of several atmospheric molecules [42], but Tier 1 observations are in general non-ideal for quantitative spectral retrievals in terms of molecular abundances, as they are required to achieve a S/N \(\ge 7\) when binned in only seven effective photometric data points in the 0.5-7.8 \(\mu m\) wavelength range [41]. Abundance posterior probabilities from retrievals can however still be informative and here we develop a new method to identify the presence of molecules in Tier 1 transmission spectra starting from these posteriors.

### 2.1 Analysis strategy

Given a marginalized posterior distribution of a molecular abundance, we compute an empirical probability, *P*, that the molecule is present in the atmosphere of a planet, with an abundance above some threshold, \(\mathbb {T}_{Ab}\), as:

where \(\mathcal {P}\) is the marginalized posterior distribution and *x* represents the abundance values. Thus, the predicted *P* depends on the assumed atmospheric model and the selected abundance threshold \(\mathbb {T}_{Ab}\). If the assumed atmospheric model is representative of the observed atmosphere, then a clear correlation (above noise) between *P* and the true abundance in Tier 1 data implies that *P* can be used to identify the most likely spectra that contain a molecule, providing a preliminary classification of planets by their molecular content. Thus, this *P*-statistic can be considered robust [43], even when \(\mathcal {P}(x)\) is too broad to constrain the abundance.

To test whether this method is sensitive enough, we need to simulate transmission spectra as observed in Tier 1, using an atmospheric model that includes a certain number of molecules. Then, we need to perform a spectral retrieval with the same atmospheric model and compare each input molecular abundance with the predicted *P* corresponding to that molecule. The test is successful if, for an agreed \(\mathbb {T}_{Ab}\), we recover a high *P* for each large input abundance and a low *P* for each small input abundance. To understand how well the method behaves under conditions similar to the *Ariel* reconnaissance survey, we repeat this test on a large and diverse planetary population.

In this study, we employ a simulated population of approximately 300 transmission spectra of H\(_2\)-He gaseous planets, which contain CH\(_4\), H\(_2\)O, and CO\(_2\) trace gases with randomized input abundances. Additionally, we introduce NH\(_3\) with randomized abundances as a nuisance parameter since its spectral features overlap with those of water and other molecules. We utilize NH\(_3\) to test the *P*-statistic’s efficacy and investigate the robustness of its predictions under various assumptions, such as the exclusion of NH\(_3\) from retrievals or the inclusion of additional molecules not present in the population.

Therefore, we can study whether this method provides reliable predictions under less favorable conditions when the assumed model is not fully representative of the observed atmosphere. This might provide some insight into how robustly the method can reveal the presence of a molecule in a real observation when the atmosphere is unknown. For this, we add or remove molecules from the retrieval model (hereafter, “fit-composition”) with respect to the simulated composition. Then, we perform different spectral retrievals, that use different fit-compositions, and compare the predictions obtained from the *P*-statistic with the input abundances.

#### 2.1.1 Model exploration

We consider three cases in our analysis. In the first case (referred to as \(\textrm{R}_0\)), we use an atmospheric model that includes CH\(_4\), H\(_2\)O, CO\(_2\), and NH\(_3\) as trace gases, which matches the composition used in the forward model generation of the population.

In the second case (referred to as \(\textrm{R}_1\)), we consider a fit-composition that includes only CH4, CO2, and H2O, omitting NH3. In this case, there is a possibility of inadequate representation of the data because NH\(_3\)’s molecular features could overlap with the observed features of other molecules (hence its adoption as a nuisance), particularly H\(_2\)O [31]. As a result, the retrieved values of *P* may not accurately reflect the input abundances of H\(_2\)O, leading to decreased reliability of the predictions.

In the third case (referred to as \(\textrm{R}_2\)), we expand the fit-composition beyond the input composition by including also CO, HCN, and H\(_2\)S. It should be noted that the spectral features of these additional molecules could also overlap with the observed features of the other molecules. For instance, CO and CO\(_2\) exhibit a spectral overlap around \(4.5\,\mu m\). Hence, even in this case, obtaining reliable predictions of the input composition may not be obvious.

Table 1 provides a summary of the molecules included in the fit-composition for each retrieval. For more detailed information on the retrievals performed, please refer to Section 2.3.

### 2.2 Experimental data set

As a simulated population, we use a planetary population generated using the Alfnoor software [42, 44]. Alfnoor is a wrapper of TauREx 3 [45] and ArielRad [46]. Given a list of candidate targets and a model of the *Ariel* payload, it automatically computes simulated exoplanet spectra as observed in each *Ariel* Tier.

Specifically, we use a subset of the POP-I planetary population of [42]. POP-I consists of 1000 planets from a possible realization of the *Ariel* Mission Reference Sample (MRS) of [41]. That MRS (hereafter, MRS19) comprises known planets in 2019 from NASA’s Exoplanet Archive and TESS forecast discoveries. Here we ignore the TESS forecasts, thus obtaining a sub-population of around 300 planets, that we label POP-Is. Using POP-Is planets ensures that, in principle, we can compare our results with those of [42].

Figure 1 shows that POP-Is comprises a diverse sample of planets mostly with large radii (\(\gtrsim \) 5 \(\textrm{R}_{\oplus }\)), short orbital periods (\(\le \) 4/5 days), warm to hot equilibrium temperatures (500 – 2500 \(^{\circ } K\)) and stellar hosts with different magnitudes in the K band of the infrared spectrum (8 – 12 \(m_K\)). Compared to the parameter space sampled by the entire POP-I, this data set has more occasional statistics on smaller and longer-period planets around brighter stars.

The detailed properties of POP-I (and therefore POP-Is) are discussed in [42] and briefly summarized here. The forward model parameters are randomized to test diverse planetary atmospheres. The baseline atmosphere is a primordial atmosphere filled with H\(_2\) and He with a solar mixing ratio of He/H\(_2\) = 0.17. The vertical structure of the atmosphere comprises 100 pressure layers, uniformly distributed in log space from 10\(^{-4}\) to 10\(^{6}\) Pa, using the plane-parallel approximation. The equilibrium temperature of each planet is randomized between \(0.7 \times T_{p}\) and \(1.05 \times T_{p}\), where \(T_{p}\) is the equilibrium temperature of the planet listed in MRS19; the atmospheric temperature-pressure profile is isothermal. Constant vertical chemical profiles are added for H\(_2\)O, CO\(_2\), CH\(_4\), and NH\(_3\), with abundances randomized according to a logarithmic uniform distribution spanning 10\(^{-7}\) to 10\(^{-2}\) in Vertical Mixing Ratios (VMR). Randomly generated opaque gray clouds are also added with a surface pressure varying from 5\(\times \)10\(^{2}\) to 10\(^{6}\) Pa to simulate cloudless to overcast atmospheres. Table 2 summarizes the randomized parameters of the POP-I forward models. For each planet, POP-I contains the raw spectrum binned at each *Ariel* Tier resolution (“noiseless spectra”), the associated noise predicted by the *Ariel* radiometric simulator, ArielRad, for each spectral bin, and the number of transit observations expected to reach the Tier-required S/N. To simulate an observation, we scatter the noiseless spectra according to a normal distribution with a standard deviation equal to the noise at each spectral bin. The “observed spectra” data set is built by repeating this process for each planet in POP-Is. As in [42], the Tier 1 data used in this work are binned on the higher resolution Tier 3 spectral grid: R = 20, 100, and 30, in NIRSpec, AIRS-CH0, and AIRS-CH1, respectively. The noise is that of Tier 1, which yields a S/N > 7 if data were binned on the Tier 1 spectral grid. This is to prevent the loss of spectral information that may occur in binning.

### 2.3 Retrievals summary

To perform the retrievals, we use the TauREx 3 retrieval framework [45], the same used to generate the raw POP-Is spectra. In the retrieval model, we include opaque gray clouds, pressure-dependent molecular opacities of various trace gases, Rayleigh scattering, and Collision-Induced Absorption (CIA) of H\(_2\)-H\(_2\) and H\(_2\)-He. Table 3 reports a referenced list of CIA and all molecular opacities used in this study.

The free parameters of the retrievals are the radius and mass of the planet, as well as the molecular mixing ratios, as listed in Table 4. We use broad logarithmic uniform priors for the molecular abundances, ranging from 10\(^{-12}\) to 10\(^{-1}\) in VMR. For the mass and radius of the planet, we select uniform priors of 20\(\%\) and 10\(\%\) around the respective values listed in MRS19. The gray cloud pressure levels are not included as free parameters in the retrieval because of their degeneracy with other parameters such as the radius [60].

We set the evidence tolerance to 0.5 and sample the parameter space through 1500 live points using the Multinest algorithm^{Footnote 2} [61, 62]. We disable the search for multiple modes to obtain a single marginalized posterior distribution of each molecular abundance to insert in Eq. 1.

We then perform the three different retrievals (respectively R\(_0\), R\(_1\), and R\(_2\)) described in Section 2.1 on each POP-Is planet. We use the Atmospheric Detectability Index (ADI) [19] to assign statistical significance to the results of these retrievals. Given the Bayesian evidence of a nominal retrieval model, \(E_{N}\), and of a pure-cloud/no-atmosphere model, \(E_{F}\), the ADI is:

ADI is a positively defined metric, equivalent to the log-Bayesian factor [63, 64] where log(\(E_{N}\)) > log(\(E_{F}\)). To compute \(E_{F}\), we perform an additional retrieval for each planet with a flat-line model with the planet radius being the only free parameter.

### 2.4 Abundance threshold

We utilized the marginalized posteriors to estimate the *P*-statistic using an abundance threshold of \(\mathbb {T}_{Ab} = 10^{-5}\), which is considered “molecular-poor” according to the definition by [42]. This threshold is higher by 1-2 orders of magnitude compared to the Tier-2 detection limits reported by [44]. The “molecular-poor” condition is met for approximately 40% of the atmospheres due to the randomization boundaries set for each molecule (see Table 2). The ability to detect a molecule depends on factors such as opacities, correlations among molecules, and noise in the measured spectrum. Therefore, \(\mathbb {T}_{Ab}\) can be optimized for each molecule in future work, although we applied the same abundance threshold for all in this pilot study.

### 2.5 Data analysis tools

The *P*-statistic can be used to reliably classify planets for the presence of a molecule with an abundance above \(\mathbb {T}_{Ab}\) when *P* correlates with the *Ab* true value. The stronger the correlation above noise fluctuations, the larger the predictive power. Because this classification is binary and *P* is defined in the range \(0 \rightarrow 1\), we can use standard statistical tools such as calibration curves and ROC curves [65, 66] to evaluate the performance of this method in revealing the presence of molecules and in selecting Tier 1 targets for higher Tiers. These curves are routinely utilized by the Machine Learning community^{Footnote 3}, as they present the forecast quality of a binary classifier in a well-designed graphical format.

#### 2.5.1 Calibration curves

A calibration curve [e.g. 66] plots the forecast probability averaged in different bins on the horizontal axis and the fraction of positives, in each bin, on the vertical axis (see Fig. 2 for a generic example). In this work, the fraction of positives is the fraction of POP-Is planets with true abundance larger than \(\mathbb {T}_{Ab}\), and the forecast probability is the corresponding *P*-statistic. Calibration curves provide an immediate visual diagnosis of the quality of binary classifier forecasts and the biases that the forecasts may exhibit.

For well-calibrated predictions, the forecast probability is equal to the fraction of positives, except for deviations consistent with sampling variability. Therefore, the ideal calibration curve follows the 1:1 line. Miscalibrated forecasts can be biased differently depending on whether the calibration curve lies on the left or on the right of the 1:1 line. A curve entirely to the right of the 1:1 line indicates an over-forecasting bias, as the forecasts are consistently too large relative to the fraction of positives, as seen in the calibration curve of Classifier 1 in Fig. 2. On the contrary, the calibration curve of Classifier 2 shows the characteristic signature of under-forecasting, being entirely on the left of the 1:1 line, indicating that the forecasts are consistently too small relative to the fraction of positives. There may also be more subtle deficiencies in forecast performance, such as an under-confident forecast, with over-forecasting biases associated with lower probabilities and under-forecasting biases associated with higher probabilities, as seen in the calibration curve of Classifier 3.

Calibration curves paint a detailed picture of forecast performance, often summarized in a scalar metric known as the Brier Score [B-S, 68], which is defined as the mean square difference between probability forecasts and true class labels (positive or negative); the lower the B-S, the better the predictions are calibrated. From Fig. 2, we see that Classifier 3 achieves the best B-S, although the forecasts are not well calibrated. In general, uncalibrated forecasts can be calibrated using calibration methods such as Platt scaling and Isotonic regression [69,70,71].

#### 2.5.2 ROC curves

Given the predicted probabilities of a classifier, and a selected probability threshold \(\mathbb {P}\), the number of True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN), are defined in Table 5.

A binary classifier with high predictive power assigns larger *P* to positive observations (true label “Yes") and smaller *P* to negative (true label “No"). This maximizes TP and TN, and minimizes FP and FN.

A ROC curve [e.g. 66] is a square diagram that illustrates the predictive power at different values of the probability threshold \(\mathbb {P}\). It plots the False Positive Rate (FPR) on the horizontal axis and the True Positive Rate (TPR) on the vertical axis (see Fig. 3 for a generic example), defined as:

FPR and TPR are commonly known as “false alarm” and “hit” rates. ROC curves are constructed by calculating the TPR and FPR from the number of TP, TN, FP, and FN as \(\mathbb {P}\) decreases from 1 to 0. The ideal classifier minimizes the FPR while maximizing the TPR; thus, its ROC curve is the unit step function. On the other hand, the worst possible classifier is a random classifier with a ROC curve along the 1:1 line. Real-world classifiers have intermediate ROC curves ranked by how close they are to the unit step function. As seen in Fig. 3, Classifier 3 exhibits the highest predictive power, as the corresponding ROC curve arcs everywhere above the ROC curves for Classifiers 1 and 2.

ROC curves portray a detailed picture of predictive power, often summarized in a scalar metric known as the Area Under the Curve (AUC), the fraction of the unit square area subtended by a ROC curve. The higher the AUC, the higher the predictive power. The ideal classifier has \(\text {AUC} = 1.0\); the random one has \(\text {AUC} = 0.5\). From Fig. 3, we see that, as expected, Classifier 3 also achieves the largest AUC.

ROC curves can also be used to select the optimal classification threshold \(\mathbb {P}\), which roughly corresponds to the position on the curve where the TPR cannot be raised without significantly increasing the FPR. For example, as seen in Fig. 3, the optimal \(\mathbb {P}\) for Classifier 3 is around 0.5, where it achieves a TPR of nearly 0.9 at a low FPR of approximately 0.1. Reducing \(\mathbb {P}\) to 0.4 is not advantageous, as it only increases the TPR to approximately 0.95, at the expense of increasing the FPR to almost 0.3.

### 2.6 Using calibration and ROC curves

Using calibration curves and the B-S metric, we can immediately diagnose the forecast quality of the *P*-statistic and its potential biases. Suppose that the forecast probability *P* matches the fraction of planets with input abundances greater than \(\mathbb {T}_{Ab}\) (fraction of positives) in each probability bin. In that case, the prediction of the method is well-calibrated. Moreover, we can compare the forecast quality achieved for different molecules using the B-S metric. If the forecasts are not well calibrated, we can infer which kind of bias affects the predictions of the method by inspecting the shape of the calibration curve. If the forecasts show an over-forecasting bias (as in the example of Classifier 1, Fig. 2) and therefore incorrectly classify a fraction of planets as bearing a molecule, too many Tier 1 planets may be selected for re-observation in higher Tiers, resulting in less optimal scheduling of observations. On the contrary, an under-forecasting bias (as in the example of Classifier 2, Fig. 2) may imply that fewer Tier 1 planets than possible would be scheduled for re-observing in higher Tiers.

Using ROC curves and the AUC metric, the power of the *P*-statistic to predict the presence of molecules can be assessed. The closer the ROC curve approaches the unit step function (AUC \(\simeq 1\), Fig. 3), the higher the predictive power. Moreover, we can directly compare the predictive power achieved for different molecules by analyzing the shape of the corresponding ROC curves and the AUC values.

The shape of the ROC curve provides a way to select the optimal classification threshold, \(\mathbb {P}_{*}\), for the problem under study. For instance, \(\mathbb {P}_{*}\) can be chosen in a trade-off process that maximizes the TPR while keeping the FPR at an acceptable low value.

This choice can aid the selection of Tier 1 targets for re-observation in a higher Tier: a large FPR would result in a poor allocation of observing time while a low TPR would result in a reduction of observational opportunities. It can also benefit population studies where one might need to track the presence of certain molecules across families of planets and extrasolar systems. These types of studies are outside the scope of this work, but can profit from the methodology developed here.

## 3 Results

As detailed in Section 2.1, we designed a method based on the *P*-statistic to reveal the presence of a molecule in Tier 1 spectra. In the following sections, we use the statistical tools described in Section 2.5 to show the performance of the *P*-statistic in predicting the presence of several molecules in our simulated planetary population. In particular, in Section 3.1, we use calibration curves to assess the reliability of the predictions of the method and related biases, while in Section 3.2, we use ROC curves to assess the predictive power of the method and discuss the optimal classification threshold, \(\mathbb {P}_{*}\). In Section 3.3, we use the median abundance as an estimator of the true abundance and investigate its biases in the low S/N regime to explain the biases observed in the calibration curves.

### 3.1 Detection reliability

#### 3.1.1 Retrieval \(\textrm{R}_0\)

Figure 4 shows the analysis performed to evaluate the reliability of the method when using the abundance posteriors of the retrieval \(\textrm{R}_0\), which uses the same atmospheric composition as the one used in the generation of the simulated atmospheres (see Table 1). The subplots in each column share the same horizontal axis with the predicted probability *P* that a molecule is present with an input abundance, \(Ab_{mol}\), above the selected abundance threshold \(\mathbb {T}_{Ab} = 10^{-5}\) (see Section 2.4). The figure reports the results for CH\(_4\), H\(_2\)O, and CO\(_2\), shown from left to right, respectively.

The top row displays histograms of the *P*-statistic realizations, which exhibit a bimodal distribution. Two peaks are observed in the distribution, with one located at \(P \approx 0.2\) and the other at \(P \approx 0.8\), with the former being more prominent. Additionally, a valley is observed at intermediate values, with \(P \approx 0.5\).

The middle row shows the correlation between the predicted probabilities on the horizontal axis and the input abundances of each molecule on the vertical axis. We take a rough measure of the correlation by calculating the angular coefficient of the data points from a linear fit. These coefficients are listed in Table 6. The lower right quadrant of these diagrams (\(P \gtrsim 0.5\) and \(Ab_{mol} < 10^{-5}\)) is almost empty of data points, indicating that whenever the method predicts a high *P*, the corresponding input abundance is likely higher than \(\mathbb {T}_{Ab}\). However, not all planets with an input abundance greater than \(\mathbb {T}_{Ab}\) are associated with a high *P*, as the upper left quadrants of these diagrams (\(P\lesssim 0.5\) and \(Ab_{mol} > 10^{-5}\)) are not empty of data points.

The bottom row shows the calibration curves computed for each molecule; each curve is shown with a bootstrap confidence interval calculated using 1000 bootstrap samples. That is, following [72], we randomly remove \(\sim 1/e \approx 36\%\) of the data from each of these samples and replace them by repeating some randomly chosen instances of the ones kept. For each molecule, we calculate the B-S using the \(\texttt {brier\_score\_loss}\) method of \(\texttt {sklearn.metrics}\) [67], with the associated uncertainty estimated from the same bootstrap samples. Table 6 lists the B-S values obtained.

The calibration curves show an under-forecasting bias (curve to the left of the 1:1 line; see Section 2.5.1) especially associated with larger forecast probabilities, giving a fraction of positives \(\approx 1.0\) for \(P \gtrsim 0.6\). On the contrary, the probabilities are better calibrated for \(P \lesssim 0.4\). From the B-S values (less accurate forecasts receive higher B-S), we see that CH\(_4\) is the best-scoring molecule, probably due to its strong absorption spectral features.

It is possible that the observed under-forecasting of the calibration curves and the bimodality of the *P*-statistic distribution are both related to the sampling of the parameter space. This is briefly discussed further in Section 4.2.

#### 3.1.2 Retrieval \(\textrm{R}_1\)

Figure 5 shows the same analysis for the retrieval \(\textrm{R}_1\), which includes only CH\(_4\), CO\(_2\), and H\(_2\)O in the fit-composition and excludes NH\(_3\), although this molecule is present in the data set (see Table 1).

Comparing the histograms from the top row of this figure with those obtained for the retrieval \(\textrm{R}_0\) (Fig. 4), we notice a decrease in the forecast frequency at low *P*, especially for CH\(_4\) and H\(_2\)O, with a reduced peak at *P* around 0.2. On the contrary, high values of *P* are more frequent, enhancing the peak at *P* around 0.8: for CH\(_4\), more than \(30 \%\) of the data set receives *P* between 0.8 and 0.9. These are samples with high input abundance.

The plots in the middle row show an increase in the scatter in the data points compared to \(\textrm{R}_0\). In this case, we find a decrease in the correlation between *P* and the input abundances, and the angular coefficients of the linear fit are reported in Table 6. Planets that receive \(P \gtrsim 0.8\) have high input abundance, \(Ab_{mol} > 10^{-5}\).

The calibration curves for H\(_2\)O and CH\(_4\) in the bottom row are, within the uncertainties, closer to the 1:1 line than for \(\textrm{R}_0\), both for high and low forecast probabilities. Although this might appear closer to the ideal behavior, it could be misleading. The B-S is higher than for \(\textrm{R}_0\), because the mean squared difference between the forecasts and true class labels is larger. This is visualized in the middle plots: for \(Ab_{mol} < 10^{-5}\) (negative true class label), there are many forecast values with \(P > 0.5\). In other words, the correlation between the *P*-statistic and the true input abundances is weaker. In contrast, the entire CO\(_2\) calibration curve shows the signature of under-forecasting. The curve for CO\(_2\) is almost the same as for \(\textrm{R}_0\), likely because the missing NH\(_3\) affects less the CO\(_2\) abundance posteriors. On the other hand, the overlap of NH\(_3\) with H\(_2\)O but also CH\(_4\) makes the model used in the retrieval less suitable to describe the data.

The reduced correlation between probability forecasts and input abundances, as well as the higher B-S values, suggest that excluding NH\(_3\), despite its presence in the data set, leads to less representative abundance posteriors. However, predictions for CO\(_2\) are less affected, possibly because this trace gas has less spectral overlap with NH\(_3\) compared to H\(_2\)O or CH\(_4\).

#### 3.1.3 Retrieval \(\textrm{R}_2\)

The results of the same analysis for the retrieval \(\textrm{R}_2\), which includes CO, HCN, and H\(_2\)S as additional molecules to the fit-composition (see Table 1) are very similar to those of \(\textrm{R}_0\) (see Section 3.1.1). Therefore, we refer the reader to Table 6 that summarizes the results for the correlation between predicted probabilities and input abundances, along with the B-S values, and to Fig. 13 in Section A of the Appendix.

### 3.2 Predictor assessment

#### 3.2.1 Retrieval \(\textrm{R}_0\)

Figure 6 shows the analysis performed to assess the predictive power of the *P*-statistic (ability to maximize TP and TN while minimizing FP and FN) when using the abundance posteriors from the retrieval \(\textrm{R}_0\). The figure reports the results for CH\(_4\), H\(_2\)O, and CO\(_2\), shown in different columns from left to right, respectively.

The upper row shows the calculated ROC curves for each molecule. Each curve is reported with a bootstrap confidence interval calculated using 1000 bootstrap samples, with the same random removal and replacement of the data as discussed in Section 3.1, involving \(1/e \approx 36\%\) of the data. For each molecule, we calculate the AUC using the \(\texttt {roc\_auc\_score}\) method of \(\texttt {sklearn.metrics}\) [67], with the associated uncertainty estimated from the same bootstrap samples. The AUC values thus obtained are collected in Table 7. For all molecules, the ROC curves are close to ideal behavior (curve near the unit step function, see Section 2.5.2), showcasing that the *P*-statistic has significant predictive power. Consequently, the corresponding AUC values are \(> 0.9\), with no considerable variation between molecules, implying similar predictive power.

For each molecule, the bottom row shows the number of TP, TN, FP, and FN (see Table 5), used to construct the ROC, versus the probability threshold \(\mathbb {P}\). Also shown are the associated confidence intervals estimated from the same bootstrap samples. These diagrams provide information on how the predictive power of the method changes as \(\mathbb {P}\) varies from 1 to 0 and aid in the selection of the optimal classification threshold \(\mathbb {P}_{*}\) (see Section 2.6).

Given the randomization of trace gas abundances in the forward model (\(10^{-7}\) to \(10^{-2}\) on a uniform logarithmic scale, see Table 2), and the selected abundance threshold (\(\mathbb {T}_{Ab} = 10^{-5}\)), the data set contains \(\sim 60 \%\) positive observations and \(\sim 40 \%\) negative observations. By definition, for \(\mathbb {P} = 1\), the number of positive forecasts, \(\mathrm {N}_{\mathrm{P}} = \mathrm{TP} + \mathrm{FP}\), is zero, and the number of negative forecasts, \(\mathrm {N}_{\mathrm{N}} = \mathrm {TN} + \mathrm {FN}\), is equal to the size of the data set. Therefore, at this probability threshold, \(\textrm{TN} \simeq 40 \%\) and \(\textrm{FN} \simeq 60 \%\). As \(\mathbb {P}\) decreases, \(\mathrm {N}_{\mathrm{P}}\) increases (TP and FP increase), while \(\mathrm {N}_{\mathrm{N}}\) decreases (TN and FN decrease). For \(\mathbb {P} = 0\), \(\mathrm {N}_{\mathrm{N}}\) is zero and \(\mathrm {N}_{\mathrm{P}}\) is equal to the data set size; at this classification threshold, \(\textrm{TP} \simeq 60 \%\) and \(\textrm{FP} \simeq 40 \%\).

In those cases where there are no external constraints on which misclassification is more bearable (FP or FN), the intersection of their curves gives an optimized classification threshold \(\mathbb {P}_{*}\).

From this intersection, we obtain \(\mathbb {P}_{*} \approx 0.3\) for all molecules. For confirmation, we can trace this \(\mathbb {P}_{*}\) on the ROC curves. As expected, it roughly corresponds to the point where we cannot significantly increase TPR without increasing FPR, which is at TPR \( \approx 0.8\). If, instead, we need a more conservative number of FP, we can choose a higher \(\mathbb {P}_{*}\), for example \(\mathbb {P}_{*} = 0.5\), the default classification threshold for a binary classifier.

A concise way to demonstrate the effectiveness of the *P*-statistic in rejecting misclassifications is by computing the odds TP:FP and TN:FN, estimated from the curves in the bottom row of Fig. 6. Odds relate to the probability that a molecule is correctly identified at the selected \(\mathbb {P}\), with an example shown in Table 7, estimated at \(\mathbb {P}_{*} = 0.5\). The table shows that the *P*-statistic is quite effective in rejecting FP, as they are negligible for all molecules at this threshold. Moreover, TPR at \(\mathbb {P}_{*} = 0.5\) indicates that more than 60% of the positives in the dataset is correctly identified, with TP values of approximately 45%, 35%, and 45% for CH\(_4\), H\(_2\)O, and CO\(_2\), respectively (rounded to the nearest 5% from the odds values listed in the table). However, at this \(\mathbb {P}\), FN increases to approximately 15-25% of the dataset (as seen in the bottom row of Fig. 6 at \(\mathbb {P}_{*} = 0.5\)), resulting in TN:FN odds of less than 3:1.

#### 3.2.2 Retrieval \(\textrm{R}_1\)

Figure 7 shows the same analysis for the retrieval \(\textrm{R}_1\).

Comparing the ROC curves in the top row with those obtained for the retrieval \(\textrm{R}_0\) (see Section 3.2.1), we notice a decrease in the predictive power of the method, measured by a reduction in AUC for CH\(_4\) and H\(_2\)O, as reported in Table 7. On the contrary, the CO\(_2\) ROC achieves the highest AUC, similar to that of \(\textrm{R}_0\), possibly caused by the limited overlap between NH\(_3\) and CO\(_2\), when compared to the case of CH\(_4\) and H\(_2\)O.

The plots in the bottom row show a significant reduction in the performance of the FP curve compared to that achieved for \(\textrm{R}_0\): for CH\(_4\) and H\(_2\)O, it is above \(10 \%\) up to \(\mathbb {P} \simeq 0.6\), instead of \(<1 \%\) at \(\mathbb {P} \simeq 0.5\). The TN curve also shows a decrease in performance: it remains below \(30 \%\) to \(\mathbb {P} \simeq 0.6\), instead of reaching \(40 \%\) at \(\mathbb {P} \simeq 0.4\) in \(\textrm{R}_0\). Although the TP and FN curves demonstrate relatively better performance, the optimal classification threshold denoted as \(\mathbb {P}_{*}\), determined at the intersection of the FP and FN curves, increases to approximately \(\mathbb {P}_* \sim 0.65, 0.5, 0.4\) for CH\(_4\), H\(_2\)O, and CO\(_2\), respectively. Tracing these \(\mathbb {P}{*}\) values on the ROC curves reveals that they correspond to a TPR of approximately 0.8 for all molecules, similar to \(\textrm{R}_0\), but with a significantly worse FPR, as a consequence of the reduced predictive power.

Table 7 reflects this, showing the odds of TP:FP and TN:FN at the same probability threshold \(\mathbb {P}_{*} = 0.5\), which was used for \(\textrm{R}_0\). In this case, the method is less efficient in rejecting FP, despite having TP of approximately 50% and 45% for CH\(_4\) and H\(_2\)O, respectively, resulting in only about 3:1 odds for TP:FP. However, the method is still effective in correctly identifying planets with CO\(_2\), with TP:FP odds of about 9:1. As for TN:FN, the results are similar to \(\textrm{R}_0\), with a slightly better rejection of FN in the case of CH\(_4\) (4:1 instead of 3:1).

#### 3.2.3 Retrieval \(\textrm{R}_2\)

The results from the same analysis for the retrieval \(\textrm{R}_2\) are very similar to \(\textrm{R}_0\)’s (see Section 3.2.1). Therefore, we refer the reader to Table 7 that summarizes the AUC values obtained and the odds TP:FP and TN:FN at the probability threshold \(\mathbb {P}_{*} = 0.5\), and to Fig. 14 in Section A of the Appendix.

### 3.3 Abundance estimates

Tier 1 might not be adequate for reliable abundance retrieval, for which higher *Ariel* Tiers are better suited. Therefore, we study the retrieved Tier 1 abundances to investigate trends in their distribution that may clarify some of the behavior observed in the calibration and ROC curves seen in the previous sections. The abundance estimator used is obtained from the median of the marginalized posterior distribution of the \(\log Ab_{mol}\) with asymmetric error bars estimated from the \(68.3 \%\) confidence level around the median. In particular, we are interested in investigating the regime of input abundances under which this median-based estimator is unbiased.

#### 3.3.1 Retrieval \(\textrm{R}_0\)

Figure 8 reports the analysis performed to investigate potential biases affecting the median of the marginalized posteriors when used as an estimator of the log-abundances. The figure reports the results for CH\(_4\), H\(_2\)O, and CO\(_2\), shown in different columns from left to right, respectively. NH\(_3\) exhibits similar behavior to the other three molecules, but it is not included in the figure in line with the decision to treat it as a nuisance in this study.

Panels in the top row show the molecular log-abundance input vs. the retrieved with the error bar. A solid black line serves as the ideal trend (1:1 line) for visual reference. The color bar indicates the distances between the input and retrieved log-abundance, expressed in units of the uncertainty \(\sigma \) on \(\log Ab_{mol}\), estimated by averaging the asymmetric error bars. Blue colors denote distances up to \(1 \sigma \); red colors represent distances in the range of \(1 \rightarrow 2 \sigma \). Larger distances are marked with black circles, which serve to diagnose potential trends and biases that may affect the retrieval results. In addition, the symbol size reflects the signal-to-noise ratio (S/N) of each observation as estimated in the AIRS-CH0 spectroscopic channel, providing insight into possible trends between the distance to the input abundance and the S/N condition.

The retrieved abundances exhibit good agreement with the input abundances in the large abundance regime, characterized by limited scatter around the ideal trend and by low retrieved uncertainties. This regime is generally observed for \(Ab_{mol} \gtrsim 10^{-4}\), but starts to break down at \(10^{-5} \lesssim Ab_{mol} \lesssim 10^{-4}\). For \(Ab_{mol} \lesssim 10^{-5}\), the input abundances are rarely retrieved accurately. This analysis can provide insights into the detection limits of CH\(_4\), H\(_2\)O, and CO\(_2\) in *Ariel* Tier 1, which are estimated to be around \(10^{-4}\). These values can be compared with the expected detection limits of the same molecules in *Ariel* Tier 2, which are anticipated to be significantly lower, with previous studies [44] reporting limits between \(10^{-7}\) and \(10^{-6.5}\).

Let the log-abundance S/N be defined as \(\frac{1}{\sigma }\mid \log {Ab_{mol}}\mid \), where \(Ab_{mol}\) is the true value of the molecular abundance. The middle row panels in Fig. 8 show the plot of log-abundance S/N vs. the difference between the retrieved and input log abundances. It can be observed that the distribution of data points is broadly separated into two sub-populations at a S/N of about 5. Data points with high S/N correspond to cases where the input is confidently retrieved and aligned along the 1:1 line in the upper row diagrams, indicating unbiased estimation. On the other hand, data points with low S/N cluster in the bottom left portion of the diagram. In these cases, the median is no longer an unbiased estimator of the true value, as the corresponding data points lie to the left of the 1:1 line in the upper row diagrams. As discussed further in Section 4.2, these cases have posteriors dominated by the prior imposed in the retrieval and are best treated as upper limits.

In the bottom row of Fig. 8, the true abundances are shown vs. the difference between the retrieved and true abundances, in units of \(\sigma \). The diagrams provide a visualization of how many samples are 2-, 3-, and 5-\(\sigma \) outliers, allowing verification that the distribution is compatible with the tail of the abundance posteriors. The number of outliers is shown in the text box inserted in the diagrams and (converted into percentages) in Table 8. Assuming that the abundance posteriors are representative of the data, the fraction of expected outliers outside is \(5\%\), \(0.3\%\), and \(\ll 1 \%\), respectively at 2-, 3-, and 5-\(\sigma \). We find good agreement between the percentages reported in Table 8 and these values, with minor deviations compatible with the statistical fluctuations of a random variable.

#### 3.3.2 Retrieval \(\textrm{R}_1\)

Figure 9 shows the same analysis for the retrieval \(\textrm{R}_1\).

The top row shows that, although there is still a correlation between the retrieved and input abundances, it is less significant than for \(\textrm{R}_0\). Furthermore, comparing the retrieved and input abundances yields different regimes for each molecule. However, the main difference from \(\textrm{R}_0\) is the significant number of data points at distances greater than \(2 \sigma \) (marked by black circles), corresponding to 2-\(\sigma \) outliers. In particular, for all molecules, most of these points are located to the right of the ideal trend, indicating the presence of an overestimation bias for the retrieved abundances. These data points are located in the region y \(\gtrsim 5\) and x > 0 in the plots in the middle row. Therefore, in addition to the overestimation bias for the abundances, their retrieved uncertainties are underestimated. Furthermore, the bottom-row diagrams show a larger number of outliers compared to the \(\textrm{R}_0\) case: too many for the posterior to be considered representative. This is a consequence of an atmospheric model which is not representative of the data, biasing the likelihood, the abundance posteriors, and the median estimator of the abundances.

#### 3.3.3 Retrieval \(\textrm{R}_2\)

The results of the same analysis for the retrieval \(\textrm{R}_2\) are very similar to those of \(\textrm{R}_0\), including the number of outliers that are compatible with the expectations for a model that is representative of the data. Therefore, we refer the reader to Table 8, and to Fig. 15 in Section A of the Appendix. Here, we only stress that adding molecules to the fit-composition that are not present in the data set does not appear to significantly bias the abundance posteriors, compared to \(\textrm{R}_0\). This is further discussed in Section 4.2.

## 4 Discussion

In this section, we first discuss the similarities between the results from the retrievals \(\textrm{R}_0\) and \(\textrm{R}_2\), shown in Sections 3.1 and 3.2. Then we apply the ADI metric to compare all retrievals from the point of view of the Bayesian evidence (Section 4.1). Finally, we expand the discussion to the role of the priors in the retrieved abundance posteriors (Section 4.2).

The results of Sections 3.1 and 3.2 show that the predictions of the *P*-statistic for the retrievals \(\textrm{R}_0\) and \(\textrm{R}_2\) are comparable, despite the quite different fit-compositions, while the reliability of the *P*-statistic is lower in the \(\textrm{R}_1\) case. The \(\textrm{R}_0\) model and its parameters are identical to those used to generate the POP-Is population, and the \(\textrm{R}_2\) extends the parameter space with new molecules. In \(\textrm{R}_2\), the abundance posteriors for CH\(_4\), H\(_2\)O, and CO\(_2\) do not appear to be significantly affected by the addition of CO, HCN, and H\(_2\)S in \(\textrm{R}_2\), despite that the latter three spectral signatures partially overlap with those of CH\(_4\), H\(_2\)O, and CO\(_2\) [31]. It should be noted that the absence of the three molecules from the simulated atmospheres is correctly revealed in \(\textrm{R}_2\) by their low *P*-statistic, shown in Fig. 10, that take values smaller than \(40 \%\) for CO, HCN, and H\(_2\)S, respectively. The extension of the analysis to include the calibration and ROC curves to these molecules is left to future work.

The analysis, therefore, suggests that the *P*-statistic is robust (that means, provides reliable results) against retrieval models that are over-representative of the observed atmosphere. However, the *P*-statistic can no longer be considered robust when the retrieval models are under-representative of the observed atmosphere.

In the current study, the threshold abundance used to estimate the *P*-statistic remains constant for all molecules. While it is possible to optimize this threshold for individual molecules, we leave this aspect for future research as discussed in Section 2.4. Lowering the threshold reduces the information provided by the ROC curves. To achieve the optimal point of operation, one must balance the True and False Positive Rates, which is necessary to promote a Tier-1 target to higher Tiers. It is important to note that ROC curves calculated at different threshold levels provide a statistical estimation of the sample’s completeness, enabling the inference of population-wide properties such as the fraction of planets containing certain molecules. While this aspect requires further investigation in future research, it should be noted that the fraction of positive, \(\Sigma \) (planets with true abundance in excess of \(\mathbb {T}_{Ab}\)) is related to the fraction of Tier-1 targets, \(\tilde{\Sigma }\), selected with \(P(>\mathbb {T}_{Ab}) > \mathbb {P}\) by

The similarities between the \(\textrm{R}_0\) and \(\textrm{R}_2\) models are further discussed in the next section.

### 4.1 ADI comparison

The ADI metric, described in Section 2.3, is used to assess the statistical significance of a model atmosphere with respect to a featureless spectrum using the log-Bayesian factor. A large ADI suggests that a featureless spectrum is less favored by the data. From the ADI definition, the log-Bayesian factor of two competing models is the difference between their respective ADI.

Figure 11 shows the ADI differences between the \(\textrm{R}_0\) model and the two competing models, \(\textrm{R}_1\) and \(\textrm{R}_2\), plotted against NH\(_3\) abundances. A large, positive difference indicates that the competing models are less representative of the data compared to \(\textrm{R}_0\). The median ADI values for all retrievals are approximately 91, 86, and 92 for \(\textrm{R}_0\), \(\textrm{R}_1\), and \(\textrm{R}_2\), respectively, as shown in the text box within Fig. 11. This suggests that a featureless atmospheric model is not favored by the data, and \(\textrm{R}_1\) is the least representative, as expected. This is further supported by the fact that the ADI difference between \(\textrm{R}_0\) and \(\textrm{R}_1\) increases with increasing NH\(_3\) abundance, indicating that higher NH\(_3\) abundances make \(\textrm{R}_1\) less representative compared to \(\textrm{R}_0\), in agreement with the analysis of Section 3. In contrast, the ADI difference between \(\textrm{R}_0\) and \(\textrm{R}_2\) is close to zero, with a scatter described by a standard deviation of approximately 0.5, which is independent of NH\(_3\) abundance. This confirms that \(\textrm{R}_2\) is similarly representative of the data compared to \(\textrm{R}_0\), despite describing a wider parameter space.

### 4.2 Priors

In this section, we discuss the impact of the log-uniform priors adopted in the analysis on the results presented. The consequence is a non-Gaussian posterior distribution, and the mean, mode, and median are not equivalent moments of the distribution. In particular, the median is not an unbiased estimator of the true abundance as shown in Fig. 8 for low log-abundance S/N (hereafter, “abundance S/N”). This can be explained in terms of the Bayesian formulation of the posterior, \(\mathcal {P}\), which is proportional to the product of the likelihood, \(\mathcal {L}\), and the prior, \(\Pi \).

Because \(\Pi (\log x)\) is uniform, \(\Pi (x) \sim 1/x\), for large abundance S/N, the likelihood dominates, the posterior is Gaussian (because of the central limit theorem), and the median estimator is unbiased. For low abundances, the prior dominates, \(\mathcal {P}(x) \propto 1/x\), and the median is an estimator of the molecular abundance that is biased towards low abundances. This is shown in Fig. 12. Each panel shows the probability density function (PDF) of the likelihood, prior and posterior normalized to 1 at the peak, for three cases where the abundance S/N is 4.0, 5.5, and 7.0, respectively, from the top to the bottom panel, assuming an input abundance of \(10^{-5}\). The posterior is likelihood-dominated when the abundance S/N is 7 and is prior-dominated when the abundance S/N is 4.

Although logarithmic uniform priors are often assumed in spectral retrieval studies, they are certainly not “uninformative priors” [73, 74]. Clearly, using these priors biases the median estimator of the molecular abundance in the low S/N regime, explaining the trends seen in Fig. 8. As a side note, log-priors on molecular abundances could as well introduce biases on the derived elemental abundances, therefore the issue has to be investigated carefully in future studies.

The low abundance S/N targets are those that contribute to the leftmost peak in the bimodal distribution of the *P*-statistic (Fig. 4). Further investigation is however needed to fully understand the origin of the *P*-statistic bimodality and its under-forecasting properties.

## 5 Conclusion

The *Ariel* Tier 1 is a shallow reconnaissance survey of a large and diverse sample of approximately 1000 exoplanet atmospheres. It is designed to achieve a signal-to-noise ratio (S/N) greater than 7 when the target exoplanet atmospheric spectra are binned into 7 photometric bands. Tier 1 enables rapid and broad characterization of planets to prioritize re-observations in higher Tiers for detailed chemical and physical characterization. However, Tier 1 may not have sufficient S/N at the spectral resolution required for high-confidence abundance retrieval of chemical species. Nonetheless, it contains a wealth of spectral information that can be extracted to address questions requiring population studies.

In this study, we have introduced a *P*-statistic, which is a function of the data that is sensitive enough to reveal the presence of molecules from transit spectroscopy observations of exoplanet atmospheres and can be used as a binary classifier. The *P*-statistic is estimated from the marginalized retrieval posterior distribution and provides an estimate of the probability that a molecule is present with an abundance exceeding a threshold, fixed at \(\mathbb {T}_{Ab} \sim 10^{-5}\) in this study, but can be optimized in future analyses.

We have tested the performance of the *P*-statistic on a simulated population of gaseous exoplanets, POP-Is, with traces of H\(_2\)O, CH\(_4\), and CO\(_2\) of randomized abundances, in a H\(_2\)-He dominated atmosphere. NH\(_3\) is also included as a disturbance parameter to test the robustness of the *P*-statistic. For this, three models are used in the retrievals: R\(_0\), which is representative of the data; R\(_1\), which is under-representative as it excludes NH\(_3\); and R\(_2\), which is over-representative as it includes additional molecules not considered in the simulated POP-Is.

We find that the *P*-statistic estimated from R\(_0\) posteriors shows a clear, above-noise correlation with the input abundances, allowing us to infer the presence of molecules. The *P*-statistic appears to follow a bimodal distribution, where targets with low abundance S/N are likely contributors to the peak at low *P* values. This is supported by the distribution of the median of the abundance posterior, which is an unbiased estimator of the true value only when the abundance S/N is sufficiently large (typically above 5). The *P*-statistic is affected by an under-forecasting bias, but this is not expected to adversely affect the classification of the planets in the survey as it can be calibrated in principle. This is further evidenced by ROC curves with large AUC, indicating that the *P*-statistic can be used to implement a reliable classifier for the presence of molecules. However, further investigation is needed to fully understand the origin of the *P*-statistic bimodality and its under-forecasting properties.

The results obtained appear not to be affected by the increase in complexity of the assumed atmospheric model, implemented in this study with the R\(_2\) retrieval model, as indicated by similar calibration and ROC curves. We find that the predictive power of the *P*-statistic is adversely affected by an under-representative model, as implemented in the R\(_1\) retrieval model, which is evident from a weaker correlation between the *P*-statistic and the input abundances, and the median of the posterior abundance no longer being a reliable unbiased estimator of the true value, even in the high abundance S/N regime.

Based on our findings, we conclude that the *P*-statistic is a reliable predictor of the presence of molecules within the parameter space explored, as long as the retrieval model matches the complexity of the data. Models that are under-representative can result in poor predictive power, while the investigated over-representative model does not seem to adversely affect classification. Further investigations are needed to test the robustness of the *P*-statistic over a wider parameter space, particularly including a wider set of molecules in both the simulated population and retrievals.

### Software

ArielRad [46], TauREx 3 [45], Alfnoor [42, 44], Astropy [75], h5py [76], Matplotlib [77], Numpy [78].

## Notes

v3.11, Release April 2018

In Python, the package scikit-learn [67] (v1.0) provides the method calibration_curve in sklearn.calibration and the method roc_curve in sklearn.metrics.

## References

Borucki, W.J., Koch, D.G., Basri, G., Batalha, N., Brown, T.M., Bryson, S.T., Caldwell, D., Christensen-Dalsgaard, J., Cochran, W.D., DeVore, E., Dunham, E.W., Gautier, I. Thomas N., Geary, J.C., Gilliland, R., Gould, A., Howell, S.B., Jenkins, J.M., Latham, D.W., Lissauer, J.J., Marcy, G.W., Rowe, J., Sasselov, D., Boss, A., Charbonneau, D., Ciardi, D., Doyle, L., Dupree, A.K., Ford, E.B., Fortney, J., Holman, M.J., Seager, S., Steffen, J.H., Tarter, J., Welsh, W.F., Allen, C., Buchhave, L.A., Christiansen, J.L., Clarke, B.D., Das, S., Désert, J.-M., Endl, M., Fabrycky, D., Fressin, F., Haas, M., Horch, E., Howard, A., Isaacson, H., Kjeldsen, H., Kolodziejczak, J., Kulesa, C., Li, J., Lucas, P.W., Machalek, P., McCarthy, D., MacQueen, P., Meibom, S., Miquel, T., Prsa, A., Quinn, S.N., Quintana, E.V., Ragozzine, D., Sherry, W., Shporer, A., Tenenbaum, P., Torres, G., Twicken, J.D., Van Cleve, J., Walkowicz, L., Witteborn, F.C., Still, M.: Characteristics of Planetary Candidates Observed by Kepler. II. Analysis of the First Four Months of Data. Astrophys. J.

**736**(1), 19 (2011). arXiv:1102.0541 [astro-ph.EP]. https://doi.org/10.1088/0004-637X/736/1/19Batalha, N.M., Rowe, J.F., Bryson, S.T., Barclay, T., Burke, C.J., Caldwell, D.A., Christiansen, J.L., Mullally, F., Thompson, S.E., Brown, T.M., Dupree, A.K., Fabrycky, D.C., Ford, E.B., Fortney, J.J., Gilliland, R.L., Isaacson, H., Latham, D.W., Marcy, G.W., Quinn, S.N., Ragozzine, D., Shporer, A., Borucki, W.J., Ciardi, D.R., Gautier, I. Thomas N., Haas, M.R., Jenkins, J.M., Koch, D.G., Lissauer, J.J., Rapin, W., Basri, G.S., Boss, A.P., Buchhave, L.A., Carter, J.A., Charbonneau, D., Christensen-Dalsgaard, J., Clarke, B.D., Cochran, W.D., Demory, B.-O., Desert, J.-M., Devore, E., Doyle, L.R., Esquerdo, G.A., Everett, M., Fressin, F., Geary, J.C., Girouard, F.R., Gould, A., Hall, J.R., Holman, M.J., Howard, A.W., Howell, S.B., Ibrahim, K.A., Kinemuchi, K., Kjeldsen, H., Klaus, T.C., Li, J., Lucas, P.W., Meibom, S., Morris, R.L., Prša, A., Quintana, E., Sanderfer, D.T., Sasselov, D., Seader, S.E., Smith, J.C., Steffen, J.H., Still, M., Stumpe, M.C., Tarter, J.C., Tenenbaum, P., Torres, G., Twicken, J.D., Uddin, K., Van Cleve, J., Walkowicz, L., Welsh, W.F.: Planetary Candidates Observed by Kepler. III. Analysis of the First 16 Months of Data. Astrophys. J. Suppl. Ser.

**204**(2), 24 (2013). arXiv:1202.5852 [astro-ph.EP]. https://doi.org/10.1088/0067-0049/204/2/24Ricker, G.R., Winn, J.N., Vanderspek, R., Latham, D.W., Bakos, G.Á., Bean, J.L., Berta-Thompson, Z.K., Brown, T.M., Buchhave, L., Butler, N.R., Butler, R.P., Chaplin, W.J., Charbonneau, D., Christensen-Dalsgaard, J., Clampin, M., Deming, D., Doty, J., De Lee, N., Dressing, C., Dunham, E.W., Endl, M., Fressin, F., Ge, J., Henning, T., Holman, M.J., Howard, A.W., Ida, S., Jenkins, J., Jernigan, G., Johnson, J.A., Kaltenegger, L., Kawai, N., Kjeldsen, H., Laughlin, G., Levine, A.M., Lin,D., Lissauer, J.J., MacQueen, P., Marcy, G., McCullough, P.R., Morton, T.D., Narita, N., Paegert, M., Palle, E., Pepe, F., Pepper, J., Quirrenbach, A., Rinehart, S.A., Sasselov, D., Sato, B., Seager, S., Sozzetti, A., Stassun, K.G., Sullivan, P., Szentgyorgyi, A., Torres, G., Udry, S., Villasenor, J.: Transiting Exoplanet Survey Satellite (TESS). In: Oschmann, J. Jacobus M., Clampin, M., Fazio, G.G., MacEwen, H.A. (eds.) Space Telescopes and Instrumentation 2014: Optical, Infrared, and Millimeter Wave. Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, vol. 9143, p. 914320 (2014). https://doi.org/10.1117/12.2063489

Fortier, A., Beck, T., Benz, W., Broeg, C., Cessa, V., Ehrenreich, D., Thomas, N.: CHEOPS: a space telescope for ultra-high precision photometry of exoplanet transits. In: Oschmann, J. Jacobus M., Clampin, M., Fazio, G.G., MacEwen, H.A. (eds.) Space Telescopes and Instrumentation 2014: Optical, Infrared, and Millimeter Wave. Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, vol. 9143, p. 91432 (2014). https://doi.org/10.1117/12.2056687

Rauer, H., Catala, C., Aerts, C., Appourchaux, T., Benz, W., Brandeker, A., Christensen-Dalsgaard, J., Deleuil, M., Gizon, L., Goupil, M.-J., Güdel, M., Janot-Pacheco, E., Mas-Hesse, M., Pagano, I., Piotto, G., Pollacco, D., Santos, \(\dot{\text{C}}\)., Smith, A., Suárez, J.-C., Szabó, R., Udry, S., Adibekyan, V., Alibert, Y., Almenara, J.-M., Amaro-Seoane, P., Eiff, M.A.-v., Asplund, M., Antonello, E., Barnes, S., Baudin, F., Belkacem, K., Bergemann, M., Bihain, G., Birch, A.C., Bonfils, X., Boisse, I., Bonomo, A.S., Borsa, F., Brandão, I.M., Brocato, E., Brun, S., Burleigh, M., Burston, R., Cabrera, J., Cassisi, S., Chaplin, W., Charpinet, S., Chiappini, C., Church, R.P., Csizmadia, S., Cunha, M., Damasso, M., Davies, M.B., Deeg, H.J., Díaz, R.F., Dreizler, S., Dreyer, C., Eggenberger, P., Ehrenreich, D., Eigmüller, P., Erikson, A., Farmer, R., Feltzing, S., de Oliveira Fialho, F., Figueira, P., Forveille, T., Fridlund, M., García, R.A., Giommi, P., Giuffrida, G., Godolt, M., Gomes da Silva, J., Granzer, T., Grenfell, J.L., Grotsch-Noels, A., Günther, E., Haswell, C.A., Hatzes, A.P., Hébrard, G., Hekker, S., Helled, R., Heng, K., Jenkins, J.M., Johansen, A., Khodachenko, M.L., Kislyakova, K.G., Kley, W., Kolb, U., Krivova, N., Kupka, F., Lammer, H., Lanza, A.F., Lebreton, Y., Magrin, D., Marcos-Arenal, P., Marrese, P.M., Marques, J.P., Martins, J., Mathis, S., Mathur, S., Messina, S., Miglio, A., Montalban, J., Montalto, M., Monteiro, M.J.P.F.G., Moradi, H., Moravveji, E., Mordasini, C., Morel, T., Mortier, A., Nascimbeni, V., Nelson, R.P., Nielsen, M.B., Noack, L., Norton, A.J., Ofir, A., Oshagh, M., Ouazzani, R.-M., Pápics, P., Parro, V.C., Petit, P., Plez, B., Poretti, E., Quirrenbach, A., Ragazzoni, R., Raimondo, G., Rainer, M., Reese, D.R., Redmer, R., Reffert, S., Rojas-Ayala, B., Roxburgh, I.W., Salmon, S., Santerne, A., Schneider, J., Schou, J., Schuh, S., Schunker, H., Silva-Valio, A., Silvotti, R., Skillen, I., Snellen, I., Sohl, F., Sousa, S.G., Sozzetti, A., Stello, D., Strassmeier, K.G., Švanda, M., Szabó, G.M., Tkachenko, A., Valencia, D., Van Grootel, V., Vauclair, S.D., Ventura, P., Wagner, F.W., Walton, N.A., Weingrill, J., Werner, S.C., Wheatley, P.J., Zwintz, K.: The PLATO 2.0 mission. Experimental Astronomy

**38**(1-2), 249–330 (2014). arXiv:1310.0696 [astro-ph.EP]. https://doi.org/10.1007/s10686-014-9383-4Gaia Collaboration, Prusti, T., de Bruijne, J.H.J., Brown, A.G.A., Vallenari, A., Babusiaux, C., Bailer-Jones, C.A.L., Bastian, U., Biermann, M., Evans, D.W., Eyer, L., Jansen, F., Jordi, C., Klioner, S.A., Lammers, U., Lindegren, L., Luri, X., Mignard, F., Milligan, D.J., Panem, C., Poinsignon, V., Pourbaix, D., Randich, S., Sarri, G., Sartoretti, P., Siddiqui, H.I., Soubiran, C., Valette, V., van Leeuwen, F., Walton, N.A., Aerts, C., Arenou, F., Cropper, M., Drimmel, R., Høg, E., Katz, D., Lattanzi, M.G., O ’Mullane, W., Grebel, E.K., Holland, A.D., Huc, C., Passot, X., Bramante, L., Cacciari, C., Castañeda, J., Chaoul, L., Cheek, N., De Angeli, F., Fabricius, C., Guerra, R., Hernández, J., Jean-Antoine-Piccolo, A., Masana, E., Messineo, R., Mowlavi, N., Nienartowicz, K., Ordóñez-Blanco, D., Panuzzo, P., Portell, J., Richards, P.J., Riello, M., Seabroke, G.M., Tanga, P., Thévenin, F., Torra, J., Els, S.G., Gracia-Abril, G., Comoretto, G., Garcia-Reinaldos, M., Lock, T., Mercier, E., Altmann, M., Andrae, R., Astraatmadja, T.L., Bellas-Velidis, I., Benson, K., Berthier, J., Blomme, R., Busso, G., Carry, B., Cellino, A., Clementini, G., Cowell, S., Creevey, O., Cuypers, J., Davidson, M., De Ridder, J., de Torres, A., Delchambre, L., Dell ’Oro, A., Ducourant, C., Frémat, Y., García-Torres, M., Gosset, E., Halbwachs, J.-L., Hambly, N.C., Harrison, D.L., Hauser, M., Hestroffer, D., Hodgkin, S.T., Huckle, H.E., Hutton, A., Jasniewicz, G., Jordan, S., Kontizas, M., Korn, A. J., Lanzafame, A. C., Manteiga, M., Moitinho, A., Muinonen, K., Osinde, J., Pancino, E., Pauwels, T., Petit, J.-M., Recio-Blanco, A., Robin, A.C., Sarro, L.M., Siopis, C., Smith, M., Smith, K.W., Sozzetti, A., Thuillot, W., van Reeven, W., Viala, Y., Abbas, U., Abreu Aramburu, A., Accart, S., Aguado, J.J., Allan, P.M., Allasia, W., Altavilla, G., Álvarez, M.A., Alves, J., Anderson, R. I., Andrei, A. H., Anglada Varela, E., Antiche, E., Antoja, T., Antón, S., Arcay, B., Atzei, A., Ayache, L., Bach, N., Baker, S. G., Balaguer-Núñez, L., Barache, C., Barata, C., Barbier, A., Barblan, F., Baroni, M., Barrado y Navascués, D., Barros, M., Barstow, M.A., Becciani, U., Bellazzini, M., Bellei, G., Bello García, A., Belokurov, V., Bendjoya, P., Berihuete, A., Bianchi, L., Bienaymé, O., Billebaud, F., Blagorodnova, N., Blanco-Cuaresma, S., Boch, T., Bombrun, A., Borrachero, R., Bouquillon, S., Bourda, G., Bouy, H., Bragaglia, A., Breddels, M.A., Brouillet, N., Brüsemeister, T., Bucciarelli, B., Budnik, F., Burgess, P., Burgon, R., Burlacu, A., Busonero, D., Buzzi, R., Caffau, E., Cambras, J., Campbell, H., Cancelliere, R., Cantat-Gaudin, T., Carlucci, T., Carrasco, J. M., Castellani, M., Charlot, P., Charnas, J., Charvet, P., Chassat, F., Chiavassa, A., Clotet, M., Cocozza, G., Collins, R.S., Collins, P., Costigan, G., Crifo, F., Cross, N.J.G., Crosta, M., Crowley, C., Dafonte, C., Damerdji, Y., Dapergolas, A., David, P., David, M., De Cat, P., de Felice, F., de Laverny, P., De Luise, F., De March, R., de Martino, D., de Souza, R., Debosscher, J., del Pozo, E., Delbo, M., Delgado, A., Delgado, H.E., di Marco, F., Di Matteo, P., Diakite, S., Distefano, E., Dolding, C., Dos Anjos, S., Drazinos, P., Durán, J., Dzigan, Y., Ecale, E., Edvardsson, B., Enke, H., Erdmann, M., Escolar, D., Espina, M., Evans, N. W., Eynard Bontemps, G., Fabre, C., Fabrizio, M., Faigler, S., Falcão, A. J., Farràs Casas, M., Faye, F., Federici, L., Fedorets, G., Fernández-Hernández, J., Fernique, P., Fienga, A., Figueras, F., Filippi, F., Findeisen, K., Fonti, A., Fouesneau, M., Fraile, E., Fraser, M., Fuchs, J., Furnell, R., Gai, M., Galleti, S., Galluccio, L., Garabato, D., García-Sedano, F., Garé, P., Garofalo, A., Garralda, N., Gavras, P., Gerssen, J., Geyer, R., Gilmore, G., Girona, S., Giuffrida, G., Gomes, M., González-Marcos, A., González-Núñez, J., González-Vidal, J.J., Granvik, M., Guerrier, A., Guillout, P., Guiraud, J., Gúrpide, A., Gutiérrez-Sánchez, R., Guy, L. P., Haigron, R., Hatzidimitriou, D., Haywood, M., Heiter, U., Helmi, A., Hobbs, D., Hofmann, W., Holl, B., Holland, G., Hunt, J.A.S., Hypki, A., Icardi, V., Irwin, M., Jevardat de Fombelle, G., Jofré, P., Jonker, P.G., Jorissen, A., Julbe, F., Karampelas, A., Kochoska, A., Kohley, R., Kolenberg, K., Kontizas, E., Koposov, S.E., Kordopatis, G., Koubsky, P., Kowalczyk, A., Krone-Martins, A., Kudryashova, M., Kull, I., Bachchan, R.K., Lacoste-Seris, F., Lanza, A.F., Lavigne, J.-B., Le Poncin-Lafitte, C., Lebreton, Y., Lebzelter, T., Leccia, S., Leclerc, N., Lecoeur-Taibi, I., Lemaitre, V., Lenhardt, H., Leroux, F., Liao, S., Licata, E., Lindstrøm, H.E.P., Lister, T.A., Livanou, E., Lobel, A., Löffler, W., López, M., Lopez-Lozano, A., Lorenz, D., Loureiro, T., MacDonald, I., Magalhães Fernandes, T., Managau, S., Mann, R.G., Mantelet, G., Marchal, O., Marchant, J.M., Marconi, M., Marie, J., Marinoni, S., Marrese, P.M., Marschalkó, G., Marshall, D.J., Martín-Fleitas, J.M., Martino, M., Mary, N., Matijevic, G., Mazeh, T., McMillan, P.J., Messina, S., Mestre, A., Michalik, D., Millar, N.R., Miranda, B.M.H., Molina, D., Molinaro, R., Molinaro, M., Molnár, L., Moniez, M., Montegriffo, P., Monteiro, D., Mor, R., Mora, A., Morbidelli, R., Morel, T., Morgenthaler, S., Morley, T., Morris, D., Mulone, A.F., Muraveva, T., Musella, I., Narbonne, J., Nelemans, G., Nicastro, L., Noval, L., Ordénovic, C., Ordieres-Meré, J., Osborne, P., Pagani, C., Pagano, I., Pailler, F., Palacin, H., Palaversa, L., Parsons, P., Paulsen, T., Pecoraro, M., Pedrosa, R., Pentikäinen, H., Pereira, J., Pichon, B., Piersimoni, A. M., Pineau, F.-X., Plachy, E., Plum, G., Poujoulet, E., Prsa, A., Pulone, L., Ragaini, S., Rago, S., Rambaux, N., Ramos-Lerate, M., Ranalli, P., Rauw, G., Read, A., Regibo, S., Renk, F., Reylé, C., Ribeiro, R.A., Rimoldini, L., Ripepi, V., Riva, A., Rixon, G., Roelens, M., Romero-Gómez, M., Rowell, N., Royer, F., Rudolph, A., Ruiz-Dern, L., Sadowski, G., Sagristá Sellés, T., Sahlmann, J., Salgado, J., Salguero, E., Sarasso, M., Savietto, H., Schnorhk, A., Schultheis, M., Sciacca, E., Segol, M., Segovia, J.C., Segransan, D., Serpell, E., Shih, I-C., Smareglia, R., Smart, R.L., Smith, C., Solano, E., Solitro, F., Sordo, R., Soria Nieto, S., Souchay, J., Spagna, A., Spoto, F., Stampa, U., Steele, I.A., Steidelmüller, H., Stephenson, C.A., Stoev, H., Suess, F.F., Süveges, M., Surdej, J., Szabados, L., Szegedi-Elek, E., Tapiador, D., Taris, F., Tauran, G., Taylor, M.B., Teixeira, R., Terrett, D., Tingley, B., Trager, S.C., Turon, C., Ulla, A., Utrilla, E., Valentini, G., van Elteren, A., Van Hemelryck, E., van Leeuwen, M., Varadi, M., Vecchiato, A., Veljanoski, J., Via, T., Vicente, D., Vogt, S., Voss, H., Votruba, V., Voutsinas, S., Walmsley, G., Weiler, M., Weingrill, K., Werner, D., Wevers, T., Whitehead, G., Wyrzykowski, L., Yoldas, A., Zerjal, M., Zucker, S., Zurbach, C., Zwitter, T., Alecu, A., Allen, M., Allende Prieto, C., Amorim, A., Anglada-Escudé, G., Arsenijevic, V., Azaz, S., Balm, P., Beck, M., Bernstein, H.-H., Bigot, L., Bijaoui, A., Blasco, C., Bonfigli, M., Bono, G., Boudreault, S., Bressan, A., Brown, S., Brunet, P.-M., Bunclark, P., Buonanno, R., Butkevich, A.G., Carret, C., Carrion, C., Chemin, L., Chéreau, F., Corcione, L., Darmigny, E., de Boer, K.S., de Teodoro, P., de Zeeuw, P.T., Delle Luche, C., Domingues, C.D., Dubath, P., Fodor, F., Frézouls, B., Fries, A., Fustes, D., Fyfe, D., Gallardo, E., Gallegos, J., Gardiol, D., Gebran, M., Gomboc, A., Gómez, A., Grux, E., Gueguen, A., Heyrovsky, A., Hoar, J., Iannicola, G., Isasi Parache, Y., Janotto, A.-M., Joliet, E., Jonckheere, A., Keil, R., Kim, D.-W., Klagyivik, P., Klar, J., Knude, J., Kochukhov, O., Kolka, I., Kos, J., Kutka, A., Lainey, V., LeBouquin, D., Liu, C., Loreggia, D., Makarov, V.V., Marseille, M.G., Martayan, C., Martinez-Rubi, O., Massart, B., Meynadier, F., Mignot, S., Munari, U., Nguyen, A.-T., Nordlander, T., Ocvirk, P., O ’Flaherty, K.S., Olias Sanz, A., Ortiz, P., Osorio, J., Oszkiewicz, D., Ouzounis, A., Palmer, M., Park, P., Pasquato, E., Peltzer, C., Peralta, J., Pérturaud, F., Pieniluoma, T., Pigozzi, E., Poels, J., Prat, G., Prod ’homme, T., Raison, F., Rebordao, J.M., Risquez, D., Rocca-Volmerange, B., Rosen, S., Ruiz-Fuertes, M.I., Russo, F., Sembay, S., Serraller Vizcaino, I., Short, A., Siebert, A., Silva, H., Sinachopoulos, D., Slezak, E., Soffel, M., Sosnowska, D., Straizys, V., ter Linden, M., Terrell, D., Theil, S., Tiede, C., Troisi, L., Tsalmantza, P., Tur, D., Vaccari, M., Vachier, F., Valles, P., Van Hamme, W., Veltz, L., Virtanen, J., Wallut, J.-M., Wichmann, R., Wilkinson, M.I., Ziaeepour, H., Zschocke, S.: The gaia mission. Astron. Astrophys.

**595**, 1 (2016). https://doi.org/10.1051/0004-6361/201629272Mayor, M., Pepe, F., Queloz, D., Bouchy, F., Rupprecht, G., Lo Curto, G., Avila, G., Benz, W., Bertaux, J.-L., Bonfils, X., Dall, T., Dekker, H., Delabre, B., Eckert, W., Fleury, M., Gilliotte, A., Gojak, D., Guzman, J.C., Kohler, D., Lizon, J.-L., Longinotti, A., Lovis, C., Megevand, D., Pasquini, L., Reyes, J., Sivan, J.-P., Sosnowska, D., Soto, R., Udry, S., van Kesteren, A., Weber, L., Weilenmann, U.: Setting New Standards with HARPS. The Messenger

**114**, 20–24 (2003)Pollacco, D.L., Skillen, I., Collier Cameron, A., Christian, D.J., Hellier, C., Irwin, J., Lister, T.A., Street, R.A., West, R.G., Anderson, D.R., Clarkson, W.I., Deeg, H., Enoch, B., Evans, A., Fitzsimmons, A., Haswell, C.A., Hodgkin, S., Horne, K., Kane, S.R., Keenan, F.P., Maxted, P.F.L., Norton, A.J., Osborne, J., Parley, N.R., Ryans, R.S.I., Smalley, B., Wheatley, P.J., Wilson, D.M.: The WASP Project and the SuperWASP Cameras. Publ. Astron. Soc. Pac.

**118**(848), 1407–1418 (2006). arXiv:0608454 [astro-ph]. https://doi.org/10.1086/508556Pepper, J., Pogge, R.W., DePoy, D.L., Marshall, J.L., Stanek, K.Z., Stutz, A.M., Poindexter, S., Siverd, R., O’Brien, T.P., Trueblood, M., Trueblood, P.: The Kilodegree Extremely Little Telescope (KELT): A Small Robotic Telescope for Large-Area Synoptic Surveys. Publ. Astron. Soc. Pac.

**119**(858), 923–935 (2007). arXiv:0704.0460 [astro-ph]. https://doi.org/10.1086/521836Udalski, A., Szymański, M.K., Szymański, G.: OGLE-IV: Fourth Phase of the Optical Gravitational Lensing Experiment. Acta Astron.

**65**(1), 1–38 (2015). arXiv:1504.05966 [astro-ph.SR]Seager, S., Sasselov, D.D.: Theoretical Transmission Spectra during Extrasolar Giant Planet Transits. Astrophys. J.

**537**(2), 916–921 (2000). arXiv:astro-ph/9912241 [astro-ph]. https://doi.org/10.1086/309088Charbonneau, D., Allen, L.E., Megeath, S.T., Torres, G., Alonso, R., Brown, T.M., Gilliland, R.L., Latham, D.W., Mandushev, G., O’Donovan, F.T., Sozzetti, A.: Detection of Thermal Emission from an Extrasolar Planet. Astrophys. J.

**626**(1), 523–529 (2005). arXiv:astro-ph/0503457 [astro-ph]. https://doi.org/10.1086/429991Tinetti, G., Vidal-Madjar, A., Liang, M., Beaulieu, J.-P., Yung, Y., Carey, S., Barber, R., Tennyson, J., Ribas, I., Allard, N., Ballester, G., Sing, D., Selsis, F.: Water vapour in the atmosphere of a transiting extrasolar planet. Nature

**448**, 169–71 (2007). https://doi.org/10.1038/nature06002Madhusudhan, N., Lee, K.K.M., Mousis, O.: A Possible Carbon-rich Interior in Super-Earth 55 Cancrie. Astrophys. J. Lett.

**759**(2), 40 (2012). arXiv:1210.2720 [astro-ph.EP]. https://doi.org/10.1088/2041-8205/759/2/L40Tinetti, G., Encrenaz, T., Coustenis, A.: Spectroscopy of planetary atmospheres in our Galaxy. Astron. Astrophys. Rev.

**21**, 63 (2013). https://doi.org/10.1007/s00159-013-0063-6Kreidberg, L., Bean, J.L., Désert, J.-M., Benneke, B., Deming, D., Stevenson, K.B., Seager, S., Berta-Thompson, Z., Seifahrt, A., Homeier, D.: Clouds in the atmosphere of the super-Earth exoplanet GJ1214b. Nature

**505**(7481), 69–72 (2014). arXiv:1401.0022 [astro-ph.EP]. https://doi.org/10.1038/nature12888Sing, D.K., Fortney, J.J., Nikolov, N., Wakeford, H.R., Kataria, T., Evans, T.M., Aigrain, S., Ballester, G.E., Burrows, A.S., Deming, D., Désert, J.-M., Gibson, N.P., Henry, G.W., Huitson, C.M., Knutson, H.A., Lecavelier Des Etangs, A., Pont, F., Showman, A.P., Vidal-Madjar, A., Williamson, M.H., Wilson, P.A.: A continuum from clear to cloudy hot-Jupiter exoplanets without primordial water depletion. Nature

**529**(7584), 59–62 (2016). arXiv:1512.04341 [astro-ph.EP]. https://doi.org/10.1038/nature16068Line, M.R., Stevenson, K.B., Bean, J., Desert, J.-M., Fortney, J.J., Kreidberg, L., Madhusudhan, N., Showman, A.P., Diamond-Lowe, H.: No Thermal Inversion and a Solar Water Abundance for the Hot Jupiter HD 209458b from HST/WFC3 Spectroscopy. Astron. J.

**152**(6), 203 (2016). arXiv:1605.08810 [astro-ph.EP]. https://doi.org/10.3847/0004-6256/152/6/203Tsiaras, A., Waldmann, I.P., Zingales, T., Rocchetto, M., Morello, G., Damiano, M., Karpouzas, K., Tinetti, G., McKemmish, L.K., Tennyson, J., Yurchenko, S.N.: A Population Study of Gaseous Exoplanets. Astron. J.

**155**(4), 156 (2018). arXiv:1704.05413 [astro-ph.EP]. https://doi.org/10.3847/1538-3881/aaaf75Evans, T.M., Sing, D.K., Goyal, J.M., Nikolov, N., Marley, M.S., Zahnle, K., Henry, G.W., Barstow, J.K., Alam, M.K., Sanz-Forcada, J., Kataria, T., Lewis, N.K., Lavvas, P., Ballester, G.E., Ben-Jaffel, L., Blumenthal, S.D., Bourrier, V., Drummond, B., García Muñoz, A., López-Morales, M., Tremblin, P., Ehrenreich, D., Wakeford, H.R., Buchhave, L.A., Lecavelier des Etangs, A., Hébrard, E., Williamson, M.H.: An Optical Transmission Spectrum for the Ultra-hot Jupiter WASP-121b Measured with the Hubble Space Telescope. Astron. J.

**156**(6), 283 (2018). arXiv:1810.10969 [astro-ph.EP]. https://doi.org/10.3847/1538-3881/aaebfPinhas, A., Madhusudhan, N., Gandhi, S., MacDonald, R.: H2O abundances and cloud properties in ten hot giant exoplanets. Monthly Notices of the Royal Astronomical Society

**482**(2), 1485–1498 (2018). https://doi.org/10.1093/mnras/sty2544Welbanks, L., Madhusudhan, N., Allard, N.F., Hubeny, I., Spiegelman, F., Leininger, T.: Mass-metallicity trends in transiting exoplanets from atmospheric abundances of h2o, na, and k. Astrophys. J. Lett.

**887**(1), 20 (2019). https://doi.org/10.3847/2041-8213/ab5a89Mikal-Evans, T., Sing, D.K., Kataria, T., Wakeford, H.R., Mayne, N.J., Lewis, N.K., Barstow, J.K., Spake, J.J.: Confirmation of water emission in the dayside spectrum of the ultrahot Jupiter WASP-121b. Mon. Not. R. Astron. Soc.

**496**(2), 1638–1644 (2020). arXiv:2005.09631 [astro-ph.EP]. https://doi.org/10.1093/mnras/staa1628Pluriel, W., Zingales, T., Leconte, J., Parmentier, V.: Strong biases in retrieved atmospheric composition caused by day-night chemical heterogeneities. Astron. Astrophys.

**636**, 66 (2020). arXiv:2003.05943 [astro-ph.EP]. https://doi.org/10.1051/0004-6361/202037678Edwards, B., Changeat, Q., Baeyens, R., Tsiaras, A., Al-Refaie, A., Taylor, J., Yip, K.H., Bieger, M.F., Blain, D., Gressier, A., Guilluy, G., Jaziri, A.Y., Kiefer, F., Modirrousta-Galian, D., Morvan, M., Mugnai, L.V., Pluriel, W., Poveda, M., Skaf, N., Whiteford, N., Wright, S., Zingales, T., Charnay, B., Drossart, P., Leconte, J., Venot, O., Waldmann, I., Beaulieu, J.-P.: ARES I: WASP-76 b, A Tale of Two HST Spectra. Astron. J.

**160**(1), 8 (2020). https://doi.org/10.3847/1538-3881/AB9225Skaf, N., Bieger, M.F., Edwards, B., Changeat, Q., Morvan, M., Kiefer, F., Blain, D., Zingales, T., Poveda, M., Al-Refaie, A., Baeyens, R., Gressier, A., Guilluy, G., Jaziri, A.Y., Modirrousta-Galian, D., Mugnai, L.V., Pluriel, W., Whiteford, N., Wright, S., Yip, K.H., Charnay, B., Leconte, J., Drossart, P., Tsiaras, A., Venot, O., Waldmann, I., Beaulieu, J.-P.: ARES. II. Characterizing the Hot Jupiters WASP-127 b, WASP-79 b, and WASP-62b with the Hubble Space Telescope. Astron. J.

**160**(3), 109 (2020). arXiv:2005.09615 [astro-ph.EP]. https://doi.org/10.3847/1538-3881/ab94a3Pluriel, W., Whiteford, N., Edwards, B., Changeat, Q., Yip, K.H., Baeyens, R., Al-Refaie, A., Fabienne Bieger, M., Blain, D., Gressier, A., Guilluy, G., Yassin Jaziri, A., Kiefer, F., Modirrousta-Galian, D., Morvan, M., Mugnai, L.V., Poveda, M., Skaf, N., Zingales, T., Wright, S., Charnay, B., Drossart, P., Leconte, J., Tsiaras, A., Venot, O., Waldmann, I., Beaulieu, J.-P., Bieger, M.F., Blain, D., Gressier, A., Guilluy, G., Jaziri, A.Y., Kiefer, F., Modirrousta-Galian, D., Morvan, M., Mugnai, L.V., Poveda, M., Skaf, N., Zingales, T., Wright, S., Charnay, B., Drossart, P., Leconte, J., Tsiaras, A., Venot, O., Waldmann, I., Beaulieu, J.-P.: ARES. III. Unveiling the Two Faces of KELT-7 b with HST WFC3. Astron. J.

**160**(3), 112 (2020). arXiv:2006.14199 [astro-ph.EP]. https://doi.org/10.3847/1538-3881/aba000Guilluy, G., Gressier, A., Wright, S., Santerne, A., Jaziri, A.Y., Edwards, B., Changeat, Q., Modirrousta-Galian, D., Skaf, N., Al-Refaie, A., Baeyens, R., Bieger, M.F., Blain, D., Kiefer, F., Morvan, M., Mugnai, L.V., Pluriel, W., Poveda, M., Zingales, T., Whiteford, N., Yip K.H., Charnay, B., Leconte, J., Drossart, P., Sozzetti, A., Marcq, E., Tsiaras, A., Venot, O., Waldmann, I., Beaulieu, J.-P.: ARES IV: Probing the Atmospheres of the Two Warm Small Planets HD 106315c and HD 3167c with the HST/WFC3 Camera. Astron. J.

**161**(1), 19 (2021). arXiv:2011.03221 [astro-ph.EP]. https://doi.org/10.3847/1538-3881/abc3c8Mugnai, L.V., Modirrousta-Galian, D., Edwards, B., Changeat, Q., Bouwman, J., Morello, G., Al-Refaie, A., Baeyens, R., Bieger, M.F., Blain, D., Gressier, A., Guilluy, G., Jaziri, Y., Kiefer, F., Morvan, M., Pluriel, W., Poveda, M., Skaf, N., Whiteford, N., Wright, S., Yip, K.H., Zingales, T., Charnay, B., Drossart, P., Leconte, J., Venot, O., Waldmann, I., Beaulieu, J.-P.: ARES. V. No Evidence For Molecular Absorption in the HST WFC3 Spectrum of GJ 1132 b. Astron. J.

**161**(6), 284 (2021). arXiv:2104.01873 [astro-ph.EP]. https://doi.org/10.3847/1538-3881/abf3c3Changeat, Q.: On spectroscopic phase-curve retrievals: H2 dissociation and thermal inversion in the atmosphere of the ultrahot jupiter wasp103 b. Astron. J.

**163**(3), 106 (2022). https://doi.org/10.3847/1538-3881/ac4475Encrenaz, T., Tinetti, G., Tessenyi, M., Drossart, P., Hartogh, P., Coustenis, A.: Transit spectroscopy of exoplanets from space: how to optimize the wavelength coverage and spectral resolving power. Exp. Astron.

**40**(2–3), 523–543 (2015). https://doi.org/10.1007/s10686-014-9415-0Tsiaras, A., Waldmann, I.P., Tinetti, G., Tennyson, J., Yurchenko, S.N.: Water vapour in the atmosphere of the habitable-zone eight-Earth-mass planet K2-18 b. Nat. Astron.

**3**(2019). https://doi.org/10.1038/s41550-019-0878-9Changeat, Q., Edwards, B., Al-Refaie, A.F., Tsiaras, A., Skinner, J.W., Cho, J.Y.K., Yip, K.H., Anisman, L., Ikoma, M., Bieger, M.F., Venot, O., Shibata, S., Waldmann, I.P., Tinetti, G.: Five key exoplanet questions answered via the analysis of 25 hot-jupiter atmospheres in eclipse. Astrophys. J. Suppl. Ser.

**260**(1), 3 (2022). https://doi.org/10.3847/1538-4365/ac5cc2Greene, T.P., Line, M.R., Montero, C., Fortney, J.J., Lustig-Yaeger, J., Luther, K.: Characterizing Transiting Exoplanet Atmospheres with JWST. Astrophys. J.

**817**(1), 17 (2016). arXiv:1511.05528 [astro-ph.EP]. https://doi.org/10.3847/0004-637X/817/1/17Feinstein, A.D., Radica, M., Welbanks, L., Murray, C.A., Ohno, K., Coulombe, L.-P., Espinoza, N., Bean, J.L., Teske, J.K., Benneke, B., Line, M.R., Rustamkulov, Z., Saba, A., Tsiaras, A., Barstow, J.K., Fortney,J.J., Gao, P., Knutson, H.A., MacDonald, R.J., Mikal-Evans, T., Rackham, B.V., Taylor, J., Parmentier, V., Batalha, N.M., Berta-Thompson, Z.K., Carter, A.L., Changeat, Q., Santos, L.A.D., Gibson, N.P., Goyal, J.M., Kreidberg, L., López-Morales, M., Lothringer, J.D., Miguel, Y., Molaverdikhani, K., Moran, S.E., Morello, G., Mukherjee, S., Sing, D.K., Stevenson, K.B., Wakeford, H.R., Ahrer, E.-M., Alam, M.K., Alderson, L., Allen, N.H., Batalha, N.E., Bell, T.J., Blecic, J., Brande, J., Caceres, C., Casewell, S.L., Chubb, K.L., Crossfield, I.J.M., Crouzet, N., Cubillos, P.E., Decin, L., Désert, J.-M., Harrington, J., Heng, K., Henning, T., Iro, N., Kempton, E.M.-R., Kendrew, S., Kirk, J., Krick, J., Lagage, P.-O., Lendl, M., Mancini, L., Mansfield, M., May, E.M., Mayne, N.J., Nikolov, N.K., Palle, E., de la Roche, D.J.M.P.d., Piaulet, C., Powell, D., Redfield, S., Rogers, L.K., Roman, M.T., Roy, P.-A., Nixon, M.C., Schlawin, E., Tan, X., Tremblin, P., Turner, J.D., Venot, O., Waalkes, W.C., Wheatley, P.J., Zhang, X.: Early Release Science of the exoplanet WASP-39b with JWST NIRISS. arXiv (2022). arXiv:2211.10493. https://doi.org/10.48550/ARXIV.2211.10493

Ahrer, E.-M., Stevenson, K.B., Mansfield, M., Moran, S.E., Brande, J., Morello, G., Murray, C.A., Nikolov, N.K., de la Roche, D.J.M.P.d., Schlawin, E., Wheatley, P.J., Zieba, S., Batalha, N.E., Damiano, M., Goyal, J.M., Lendl, M., Lothringer, J.D., Mukherjee, S., Ohno, K., Batalha, N.M., Battley, M.P., Bean, J.L., Beatty, T.G., Benneke, B., Berta-Thompson, Z.K., Carter, A.L., Cubillos, P.E., Daylan, T., Espinoza, N., Gao, P., Gibson, N.P., Gill, S., Harrington, J., Hu, R., Kreidberg, L., Lewis, N.K., Line, M.R., López-Morales, M., Parmentier, V., Powell, D.K., Sing, D.K., Tsai, S.-M., Wakeford, H.R., Welbanks, L., Alam, M.K., Alderson, L., Allen, N.H., Anderson, D.R., Barstow, J.K., Bayliss, D., Bell, T.J., Blecic, J., Bryant, E.M., Burleigh, M.R., Carone, L., Casewell, S.L., Changeat, Q., Chubb, K.L., Crossfield, I.J.M., Crouzet, N., Decin, L., Désert, J.-M., Feinstein, A.D., Flagg, L., Fortney, J.J., Gizis, J.E., Heng, K., Iro, N., Kempton, E.M.-R., Kendrew, S., Kirk, J., Knutson, H.A., Komacek, T.D., Lagage, P.-O., Leconte, J., Lustig-Yaeger, J., MacDonald, R.J., Mancini, L., May, E.M., Mayne, N.J., Miguel, Y., Mikal-Evans, T., Molaverdikhani, K., Palle, E., Piaulet, C., Rackham, B.V., Redfield, S., Rogers, L.K., Roy, P.-A., Rustamkulov, Z., Shkolnik, E.L., Sotzen, K.S., Taylor, J., Tremblin, P., Tucker, G.S., Turner, J.D., de Val-Borro, M., Venot, O., Zhang, X.: Early Release Science of the exoplanet WASP-39b with JWST NIRCam. arXiv (2022). https://doi.org/10.48550/ARXIV.2211.10489. arXiv:2211.10489

Rustamkulov, Z., Sing, D.K., Mukherjee, S., May, E.M., Kirk, J., Schlawin, E., Line, M.R., Piaulet, C., Carter, A.L., Batalha, N.E., Goyal, J.M., López-Morales, M., Lothringer, J.D., MacDonald, R.J., Moran, S.E., Stevenson, K.B., Wakeford, H.R., Espinoza, N., Bean, J.L., Batalha, N.M., Benneke, B., Berta-Thompson, Z.K., Crossfield, I.J.M., Gao, P., Kreidberg, L., Powell, D.K., Cubillos, P.E., Gibson, N.P., Leconte, J., Molaverdikhani, K., Nikolov, N.K., Parmentier, V., Roy, P., Taylor, J., Turner, J.D., Wheatley, P.J., Aggarwal, K., Ahrer, E., Alam, M.K., Alderson, L., Allen, N.H., Banerjee, A., Barat, S., Barrado, D., Barstow, J.K., Bell, T.J., Blecic, J., Brande, J., Casewell, S., Changeat, Q., Chubb, K.L., Crouzet, N., Daylan, T., Decin, L., Désert, J., Mikal-Evans, T., Feinstein, A.D., Flagg, L., Fortney, J.J., Harrington, J., Heng, K., Hong, Y., Hu, R., Iro, N., Kataria, T., Kempton, E.M.-R., Krick, J., Lendl, M., Lillo-Box, J., Louca, A., Lustig-Yaeger, J., Mancini, L., Mansfield, M., Mayne, N.J., Miguel, Y., Morello, G., Ohno, K., Palle, E., de la Roche, D.J.M.P.d., Rackham, B.V., Radica, M., Ramos-Rosado, L., Redfield, S., Rogers, L.K., Shkolnik, E.L., Southworth, J., Teske, J., Tremblin, P., Tucker, G.S., Venot, O., Waalkes, W.C., Welbanks, L., Zhang, X., Zieba, S.: Early Release Science of the exoplanet WASP-39b with JWST NIRSpec PRISM. arXiv (2022). https://doi.org/10.48550/ARXIV.2211.10487. arXiv:2211.10487

Alderson, L., Wakeford, H.R., Alam, M.K., Batalha, N.E., Lothringer, J.D., Redai, J.A., Barat, S., Brande, J., Damiano, M., Daylan, T., Espinoza, N., Flagg, L., Goyal, J.M., Grant, D., Hu, R., Inglis, J., Lee, E.K.H., Mikal-Evans, T., Ramos-Rosado, L., Roy, P.-A., Wallack, N.L., Batalha, N.M., Bean, J.L., Benneke, B., Berta-Thompson, Z.K., Carter, A.L., Changeat, Q., Colón, K.D., Crossfield, I.J.M., Désert, J.-M., Foreman-Mackey, D., Gibson, N.P., Kreidberg, L., Line, M.R., López-Morales, M., Molaverdikhani, K., Moran, S.E., Morello, G., Moses, J.I., Mukherjee, S., Schlawin, E., Sing, D.K., Stevenson, K.B., Taylor, J., Aggarwal, K., Ahrer, E.-M., Allen, N.H., Barstow, J.K., Bell, T.J., Blecic, J., Casewell, S.L., Chubb, K.L., Crouzet, N., Cubillos, P.E., Decin, L., Feinstein, A.D., Fortney, J.J., Harrington, J., Heng, K., Iro, N., Kempton, E.M.-R., Kirk, J., Knutson, H.A., Krick, J., Leconte, J., Lendl, M., MacDonald, R.J., Mancini, L., Mansfield, M., May, E.M., Mayne, N.J., Miguel, Y., Nikolov, N.K., Ohno, K., Palle, E., Parmentier, V., de la Roche, D.J.M.P.d., Piaulet, C., Powell, D., Rackham, B.V., Redfield, S., Rogers, L.K., Rustamkulov, Z., Tan, X., Tremblin, P., Tsai, S.-M., Turner, J.D., de Val-Borro, M., Venot, O., Welbanks, L., Wheatley, P.J., Zhang, X.: Early Release Science of the Exoplanet WASP-39b with JWST NIRSpec G395H. arXiv (2022). https://doi.org/10.48550/ARXIV.2211.10488. arXiv:2211.10488

Tsai, S.-M., Lee, E.K.H., Powell, D., Gao, P., Zhang, X., Moses, J., Hébrard, E., Venot, O., Parmentier, V., Jordan, S., Hu, R., Alam, M.K., Alderson, L., Batalha, N.M., Bean, J.L., Benneke, B., Bierson, C.J., Brady, R.P., Carone, L., Carter, A.L., Chubb, K.L., Inglis, J., Leconte, J., Lopez-Morales, M., Miguel, Y., Molaverdikhani, K., Rustamkulov, Z., Sing, D.K., Stevenson, K.B., Wakeford, H.R., Yang, J., Aggarwal, K., Baeyens, R., Barat, S., Borro, M.d.V., Daylan, T., Fortney, J.J., France, K., Goyal, J.M., Grant, D., Kirk, J., Kreidberg, L., Louca, A., Moran, S.E., Mukherjee, S., Nasedkin, E., Ohno, K., Rackham, B.V., Redfield, S., Taylor, J., Tremblin, P., Visscher, C., Wallack, N.L., Welbanks, L., Youngblood, A., Ahrer, E.-M., Batalha, N.E., Behr, P., Berta-Thompson, Z.K., Blecic, J., Casewell, S.L., Crossfield, I.J.M., Crouzet, N., Cubillos, P.E., Decin, L., Désert, J.-M., Feinstein, A.D., Gibson, N.P., Harrington, J., Heng, K., Henning, T., Kempton, E.M.-R., Krick, J., Lagage, P.-O., Lendl, M., Line, M., Lothringer, J.D., Mansfield, M., Mayne, N.J., Mikal-Evans, T., Palle, E., Schlawin, E., Shorttle, O., Wheatley, P.J., Yurchenko, S.N.: Direct Evidence of Photochemistry in an Exoplanet Atmosphere. arXiv (2022). https://doi.org/10.48550/ARXIV.2211.10490. arXiv:2211.10490

Tinetti, G., Drossart, P., Eccleston, P., Hartogh, P., Heske, A., Leconte, J., Micela, G., Ollivier, M., Pilbratt, G., Puig, L., Turrini, D., Vandenbussche, B., Wolkenberg, P., Beaulieu, J.-P., Buchave, L.A., Ferus, M., Griffin, M., Guedel, M., Justtanont, K., Lagage, P.-O., Machado, P., Malaguti, G., Min, M., Nørgaard-Nielsen, H.U., Rataj, M., Ray, T., Ribas, I., Swain, M., Szabo, R., Werner, S., Barstow, J., Burleigh, M., Cho, J., du Foresto, V.C., Coustenis, A., Decin, L., Encrenaz, T., Galand, M., Gillon, M., Helled, R., Morales, J.C., Muñoz, A.G., Moneti, A., Pagano, I., Pascale, E., Piccioni, G., Pinfield, D., Sarkar, S., Selsis, F., Tennyson, J., Triaud, A., Venot, O., Waldmann, I., Waltham, D., Wright, G., Amiaux, J., Auguères, J.-L., Berthé, M., Bezawada, N., Bishop, G., Bowles, N., Coffey, D., Colomé, J., Crook, M., Crouzet, P.-E., Da Peppo, V., Sanz, I.E., Focardi, M., Frericks, M., Hunt, T., Kohley, R., Middleton, K., Morgante, G., Ottensamer, R., Pace, E., Pearson, C., Stamper, R., Symonds, K., Rengel, M., Renotte, E., Ade, P., Affer, L., Alard, C., Allard, N., Altieri, F., André, Y., Arena, C., Argyriou, I., Aylward, A., Baccani, C., Bakos, G., Banaszkiewicz, M., Barlow, M., Batista, V., Bellucci, G., Benatti, S., Bernardi, P., Bézard, B., Blecka, M., Bolmont, E., Bonfond, B., Bonito, R., Bonomo, A.S., Brucato, J.R., Brun, A.S., Bryson, I., Bujwan, W., Casewell, S., Charnay, B., Pestellini, C.C., Chen, G., Ciaravella, A., Claudi, R., Clédassou, R., Damasso, M., Damiano, M., Danielski, C., Deroo, P., Di Giorgio, A.M., Dominik, C., Doublier, V., Doyle, S., Doyon, R., Drummond, B., Duong, B., Eales, S., Edwards, B., Farina, M., Flaccomio, E., Fletcher, L., Forget, F., Fossey, S., Fränz, M., Fujii, Y., García-Piquer, Á., Gear, W., Geoffray, H., Gérard, J.C., Gesa, L., Gomez, H., Graczyk Rafa land Griffith, C., Grodent, D., Guarcello, M.G., Gustin, J., Hamano, K., Hargrave, P., Hello, Y., Heng, K., Herrero, E., Hornstrup, A., Hubert, B., Ida, S., Ikoma, M., Iro, N., Irwin, P., Jarchow, C., Jaubert, J., Jones, H., Julien, Q., Kameda, S., Kerschbaum, F., Kervella, P., Koskinen, T., Krijger, M., Krupp, N., Lafarga, M., Landini, F., Lellouch, E., Leto, G., Luntzer, A., Rank-Lüftinger, T., Maggio, A., Maldonado, J., Maillard, J.-P., Mall, U., Marquette, J.-B., Mathis, S., Maxted, P., Matsuo, T., Medvedev, A., Miguel, Y., Minier, V., Morello, G., Mura, A., Narita, N., Nascimbeni, V., Nguyen Tong, N., Noce, V., Oliva, F., Palle, E., Palmer, P., Pancrazzi, M., Papageorgiou, A., Parmentier, V., Perger, M., Petralia, A., Pezzuto, S., Pierrehumbert, R., Pillitteri, I., Piotto, G., Pisano, G., Prisinzano, L., Radioti, A., Réess, J.-M., Rezac, L., Rocchetto, M., Rosich, A., Sanna, N., Santerne, A., Savini, G., Scandariato, G., Sicardy, B., Sierra, C., Sindoni, G., Skup, K., Snellen, I., Sobiecki, M., Soret, L., Sozzetti, A., Stiepen, A., Strugarek, A., Taylor, J., Taylor, W., Terenzi, L., Tessenyi, M., Tsiaras, A., Tucker, C., Valencia, D., Vasisht, G., Vazan, A., Vilardell, F., Vinatier, S., Viti, S., Waters, R., Wawer, P., Wawrzaszek, A., Whitworth, A., Yung, Y.L., Yurchenko, S.N., Osorio, M.R.Z., Zellem, R., Zingales, T., Zwart, F.: A chemical survey of exoplanets with ARIEL. Exp. Astron.

**46**(1), 135–209 (2018). https://doi.org/10.1007/s10686-018-9598-xEdwards, B., Mugnai, L., Tinetti, G., Pascale, E., Sarkar, S.: An Updated Study of Potential Targets for Ariel. Astron. J.

**157**(6), 242 (2019). arXiv:1905.04959 [astro-ph.EP]. https://doi.org/10.3847/1538-3881/ab1cb9Mugnai, L.V., Al-Refaie, A., Bocchieri, A., Changeat, Q., Pascale, E., Tinetti, G.: Alfnoor: Assessing the Information Content of Ariel’s Low-resolution Spectra with Planetary Population Studies. Astron. J.

**162**, 288 (2021). https://doi.org/10.3847/1538-3881/ac2e92Wall, J.V., Jenkins, C.R.: Practical Statistics for Astronomers, (2012)

Changeat, Q., Al-Refaie, A., Mugnai, L.V., Edwards, B., Waldmann, I.P., Pascale, E., Tinetti, G.: Alfnoor: A Retrieval Simulation of the Ariel Target List. Astron. J.

**160**(2), 80 (2020). https://doi.org/10.3847/1538-3881/ab9a53Al-Refaie, A.F., Changeat, Q., Waldmann, I.P., Tinetti, G.: TauREx 3: A Fast, Dynamic, and Extendable Framework for Retrievals. Astrophys. J.

**917**(1), 37 (2021). https://doi.org/10.3847/1538-4357/ac0252Mugnai, L.V., Pascale, E., Edwards, B., Papageorgiou, A., Sarkar, S.: ArielRad: the Ariel radiometric model. Exp. Astron.

**50**(2–3), 303–328 (2020). https://doi.org/10.1007/s10686-020-09676-7Abel, M., Frommhold, L., Li, X., Hunt, K.L.C.: Collision-induced absorption by h2 pairs: From hundreds to thousands of kelvin. The Journal of Physical Chemistry A

**115**(25), 6805–6812 (2011). https://doi.org/10.1021/jp109441f. PMID: 21207941Fletcher, L.N., Gustafsson, M., Orton, G.S.: Hydrogen Dimers in Giant-planet Infrared Spectra. Astrophys. J. Suppl. Ser.

**235**(1), 24 (2018). arXiv:1712.02813 [astro-ph.EP]. https://doi.org/10.3847/1538-4365/aaa07aAbel, M., Frommhold, L., Li, X., Hunt, K.L.C.: Infrared absorption by collisional h2-he complexes at temperatures up to 9000 k and frequencies from 0 to 20,000 cm(-1). The Journal of Chemical Physics

**136**(4), 044319 (2012). https://doi.org/10.1063/1.3676405Barton, E.J., Hill, C., Yurchenko, S.N., Tennyson, J., Dudaryonok, A.S., Lavrentieva, N.N.: Pressure-dependent water absorption cross sections for exoplanets and other atmospheres. J. Quant. Spectrosc. Radiat. Transf.

**187**, 453–460 (2017). arXiv:1610.09008 [astro-ph.EP]. https://doi.org/10.1016/j.jqsrt.2016.10.024Polyansky, O.L., Kyuberis, A.A., Zobov, N.F., Tennyson, J., Yurchenko, S.N., Lodi, L.: ExoMol molecular line lists XXX: a complete high-accuracy line list for water. Mon. Not. R. Astron. Soc.

**480**(2), 2597–2608 (2018). arXiv:1807.04529 [astro-ph.EP]. https://doi.org/10.1093/mnras/sty1877Hill, C., Yurchenko, S.N., Tennyson, J.: Temperature-dependent molecular absorption cross sections for exoplanets and other atmospheres. Icarus

**226**(2), 1673–1677 (2013). https://doi.org/10.1016/j.icarus.2012.07.028Yurchenko, S.N., Tennyson, J.: ExoMol line lists - IV. The rotation-vibration spectrum of methane up to 1500 K. Mon. Not. R. Astron. Soc.

**440**(2), 1649–1661 (2014). arXiv:1401.4852 [astro-ph.EP]. https://doi.org/10.1093/mnras/stu326Rothman, L.S., Gordon, I.E., Barber, R.J., Dothe, H., Gamache, R.R., Goldman, A., Perevalov, V.I., Tashkun, S.A., Tennyson, J.: HITEMP, the high-temperature molecular spectroscopic database. J. Quant. Spectrosc. Radiat. Transf.

**111**, 2139–2150 (2010). https://doi.org/10.1016/j.jqsrt.2010.05.001Yurchenko, S.N., Barber, R.J., Tennyson, J.: A variationally computed line list for hot NH3. Monthly Notices of the Royal Astronomical Society

**413**(3), 1828–1834 (2011). https://doi.org/10.1111/j.1365-2966.2011.18261.xTennyson, J., Yurchenko, S.N.: ExoMol: molecular line lists for exoplanet and other atmospheres. Monthly Notices of the Royal Astronomical Society

**425**(1), 21–33 (2012). https://doi.org/10.1111/j.1365-2966.2012.21440.xLi, G., Gordon, I.E., Rothman, L.S., Tan, Y., Hu, S.-M., Kassi, S., Campargue, A., Medvedev, E.S.: Rovibrational Line Lists for Nine Isotopologues of the CO Molecule in the X \(^{1}\)\(\Sigma \)\(^{+}\) Ground Electronic State. Astrophys. J. Suppl. Ser.

**216**(1), 15 (2015). https://doi.org/10.1088/0067-0049/216/1/15Azzam, A.A.A., Tennyson, J., Yurchenko, S.N., Naumenko, O.V.: ExoMol molecular line lists – XVI. The rotation-vibration spectrum of hot H2S. Monthly Notices of the Royal Astronomical Society

**460**(4), 4063–4074 (2016). https://doi.org/10.1093/mnras/stw1133Barber, R.J., Strange, J.K., Hill, C., Polyansky, O.L., Mellau, G.C., Yurchenko, S.N., Tennyson, J.: ExoMol line lists – III. An improved hot rotation-vibration line list for HCN and HNC. Monthly Notices of the Royal Astronomical Society

**437**(2), 1828–1835 (2013). https://doi.org/10.1093/mnras/stt2011Changeat, Q., Keyte, L., Waldmann, I.P., Tinetti, G.: Impact of Planetary Mass Uncertainties on Exoplanet Atmospheric Retrievals. Astrophys. J.

**896**(2), 107 (2020). arXiv:1908.06305 [astro-ph.EP]. https://doi.org/10.3847/1538-4357/ab8f8bFeroz, F., Hobson, M.P., Bridges, M.: MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics. Monthly Notices of the Royal Astronomical Society

**398**(4), 1601–1614 (2009). https://doi.org/10.1111/j.1365-2966.2009.14548.xBuchner, J.: Nested Sampling Methods. arXiv e-prints, 2101–09675 (2021). arXiv:2101.09675 [stat.CO]

Kass, R.E., Raftery, A.E.: Bayes factors. Journal of the American Statistical Association

**90**(430), 773–795 (1995). https://doi.org/10.1080/01621459.1995.10476572Jenkins, C.R., Peacock, J.A.: The power of Bayesian evidence in astronomy. Monthly Notices of the Royal Astronomical Society

**413**(4), 2895–2905 (2011). https://doi.org/10.1111/j.1365-2966.2011.18361.xSanders, F.: On Subjective Probability Forecasting. Journal of Applied Meteorology and Climatology

**2**(2), 191–201 (1963). https://doi.org/10.1175/1520-0450(1963)002<0191:OSPF>2.0.CO;2Wilks, D.S.: Chapter 9 - forecast verification. In: Wilks, D.S. (ed.) Statistical Methods in the Atmospheric Sciences (Fourth Edition), Fourth edition edn., pp. 369–483. Elsevier (2019). https://doi.org/10.1016/B978-0-12-815823-4.00009-2

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research

**12**, 2825–2830 (2011)Brier, G.W.: Verification of forecasts expressed in terms of probability. Monthly Weather Review

**78**(1), 1–3 (1950). https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2Platt, J., et al.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers

**10**(3), 61–74 (1999)Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002). https://doi.org/10.1145/775047.775151

Niculescu-Mizil, A., Caruana, R.: Predicting good probabilities with supervised learning. In: ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning, pp. 625–632 (2005). https://doi.org/10.1145/1102351.1102430

Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C. The Art of Scientific Computing, (1992)

Trotta, R.: Bayes in the sky: Bayesian inference and model selection in cosmology. Contemporary Physics

**49**(2), 71–104 (2008). arXiv:0803.4089 [astro-ph]. https://doi.org/10.1080/00107510802066753Oreshenko, M., Lavie, B., Grimm, S.L., Tsai, S.-M., Malik, M., Demory, B.-O., Mordasini, C., Alibert, Y., Benz, W., Quanz, S.P., Trotta, R., Heng, K.: Retrieval analysis of the emission spectrum of WASP-12b: Sensitivity of outcomes to prior assumptions and implications for formation history. Astrophys. J.

**847**(1), 3 (2017). https://doi.org/10.3847/2041-8213/aa8acfAstropy Collaboration, Price-Whelan, A.M., Sipőcz, B.M., Günther, H.M., Lim, P.L., Crawford, S.M., Conseil, S., Shupe, D.L., Craig, M.W., Dencheva, N., Ginsburg, A., VanderPlas, J.T., Bradley, L.D., PérezSuárez, D., de Val-Borro, M., Aldcroft, T.L., Cruz, K.L., Robitaille, T.P., Tollerud, E.J., Ardelean, C., Babej, T., Bach, Y.P., Bachetti, M., Bakanov, A.V., Bamford, S.P., Barentsen, G., Barmby, P., Baumbach, A., Berry, K.L., Biscani, F., Boquien, M., Bostroem, K.A., Bouma, L.G., Brammer, G.B., Bray, E.M., Breytenbach, H., Buddelmeijer, H., Burke, D.J., Calderone, G., Cano Rodríguez, J.L., Cara, M., Cardoso, J.V.M., Cheedella, S., Copin, Y., Corrales, L., Crichton, D., D ’Avella, D., Deil, C., Depagne, É., Dietrich, J.P., Donath, A., Droettboom, M., Earl, N., Erben, T., Fabbro, S., Ferreira, L.A., Finethy, T., Fox, R.T., Garrison, L.H., Gibbons, S.L.J., Goldstein, D.A., Gommers, R., Greco, J.P., Greenfield, P., Groener, A.M., Grollier, F., Hagen, A., Hirst, P., Homeier, D., Horton, A.J., Hosseinzadeh, G., Hu, L., Hunkeler, J.S., Ivezié, Ž., Jain, A., Jenness, T., Kanarek, G., Kendrew, S., Kern, N.S., Kerzendorf, W.E., Khvalko, A., King, J., Kirkby, D., Kulkarni, A.M., Kumar, A., Lee, A., Lenz, D., Littlefair, S.P., Ma, Z., Macleod, D.M., Mastropietro, M., McCully, C., Montagnac, S., Morris, B.M., Mueller, M., Mumford, S.J., Muna, D., Murphy, N.A., Nelson, S., Nguyen, G.H., Ninan, J.P., Nöthe, M., Ogaz, S., Oh, S., Parejko, J.K., Parley, N., Pascual, S., Patil, R., Patil, A.A., Plunkett, A.L., Prochaska, J.X., Rastogi, T., Reddy Janga, V., Sabater, J., Sakurikar, P., Seifert, M., Sherbert, L.E., Sherwood-Taylor, H., Shih, A.Y., Sick, J., Silbiger, M.T., Singanamalla, S., Singer, L.P., Sladen, P.H., Sooley, K.A., Sornarajah, S., Streicher, O., Teuben, P., Thomas, S.W., Tremblay, G.R., Turner, J.E.H., Terrón, V., van Kerkwijk, M.H., de la Vega, A., Watkins, L.L., Weaver, B.A., Whitmore, J.B., Woillez, J., Zabalza, V., Astropy Contributors: The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package. Astron. J.

**156**(3), 123 (2018). arXiv:1801.02634 [astro-ph.IM]. https://doi.org/10.3847/1538-3881/aabc4fCollette, A., Caswell, T.A., Tocknell, J., Kluyver, T., Dale, D., Scopatz, A., Jelenak, A., Valls, V., Kofoed Pedersen, U., Raspaud, M., jakirkham, Parsons, A., jialin, Chan, L., Paramonov, A., Hole, L., Feng, Y., Johnson, S.R., Brucher, M., Teichmann, M., Vaillant, G.A., de Buyl, P., Hinsen, K., Huebl, A., VINCENT, T., Dietz, M., Rathgeber, F., Billington, C., Kieffer, J., Wright, G.: h5py/h5py: 2.10.0. Zenodo (2019). https://doi.org/10.5281/zenodo.3401726

Hunter, J.D.: Matplotlib: A 2D Graphics Environment. Computing in Science and Engineering

**9**(3), 90–95 (2007). https://doi.org/10.3905/jpm.1997.409618Harris, C.R., Millman, K.J., van der Walt, S.J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N.J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M.H., Brett, M., Haldane, A., del Río, J.F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., Oliphant, T.E.: Array programming with NumPy. Nature

**585**(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2

## Funding

Open access funding provided by Università degli Studi di Roma La Sapienza within the CRUI-CARE Agreement. The authors acknowledge that this work has been supported by the ASI grant n. 2021.5.HH.0.

## Author information

### Authors and Affiliations

### Contributions

Andrea Bocchieri wrote the main manuscript text and prepared all the figures. Lorenzo V. Mugnai provided the forward models for the analysis. All authors provided comments on the analysis. Andrea Bocchieri and Enzo Pascale edited the final manuscript. All authors read and approved the final manuscript.

### Corresponding author

## Ethics declarations

### Conflict of interest

The authors declare they have no conflict of interest.

## Appendix A Complementary figures

### Appendix A Complementary figures

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Bocchieri, A., Mugnai, L.V., Pascale, E. *et al.* Detecting molecules in *Ariel* low resolution transmission spectra.
*Exp Astron* **56**, 605–644 (2023). https://doi.org/10.1007/s10686-023-09911-x

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s10686-023-09911-x