Scientific motivation and objectives

Driving questions

Exoplanet research is a focal point in modern astrophysics and one of the long-term primary objectives is the investigation of the atmospheric properties of dozens of small and terrestrial exoplanets. This is partially driven by the goal to search for and identify potentially habitable or even inhabited exoplanets. In addition, such a dataset is invaluable for investigating and understanding the diversity of planetary bodies. Exoplanet science is already omnipresent on the roadmaps of all major space agencies. However, none of the currently selected missions, neither in Europe nor in the US, will be able to deliver the above-mentioned comprehensive dataset of terrestrial exoplanet atmospheres as we will further detail below. The same is true for current and future ground-based observatories including the 30–40 m Extremely Large Telescopes (ELTs). Therefore, tackling a prime objective of exoplanet science and understanding how unique or common planets like our Earth are in our galactic neighborhood will require a new, dedicated approach. Now is the right time to start investigating how a statistically relevant number of terrestrial exoplanet atmospheres can be analyzed and discuss how the guiding scientific objectives should be formulated. In fact, thanks to NASA’s Kepler and TESS missions, ESA’s upcoming PLATO mission, and ongoing and future radial velocity (RV) surveys from the ground, by 2030 we will have a robust statistical understanding of the occurrence rate of terrestrial exoplanets and their radius, mass, and period distributions out to the habitable zone around main sequence stars, and we will have identified dozens of exoplanets in the immediate vicinity of the Sun including potentially habitable ones.

The next logical step is to address the following questions:

(Q1) How many exoplanets exhibit (atmospheric) signatures of potential biological activity?

(Q2) What fraction of terrestrial exoplanets provide (surface) conditions so that liquid water and life as we know it could in principle exist?

(Q3) How diverse are (terrestrial) exoplanet atmospheres in their composition across a range of relevant parameters (e.g., planet mass and radius, host star spectral type, orbital period) and how does this compare to theories for planet and atmosphere formation and evolution?

The sequence of these questions is deliberately chosen such as to go from the most specific (the search for biosignatures) to the most general (atmospheric diversity).

The mid-infrared opportunity

While in-situ and/or fly-by measurements can in principle be carried out for Solar System bodies, this is not possible for exoplanets because of their distance. Instead, we have to rely on remote sensing techniques. These include investigations in reflected light (at optical and near-infrared (NIR) wavelengths), transmitted light (if the planet is transiting in front of its host star), or thermal emission (either through secondary eclipse, phase curve measurements or spatially resolved observations, all done at NIR to mid-infrared (MIR) wavelengths). We will argue in the following that spatially resolved observations in the MIR that aim to detect exoplanet thermal emission spectra are likely the most promising and powerful approach to address Q1-Q3 listed above and are hence the focus of this White Paper. The scientific potential of studying (terrestrial) exoplanets in reflected light is discussed in the complementary White Paper by Snellen et al.

For mature (>Gyr old) planets orbiting in the inner few AU around their host stars the energy budget of their atmospheres is typically dominated by the absorption and re-radiation of stellar energy. The temperature structure of the atmosphere, i.e., the temperature as a function of height or pressure, is a key diagnostic and a driver of chemistry and climate. An emission spectrum encodes information about this temperature structure as well as the re-radiated luminosity of the planet, which – in combination with the observed effective temperature – strongly constrains the planet radius. Taking the Earth as reference, certain atmospheric windows in the MIR may even allow a direct measurement of the surface temperature of a terrestrial exoplanet (e.g., [1]). Furthermore, the MIR wavelength regime offers an unparalleled diagnostic potential to determine the atmospheric composition as multiple major molecules required to explore planetary conditions present strong absorption bands in the MIR. Thermal emission observations are also less influenced by (though not insensitive to) the presence of clouds (e.g., [2]). Mitigating the role of uncertain cloud properties is imperative to our understanding of atmospheric composition. In particular for the detection of biosignature gases and chemical disequilibrium, the MIR is an ideal spectral regime (e.g., [1, 3]). The MIR includes absorption bands from ozone (O3) and methane (CH4) and the presence of both molecules in the Earth’s atmosphere, which unless continuously replenished would quickly react with each other, can only be explained by biological activity. Furthermore, nitrous oxide (N2O), another potential biosignature present in the Earth’s atmosphere, has strong bands in the MIR, but, similar to methane, no equally strong bands at optical or NIR wavelengths. Molecular oxygen (O2) can be detected in the optical regime (around 760–765 nm) and not in the MIR, but interpreting oxygen as a potential biomarker requires additional contextual information [4]. The unique richness of the MIR spectral range in general, but in particular in the context of potential biosignature gases, is summarized in Fig. 1.

Fig. 1
figure 1

Molecules relevant to terrestrial planet characterization between 3 and 20 μm (adapted from Catling et al. [5] and Schwieterman et al. [3]). Potential biosignature gases are listed in the top-half (green background) and other possible constituents in the lower half (yellow background). Strong bands in Earth’s modern spectrum are highlighted in red. In particular the existence of significant CH4 and N2O bands is important to highlight as they have no strong counterparts at optical or NIR wavelengths. We also note the strong CO band around 4.67 μm, which can serve as an “anti-biosignature” gas under certain circumstances

The following three sub-sections provide more quantitative information regarding the driving questions listed in section 1.1 above. Section 1.2.1 deals with the admittedly hypothetical case of observing an Earth-twin and searching for biosignatures (Q1). In section 1.2.2 we discuss how one could statistically address the fundamental question how common or rare terrestrial exoplanets with Earth-like (surface) conditions are (Q2). Finally, in section 1.2.3 we show how many exoplanets covering a broad range of sizes and orbital periods could be detected by an MIR exoplanet imaging mission in order to investigate atmospheric diversity in the most general sense (Q3).

Atmospheric modeling and retrieval of key parameters for an earth-twin (Q1)

We carried out a spectral retrieval study (see, e.g., [6], for a recent review on atmospheric retrieval for exoplanets), where Earth’s emission spectrum was modeled over a certain wavelength range assuming a certain spectral resolution (R) and signal-to-noise (SNR) per resolution element (cf. [7, 8]). Taking such a model spectrum, i.e., a simulated observation of an Earth-twin, as input, the retrieval framework allows us to derive posterior distributions for key atmospheric parameters including the pressure-temperature profile, abundances of key atmospheric constituents as well as the radius of the planet. We acknowledge that the choice of an Earth-twin exoplanet is a special case and it seems unlikely that the formation and subsequent evolution processes that led to the Earth’s present-day atmosphere took place also on another nearby exoplanet. However, we argue that any mission that aims to detect potential biosignatures in an exoplanet’s atmosphere should be able to identify such signatures in our Earth’s atmosphere, the only planet we know to harbor life. Hence, taking an Earth-twin as starting point, as is also frequently done in other publications (e.g., [2, 9, 10]), appears justifiable.

Our results from one specific simulation are shown in Fig. 2. Here, we assumed a wavelength range of 3–30 μm, R = 100 and SNR = 20; however, the results would hardly change if the wavelength range were limited to 3–20 μm. The values for R and SNR were chosen to quantitatively compare the results to those derived in a published retrieval study by Feng et al. [12] who investigated the diagnostic potential of reflected light observations of an Earth-twin in the 0.4–1.0 μm range.Footnote 1

Fig. 2
figure 2

Results of a retrieval study of an Earth-twin atmosphere observed over a wavelength range of 3–30 μm with R = 100 and SNR = 20. The inlay in the top right corner shows the model in blue, the simulated observational data in black and a black-body curve representing surface emission in red. Below, the corresponding pressure-temperature profile is plotted with the model again in blue, the best-fit retrieved profile in black and the red band indicating the 68% confidence range. The corner plot shows the marginalized distributions for the abundance of major molecules detected in the atmosphere as well as for the radius of the planet, all of which were free parameters in the retrieval (with flat priors). All parameters are well constrained. We note that also the planet’s mass was a free parameter in the retrieval analysis (with a flat prior confined to the range 0.1–10 M). As expected, it could not be accurately estimated from the spectrum alone and complementary RV measurements or an empirically calibrated mass-radius relationship (e.g., [11]) would be needed to provide constraints

Figure 2 clearly demonstrates that if one were able to obtain a high-quality MIR spectrum of an Earth twin, the atmospheric composition, but also surface pressure, surface temperature, and planetary radius could be constrained with high precision and accuracy.

This is further illustrated in Fig. 3 where the best-fit values as well as the 68% confidence range for the retrieved parameters are plotted (red data points) and compared to the ‘ground-truth’, i.e., the input value in the simulated spectrum (black lines). In all cases, the input value is well within the 68% confidence range and this confidence range typically corresponds to a factor of 3 (∼0.5 dex) uncertainty for the molecular abundances. Furthermore, the radius is constrained to much better than a few percent, the surface pressure to 0.1 dex, and the surface temperature to better than 5 K. We emphasize that the radius was determined solely from the emission spectrum and no additional information from, e.g., transit observations, were required. With such a dataset in hand, assessing the atmospheric conditions – and potentially the surface conditions – of this planet would easily be feasible. The existence of atmospheric biosignatures could be detected with high confidence.

Fig. 3
figure 3

Comparing the retrieval results for the simulated thermal emission spectrum shown in Fig. 2 and those found in Feng et al. [12] for reflected light to the input values used for the different parameters in the atmospheric models. The black vertical lines represent the ‘true’ value for each parameter, the yellow areas indicate the ±0.5 dex range for the atmospheric constituents and for the surface pressure, the ±0.3 R range for the planet radius and the ±10 K range for the surface temperature. The red and blue data points show the location of the retrieved best-fit parameters, for thermal emission and reflected light, respectively, and the error bars indicate their 68% confidence range

In addition to the results from our retrieval study, Fig. 3 also shows the same analysis using the results found by Feng et al. [12] for reflected light (blue data points) instead of thermal emission. The key take-away messages from this comparison are the following:

(a) The atmospheric abundances that can be constrained using thermal emission are retrieved with similar accuracy and precision as those accessible in reflected light.

(b) The biosignature gases CH4 and N2O are only accessible in the thermal emission spectrum as is the anti-biosignature gas CO.

(c) O2 is only accessible in reflected light, but in the MIR emission spectrum O3, which is an atmospheric by-product of O2, serves as a robust proxy for the existence of oxygen.

(d) The constraints on the planetary radius are much stronger in thermal emission (uncertainties are of order 5%) compared to reflected light (uncertainties are of order 30%) as there is a degeneracy between the planet’s albedo and its radius in reflected light.

(e) While surface temperature and pressure can be well constrained with thermal emission spectra, reflected light does not provide immediate information about surface temperature and constraints on pressure are weaker.

This comparison demonstrates the enormous characterization potential contained in the thermal emission spectra of terrestrial exoplanets. Not included in this analysis is the possibility of detecting N2-N2 collision-induced absorption features in the MIR (see, Fig. 1; [13]), which could help constrain the absolute abundances of the different components in N2-dominated atmospheres. For reflected light studies at optical wavelengths surface features such as ‘glint’ from water oceans could be detected under certain conditions (e.g., [14]) and also the so-called ‘red-edge’, the specific spectral reflectance of chlorophyll in plants, is observable – at least in case of the Earth (e.g., [15]).

Statistical significance and possible null-result (Q2)

Already the detection of a single exoplanet spectrum with clear indications of biologically induced disequilibrium chemistry would be a breakthrough result that warrants special care in interpreting [5]. However, one must be prepared for a null-result and a future mission should be able to address more general – and scientifically equally important – questions related to the population of (terrestrial) exoplanets, their atmospheres and climates. Q2 and Q3 listed above capture such questions and require a sample of exoplanets to be investigated. In particular for Q2 the sample size should be defined in such a way that the null-result, i.e., that none of the planets that are characterized turns out to provide conditions similar to those on Earth, is a major scientific result and robustly quantifies the rareness of “Earth-like” planets. A possible approach is to re-formulate this question in the following hypothesis:

“The fraction of terrestrial exoplanets that reside in the empirical habitable zone around their host star and provide conditions for liquid water to exist is ηhab.”

We emphasize that ηhab is not identical to the commonly used η parameter that quantifies the fraction of stars that harbor terrestrial exoplanets in their habitable zone and ηhab is also ignorant of the potential existence of biosignatures in the atmospheres. In the context of the hypothesis above we define “terrestrial exoplanets” as planets with a radius Rplanet with 0.5 R ≤ Rplanet ≤ 1.5 R and the “empirical habitable zone” as the separation range where the incoming stellar insolation Splanet is 0.35 S ≤ Splanet ≤ 1.75 S with S being the solar constant, i.e., the average insolation received at Earth (cf. [16]). The small end of the radius range is defined by the size of Mars, which is assumed to be the minimum planet size/mass that can retain an atmosphere, and the large end by the transition between rocky and gas dominated planets [11, 17, 18]. The insolation range is defined by the so-called “Early Mars limit” at the outer edge and the “Recent Venus limit” at the inner edge [19]. We acknowledge that the concept of “(exoplanet) habitability” is complex (e.g., [20]) and extends beyond the radius and insolation ranges considered here (e.g., [21, 22]). However, similar to the case of the Earth-twin considered in section 1.2.1, we think that our choices are a reasonable and justifiable starting point that can be modified going forward as our understanding of “habitability” and exoplanet properties further progresses.

For the radius and insolation ranges considered here, two planets in the Solar System, Earth and Mars, qualify. Hence, the fraction of terrestrial exoplanets that reside in the empirical habitable zone in the Solar System and allow for the existence of liquid water is one (Earth) out of two, i.e., ηhab = 50%.

In Fig. 4 we show the significance that a certain value for ηhab can be rejected in case a certain number of planets (50, 30, 20, or 10) were observed and none of them provides favorable conditions. These numbers are based on Poisson statistics and assume that the occurrence of terrestrial exoplanets orbiting in the habitable zones of different stars can be treated as statistically uncorrelated events. It shows that, if 30 planets are observed, ηhab = 20% and ηhab = 50% can be rejected with ≈3σ and ≈5σ, respectively. For 50 planets, ηhab = 10% and ηhab = 30% can be rejected with the same confidence levels. These results suggest that, in case of a null-result, several tens of planets would be required in order to derive statistically significant limits on the rareness of “Earth-like” planets.

Fig. 4
figure 4

The statistical power of a null-result: in case 50 (red curve), 30 (blue curve), 20 (green curve), or 10 (yellow curve) exoplanets with radii between 0.5 R and 1.5 R and receiving between 0.35 and 1.7 times the insolation of the Earth are investigated with high-quality thermal emission spectra and not a single one is found to support conditions that allow for the existence of liquid water, then the null-hypothesis – shown on the x-axis – can be rejected with the significance shown on the y-axis. In the Solar System, one out of two planets within the empirical habitable zone provides (surface) conditions for liquid water to exist; hence, ηhab = 50% for the Solar System

To date we do not yet have a large enough sample of exoplanets detected that fulfill the criteria used in the analysis (cf. section 4.2). However, steady progress is being made by ongoing surveys and missions to increase the number of relevant planets. Alternatively, a mission that can address Q1 and Q2 and the objectives formulated above could be split in two phases: (1) a “search phase”, aiming at quickly detecting a sufficient number of planets in the above-mentioned radius and insolation range; and (2) a “characterization phase”, where a sub-set of the detected planets would be re-observed and investigated with high SNR in sufficient detail. To make the search phase time-efficient, a broad-band photometry mode, e.g., by collapsing the MIR spectra over certain wavelength ranges to increase the SNR, could be applied. As we will detail below, it is important to mention that during the search phase many more planets, with properties outside the parameter range defined above, would be detected “for free”. It turns out that if the occurrence rates of exoplanets (including their radius and period distributions), as found by NASA’s Kepler mission, are applicable to exoplanets orbiting stars in the vicinity of the Sun, several hundred planets may be detected.

Atmospheric diversity and total planet yield (Q3)

Similar to the diversity in planet radii and orbital periods, as revealed by NASA’s Kepler mission, we can expect a great diversity in atmospheric properties of (terrestrial) exoplanets (e.g., [23]). It is hence important to understand how many exoplanets in general, i.e., over a large region in the radius vs. stellar insolation parameter space, could be detected, e.g., during the search phase of an MIR exoplanet imaging mission. The detection of a large sample would enhance the value of the mission and the potential science legacy by enabling the exploration of (unbiased) planetary system architectures and the constraints they put on planet formation theory and atmospheric models. At the same time, the feasibility of detecting tens of “Earth-like” exoplanets as defined above in section 1.2.2 needs to be investigated.

We hence updated Monte Carlo simulations that were first presented in Kammerer & Quanz [24] and Quanz et al. [25] to quantify the exoplanet yield during the search phase mentioned above. The technical specifications used in these simulations are based on earlier concept studies for a space-based MIR interferometry mission [26], but updated with more recent estimates for sensitivity limits similar to those of the MIRI instrument on the James Webb Space Telescope (JWST). For more details on sensitivity and spatial resolution requirements we refer to section 2.2. For the underlying planet population around FGK stars, in terms of planet occurrence rate and radius and period distributions, we used the statistics published by NASA’s working group SAG13.Footnote 2 These statistics were also used for recent studies for reflected light missions [27]. For M stars the statistics from Dressing & Charbonneau [28] were used. In general, Kepler and other ongoing missions and projects have impressively demonstrated that planetary systems are ubiquitous, including exoplanets close to or in the habitable zone (e.g., [29,30,31,32]). We now know that planets with sizes in the Earth and Super-Earth regime populate nearly every star and that systems consisting of multiple planets are very abundant (e.g., [33]).

The stellar sample we used in our simulations consisted of 320 F, G, K, and M stars all within 20 pc from the Sun [24, 34].Footnote 3 This is only a small subset of all stars within this distance limit (see, Fig. 5) and refining the possible target sample is subject of ongoing work. One important piece of information that needs to be carefully considered is the number of binary star systems that can in principle be included. Stellar binarity, over a certain range of separations, has been shown to have a negative impact on the occurrence of small planets [35]. A first rough estimate would suggest that no more than 20–30% of the stars shown in Fig. 5 should be eliminated, leaving more than 600 potential targets within only 15 pc.

Fig. 5
figure 5

All known GKM main-sequence stars within 15 pc from the Sun as a function of their apparent magnitude MG (in the Gaia filter). The left y-axis shows the number per bin in the black histogram, the right axis the cumulative number indicated by the blue line (∼900 objects in total). The axis on the top shows the approximate location of different spectral types relative to G magnitudes. In addition to the stars shown here, there are 21 F-type stars known within 15 pc

For each target star that we consider in our simulations, 5000 planetary systems were created with properties randomly drawn from the distributions mentioned above and randomly oriented orbits and planetary positions thereon. Based on their apparent separation and estimated flux levels (from randomly drawn Bond albedos and assuming black-body emission) we count all planets that are detectable according to the technical specifications (see section 2.2). We assume that the search phase could be carried out simultaneously in broad spectral bands centered at 5.6, 10, and 15 μm (e.g., by collapsing the observed spectra around these wavelength ranges) and that all 320 stars are observed for an equal amount of time (35,000 s). In total this amounts to less than 0.5 years and hence it seems realistic that even if overheads (e.g., slewing from one star to the next) and additional noise sources (e.g., stellar leakage or extra-zodiacal light) are considered, a potential search phase would not exceed 3 years. Figure 6 shows the results: the total number of detectable planets exceeds 400 and these planets cover a broad range of radii (0.5–6.0 R) and stellar insolation levels (≈0.1–1000 S). In fact, as shown in the left panel of Fig. 6, in each of the radius bins covering 0.5–1.25, 1.25–2.0, and 2.0–4.0 R more than 120 exoplanets should be detectable. Such a database would be an excellent starting point to address Q3 listed above, the diversity of planetary atmospheres. Furthermore, among those > 400 exoplanets, the number of detectable terrestrial planets located in the empirical habitable zone, as defined in section 1.2.2, is ∼30, which is the minimum number of planets needed for a meaningful statistical interpretation of a null-result (cf. Figure 4). This number can be further increased by optimizing the distribution of observing time across the stellar target list in order to maximize the number of detectable terrestrial planets located in the empirical habitable zone (cf. [36]). Further optimization potential lies in the selection of the stellar target sample.

Fig. 6
figure 6

Estimated exoplanet yield in a hypothetical 3-year search phase for a space-based MIR exoplanet imaging mission targeting 320 stars within 20 pc and assuming planet statistics as derived by NASA’s Kepler mission (cf. [24]). Left: Number of expected exoplanet detections per bin in the radius vs. stellar insolation plane. Right: The integrated number of expected exoplanets (423) broken down into sub-sets that are either detected in only one, two, or all three of the assumed bands centered at 5.6, 10, and 15 μm. Planets detected around FGK stars are shown in orange, planets around M stars in red bars

The need for a large space mission

In the previous sections we already alluded to a possible mission scenario that would enable the science described above: an MIR exoplanet mission that would consist of (1) a search phase to detect a large sample of exoplanets in certain broad spectral bands, and (2) a follow-up characterization phase to investigate the (atmospheric) properties of a sub-set of these exoplanets with high-fidelity MIR low-resolution spectroscopy over the 3–20 μm range. In the following we argue that this can only be achieved from space in the context of a large mission.

Why space?

Detecting thermal emission from celestial bodies between 3 and 20 μm wavelength from the ground is severely hampered by the thermal background emission caused by the Earth’s atmosphere (>100 Jy/arcsec2 for wavelengths >8 μm on Cerro Paranal, ESO’s site for the Very Large Telescope (VLT); [37]), the telescope’s primary mirror, and any additional non-cryogenic optical component in the light-path. For comparison, the Earth’s emission spectrum peaks around 10 μm (Fig. 2), but seen from a distance of 10 pc our planet emits only ∼0.4 μJy at these wavelength ranges [1]. Consequently, even with the upcoming 30–40 m ELTs, only a handful of terrestrial exoplanets around the very nearest stars will be detectable at 10 μm wavelength within a reasonable amount of observing time [38]. Furthermore, the science described above requires a continuous wavelength coverage to measure the luminosity of the exoplanets and search for various atmospheric key components. In particular the water band at 6.2 μm, and methane (CH4) and nitrous oxide (N2O) at 7.7 μm and 7.8 μm, respectively, are not detectable from the ground. The two latter molecules are, however, important biosignatures (Fig. 1) that are key for addressing Q1. In conclusion, achieving the science goals of this White Paper requires going to space.

Why L-class?

The direct detection of dozens of (small) exoplanets around nearby stars in the MIR wavelength regime requires both high spatial resolution and high sensitivity. An Earth-like planet seen at 10 pc has an apparent separation of only 0.1′′ from a solar-type host star and planets orbiting around M-stars must be even closer to their stars in order to be located in the habitable zone. Even JWST with its 6.5 m primary mirror does not provide sufficient spatial resolution (∼0.45′′ at 10 μm). Furthermore, the low flux levels from terrestrial exoplanets (see above the example of Earth) set some important constraints on the collecting area of the telescope. To our knowledge, the only way to achieve both sufficient spatial resolution and sensitivity is a mid-infrared nulling interferometer. It would consist of several spacecraft (‘collector telescopes’) that together provide enough collecting area, but are sufficiently far separated from each other so that the baselines between them provide the required spatial resolution. The beam combination would be done in a separate spacecraft where light from the central star interferes destructively (‘nulling’) so as to provide sufficient contrast to detect the much fainter signal from the planets (typically of order 10−7–10−6 fainter at 10 μm.).

For the exoplanet yield estimate in section 1.2.3 we assumed a nulling baseline between the collector telescopes of up to ∼170 m. As a starting point for the achievable sensitivity limits, we took those from JWST/MIRI [39], but reduced the overall instrument throughput by a factor of 3.5 (an interferometer will likely have a lower throughput than a single-dish instrument, cf. [24]). This means, however, that implicitly a collecting area similar to the effective aperture size of JWST was assumed. In case of 4 collector telescopes this translates into individual aperture sizes of the order 2.5 m in diameter (note: the primary mirror of the Hubble Space Telescope is 2.4 m and that of the Herschel Space Observatory was 3.5 m). The smaller the combined aperture size of the collector telescopes (or the smaller the number of collector telescopes), the longer the mission search phase to detect a large sample of exoplanets and the longer also the time needed to do high-SNR follow-up observations for in-depth atmospheric characterization. Indeed, decreasing the aperture size of the 4 collector telescopes from 2.5 m to 1.5 m or 0.75 m would reduce the total planet yield during the search phase from 423 (see section 1.2.3) to 281 or 155, respectively; a 2-telescope interferometer with apertures of a few tens of centimeters may only be able to search for planets around a few nearby stars [40]. In these cases, and because the time to carry out high signal-to-noise follow-up observations during the characterization phase would also require significantly more observing time, a large exoplanet sample that allows for a robust statistical analysis, as needed to address Q2 and Q3, would be more difficult to obtain.

This suggests that, very likely, a mission designed to address the science objectives described above can only be implemented in the framework of an ESA L-class mission. However, given the ambitious science goals and unique scientific capabilities of such a mission delivering truly ground-breaking results, other international partners might be interested in a joint effort (see also below).

Timing for a space-based mid-infrared interferometer mission

The idea for a space-based infrared interferometer for exoplanet science is not new. In fact, on both sides of the Atlantic, mission concepts were studied between the late 90s and the mid 2000s (Darwin on ESA’s side and TPF-I on NASA’s side). While conceived as too risky and ahead of their time back then, the landscape for such a mission has completely changed – as we will further detail below – and the timing of ESA’s “Voyage 2050” long-term plan could not be more ideal. One of the most fundamental scientific results from the past years, with immediate relevance for the science discussed here, came from NASA’s Kepler mission, namely that the statistical occurrence rate of small (terrestrial) planets around solar-type and also M-type stars is extremely high, with – on average – more than one planet orbiting each star. This led to a number of new exoplanet missions to be proposed to both ESA and NASA, possibly culminating in a new exoplanet-driven flagship mission on the US side aiming at the direct detection of (small) exoplanets in reflected light (see section 3.2; see also White Paper by Snellen et al.). However, while the recently published “Exoplanet Science Strategy” report from the US National Academies of SciencesFootnote 4 puts a strong emphasis on future missions detecting planets in reflected light in a first step, mid-infrared interferometry is considered to be key in the long-run. One of the findings states: “Technology development support in the next decade for future characterization concepts such as mid-infrared (MIR) interferometers [...] will be needed to enable strategic exoplanet missions beyond 2040.” Even more important is the following statement: “That said, the common (although often unspoken) belief is that such a nulling, near-infrared (NIR) interferometer would be a necessary follow-up to any reflected light direct imaging mission, as detecting the exoplanet in thermal emission is not only required to measure the temperature of the planet, but is also needed to measure its radius, and so (with an astrometric or radial velocity detection of [...] the mass of the planet) measure its density and thus determine if it is truly terrestrial.” In combination with the results shown in Fig. 3, underlining the unique characterization potential of thermal emission spectra, these statements re-emphasize not only the scientific importance, but also the timeliness for a space-based infrared nulling interferometer in the context of ESA’s “Voyage 2050” long-term plan. We note that it is not necessarily required that a reflected light mission has to be carried out first as shown by our yield analysis above. Finding and characterizing dozens of terrestrial exoplanets is feasible with a large MIR interferometer mission alone. It is, however, also clear that combining reflected light and thermal emission data for a given exoplanet expands the characterization potential significantly (see, e.g., section 1.2.1).

Exoplanet science in the 2030s–2040s

Since the first detection of a planet orbiting a main-sequence star other than our Sun [41], the field of exoplanet science has been growing at a breathtaking speed: to date we know more than 5000 exoplanets and exoplanet candidates.Footnote 5 The overwhelming majority of these objects were detected via dedicated long-term surveys using indirect techniques (the RV or the transit technique) from both the ground (e.g., the HARPS survey or the California Planet Survey) and from space (e.g., NASA’s Kepler mission). Thanks to the statistics derived from these surveys, we have a first quantitative understanding of the occurrence rate of different planet types as a function of their radius / mass, orbital period and also spectral type of the host star (e.g., [42,43,44]). In addition, we can put constraints on where the transition occurs between rock-dominated exoplanets and gas/atm-dominated exoplanets (e.g., [11, 17, 18]). The RV and transit techniques were also the techniques that revealed the first, rocky exoplanets orbiting within or close to the habitable zone of their (very) low-mass host stars located in the Solar neighborhood [29, 31]. For some exoplanets, transit spectroscopy and/or secondary eclipse measurements (primarily done from space with the Hubble Space Telescope and the Spitzer Space Telescope) provide empirical constraints on their atmospheric composition (e.g., [45, 46]). With a few exceptions (e.g., [47, 48]), up to now these investigations targeted primarily so-called hot Jupiters, gas-giant planets on orbits with periods of a few days only. In the following we will summarize what developments from ground and space we can expect in the coming ∼20 years. The focus will be on developments with immediate relevance for the science proposed here, i.e., the direct detection and (atmospheric) characterization of terrestrial exoplanets using thermal emission spectra.

Expected developments on the ground

RV: Ongoing (large) programs with high-precision, high-resolution spectrographs (e.g., HARPS, HIRES) continue to search for and detect exoplanets over a range of masses and orbital periods. New spectrographs specifically designed to detect small, rocky planets either around nearby, red, low-mass stars (e.g., CARMENES, SPIRou) or even aiming at reaching the detection threshold for an Earth-twin around a solar-type star of ∼0.1 m/s (e.g., ESPRESSO) are in operations now and have the potential to reveal new (small) exoplanets in the vicinity of the Sun (see also section 4.2). More spectrographs (e.g., NIRPS, HARPS3, EXPRES, NEID) with similar capabilities and science goals are currently in development and will support the search for low-mass, rocky planets in the solar neighborhood. Coupling a high-resolution spectrograph with an adaptive optics system on an 8 m telescope may, in principle and under best conditions, allow for the detection of the nearest exoplanet, Proxima b, in reflected light [49]. Similar science is expected for a few more objects with upcoming high-resolution spectrographs in the era of 30–40 m Extremely Large Telescopes (ELTs) (e.g., [50, 51]).

Transit: The majority of ground-based exoplanet transit searches (e.g., WASP, KELT, MAS- CARA, HAT, TrES) are focusing on the detection of hot and warm gas giant planets. However, other projects (e.g., NGTS, MEarth, Trappist, SPECULOOS) are designed to look for smaller and terrestrial exoplanets. The latter ones are only detectable around (very) low-mass stars, but can be located within their empirical habitable zone [31]. Given the random orientation of planetary orbits in the plane of the sky, only a small minority of the existing exoplanets can be detected this way; most planets do not transit. Consequently, for a given planet type, transiting planets have statistically a larger distance from the Sun compared to non-transiting planets rendering possible follow-up observations with, e.g., direct imaging techniques, more challenging.

High-contrast imaging: All leading ground-based 8 m class observatories are equipped with high-contrast imaging instruments designed to directly detect massive gas giant planets (>2–3 MJupiter at large orbital separations (>20 AU) (e.g., [52]). ESO and the Breakthrough Foundation just concluded a first experiment to directly detect thermal emission around 10 μm from small planets around alpha Cen A with an upgraded MIR instrument at the VLT in Chile.Footnote 6 However, even with 8 m telescopes, detecting thermal emission from a true Earth-analog around the nearest stars is prohibitively expensive in terms of observing time and only with the advent of the ELTs a few terrestrial exoplanets orbiting very nearby stars would come within reach if they existed [38].

Microlensing: Large ground-based networks of dedicated telescopes continue to identify and monitor microlensing events and give access to a unique part of the exoplanet mass-separation parameter space as, in principle, they can constrain the occurrence rate of planets as a function of their mass (down to Super-Earths) out to separations of around 10 AU (e.g., [53]). There is a strong detection bias towards finding planets around M-stars. The number of detections is still modest, compared to RV and transit searches, and hence the uncertainties are large, but in order to map out the exoplanet population – in a statistical sense – microlensing is indispensable. A drawback is that the majority of the events are located too far away from the Sun for any follow-up observations to be feasible.

Expected developments in space

Both ESA and NASA are preparing to launch a suite of missions dedicated or related to exoplanet science in the coming 10–20 years that will join other already ongoing exoplanet missions (see Figs. 7 and 8). In addition to the dedicated exoplanet missions described below, ESA’s Gaia mission will reveal thousands of exoplanets based on the astrometric motion of their host stars providing a rich dataset for exoplanet population studies and targets for future imaging studies from ground and space [54]. Whether Gaia can reveal Super-Earth exoplanets around some very nearby stars depends on the achievable astrometric accuracy at the end of the mission and remains to be seen. Not listed is ESA’s EUCLID mission, which, while its primary mission is not exoplanet science, will deliver a large catalog of microlensing events, significantly extending the statistical power of this method.

Fig. 7
figure 7

Adopted space missions related to exoplanet science from ESA. Image credit: ESA; https://sci.esa.int/s/w7gy4Yw (accessed July 4, 2019)

Fig. 8
figure 8

Space missions related to exoplanet science from NASA including potential future missions that are part of the current Decadal Survey and also an MIR interferometer as discussed here. Image credit: NASA/JPL/Caltech; https: // exoplanets. Nasa. gov/ exep/ technology/ technology-overview/ (accessed July 4, 2019)

ESA: CHEOPS (Characterizing Exoplanet Satellite; [55]); launched in December 2019, is the first ESA S-class science mission with the goal of measuring the size of known transiting planets with high accuracy and searching for transit signals of well-selected exoplanets initially discovered with the RV technique. In the mid 2020s, PLATO (Planetary Transits and Oscillations of stars; [56]) will follow as the third M-class mission in ESA’s Cosmic Vision Program. Similar to Kepler, albeit targeting brighter stars with higher precision and longer time baseline, PLATO will uncover hundreds of new Earth-sized exoplanets and provide unprecedented constraints on the occurrence rate of terrestrial planets in the habitable zone of Solar-type stars. Ariel (Atmospheric Remote sensing Infrared Exoplanet Large survey mission; [57]), another M-class exoplanet mission from ESA, will follow in 2028. Ariel will provide transmission and secondary eclipse measurements for hundreds of (mostly transiting) exoplanets at visible and NIR wavelengths allowing investigations of the atmospheric composition of a large, well-defined and diverse sample of known exoplanets. The vast majority of Ariel’s targets will be warm and hot transiting gas giants and Neptunes. Some Super-Earths may also be within reach but studying the atmospheres of temperate terrestrial exoplanets similar to Earth is beyond Ariel’s scope.

NASA: The currently operating missions Hubble, Spitzer, and the recently retired Kepler/K2 have revolutionized our understanding of exoplanet abundance and diversity through discovering and characterizing transiting systems. TESS launched successfully in 2018 and will bring another step forward in our understanding of exoplanet occurrence rates, especially around bright, nearby stars. Recently, NASA decided to extend the nominal 2-year mission lifetime by at least another 2 years significantly increasing TESS’ discovery space.Footnote 7 JWST (planned for a launch in 2021) will include the capability to perform infrared transit and eclipse spectroscopy of exoplanets as well as phase curve measurements (e.g., [58]). The investigation of small planets orbiting close-to or within the habitable zone of their host stars will remain very challenging and time-demanding, though (e.g., [59,60,61]). The Roman Space Telescope (previously known as WFIRST) is planned for launch in the mid 2020s with a high-contrast coronagraph instrument (CGI). It will allow the direct detection of a few known giant exoplanets that were discovered by indirect techniques and perform an essential technology demonstration for future missions. A starshade could be launched to rendezvous with Roman, which would enable direct imaging of a few Earth-like exoplanets (pending recommendations of the 2020 Decadal Survey). Furthermore, the primary mission of Roman will, similar to EUCLID, deliver a wealth of microlensing events. If recommended, large missions like LUVOIR (Large UV/Optical/IR Surveyor) and HabEx (Habitable Exoplanet Observatory Mission) will be capable of directly imaging and spectrally characterizing up to a few tens of Earth-like exoplanets in reflected light ([27], see also the White Paper by Snellen et al.). They may search for the spectral signature of gases like water vapor and oxygen (see, Fig. 3; [12]). The Origins Space Telescope (OST) plans to develop the capability to search for biosignature gases in the atmosphere of rocky exoplanets transiting M dwarfs. Any of these latter missions could be capable of discovering the first indications for signs of life on a nearby terrestrial exoplanet. However, they are all part of NASA’s ongoing Decadal Survey and none of the missions is approved yet.

Implications for atmospheric studies of terrestrial exoplanets

All of the currently adopted ground- or space-based projects and missions have exciting and challenging scientific objectives that will deliver important results in various areas of exoplanet research. However, none of them will enable the science proposed in this White Paper. JWST may be able to check for the existence of an atmosphere around a couple of nearby, terrestrial (transiting) exoplanets; a handful of terrestrial exoplanets may be also within reach of the ELTs for basic atmospheric characterization. As summarized on ESA’s web-pages: “With this suite of space telescopes launching within the next decade, we can expect to come closer to finding Earth 2.0”.Footnote 8 However, we will not have the means to find and characterize an Earth 2.0. Only a large and focused space mission offers the potential to do that and – at the same time – will allow us to statistically investigate the expected compositional diversity of terrestrial exoplanet atmospheres.

Challenges ahead

Concepts for a space-based nulling interferometer already existed more than a decade ago, but technical challenges paired with uncertainties related to the scientific yield of such a mission – the occurrence rate of small exoplanets was unknown – led to the cancellation of the projects. Since then, progress has been made in key areas, as we will detail in the following, but additional coordinated efforts will be needed to develop and space-qualify some components and technologies. Pushing the boundaries of what is technically possible always requires a substantial amount of investment, but only then scientific breakthrough results in (astro-)physics, such as the detection of gravitational waves (e.g., [62]) or the first image of a black hole [63], can be achieved. We picked these two examples on purpose because both of them relied on interferometric measurement techniques. We note that also ESA’s LISA mission, bound to revolutionize our understanding of the Universe using gravitational waves, will apply interferometric measurements between free-flying spacecraft.

Technology challenges and recent progress

In recent years the field of high-precision ground-based interferometry has seen significant progress both in Europe and the United States. In particular Europe has gained a strong expertise in the field of fringe sensing, tracking, and stabilization with the operation of the Very Large Telescope Interferometer (VLTI). This maturity contributed to the first direct observation of an exoplanet with long-baseline optical interferometry, providing record-breaking precision on the astrometry and spectrum of any directly imaged planet to date [64]. In parallel, new data reduction and observing techniques have enabled unparalleled interferometric contrasts on US-based nulling interferometers [65, 66].

In addition to these developments, most technologies required to fly a space-based nulling interferometer have reached a Technology Readiness Level (TRL) ≥5, which means that the components have been tested and validated in a relevant environment (see, e.g., [67], for a recent review). In particular, key technologies that were considered immature in 2007, when most Darwin/TPF-I activities stopped, have now been demonstrated on test-benches (e.g., deep nulling beam combination) or will soon be demonstrated in space (e.g., formation flying). Significant efforts related to effective starlight suppression culminated in laboratory demonstrations mainly at the Jet Propulsion Laboratory (JPL) in the US. Work with the “Adaptive Nuller” at room temperature indicated that MIR nulls of 10−5 are achievable with a bandwidth of 34% and a mean wavelength of 10 μm (left panel in Fig. 9; [68]). The “Planet detection testbed”, developed in parallel, demonstrated the main components of a high performance four beam nulling interferometer at a level matching that needed for a space mission [69]. At 10 μm, with 10% bandwidth, it has achieved a null-depth of 8 × 10−6, and a total starlight suppression of 10−8 after post-processing; the Earth-Sun contrast at 10 μm is of the order 10−7.

Fig. 9
figure 9

Left: 6-h measurement of the null-depth achieved at 10 μm wavelength with the “Adaptive Nuller” at JPL [68]. Right: Artist impression of ESA’s PROBA 3 spacecraft, the first mission to demonstrate autonomous precision formation flying (Image credit: ESA - P. Carril, 2013)

Handling the high degree of autonomy necessary for free-flying or formation flying missions of close-by elements in space is a complex endeavor. This is an active field of research, with continuous progress in estimation and control algorithms, and in ways to internally calibrate the local tie from the metrology to the scientific measurement systems (e.g., [70], and references therein). An additional key milestone for formation flying technology was the space-based demonstration by the PRISMA mission [71]. PRISMA demonstrated a sub-cm positioning accuracy between two spacecraft, mainly limited by the metrology system (GPS and radio frequency ranging). The launch of ESA’s PROBA-3 mission (right panel in Fig. 9), currently scheduled for 2023, will mark the next step in formation flying. Its two satellites will maintain formation to millimeter and arc second precision at distances of 150 m or more autonomously, i.e., without relying on guidance from the ground. This separation is of the same order as the one needed for a space-based nulling interferometer (see above) and the formation flying precision exceeds even the requirements (cf. [67]).

Required technology development

As mentioned above, the science described here will likely require an L-class mission and cost will be a key driver eventually. Hence, ideally, an ESA-supported technology development program should be set up that supports industrial partners and academic institutions in.

(1) identifying synergies with other planned or ongoing missions, leveraging the heritage of past (cryogenic / MIR) missions and investigating new approaches to limit major cost drivers during development and implementation of the new mission, and in.

(2) further pushing the readiness and availability of certain required key technologies.

The general goal of cost reduction ensures that the total mission budget fits comfortably in the financial envelope of an L-class mission. Given that first mission concepts were already present 15–20 years ago and that significant further progress has been made ever since in various areas (see above), new concept studies would not have to start from scratch and could focus on identifying those areas where learnings from other missions or new developments and technologies would yield potential cost savings without jeopardizing the mission’s scientific objectives.

In addition, specific areas, where additional technology development would be required, are the following: To further push starlight suppression technology, the next step would be to reproduce the US experiments mentioned above, but under cryogenic conditions and with flux levels similar to those expected in space. The Laboratory for Astronomical Instrumentation at ETH ZurichFootnote 9 is currently developing designs for possible experimental setups. This will likely include the successful validation of cryogenic spatial filters that can provide the necessary wave-front control performance from 3 μm to around 20 μm and the implementation of a cryogenic deformable mirror [72, 73]. The use of newly available concepts of single-mode fibers as cryogenic spatial filters, including commercial solutions (e.g., classes of photonic crystal fibers, low-loss hollow core fibers), should be investigated in different ranges of the MIR spectrum because of their improved throughput across the spectral range of interest (e.g., [74, 75]). Furthermore, integrated optics devices for the MIR wavelength regime could significantly reduce the complexity of the instrument if they reach the appropriate performance level. Recent developments seem promising (e.g., [76,77,78]), but more work is needed. We also expect that dedicated developments will be required in the field of MIR detectors, although the JWST legacy will be particularly useful in this context. However, also Mercury-Cadmium-Telluride (MCT) detectors seem to push towards longer wavelengths [79] and it remains to be seen if this technology could reach out to at least ∼17 μm to cover the important CO2 band at 15 μm (Fig. 1).

Astrophysical challenges

A key factor impacting the time required to detect an “Earth-like” exoplanet at MIR wavelengths will be the level of thermal emission arising from small grains in exo-zodiacal dust belts around the target star. This emission would provide extra thermal background noise possibly leading to an increase in observing time for a given target and – in the extreme case – to an extension of the overall mission search phase. For the vast majority of stars within the immediate vicinity of the Sun (< 15–20 pc) it is not known if they harbor such belts and – if so – what their level of thermal emission in the MIR range is. The NASA funded HOSTS survey, carried out with the Large Binocular Telescope Interferometer (LBTI), provides some new statistical results on the occurrence rate of exo-zodiacal dust, indicating that the median level for Sun-like stars amounts to 4.5+7.5–1.5 times the level of the zodiacal light in the Solar System [80, 81]. This work has shown that for a majority of systems the dust levels can be expected to be low enough to enable the detection of terrestrial exoplanets through MIR interferometry. Ideally, however, a systematic observing program should be carried out to investigate all potential target stars of a future exoplanet imaging space mission including stars located in the Southern hemisphere that are inaccessible with the LBTI (see, e.g., the Hi-5 project on the VLTI; [82]). In addition to the MIR flux levels of potential dust belts also their orientation / geometry would be good to know as this prior information can render the exoplanet search more efficient (e.g., [83]).

Furthermore, to minimize the time devoted to an initial search phase for detecting a sufficiently large sample of terrestrial exoplanets for in-depth atmospheric characterization, current efforts to find such planets beforehand – from the ground or from space – need to be continued. The yield estimate shown in Fig. 6 assumes that the planet occurrence rates detected by the Kepler mission are also applicable for exoplanets around stars in the Solar neighborhood. As shown in Fig. 10, close to 20 small exoplanets within 15 pc are already known that lie within the empirical habitable zone. Due to detection biases of current surveys such planets are preferably found around cooler, i.e., smaller and lower mass, M-stars. Not all of these planets will be rocky (the transition between rocky and envelope-dominated exoplanets, in terms of planet mass, is expected to occur roughly around (2.0 ± 0.7) M; [11]) and some orbit too close around their host star so that even an MIR nulling interferometer, with the baselines and resulting spatial resolution as assumed above, could not directly detect them. However, these numbers show that planets orbiting in the empirical zone appear to be ubiquitous in the Solar neighborhood. We remind the reader that there are of order 900 GKM main-sequence stars within 15 pc from the Sun (see, Fig. 5), many of which have not yet been searched for (terrestrial) exoplanets. Recent estimates for η, i.e., the fraction of stars with terrestrial planets located within the habitable zone, amount to roughly 0.2–0.3 for solar type stars and possibly even higher values for M-stars (e.g., [16]). New, dedicated RV projects specifically targeting exoplanets close-to and in the habitable zone around nearby G- and K-type stars (such as, e.g., EXPRESFootnote 10 or HARPS3Footnote 11) would be perfectly complementary to ongoing projects focusing on low-mass stars. Combined they will not only provide targets for a future exoplanet imaging space mission, but also crucial constraints on the planets’ masses.

Fig. 10
figure 10

All known exoplanets and additional unpublished candidates from the CARMENES project within 15 pc from the Sun and with minimum masses <3 M (left panel) and < 10 M (right panel) located within the empirical habitable zone of their host stars (boundaries indicated by the red and blue line; see section 1.2.2 for details). The x-axis shows the stellar insolation received by the exoplanets normalized to Earth’s insolation. The y-axis shows the effective temperature of the host stars. Blue circles are planets detected via transit observations, black triangles are planets detected via RV measurements, and the unpublished RV candidates are shown in red triangles. The locations of Venus, Earth and Mars are shown as black circles

Conclusions

In this White Paper we have argued that a large space-based mission designed to detect and investigate thermal emission spectra of terrestrial exoplanets in the MIR wavelength range provides unique scientific potential in order to (1) search for signatures in their atmospheres that indicate biological activity, (2) assess their ability to provide conditions for life as we know it, and (3) investigate their expected atmospheric diversity. Such a mission would surpass the capabilities of other approaches. While NASA might be focusing on large missions that aim to detect terrestrial planets in reflected light, ESA has the opportunity to take leadership and spearhead the development of a large MIR exoplanet mission within the scope of the “Voyage 2050” long-term plan establishing Europe at the forefront of exoplanet science for decades to come. Given the ambitious science goals of such a mission, additional international partners might be interested in participating and contributing to a roadmap that, in the long run, leads to a successful implementation. A new, dedicated development program funded by ESA to help reduce development and implementation cost and further push some of the required key technologies would be a first important step in this direction.

Ultimately, a large MIR exoplanet imaging mission will be needed to help answer one of mankind’s most fundamental questions: “How unique is our Earth?”