1 The next steps in extragalactic astrophysics

The era of extragalactic astrophysics began in earnest around the time of the Great Debate between Shapley and Curtis in 1920 [136]. The debate focused on the nature of a number of intriguing “nebulae”, at a time when the consensus rested on a Universe in which the Milky Way was its main constituent, a scenario that harks back to the model of our Galaxy laid out by Herschel [59]. The discoveries during the 1930s, pioneered by Slipher, Hubble, and Humason resulted in the concept of island Universes, where each “stellar system”, a galaxy, constitutes a fundamental building block tracing the largest scales in the Cosmos. It has been nearly a century since this Great Debate, and our understanding of extragalactic astrophysics has come a long way.

Developments in telescopes, instrumentation and analysis techniques have allowed us to decipher the intricacies of galaxy formation. At present, the established paradigm rests on a dark matter dominated cosmic web within which a comparatively small mass fraction consists of ordinary matter (“baryons”), mostly in the form of stars, gas and dust. The first stage of galaxy formation is driven by the (linear) growth of the dark matter density fluctuations imprinted during the earliest phases of cosmic evolution. Stable dark matter structures, termed halos, collapse and virialise, constituting the basic units in this scenario. At the same time, gas accumulates in the central regions of these halos, leading to cooling and star formation. The general aspects of this complex process can be explained within the current framework (see, e.g. [123]), resulting in an overall very successful theory that matches the observations. However, many of the key processes are only roughly understood, most notably the “baryon physics” that transforms the smooth distribution of gas at early times into the galaxies we see today. This complex problem requires large, targeted data sets probing the most important phases of galaxy formation and evolution. This proposal addresses the next steps that the astrophysics community will follow in the near future to understand structure formation. High-quality spectroscopic observations of galaxies are required to probe these important phases. We motivate below the need for a large, space-based, ultra-deep survey of galaxy spectra in the near-infrared, and present the technological challenges that must be addressed.

The extremely weak fluxes of the targets, combined with the need to work at near infrared wavelengths imply such a task must be pursued from space, free of the noise from atmospheric emission and absorption. Moreover, the need to simultaneously observe many sources spectroscopically, from an unmanned, unserviceable mission, defines arguably one of the toughest challenges in space science. Such a task is optimally suited for the 2035-2050 period envisioned by ESA within the Voyage 2050 call. We emphasize that this science case complements the successful track record of ESA in this field, with missions such as Herschel (tracing the evolution of dust in galaxies), Gaia (tracing the gravitational potential of our Galaxy), as well as the cosmology-orientated missions, Planck and Euclid. We will show below how the fundamental science case of galaxy formation and evolution requires a future space-based observatory, beyond the capabilities of the upcoming JWST or large 30-40m ground-based telescopes such as ESO’s ELT.

2 The evolution of galaxies at the peak of activity

2.1 Star formation across cosmic time

The observational evidence reveals that the overall level of star formation in nearby galaxies is comparatively low with respect to earlier epochs. Figure 1, derived from various observational tracers of star formation, shows a characteristic peak in the cosmic star formation activity in galaxies between redshifts 1 and 3, roughly corresponding to a cosmic time between 2 and 6 Gyr after the Big Bang (or a lookback time between 8 and 12 Gyr ago). Such a trend can be expected as the gas from the initial stages is gradually locked into stars and, subsequently, remnants. This trend is highly packed with complex information regarding the efficiency of star formation, the mechanisms of gas infall and outflows, the ejection of gas from evolved phases of stellar evolution and the bottom-up hierarchy of structure formation.

Fig. 1
figure 1

Cosmic star formation history: This diagram shows the redshift evolution of the star formation rate density. Note that detailed spectroscopic optical galaxy surveys exist only out to z \(\lesssim \) 1, whereas the epochs of maximum star formation (z\(\sim \)1–3), and the first stages of formation (z\(\gtrsim \)7) are poorly understood (from [60]; see also [84])

In addition, the z\(\sim \)1–3 redshift window corresponds to the peak of AGN activity [113], and merger rate [118]. Moreover, it is the epoch when the dark matter halos hosting massive galaxies allow for cold accretion via cosmic streams (see Section 2.5). Decoding this complex puzzle requires a detailed study of the different phases of evolution. At present we only have complete galaxy samples amounting to \(\sim \)1 million high quality spectra at low redshift (z\(\lesssim \)0.2, e.g. SDSS, [157]), along with samples of spectra at intermediate redshift (z\(\lesssim \)1.5), e.g. VIPERS [56], VVDS [72], zCOSMOS [75], GAMA [39], BOSS [32] or LEGA-C [141]. Future spectroscopic surveys will also probe similar redshift ranges within the optical spectral window – e.g., WAVES [40]; WEAVE [31]; DESI [38], MSE [89]. In the NIR, ESO’s VLT/MOONS [26] will constitute the state-of-the art ground-based survey, but the expected S/N will not be high enough for studies comparable to those perfomed on SDSS spectra at z\(\lesssim \)0.2. We note that many of the spectroscopic surveys (past and future) are mostly designed as a “redshift machine” (i.e. optimised for cosmology, using galaxies simply as “test particles”), and the S/N of the data in the continuum is too low for any of the science presented here to be successfully delivered. None of the current and future observing facilities, both ground- and space-based, will be capable of creating the equivalent of the spectroscopic SDSS catalogue at these redshifts.

2.2 Bimodality and galaxy assembly

On a stellar mass vs colour (or age) diagram, galaxies populate two well defined regions: the red sequence and the blue cloud (see, e.g., [68, 133]). Galaxies on the red sequence are mostly massive, passively-evolving systems with little or no ongoing star formation. Although the red sequence extends over a wide range in stellar mass, the most massive galaxies tend to be on the red sequence, with a preferential early-type morphology. In contrast, blue cloud galaxies have substantial ongoing star formation, and extend towards the low-mass end. A third component is also defined, the green valley [90], between these two. However, the identification of this region as a transition stage between the blue cloud and the red sequence is far from trivial [3, 121]. There are many studies tracing the redshift evolution of galaxies in these regions (e.g. [11, 62, 94]), revealing a downsizing trend, so that the bulk of star formation (i.e. the “weight” of the blue cloud) shifts from the most massive galaxies at high redshift, to lower mass systems in the present epoch. This simple diagram allows us to present a simplified version of star formation in galaxies, including the usual bottom-up hierarchy that begins with small star-forming systems, leading to more massive galaxies through in situ star formation and mergers, both with (“wet”) and without (“dry”) additional star formation.

Models such as those proposed by [42] (see Fig. 2) allow us to express graphically the complex processes involved. However, the problem with these analyses is how to properly characterize the formation stage of a galaxy by a simple descriptor such as galaxy colour. More detailed analyses have been presented of the colour-mass diagram, combining photometry and spectroscopy in relatively nearby samples (e.g. [120]) showing interesting processes that relate the various sources of feedback (see Section 2.3 below).

Fig. 2
figure 2

Schematics of galaxy evolution from the blue cloud to the red sequence. Three different scenarios are considered, as labelled, with the black arrows representing evolution through wet mergers and quenching, and white arrows symbolising stellar mass growth through dry mergers (adapted from [42])

However, such studies are complicated by the fact that the underlying stellar populations span a wide range of ages and chemical composition, and the star formation processes do not involve a substantial fraction of the baryonic mass of the galaxy. Therefore, it is necessary to extend these studies, including high quality spectroscopic data, to explore the evolution on the colour-stellar mass diagram with galaxies targeted during the peak of galaxy formation. At these redshifts (z\(\sim \)1-3), we will be dealing with the most important stages of formation.

2.3 The role of star formation and AGN

The bimodality plot (Fig. 2) illustrates the key processes underlying galaxy evolution. Most importantly, the presence of a large population of passive galaxies on the red sequence, without an equivalent counterpart of massive galaxies on the blue cloud requires physical mechanisms by which star formation is quenched. As the fuel for star formation is cold gas, quenching of any type must resort to reducing this component, either by heating, photoionisation or mechanical removal of the cold phase.

Various theoretical models have been explored over the past decades, most notably based on the expulsion of gas from supernovae-driven winds (stellar feedback, e.g. [33]) or from a central supermassive black hole (AGN feedback, e.g. [124]).

A comparison of the observed stellar mass function of galaxies and N-body simulations of dark matter halos (see Fig. 3) suggests at least two distinct mechanisms to expel gas from galaxies, one dominant at the low-mass end, and the other one controlling the high-mass end. Since the efficiency of stellar winds is expected to increase in weaker gravitational potentials, one would assume stellar feedback is responsible for the low-mass trend. Similarly, the increasing efficiency of AGN feedback with black hole mass would produce the trend at the high-mass end. Furthermore, the strong correlation between bulge mass (or velocity dispersion) and the mass of the central supermassive black hole (Fig. 4, see, e.g. [70, 119]) gives further support to the role of AGN activity in shaping galaxy formation. However, this picture is too simplistic, requiring a better understanding of the physics. Detailed analyses of winds driven by nearby starbursting galaxies present a complex scenario that is not properly described by the latest numerical codes of galaxy formation [58]. The prevalence of outflows increases towards the younger phases of galaxy formation. Therefore, detailed studies over complete samples during the critical phases of galaxy evolution are needed to understand feedback in detail.

Fig. 3
figure 3

Correlation between the stellar-to-halo mass ratio and halo mass. Even at the peak of the curve (\(\sim \)3%) the stellar mass is significantly lower than the cosmic baryon to dark matter ratio, revealing an inefficient process of star formation. Furthermore, the decrease of this fraction towards both the high- and low-mass end reveals the complexity of feedback mechanisms (from [9])

Fig. 4
figure 4

Correlation between black hole mass and velocity dispersion in local galaxies, from direct measurements of the SMBH mass (from [57])

2.4 Galaxy growth through mergers

One of the main methods by which galaxies form is through the merger process, whereby separate galaxies combine together to form a new system. Merging is a significant channel of galaxy formation, and needs to be measured with high precision if we are to understand how galaxy formation proceeds. Closer to home, the complex structure of the stellar populations found around the Milky Way, its vicinity and the nearby Andromeda galaxy reflects the contribution of mergers to galaxy growth (e.g. [45, 64]). Whilst mergers are arguably not the way in which galaxies obtain the majority of their mass, this process is still likely the main one for triggering AGN and black hole formation and accounts for 25-50% of the formation of massive galaxies since z = 3 [101]. Thus, a detailed quantitative assessment of galaxy merger rates is a critical step that has not yet been fully carried out, due to the lack of complete spectroscopic samples. Furthermore, there are inconsistencies with the results obtained so far and a disagreement with theory, showing that more work and better data are needed in this area.

Firstly, the exact role of mergers in galaxy formation is not clear, with conflicting results, particularly at higher redshifts (z > 1). The merger fraction at z \(\gtrsim \) 1 is likely high, with a merger rate of \(\sim \)0.5–1 mergers Gyr− 1 (e.g. [15, 132]). Many merger rates at high redshift z > 1 are measured with galaxy structures, or based on samples of galaxies in kinematic or photometric redshift pairs. However, our best estimates of the merger rate differ from theory by up to an order of magnitude (see Fig. 5 contrasting observational results with the latest, state-of-the-art simulations by the Illustris collaboration). Moreover, we do not have robust estimates about the role of minor mergers in galaxy formation – recovering these will require very deep spectroscopic observations.

Fig. 5
figure 5

Redshift evolution of the merger rate, \({\mathcal R}(z)\) [92]. The observational constraints, shown as points with different symbols, are in stark contrast with respect to state-of-the-art predictions from the Illustris numerical simulations of galaxy formation (dashed lines, labelled at different stellar masses [145])

The best way to measure the merger rate at high redshift is through spectroscopic pairs which requires both position and accurate radial velocity information (e.g. [79]). However, the most up to date studies have only used 12 pairs at z > 2 to measure this important quantity [131] with a merger fraction with rather large errors (\(19.4^{+9}_{-6}\)%) due to small number statistics.

A near infrared spectroscopic survey of distant galaxies at z > 1 will give us the information we need to address this issue in detail. A survey with a high completeness level over the z = 1–3 range will give us a surface density over 10 times higher than previous surveys at 1< z < 3 such as DEEP2, VVDS, and UDSz. To address this type of science, the survey strategy needs to incorporate the option of including such targets in the mask layout (if the method is to proceed with a reconfigurable focal plane, see Section 6.1). Given the density of targets at the redshifts of interest, the merger fraction will be measured to an accuracy an order of magnitude better than what is currently known at these redshifts.

This is necessary to ultimately pin down the amount of mass assembled through merging, as well as to determine the role of merging on the triggering and quenching of star formation, and on central AGN activity. For reference, in the most massive systems with M\(_{*} > 10^{10}\textit {M}_{\odot }\) it will be possible to measure merger fraction ratios of up to 1:30 down to a stellar mass limit of M\(_{*} =10^{9.5}\textit {M}_{\odot }\), such that we can study, for the first time, the role of minor mergers in these processes.

2.5 The role of cold accretion

The evolution of the gaseous component – and its subsequent transformation into stars – is arguably one of the most complicated problems in extragalactic astrophysics. Hydrodynamical processes driving the gas flows, and feedback from star formation, AGN activity or dynamical evolution of the baryon-dominated central regions of halos lead to a significant mismatch between the mass assembly history of dark matter halos, and the star formation histories of galaxies embedded in these halos. In fact, Fig. 3 illustrates this mismatch.

One key observable of the difference between dark matter growth and galaxy growth is the presence of massive galaxies at early times (e.g. [24, 51, 80, 107]). A naive mapping of dark matter growth into stellar mass growth leads to late star formation in massive galaxies, as found in the first, pioneering computer simulations of galaxy formation (e.g. [67]). The presence of massive galaxies (stellar mass \(\gtrsim 10^{11}\textit {M}_{\odot }\)) with quiescent populations at redshifts z\(\sim \)2-3 (e.g. [47]) requires a mechanism by which the commonly adopted process of star formation through shock-heating of gas after the virialization of the halo, followed by cooling [111] cannot be the main growth channel in these systems.

We find ourselves in a similar quandary with strong AGN activity at very high redshift, z \(\gtrsim \) 6 – reflecting the presence of very massive black holes within the first billion years of cosmic time [43]. In addition to the traditional hot-mode growth, cold gas can also flow towards the centres of halos, following the filamentary structure of the dark matter distribution, efficiently feeding the central sites of star formation at early times [35]. Moreover, this process is found to operate in the most massive systems at early times [34]. Numerical simulations suggest that clump migration and angular momentum transfer provides an additional mechanism leading to the creation of massive stellar cores at early times [23].

However, observational constraints of the role of cold accretion are few, and no conclusive evidence has been found to date. A large spectroscopic galaxy survey probing the peak of evolution would allow us to study the hot- and cold-mode growth channels of star formation and black hole growth, and the connection with redshift and environment. A detailed analysis of the shape of targeted spectral lines will allow us to detect and quantify gas inflows, but a large volume of data is necessary given the small covering factor of accretion flows [44]. The high S/N of this survey will make studies of individual galaxies (not stacked spectra) available. As of today, state-of-the-art samples comprise \(\sim \)100 spectra with just enough S/N to study bright emission lines (see, e.g. [53] with VLT/KMOS; or [66] with Keck/MOSFIRE). These studies give promising results about the presence of this important process of galaxy growth. Note studies in the Ly-α region (i.e. concerning the cosmic dawn survey, Section 3) can also be used to obtain constraints on gas inflows [155].

2.6 Size evolution

An additional conundrum raised by the study of massive galaxies at high redshift is the issue of size evolution. The comoving number density of massive (\(\gtrsim 10^{11}\textit {M}_{\odot }\)) galaxies has been found not to decrease very strongly with redshift (z\(\lesssim 2\)), with respect to the predictions from simple models of galaxy formation that mostly link galaxies to the evolution of the dark matter halos (e.g. [27, 46]).

This would reflect an early formation of these type of galaxies, whereby the bulk of the stellar mass is in place by redshift z\(\sim \)2–3. However, the sizes of these galaxies at z\(\gtrsim \)1–2 are significantly smaller than their low-redshift counterparts (e.g. [30, 137]). A large volume of publications has been devoted to propose mechanisms that could explain this puzzle, including gas outflows as a mechanism to alter the gravitational potential, “puffing-up” the dense central region.

However, the (old) stellar populations typically found in massive galaxies do not allow for significant quantities of recent star formation, or cold gas flows to explain this size evolution [138], suggesting instead a growth process through gas-free (dry) merging. This merging can proceed dramatically – through a small number of major mergers [69], where the merging progenitors have similar mass – or through a more extended and smooth process of minor merging [95]. In addition, one should consider whether these evolved compact cores end up as massive (and extended) early-type galaxies in high density regions [110], or as massive bulges of disc galaxies [36].

Figure 6 shows a diagram of how this may work, from an analysis of massive galaxies in CANDELS [7], with an interesting evolution from massive compact systems with a strong star formation rate, towards the quiescent galaxies we see today, involving both secular processes, galaxy mergers and star formation quenching. Establishing such connections requires a large volume of galaxy spectra at the peak of galaxy formation activity. All these studies are based on relatively small samples (\(\lesssim 10^{3}\)) with mostly high-quality photometry (from HST) but comparatively poor spectroscopic data. Accurate characterization of the stellar population content of these galaxies will enable us to robustly constrain the processes by which galaxies grow.

Fig. 6
figure 6

Mechanism proposed [7] to explain the size evolution of massive galaxies. The grey contour shows the galaxy distribution at low redshift. Two main growth channels are proposed, involving a mixture of processes such as merging, star formation quenching or secular processes. Large, high quality spectroscopic data at these redshifts will allow us to test in detail these proposals

2.7 Reaching out: the role of environment

The environment where galaxies reside plays a significant role in shaping their observed properties and thus their evolution. It essentially deprives them of their hot and cold gas reservoirs, thus quenching their star formation activity, and also can literally disrupt them by removing their stars [105]. The observed properties of galaxies in the local Universe have provided us with a wealth of evidence towards environmental processes, whose time scales and amplitudes are unfortunately known only at a qualitative level. A robust quantitative estimate of the dependence of such parameters on environment and redshift largely remains an open problem.

The Sloan Digital Sky Survey (SDSS) has been the very first survey to perform an unprecedented and statistically significant census of the photometric and spectroscopic properties of z\(\simeq \)0 galaxies at optical wavelengths. It has permitted us to detail star formation activity in galaxies across several orders of magnitude, with respect to galaxy stellar mass, environment and infall time at z\(\simeq \)0. We are now aware that the number of quenched galaxies – not forming new stars any longer – rises with their stellar mass at a fixed kind of environment, and with environment magnitude (from small galaxy groups to large clusters) at fixed stellar mass [103, 140, 147, 148].

We have also learnt from SDSS that the age of the bulk of stars in a galaxy grows progressively older i) as their infall time increases (i.e. galaxies accreted onto their present-day host environment early on are now older than those accreted more recently); 2) as their environment, at fixed infall time, becomes more massive, indicating that the environment mass enhances the efficiency of those physical processes able to quench star formation in galaxies [104, 106, 125]. In addition, recently accreted cluster galaxies appear to be older than equally-massive field galaxies, an observational result that has been attributed to group-preprocessing: the star-formation quenching of these recent infallers started already while they were still living in smaller groups, that later merged with clusters (see Fig. 7, [106]). Such group-preprocessing has extensively been advocated by semi-analytic models of galaxy formation and evolution in order to explain the large number of quenched galaxies observed in clusters [37, 149]

Fig. 7
figure 7

The stellar age of galaxies less massive than \(10^{10}\textit {M}_{\odot } h^{-2}\) is shown as a function of infall time. Galaxies are colour-coded regarding halo mass: red circles, orange squares and yellow triangles identify galaxies in clusters, rich groups and low-mass groups, respectively. The grey stripe indicates the stellar age of equally-massive galaxies in the field (from [106])

The observational evidence described above highlights the importance of knowing the accretion epoch of a galaxy if we want to understand the role of environment. Unfortunately, we can not use observations of z\(\simeq \)0 galaxies to accurately derive their infall epochs when they became exposed to environmental effects for the first time. To determine such an important moment in the evolution of galaxies we need to quantify and study environment at different redshifts; this is what a deep-wide NIR spectroscopic galaxy survey will enable us to do, by tracing the assembly history of environments with cosmic time, providing us with a direct measurement of the redshift of infall of galaxies as a function of their stellar mass. Moreover, the lensing and X-ray information from Euclid and eRosita, respectively, combined with the accurate spectroscopic information produced by Chronos will probe the dependence of the star formation histories on the dark matter halos. While the data from Euclid and eRosita will mainly target the assembly of massive environments, thus introducing a significant bias towards star-forming galaxies, Chronos will broaden the study to smaller environments and consequently will avoid the selection bias of the Euclid sample.

When and in which environments did the quenching of the star formation activity of galaxies start? How fast did it proceed? The quantitative and direct replies to these inquiries are provided by our measurements of star formation rates, star formation histories and chemical enrichment of galaxies of different stellar mass, in different environments at different epochs, from z\(\sim \)1–3 to z = 0. Only these observables allow us to directly estimate the typical time scales of star formation in galaxies, and to achieve a model-independent value of the time scales over which galaxy groups and clusters switched galaxy star formation off, and produced the observed present-day galaxy populations.

With increasing redshift these measurements shift to infrared wavelengths and become challenging even for modern ground-based telescopes. Ground-based measurements allow for only a partial characterization of the properties of galaxies at z > 0.5, for which we can mostly measure emission lines (thus star formation rates) since their absorption lines (used as age and metallicity indicators) become less and less accessible. The data gathered so far on galaxies at 0.3 < z < 0.8 indicate that the fraction of quenched galaxies is larger in galaxy groups than in the field, but definitively lower than the fraction of quenched galaxies in groups at z\(\simeq \)0 [81, 151]. At intermediate redshifts, the fraction of star forming galaxies diminishes from 70-100% in the field to 20-10% in the more massive galaxy clusters [109]. However, the star formation rates of group galaxies do not significantly differ from those in the field; only star forming galaxies in clusters show star formation rates a factor of 2 lower than in the field at fixed stellar mass [81, 109, 146].

At the highest redshifts probed for environment, 0.8 < z < 1, the more massive galaxy groups and clusters are mostly populated by quenched galaxies and both exhibit a 30% fraction of post-starburst galaxies (i.e. with a recently truncated star formation activity [6]). In particular, the fraction of post-starburst galaxies in clusters exceeds that in the field by a factor of 3. Cluster and field galaxies still able to form new stars share instead similar star formation rates. On the basis of these results, It has been argued [93] that, at z\(\sim \)1, either the quenching of star formation due to the secular evolution of galaxies is faster and more efficient than the quenching induced by galaxy environment, or both mechanisms occur together with the same time scale. Which mechanism prevails and over which time scale? At present, we do not know.

To further progress on this issue, we require a facility such as Chronos to observe a complete stellar-mass limited sample of environments at z ≥ 1–3, and to derive the star formation histories of their galaxies with an unprecedented accuracy. Chronos observations will thus deliver the fading time scales of star formation of galaxies of different stellar mass residing in groups and clusters. This is not simply an incremental step in our knowledge of environment-driven galaxy evolution. This is the still missing, fundamental quantitative change from the simple head-count of quenched or star-forming galaxies to the measurement of physical properties of galaxies in environments at cosmic noon.

3 First galaxies and the epoch of reionization

3.1 Leaving the dark ages

Cosmic reionization is a landmark event in the history of the Universe. It marks the end of the “Dark Ages”, when the first stars and galaxies formed, and when the intergalactic gas was heated to tens of thousands of Kelvin from much colder temperatures. This global transition, during the first billion years of cosmic history, had far-reaching effects on the formation of early cosmological structures and left deep impressions on subsequent galaxy and star formation, some of which persist to the present day.

The study of this epoch is thus a key frontier in completing our understanding of cosmic history, and is currently at the forefront of astrophysical research (e.g. [116]). Nevertheless, despite the considerable progress in both observations and theory (e.g. see reviews by [41] and [77]) all that is really established about this crucial era is that Hydrogen reionization was completed by redshift z\(\sim \)6 (as evidenced by high-redshift quasar spectra; [43]) and probably commenced around z\(\sim \)12 (as suggested by the Planck polarisation measurements, which favour a ‘mean’ redshift of z\(_{\text {re}} = 8.8^{+1.7}_{-1.4}\); [108]). However, within these bounds the reionization history is essentially unknown. New data are required to construct a consistent picture of reionization and early galaxy formation/growth (see Fig. 8).

Fig. 8
figure 8

Measures of the neutrality \(1-{{Q}_{{{\mathrm {H}}_{\text {II}}}}}\) of the intergalactic medium as a function of redshift. Shown are the observational constraints, along with model predictions of the evolving IGM neutral fraction (in red). The bottom panel shows the IGM neutral fraction near the end of the reionization epoch, where the model fails to capture the complexity of the reionization process. (from [116])

Understanding reionization is therefore a key science goal for a number of current and near-future large observational projects. In particular, it is a key science driver for the new generation of major low-frequency radio projects (e.g. LOFAR, MWA and SKA) which aim to map out the cosmic evolution of the neutral atomic Hydrogen via 21-cm emission and absorption. However, such radio surveys cannot tell us about the sources of the ionizing flux, and in any case radio observations at these high redshifts are overwhelmingly difficult, due to the faintness of the emission and the very strong foregrounds. It is thus essential that radio surveys of the neutral gas are complemented by near-infrared surveys which can both map out the growth of ionized regions, and provide a complete census of the ionizing sources.

A genuine multi-wavelength approach is required, and cross-correlations between different types of observations will be necessary both to ascertain that the detected signals are genuine signatures of reionization, and to obtain a more complete understanding of the reionization process. It has thus become increasingly clear that a wide-area, sensitive, spectroscopic near-infrared survey of the z = 6–12 Universe is required to obtain a proper understanding of the reionization process and early galaxy and black-hole formation. Such a survey cannot be undertaken from the ground (due to Earth’s atmosphere), nor with JWST (inadequate field-of-view), nor Euclid or WFIRST, renamed the Roman Space Telescope (inadequate sensitivity with slitless spectra). Only a mission such as Chronos can undertake such a survey and simultaneously address the three, key, interelated science goals which we summarize below. Moreover, detailed studies of z > 6 galaxies in the Ly-α region will complement the information provided at longer wavelengths by ALMA (e.g. [21]).

3.2 The clustering of Ly-α emitters as a probe of reionization

Cosmological simulations of reionization predict that the highly-clustered, high-redshift sources of Lyman-continuum photons will lead to an inhomogeneous distribution of ionized regions. The reionization process is expected to proceed inside-out, starting from the high-density peaks where the galaxies form. Thus, as demonstrated by the state-of-the-art simulations shown in Fig. 9, reionization is predicted to be highly patchy in nature. This prediction is already gaining observational support from the latest large-area surveys for Ly-α emitters at z\(\sim \)6.5, where it has been found that, depending on luminosity, their number density varies by a factor of 2–10 between different \(\frac {1}{4}\) deg2 fields [96, 100]. It is thus clear that surveys over many square degrees are required to gain a representative view of the Universe at z > 6. Crucially, with such a survey, the differential evolution and clustering of Lyman-break galaxies and Ly-α emitting galaxies can be properly measured for the first time, offering a key signature of the reionization process.

Fig. 9
figure 9

The geometry of the epoch of reionization, as illustrated by a slice through a (165 Mpc)3 simulation volume at z = 9. Shown are the density (green/yellow), ionized fraction (red/orange), and ionizing sources (dark dots) [63]. The necessity of a deep, near-infrared spectroscopic survey covering many square degrees is clear

High-redshift galaxies can be selected on the basis of either their redshifted Lyman break (the sudden drop in emission from an otherwise blue galaxy, due to inter-galactic absorption at wavelengths λrest < 1216Å), or their redshifted Ly-α emission. The former class of objects are termed Lyman-Break Galaxies (LBGs) while the latter are termed Ly-α Emitters (LAEs). In principle, LAEs are simply the subset of LBGs with detectable Ly-α emission, but the current sensitivity limitations of broad-band near-infrared imaging over large areas has meant that narrow-band imaging has been successfully used to yield samples of lower-mass galaxies which are not usually identified as LBGs (e.g. [99]). Nevertheless, as demonstrated by spectroscopic follow-up of complete samples of bright LBGs (e.g. [122, 128, 142]), the fraction of LBGs which are LAEs as a function of redshift, mass, and environment is a potentially very powerful diagnostic of both the nature of the first galaxies, and the physical process of reionization.

With the unique combination of deep, wide-area near-infrared imaging, provided by surveys such as Euclid and the Roman Space Telescope, and deep, complete follow-up near-infrared spectroscopy, made possible with Chronos, we propose to fully exploit the enormous potential of this approach.

The essential idea of using Chronos to constrain reionization is as follows: while the Ly-α luminosity of LAEs is affected both by the intrinsic galaxy properties, and by the HI content (and hence reionization), the luminosity of LBGs (which is measured in the continuum) depends only on the intrinsic galaxy properties. Thus, a deep, wide-area, complete survey for LBGs at z\(\sim \)6–12 with accurate redshifts secured by Chronos will deliver a definitive measurement of the evolving luminosity function and clustering of the emerging young galaxy population, while the analysis of the follow-up spectroscopy will enable us to determine which LBGs reside in sufficiently large ionized bubbles for them to also be observed as LAEs. In order to prevent strong damping wing absorption of Ly-α photons, a galaxy must carve out a bubble of radius RI of 500–1000 physical kpc at z\(\sim \)8. According to reionization history predictions from cosmological simulations, consistent with the various reionization constraints, the I fraction at this redshift is around \(\chi \sim 0.5\)–0.7. RI for a typical galaxy with a star-formation rate of \(\dot {\mathrm {M}}_{*} = 1\) M yr− 1 is expected to be considerably smaller (though it depends on poorly established values of the ionizing photon escape fraction; cf. [112]). Thus, such galaxies will be only marginally detectable in the Ly-α line if they are isolated. In practice, some of these galaxies will be highly clustered and therefore will help each other in building a HII region which is large enough to clear the surrounding HI and make it transparent to Ly-α photons.

This argument emphasizes the importance of clustering studies of LAEs, for which the proposed survey is optimally designed. A key aim is to compute in great detail the two-point correlation function of LAEs and its redshift evolution. For the reasons outlined above, reionization is expected to increase the measured clustering of emitters and the angular features of the enhancement would be essentially impossible to attribute to anything other than reionization.

In fact, under some scenarios, the apparent clustering of LAEs can be well in excess of the intrinsic clustering of halos in the concordance cosmology. Observing such enhanced clustering would confirm the prediction that the HII regions during reionization are large [83]. As required to meet our primary science goals, the Chronos surveys will result in by far the largest and most representative catalogues of LBGs and LAEs ever assembled at z > 6. Detailed predictions for the number of LBGs as extrapolated from existing ground-based and HST imaging surveys are deferred to the next subsection. However, here we note that the line sensitivity of the 100 deg2 spectroscopic survey will enable the identification of LAEs with a Ly-α luminosity ≥ 1042.4 erg s− 1, while over the smaller ultra-deep 10 deg2 survey this line-luminosity limit will extend to ≥ 1041.6 erg s− 1. Crucially this will extend the Ly-α detectability of LBG galaxies at z\(\sim \)8, with brightness J\(\sim \)27AB, down to “typical” equivalent widths of \(\sim \)15Å [28, 122, 128, 142].

The total number of LAEs in the combined surveys (100 + 10 deg2) will obviously depend on some of the key unknowns that Chronos is designed to measure, in particular the fraction of LBGs which display detectable Ly-α emission as a function of redshift, mass and environment. However, if the observed LAE fraction of bright LBGs at z\(\sim \)7 is taken as a guide, the proposed surveys will uncover \(\sim \)10,000 LAEs at z > 6.5.

3.3 The emerging galaxy population at z > 7, and the supply of reionizing photons

The proposed survey will provide a detailed spectroscopic characterization of an unprecedently large sample of LBGs and LAEs. Crucially, as well as being assembled over representative cosmological volumes of the Universe at z\(\sim \)6–12, these samples will provide excellent sampling of the brighter end of the galaxy UV luminosity function at early epochs. As demonstrated by work on the galaxy luminosity function at z\(\sim \)7–9 [82], an accurate determination of the faint-end slope of the luminosity function (crucial for understanding reionization) is in fact currently limited by uncertainty in L and Φ. Consequently, a large, robust, spectroscopically-confirmed sample of brighter LBGs over this crucial epoch is required to yield definitive measurements of the evolving luminosity functions of LBGs and LAEs.

Leaving aside the uncertainties in the numbers of LAEs discussed above, we can establish a reasonable expectation of the number of photometrically-selected LBGs which will be available within the timescales expected for such a mission. For example, scaling from existing HST and ground-based studies, the “Deep” component of the Euclid survey (reaching J\(\sim \)26AB at 5σ over \(\sim \)40 deg2), is expected to yield \(\sim \)6000 LBGs in the redshift range 6.5< z < 7.5 with J < 26AB (selected as “z-drops”), \(\sim \)1200 at 7.5< z < 8.5 (“Y-drops”), and several hundred at z > 8.5 (“J-drops”) [16, 17, 82].

Therefore, the planned spectroscopic follow-up over 10 deg2, will be able to target (at least) \(\sim \)1500 LBGs in the redshift range 6.5< z < 7.5, \(\sim \)300 in the redshift bin 7.5< z < 8.5, and an as yet to be determined number of candidate LBGs at 8.5< z < 9.5. The proposed depth and density of the Chronos near-infrared spectroscopy will allow detection of Ly-α line emission from these galaxies down to a 5σ flux limit 10− 18 erg cm− 2 s− 1, enabling rejection of any low-redshift interlopers, determination of the LAE fraction down to equivalent widths of \(\sim \)10Å, and accurate spectroscopic redshifts for the LAE subset.

3.4 The contribution of AGN to reionization & the early growth of black holes

SDSS has revolutionised studies of quasars at the highest redshifts, and provided the first evidence that the epoch of reionization was coming to an end around z > 6 [8]. As with the studies of galaxies discussed above, pushing to higher redshifts is impossible with optical surveys, regardless of depth, due to the fact that the Gunn-Peterson trough occupies all optical bands at z > 6.5. Therefore, to push these studies further in redshift needs deep wide-field surveys in the near-infrared.

The wide-area, ground-based VISTA near-infrared public surveys such as VIKING and the VISTA hemisphere survey are slowly beginning to uncover a few bright quasars at z\(\sim \)7 (e.g. [91]). Evidence combining X-ray and near-IR data suggests that faint quasars at z\(\sim \)6 may be commoner than previously thought, and might contribute to reionization significantly [54, 85]. It is expected that Euclid and the Roman Space Telescope will be able to provide a good determination of the bright end of the QSO luminosity function at z > 6. However, the shape of the QSO luminosity function at these redshifts can only be studied with detailed near-infrared spectroscopy over a significant survey area. This is the only direct way to properly determine the contribution of accreting black holes to the reionization of the Universe and constrain the density of black holes within the first Gyr after the Big Bang; the combination of depth and area proposed in this NIR survey provides the ideal way in which to measure the evolving luminosity function of quasars at 6.5< z < 10.

4 Precision cosmology

The Euclid mission will revolutionize cosmology, however the ultimate precision of Euclid will be limited by our understanding of galaxy evolution on small-scales (\(\lesssim \)1 Mpc) due to baryonic feedback mechanisms. For example AGN feedback should have a sizeable 20% effect on the amplitude of the matter power spectrum [139], amongst many other studies. Without calibration data on small-scales from large complete spectroscopic samples, Euclid will be required to either marginalize over such effects, remove them from the analyses using filter techniques, or model them using a phenomenological ansatz such as the halo model.

Understanding galaxy evolution will therefore enable precision cosmology to be extended beyond the Euclid baseline to smaller scales, allowing for an increased sensitivity of modified gravity models, and up to a ten fold improvement on dark energy constraints than from Euclid alone. As example of beyond-Euclid cosmology enabled by small-scale information, we list the following:

  • Neutrino Physics. Massive neutrinos impact the matter power spectrum on both linear and non-linear scales. In particular, information on the neutrino hierarchy is amplified on small-scales [65].

  • Warm Dark Matter. The temperature, and particle mass, of dark matter is still unknown. In fact models in which dark matter has a small temperature are still allowed by the data. If dark matter is warm then any signature of its effects will be seen on small-scales, e.g. in the stellar mass function.

  • Modified Gravity. The accelerated expansion could be a symptom of our gravity model, general relativity, being incorrect. Models that change general relativity can have a scale-dependence, and chameleon mechanisms can act on relatively small scales [2].

Figure 10 shows the sensitivity of three beyond-Euclid cosmological models to small-scale information. The deep redshift range would also constrain early-dark energy models, complementing the Euclid cosmology objectives using techniques such as those in [86], applied to SDSS data.

Fig. 10
figure 10

A sample of new physical effects that can be tested using small-scale information. Shown is the ratio of the unaffected power spectrum compared to that with the new physical effects, as a function of scale, at redshift zero. The solid line (from [87]) shows the impact of a 1.25 keV warm dark matter particle. The dot-dashed line shows the impact of a massive neutrinos with total mass of 0.4 eV [160], and the dashed line shows a Hu-Sawicki modified gravity model with an amplitude deviation in the Lagrangian of 10− 6 [4]

5 Scientific requirements

5.1 Introduction

The study of galaxy formation and evolution involves a large range of measurement concepts. A deep spectroscopic galaxy survey – combined with high resolution NIR imaging from Euclid and the Roman Space Telescope – provides the optimal dataset. Note, however, the inherently more complex task of gathering high-quality spectroscopic data with respect to imaging. A spectral resolution \(R\equiv \lambda /{\Delta }\lambda \sim 1500-3000\) is needed both for accurate velocity dispersion measurements, and to beat the degeneracies present in spectral features. This limit is mainly set by the typical stellar velocity dispersions found in galaxies (50–300 km s− 1), and by the need to adequately resolve targeted emission lines and absorption features. Figure 11 quantifies the magnitude limit within the targeted redshift range. Ideally, a H = 26AB limit, in the continuum, would provide complete samples down to a stellar mass of M\(_{*}\gtrsim 10^{9}\textit {M}_{\odot }\) across the peak of galaxy formation activity (z\(\sim \)1–3). Note that at higher redshifts, the analysis will rely on emission lines, although it will be possible to work in the continuum of the most massive galaxies (M\(_{*}\gtrsim 10^{9.5}\textit {M}_{\odot }\) at z\(\sim 6\)).

Fig. 11
figure 11

Stellar mass of a range of stellar populations with respect to redshift, with apparent magnitude H = 24AB and H = 26AB (from the synthetic models of [18]). The shaded regions extend from old stellar populations (formed at zFOR = 10) to a younger galaxy (age 50 Myr). Real galaxies will mostly sit within the shaded regions

Regarding the issue of target selection for spectroscopy, H = 26AB is the sensitivity limit expected for the deep fields with Euclid/NISP [71], and the Roman Space Telescope/WFI will provide photometry slightly deeper than this [127].

Note the highly challenging measurements: at the faint end, a H = 26AB distant galaxy produces a flux of \(\sim \)3 photons per second in a perfect, unobstructed 3 m diameter telescope through the WFC3/F160W passband. Furthermore, the same collecting area yields \(\sim \)30 photons per hour, per spectral resolution element, in the continuum of a spectrum at R = 2000. The sky brightness at the best ground-based sites reach \(\mu _{\text {H,AB}}^{\text {Sky}}\sim \)19.5 mag arcsec− 2 [129], and the zodiacal background can be as high as \(\mu _{\text {H,AB}}^{\text {Zodi}}\sim \)21.5 mag arcsec− 2 in the same spectral regionFootnote 1. At these limiting magnitudes, any successful project must be based in space, and requires very long integration times, pointing towards the darkest regions away from the galactic plane and the ecliptic. For reference, the best spectroscopic samples of galaxies at z\(\sim \)2–3 with state-of-the-art, ground-based facilities (e.g. VLT/X-SHOOTER) reach K\(\lesssim \)21.5AB, and have noisy continua (e.g. [88]).

In a presentation for the future ESA L2/L3 science cases [48], we argued that any ground-based facility, including future telescopes such as ELT or TMT, will not be capable of providing a clean spectrum over a wide spectral window, needed to trace in detail the continuum associated to the stellar populations of galaxies at the peak of formation.

5.2 Tentative mission concept

The proposed science case will require a large aperture survey telescope in space (between 3 m and 6 m diameter), ideally at L2, although bolder options in the future may consider a lunar platform (allowing for service missions, and providing added value to a future human exploration programme to the Moon). The survey will entail long total integration times per field, over the 100 ks mark – requiring fine pointing accuracy. Such a survey would be, by far, the deepest ever taken. The baseline concept [48] was equivalent to taking one Hubble Ultra Deep Field every fortnight for five years. Such characteristics place Chronos as an L-type mission, ideally including cross-collaborations with international space agencies outside the ESA domain. A smaller, M-type, mission could be envisioned for technology development, targetting the most luminous galaxies in the two cosmic intervals under study.

5.3 Why target one million spectra?

The aim of the survey is to provide a legacy database of high quality galaxy spectra, sampling both the peak (z\(\sim \)1–3) as well as the first phases (z \(\gtrsim \) 6) of galaxy formation. In contrast with cosmology-driven missions – such as Euclid or Planck – that have a unique figure of merit for the constraint of a reduced set of cosmological parameters, Chronos will be a “general-purpose” survey. Regarding sample size, we use as reference, the best spectroscopic dataset of galaxy spectra at z\(\lesssim \)0.2, namely the “classic” Sloan Digital Sky Survey (SDSS), comprising approximately 1 million optical spectra of galaxies brighter than r\(\sim \)18 AB (e.g., [157]). The problems facing galaxy formation studies are not as “clean” as, for instance, finding w in a dark energy-dominated cosmology, or water vapour in an exoplanet. Galaxy formation is a highly complex field, involving a large set of physical mechanisms. Such complex questions need large datasets to be able to probe in detail variations of the observables with respect to properties such as the stellar mass, size or morphology of the galaxy under consideration, the mass of its host halo, the potential nuclear activity (ongoing or recent), the presence of infall/outflows, or tidal interactions from nearby interlopers. Therefore, it is necessary to probe this multi-parameter space in sufficient depth to understand in detail the role of the mechanisms driving galaxy formation. This is where a large multiplex mission such as Chronos exceeds the specifications of future large facilities such as JWST or ELT. Although JWST/NIRSpec will obviously have the capability of observing deep NIR spectra of distant galaxies, its small field of view, lower multiplex and oversubscription – across a wide range of disciplines – will allow such a powerful telescope to gather, at most, \(\sim \)1,000 galaxy spectra at similar spectral resolution, within the remit of this science case [114]. Doubtlessly, it will help tackle the science drivers listed above. However, such a small sample will always lead to the question of whether the observed sources are representative. Furthermore, if one wants to explore the effect of one of the parameters/observables listed above, it will be necessary to divide the sample accordingly. As an example, studies of environment-related processes done at lower redshift with SDSS or GAMA, work with samples between 10 and 100 times larger than the potential output of JWST. Chronos should be considered a successor to JWST and ELT-class facilities in galaxy formation studies.

5.4 Why the proposed spectral mode?

Table 1 shows the overal properties of the proposed survey. Choosing a wider wavelength coverage would enable us to target additional spectral features. In principle, it would be possible to extend the coverage to K band. Note that the targeted spectral coverage is suitable for the analysis of the feature-rich region around the 4000Å break in the “cosmic-noon” sample. Those features will be good enough to determine accurate kinematics, stellar population properties (age and chemical composition) and gas parameters. Extending the data, e.g. to 3 μ m would add Hα at the highest redshifts of the “cosmic noon” sample (z\(\sim \)4), which will obviously increase the science return, as the Hα region will allow us to improve on estimates of star formation rates, or characterize in more detail the ionization state of the gas. In the “cosmic dawn” sample, the key region is the Lyα interval, fully covered at the required redshifts by the proposed wavelength range, so a limited extended spectral coverage in this sample is not so beneficial.

Table 1 Summary of the Chronos survey specifications

However, we emphasize that a significant increase in wavelength coverage at fixed spectral resolution could make the spectrograph prohibitively expensive in a high multiplex system such as Chronos. The main science drivers cannot be fulfilled at lower spectral resolution, and a lower multiplex will make the proposed survey size unfeasible within a 5 year mission concept, so a wider spectral coverage should not be the major direction to improve on this concept. Note also that Chronos will operate with ultra-faint sources. At longer wavelengths, the thermal background of the telescope will impose additional costly solutions to keep the background at acceptable levels.

5.5 Comparison with current and future projects

Chronos will play a unique role in the landscape of future near-infrared spectroscopic surveys. The MOONS multi-fiber spectrograph at the ESO VLT will have a multiplex of about 1000 fibers over a field of view (FoV) of 500 arcmin2, and will cover 0.6–1.8 μ m at R\(\sim \)5000 [26]. The Subaru Prime Focus Spectrograph (PFS) will use up to 2400 fibers over 1.3 deg2, and will cover 0.38-1.26 μ m at R\(\sim \)4300 in the near-IR (λ > 0.94 μ m) [130]. The Maunakea Spectroscopic Explorer (MSE) is planned as a dedicated 10m-class telescope with a high multiplex (2,000–3,000) spectrograph that will work at low and high spectral resolution [89]. However, the data taken with these promising facilities will be inevitably affected by the atmosphere (opaque spectral windows, telluric absorption lines, OH emission line forest, high sky background) which will severely limit the sensitivity, the quality of the spectra and the capability to observe the continuum of faint objects. Having both continuum and absorption information in galaxy spectra over a wide spectral window allows us to break the degeneracies that entangle the properties of the stellar populations. In the case of space-based facilities, JWST will have a very small survey efficiency due to its small FoV and is therefore expected to play a complementary role in the detailed study of small samples of objects. Euclid and the Roman Space Telescope will survey very wide sky areas (15,000 deg2 and 5,000 deg2, respectively) in the near-infrared (\(\sim \)1–2 μ m), but the spectroscopy will be slitless and with low resolution (R < 500). This makes Euclid and the Roman Space Telescope powerful missions for redshift surveys based on fairly bright emission lines, but less suitable for detailed spectroscopic studies. For these reasons, Chronos will play a unique and unprecedented role thanks to its uninterrupted and wide near-IR spectral range (rest-frame optical for z > 1 objects), extremely high sensitivity due to the low background, capability to detect the continuum down to H = 24-26AB, high S/N ratio suitable to perform astrophysical and evolutionary studies, very high multiplexing, wide sky coverage and large (SDSS-like) samples of objects.

5.6 Star formation history of galaxies

The stellar component in a galaxy is made up of a complex mixture of ages and chemical composition, reflecting its past formation history. For instance, galaxies that underwent recent episodes of star formation will include a young stellar component, characterised by strong Balmer absorption (e.g. [150]); an efficient process of gas and metal outflows will be reflected in the chemical composition, targeted through metallicity-sensitive spectral indices such as Mgb and 〈Fe〉 (e.g. [135]); abundance variations between different chemical elements, such as [Mg/Fe], map the efficiency of star formation (e.g. [134]); variations in the stellar initial mass function (a fundamental component of any galaxy formation model) can be constrained through the analysis of gravity-sensitive indices (e.g. [49, 61]).

The stellar component of a galaxy encodes a fossil record of its evolution. In contrast, the gaseous component gives a snapshot of the “ongoing” processes. The analysis of the unresolved stellar populations in distant galaxies is tackled through targeted line strengths and spectral fitting, by comparing high-quality spectroscopic data with the latest stellar population synthesis models (e.g. [143, 144]). Such methods have been very successful at understanding the formation history of low redshift galaxies by use of spectra from the Sloan Digital Sky Survey (e.g. [52]). Similar type of studies at high redshift are fraught with the difficulties of dealing with very faint sources, in an observer frame (NIR) where the complex and highly variable airglow and telluric absorption makes ground-based observations tremendously challenging. Figure 12 shows a test with synthetic spectra of the S/N level required to constrain stellar population parameters from a set of line strengths. For a 0.1 dex (statistical) accuracy in log(Age) or log(Z/Z), typical values of S/N of \(\sim \)10–20 per resolution element are required in the continuum. This is a challenging target for galaxies at z\(\sim 2-3\), given the faint flux levels in the continuum shown in Fig. 11.

Fig. 12
figure 12

The red lines are model predictions from [18] for two age-sensitive (left) and two metallicity sensitive (right) line strengths for a galaxy with velocity dispersion σ = 200 km s− 1, as a function of age and metallicity, respectively (the bottom axes show the age and metallicity ranges). The orange lines are the estimated measurements, along with a 1 σ error bar, given as a function of S/N (shown in the top axes). The simulated data correspond to a population at solar metallicity and age 3 Gyr, marked with vertical dashed blue lines, along with a ± 0.1 dex interval in grey

5.7 The role of AGN

Studies of the past star formation histories of galaxies (Section 5.6) need to be compared with diagnostics of AGN activity, to understand the connection between galaxy growth and that of the central SMBH. Such studies are based on emission line diagrams (e.g. [5]) that trace the ionisation state of the interstellar medium. The requirements with regards to the S/N and spectral resolution are similar to the limits imposed by the analysis of stellar populations, although we note that emission line constraints will be less stringent, in general, to those in the continuum. At high enough S/N, it may be possible to separate the central component (dominated by the AGN) from the bulk of the galaxy. As reference, a 0.1 arcsec resolution element maps into a projected physical distance of 0.8–1 kpc at z\(\sim \)1–3.

5.8 Environment and merger history of galaxies

Large spectroscopic redshift surveys are needed to characterize the environment of galaxies in detail (e.g. [117, 156]). A mass-limited complete survey will allow us to probe the merging history of galaxies, either from the study of dynamically close pairs (e.g. [50, 79]) or through morphological studies (e.g. [78]).

Although deep NIR imaging surveys will be available at the time of a potential L4 mission, there will not be a comprehensive counterpart of spectroscopic observations, except for reduced sets of galaxies (\(\lesssim \)1,000) observed by JWST, or E-ELT-like telescopes from the ground. In order to beat cosmic variance it is necessary to obtain spectroscopic redshifts covering large enough volumes. As a rough estimate, we use the state-of-the-art Sloan Digital Sky Survey as reference. The original low-redshift dataset, limited to r < 17.7AB, can be considered “complete” out to redshift z\(\lesssim \)0.2, covering a comoving volume of 5.5 × 10− 5 Gpc3 per square degree. At the peak of galaxy formation activity, z\(\sim 1-3\), the equivalent volume is 0.02 Gpc3 per square degree. Since the SDSS footprint extends over \(\sim 10^{4}\) deg2 on the sky, a similar comoving volume will be probed by Chronos if covering 30 deg2. Although a detailed analysis is beyond the scope of this proposal, it may be advisable to opt for a tiered survey, from shallower samples (H < 24AB) over 100 deg2 to deeper regions, covering \(\sim 5-10\) deg2 at H < 26AB.

5.9 Gas and stellar kinematics and chemistry

The emission and absorption line positions and shapes are a valuable tool to study the kinematics and chemical composition of the stellar and gaseous components. Through high volumes of high S/N data with high enough spectral resolution, it will be possible to trace stellar kinematics and the mechanisms of gas outflows and stellar feedback. Moreover, information such as the velocity dispersion or the spin parameter can be used to constrain the properties of the dark matter halos hosting galaxies at z\(\sim \)1-3 (e.g. [19, 154]). More detailed analyses can be gathered by integral field units, where the spectra of different regions of the galaxy are extracted separately. Such instruments have facilitated detailed analyses of the stellar and gaseous components in nearby (e.g. ATLAS3D [22]) and distant (e.g. KMOS3D [152]) galaxies. Due to the faintness of the sources and the need for a high multiplex system covering a wide field of view, we would, in principle, decide against an IFU-based instrument, although this issue would be an important one to tackle during the definition phase (see Section 6). Also note that at the redshifts probed, the (spatial) resolving power is rather limited, expecting a resolution – measured as a physical projected distance – around 1 kpc at z\(\sim \)1–3. The high S/N requirements of the previous cases align with this one, but on the issue of kinematics, a slightly higher spectral resolution may be desired. Note also that at high resolution, high S/N spectra may be used to disentangle different components – such as bulge and disc [98].

5.10 Observations of galaxies in the high redshift Universe

Deep NIR spectroscopy from space is the only way to confirm the continuum break at 1216Å in the high redshift Universe (z\( \gtrsim \)6). Ground-based instruments only detect these objects when they have strong Ly-α lines in clean regions of the night sky spectrum. This line can be scattered by neutral intergalactic gas, and is likely to be weaker at such redshifts. The goal of achieving enough S/N in the continuum is important to properly characterize the properties of the underlying stellar populations, something that could be done with individual galaxies in the deep survey, and with stacked subsamples in the wide survey.

6 Measurement concept

6.1 Reconfigurable focal planes for space applications

JWST will be the first astronomy mission to have a true multi-object spectroscopic capability via the micro-shutter arrays in the NIRSpec instrument, which can observe up to 100 sources simultaneously over a field-of-view around \(3^{\prime }\times 3^{\prime }\) [74]. Scaling this technology to the field sizes and multiplex advantage required for the next generation of space-based spectroscopic survey instruments is not straightforward however, and will likely require a new approach. There are currently three technologies which show promise in this area.

  1. 1.

    Digital Micromirror Arrays (DMDs): Digital micromirror technology was developed in the 1990s by Texas Instruments for use in light projection systems (see Fig. 13). The current state-of-the-art is 2kx1k devices with 13 μ m pixels but larger format devices (up to 16 million pixels) are under development (c.f. 62,000 micro-shutters in JWST NIRSpec). These devices are also well matched in pixel size to the focal planes of small to medium size telescopes. DMDs were first proposed for the ESA M-class SPACE mission concept [25] which later evolved into the Euclid mission. The primary technological challenges in exploiting DMDs in space are: (i) developing radiation-hard electronics to drive the DMDs (the MEMS technology used in the mirrors themselves are not susceptible to damage except by extreme micrometeorite events), (ii) demonstrating reliable operation at cryogenic temperatures as required for observations at near-infrared wavelengths, (iii) modifying the visible-light windows on commercial devices to allow extended operation into the near-infrared, (iv) improving the contrast/scattered light for bright objects. Preliminary work has been undertaken during studies for SPACE/Euclid and elsewhere [158, 159] but further work is required to raise the TRL.

  2. 2.

    Reconfigurable Slits: A near-infrared multi-object spectroscopy target selection system which has been successfully deployed on the ground-based MOSFIRE instrument at Keck is the Configurable Slit Unit (CSU) [126]. This is a form of micro-mechanical system which employs voice-coil actuated “inch-worm” motors to position up to 46 slitlets (each 5 mm long) in a \(4^{\prime }\times 4^{\prime }\) field. This technology has been proven to be reliable in cryogenic operation (at 120K) over several years. Whilst the 1-D motion of the slits in the focal plane reduces target acquisition efficiency somewhat, the contiguous slits allow for improved sky-subtraction compared to devices (like the DMD) where the local sky is obtained via separate apertures. The primary technology challenges of adopting this technology for space applications would be the substantial miniaturization required and increasing the multiplex gain by 1 or 2 orders of magnitude, possibly using a piezo-electric drive system.

  3. 3.

    Liquid Crystal Masks: Liquid crystal (LC) masks are widely used as spatial light modulators in a number of laboratory applications. Whilst fundamentally relying on the ability of polarizing crystals to transmit or block linearly polarized light, they can be made to work more efficiently on unpolarized light using polymer dispersed liquid crystals (PDLCs). By combining an optically active material with an appropriate electrode structure, reconfigurable masks can be obtained which only transmit light in a specific spatial pattern [153]. Devices in formats up to 1024x768 are commercially available with up to 36 μ m pixels. The primary technological challenges would be: (i) operation efficiency (switching times) at low temperatures due to the properties of the LC medium, (ii) contrast limits (many commercial devices have contrast ratios < 100:1), (iii) limitations on bandwidth due to the chromatic properties of LCs, (iv) limitations due to non-orthogonal illumination.

Fig. 13
figure 13

Photomicrograph of tilted DMD micromirrors. The neighbouring mirrors have been removed to reveal the substructure (courtesy ASME/Texas Instruments)

In addition to the above reconfigurable focal plane solutions, which segment the focal plane spatially according to preselected target positions, an alternative approach is to select targets from a fixed grid of sub-areas across the focal plane (one target per sub-area) using a “beam-steering” approach. Many of these rely on similar underlying technologies to those discussed above (i.e. MEMS and/or variable prisms) and should be explored in the context of specific mission requirements. They are particularly suitable to selecting targets for spatially-resolved (“integral field”) spectroscopic studies.

6.2 Large format integral field units

A complementary approach to massively multiplexed spectroscopy when the target densities are high enough, is to use some form of integral field spectroscopy (IFS) which delivers a full spectral datacube for a contiguous region of sky. The IFS approach also opens up a large serendipity space since no imaging surveys are required to pre-select targets.

Integral field units have been widely used on ground-based telescopes (e.g. [1]) and a small-format device (30 × 30 spatial pixels) will be launched on JWST as one of the observing modes for the NIRSpec spectrograph [12]. Integral field systems can be realized using a number of techniques but the favoured approach for space infrared systems is the diamond-machined image slicer [76] which can take advantage of monolithic manufacturing methods and a robust thermal design approach. Technology developments would be required to develop wide-field integral field systems for space applications, but the generic approach using a “field-splitter” front-end optic to feed multiple sub-systems is well-understood from ground-based instruments (e.g. [102]). Mass, power and data rate budgets remain to be explored but will be common to all wide-field spectroscopic facilities.

6.3 NIR detector technology

A successful outcome of a survey such as Chronos also rests on highly efficient NIR detector technology, with minimal noise and well-understood systematics. The survey operates at a very low-photon regime, where it is essential to control the noise sources, and to understand in detail the response of the detector. For instance, cross-talk and persistence are substantial problems that can hinder the observations, and need to be characterised in exquisite detail.

The best available technology for this science case involves HgCdTe-based detectors, where the spectral range can be optimised by the choice of the ratio of Hg to Cd, that modifies the band gap between 0.1 and 1.5 eV. As of today, US-based companies can provide 4k×4k HgCdTe arrays with high enough TRL for a space mission (e.g. Teledyne, [10]). However, given the long timescales expected for a potential mission, and aligned with ESA’s investment in NIR detector technology development [97], it would be desirable to involve European groups (such as CEA-LETI, Selex ES or Caeleste) in the development of ultra-sensitive NIR detectors and the associated electronics. New HgCdTe-based technology with avalanche photodiodes, developed by Selex appears quite promising for astrophysics applications [55].

6.4 Photonics-based approach

An alternative approach to the traditional spectrograph design is to adopt a photonics-based instrument, creating the equivalent version of an integrated circuit in electronics. Astrophotonics has produced several revolutionary technologies that are changing the way we think of conventional astronomical instrumentation. In particular, the invention of the photonic lantern allows us to reformat the input to any instrument into a diffraction-limited output [73]. In principle, any spectrograph operating at any resolving power can be designed to fit within a shoebox [13, 14]. These authors refer to this as the photonic integrated multimode microspectrograph (PIMMS) concept and it has been demonstrated at the telescope and in space (see Fig. 14), also taking advantage of the CubeSat architecture [20]. Suitable optical designs are presented in [115]. Presently, the main limitation is that the ideal detector has yet to be realized, although discussions are ongoing with detector companies. This technology is ideally suited to optical and infrared spectroscopy, and may overcome the technological challenges faced by conventional spectrograph designs within ESA’s Voyage 2035-2050 long-term plan.

Fig. 14
figure 14

As seen above, the PIMMS concept has been demonstrated at the telescope [29], on a balloon (2012) and onboard the Inspire cubesat (2017-8) flown by the University of Sydney [20]

7 Summary

Responding to ESA’s call regarding the Voyage 2050 long-term plan to define the future space missions that will address the astrophysics science questions during the 2035-2050 period, we present here the motivation for an all-sky, survey mode, ultra-deep NIR spectroscopic mission from space, to tackle the most important open questions in extragalactic astrophysics. The observer NIR spectral range allows us to probe the two most important phases of galaxy evolution, namely the formation of the first galaxies (at z\(\gtrsim 6\); cosmic dawn, in the UV Lyman-α rest-frame), and the peak of galaxy formation activity (at redshift z\(\sim \)1–3; cosmic noon, in the optical rest-frame). The faintness of these distant sources and the need for a large number of spectra – to produce robust statistical constraints on the mechanisms of galaxy evolution – pose one of the strongest challenges faced by any astrophysics mission to date. To put the mission in context, the survey requires the equivalent of gathering one Hubble Ultra-Deep Field every fortnight for five years. In order to probe all possible models of galaxy evolution, we need to split the sample with respect to a sizeable set of parameters such as stellar mass, velocity dispersion, size, nebular activity, or environment, requiring a large sample, comparable to the \(\sim \)1 million spectra from the SDSS classic survey, an unachievable goal with state-of-the-art facilities such as JWST or ground-based ELTs.

We note that cosmology-driven redshift surveys impose much weaker constraints on the S/N in the continuum, as they only use galaxies as “test particles”, thus only requiring a redshift measurement. This project focuses on the galaxies themselves. The proposed database is needed to solve the key open questions in galaxy formation. The purpose of this proposal is to start a comprehensive study of such a demanding survey, focusing on the challenging technical aspects involving an ultra-deep (H\(\sim \)24-26 AB), high multiplex (\( \gtrsim \)5,000), NIR (0.8–2 μ m) space-based spectrograph, at optimal resolution for galaxy formation studies (\(\sim \)2,000), and with a large field of view (\( \gtrsim \)0.2 deg2).