1 Scientific rationale

Massive stars are stellar-size objects of broad astrophysical impact. Born with M > 8M they live fast and die spectacularly, making an excellent source of fast chemical enrichment in their host galaxy. During a long fraction of their evolution they experience very high effective temperatures (Teffgeq20000 K, up to 200000 K in some stages) that result in an extreme ionizing UV-radiation field. The same UV-radiation powers supersonic winds that inject large amounts of kinetic energy into the interstellar medium (ISM) and create ionized bubbles and complex HII structures (e.g. 27, review). The deaths of massive stars are counted among the most disrupting events ever registered: type Ib,c,II supernovae (SNe), pair-instability SNe, super-luminous supernovae (SLSNe), kilonovae, and γ-ray bursts (GRBs). The surviving end products, neutron-stars and stellar-size black holes, are sites of extreme physics.

Massive star feedback enters small and large-scale processes spanning the age of the Universe, including the formation of subsequent generations of stars and planets, and the chemodynamical evolution of galaxies. Because the cosmic chemical complexity is ever-growing after the Big Bang, simulating these phenomena in past systems and interpreting the available observations demand robust models for the atmospheres, evolution, and feedback of massive stars at ever-decreasing metallicity. Moreover, the metal-poor ISM is more porous to UV wavelengths and it is expected that the UV radiation of extremely metal-poor massive stars has an impact over comparatively larger areas [25]. Ultimately we need verified models for nearly metal-free very massive stars that can be extrapolated to describe the first stars of the Universe.

The massive stars of the Small Magellanic Cloud (SMC) constitute the current standard of the metal-poor regime, with a battery of observations from ground- and space-based telescopes providing empirical evidence and constraints to theory (e.g. 21, 36, 83, 94, 99, 106, 142). All this is integrated into population synthesis codes used to interpret observations of star-forming galaxies along cosmic history.

However, the 1/5Z metallicity of the SMC is not representative of the extremely metal-poor regime, nor the average metallicity of the Universe past redshift z= 1 [89]. The theoretical framework for lower metallicities does exist [33, 93, 95, 126] and predicts substantial differences in the evolutionary pathways with impact upon the time-integrated feedback and end products. We will elaborate further on this point, but we highlight now that one of the proposed mechanisms to reproduce the first gravitational wave ever detected involves the binary evolution of two metal-poor massive stars [1, 91].

This paper deals with extremely metal-poor, sub-SMC metallicity massive stars. In the following pages we will summarize the state-of-the-art on the topic, the exciting new scenarios that we may expect from the theoretical predictions, how far we have reached with current observatories, and prospects for future missions in the planning.

1.1 Formation of massive stars in metal-poor environments

How massive stars form remains a matter of intense investigation. Our understanding of this topic has significant gaps ranging from the formation of individual stars, to how the upper initial mass function (IMF) is populated, and whether there is a dependency on environment. Two principal issues make the formation process markedly different from their lower-mass siblings: the star-forming clumps must be prevented from breaking into smaller pieces, and radiation pressure from the forming star may halt accretion.

Two main theories of star formation have emerged, competitive accretion (radiation pressure is overcome by forming massive stars in the gravitational well of the whole cluster, favoring the possibility of mergers, 11) and monolithic collapse (radiation is liberated via a jet, 74). At solar metallicity, they both struggle to form stars more massive than 20-40M [61, 127, 151]. New promising simulations may raise this figure to \(\sim \)100M [76, 81] in better accordance with the number of known ≥ 60M stars in Milky Way clusters (e.g. Westerlund 1: [24], Carina: [121], Cygnus OB2: [8] or the Galactic Center: [97]) although still far from the \(\gtrsim \)150M value registered in the Large Magellanic Cloud -LMC- (9, 26, see below).

Models also struggle to reproduce the large number of short period binaries that are observed in massive star populations [6, 70, 108, 109]. Fragmentation of the massive accretion disc [72] is a promising way to match the high-degree of multiplicity observed in the Milky Way [4, 90, 110] but the binaries thus formed are too wide, and an additional hardening mechanism is required [111].

The feasibility of different scenarios of massive star formation eludes observational confrontation since it is hard to catch forming massive stars in the act. The most massive young stellar objects (MYSOs) have 20-30M [98], but at this stage a highly embedded hot core is detected where it is complicated to disentangle the contribution of the accretion structure and the ionized gas component with either imaging or spectroscopy [114]. Few interesting links exist like IRAS 13481-6124, an 18M MYSO for which VLTI-AMBER detected a 20M surrounding disc [73]. Nonetheless, this system would only make \(\sim \)40M at maximum efficiency. The number of candidate merger events [123] or merger products [133] is also small.

The situation should be alleviated in environments of decreasing metallicity, since the paucity of metals would both prevent gas from cooling and breaking up into smaller pieces, and make pre-stellar radiation-driven outflows weaker [140]. The former argument is fundamental to support the widely-accepted concept that the First, metal-free, stars of the Universe were very massive. In fact, the record-holding \(\gtrsim \)150M massive stars have been found in 30 Doradus at the heart of the Tarantula Nebula in the LMC [26], and have 0.4Z metallicity. Evidence of over 100M stars has also been found in the integrated light of unresolved, metal-poor starbursts [120, 146].

The Local Group and nearby dwarf irregular galaxies (dIrr) enable us to investigate whether the upper mass limit is set to higher values as metallicity decreases in the range of 1/7–1/30 O (see Section 2). Some of these galaxies host spectacular HII shells equivalent in size to the 30 Doradus region (Fig. 1), but no analog to the LMC’s monster stars has been found yet. The most massive star reported has an initial mass of 60M, and only a handful of them have masses in the 40-60M range [47].

Fig. 1
figure 1

Where are the very massive stars of the Local Group? Left and middle panels: The R136a cluster at the heart of the Tarantula nebula, in the LMC, hosts the most massive stars known in the local Universe (Fig.1 from “The R136 star cluster hosts several stars whose individual masses greatly exceed the accepted 150M stellar mass limit”, 26, adapted). Right: The Local Group 1/10 Z galaxy Sextans A hosts HII shells equivalent in size, but no star more massive than 60 M has been detected

It may be argued that other factors may outweigh the paucity of metals and lead to smaller final masses, such as the mass of the reservoir of H2, the local gas density or the star-formation rate [60], expecting higher mass stars in denser, more active regions. This hypothesis was also challenged by the spectroscopic detection of OB-stars at the low stellar and gas density outskirts of Sextans A, two of them being the youngest and most massive stars known so far in this galaxy [49]. Previous indications of young massive stars in low gas-density environments existed, like the extended UV-disc galaxies [53], but O stars identified spectroscopically enabled the unequivocal association of UV emission at the outskirts with young massive stars (Fig. 2).

Fig. 2
figure 2

Signatures of star-formation in low gas-density environments. Left: NGC 5236 and other extended UV-disc galaxies exhibit UV emission (hence on-going star-formation) up to 4 times beyond their optical radius. RGB composite made with FUV (blue) and NUV (green) GALEX channels (from , 12, adapted). Right: The youngest, most massive stars ever reported in Sextans A (circles, 49) are located in the outskirts of the galaxy, where the density of neutral hydrogen [67] is comparatively low. The HI emission is laid over a V-band image of Sextans A

The density and distribution of molecular gas would be key to understand star formation in these environments, but this piece of the puzzle is missing. Direct observations of cold H2 are unfeasible at most of the sub-SMC metallicity Local Group dIrr galaxies (0.715 - 1.3 Mpc) and detecting CO is extremely challenging, because of the low metal content. CO has been detected only in a few of them [35, 117], overlapping the highest concentrations of stars and UV emission, but the low-density outskirts of the galaxies have not been targeted. The mass of molecular gas can also be estimated from the dust content but Herschel missed the inconspicuous regions at \(\sim \)1 Mpc (see e.g. selection by 116). In addition, results rely heavily on the adopted gas to dust ratios which in turn depend on metallicity with a large scatter [52]. At the moment, there is no reliable inference of the distribution of molecular gas in the dwarf irregular galaxies of the Local Group.

A tantalizing alternative is that star formation could proceed directly from neutral gas. Simulations have shown that low-density, metal-poor neutral gas can reach sufficiently low temperatures to begin collapsing without forming H2 molecules [75], breaking the link between star formation and molecular gas. Interestingly, there is a strong spatial correlation between HI and the location of OB stars and associations in the dwarf irregular galaxies of the Local Group [45]. Is it possible that cloud fragmentation and star formation follows different mechanisms in dense environments hosting molecular clouds, and sparse, neutral-gas dominated regions? This could be a natural explanation for the occurrence of SLSNe in the outskirts of galaxies [86]. If this concept was demonstrated, the simulations of the evolution of galaxies would need to be revisited to check the significance of the stellar mass formed in low gas-density regions and in the outskirts of spirals.

Hence, the latest results not only highlight our poor understanding of massive star formation but also open new questions. The joint study of resolved massive stars and detailed maps of atomic and molecular gas will help to unravel the relative role played by H2 and HI in star formation, whether it changes with galactic site and metallicity, and whether it translates into different mechanisms that populate the IMF and the distributions of initial rotational velocity and of binaries distinctly. Ideally, untargeted, unbiased spectroscopic censuses of resolved massive star populations would enable reconstructing these distributions that are so important to understand star formation, establishing the local upper mass limit in particular, and checking for any dependence on metallicity and gas content. However, observations of the required spectral quality are out of reach to current technology because: 1) main sequence O stars at > 1 Mpc are at the sensitivity limit of optical spectrographs installed at 10m ground-based telescopes; 2) the programme should include near-IR observations to overcome internal extinction (significant in dIrr galaxies, 49) and to reach MYSOs, but both are out of reach to current IR spectrographs; and 3) the densest concentrations of gas and stars should be inspected to look for very massive stars but these regions are hardly resolved by ground-based spectroscopic facilities even using adaptive optics. Nonetheless if such a phenomenal database was possible it would enable studying on-going star formation with unprecedented detail, and re-calibrating star formation indicators once all the stellar mass content (including extincted stars that were initially unregistered) was properly accounted for.

1.2 Evolution, explosions, and feedback

The close interaction between massive stars and the Universe began with the first generation of stars. Primordial star formation simulations and evidence from extremely metal-poor halo stars strongly suggest that a fraction of them were sufficiently massive and hot as to commence the re-ionization of the Universe [16, 43, 64]. Ever since, signatures of their copious ionizing flux can be seen in highly-ionized UV emission lines (CIV 1548,1551, OIII] 1661,1666, [CIII] 1907 + CIII] 1909) [115], indirectly in Lyman-break galaxies (LBGs), and in a few interesting cases in the shape of Lyα emission galaxies (LAEs). They allow us to detect galaxies and probe the cosmic star formation rate out to redshift \(z \sim \) 10 (see e.g. introduction by 145).

Understanding massive stars with extremely low metal content is the missing piece of information to interpret star-forming galaxies in their many flavors, i.e. LAEs, LBGs, ULIRGs, and Blue compact dwarfs. Their physical properties (Teff, luminosity Lbol, mass loss rate \(\dot {M}\)) parameterized along their evolutionary stages, will enter population synthesis and radiative transfer codes such as Starburst99 [84], BPASS [34], and CLOUDY [40], to interpret the integrated light from massive star populations [147]. Armed with these tools to study the interplay between massive stars and hosts, we can answer outstanding questions such as the average ionizing photon escape fraction of galaxies, a crucial parameter to establish the end of the re-ionization epoch [41, 136].

Massive stars are born as O- or early B-dwarfs (Teff ≥ 20000 K), or extreme WNh stars when very massive. After H-burning the star undergoes a sequence of evolutionary stages that strongly varies with the initial stellar mass. The zoo of post-main sequence stages includes O and B supergiants, red supergiants (RSGs), luminous blue variables (LBV), yellow hypergiants (YHG) and Wolf-Rayet stars (WR), and covers an extreme temperature interval ranging from the \(\sim 4 \)000 K of RSGs [28] to the \(\sim 200 \)000 K of the most extreme oxygen WRs [130]. Evolutionary models must link these stages, drawing paths that depend on metallicity, steady/episodic mass loss (Section 1.3), rotational velocity and mass exchange in binary systems [82, 144].

Evolutionary tracks that treat rotation and mass loss have been extensively calculated for the Milky Way, LMC (1/2Z), SMC (1/5Z) [17, 32], I Zw18 (1/50Z) [59, 126], and Population III stars [31, 92, 150]. Significant changes are expected in the evolution of metal-poor massive stars, some of them with tremendous impact on ionizing fluxes. The most notable example is the incidence of chemically homogeneous evolution (CHE), in which fresh He produced in the core is brought to the surface by rotation-induced mixing. A 1/5Z, Mini= 25M SMC star will usually reach the RSG stage, but if the initial rotational velocity (\(v {\sin \limits } i\)) is extremely high it will evolve into a CHE-induced WR-like stage with \(T_{\text {eff}}\sim 100 \)000 K [17]. This effect is magnified at lower metallicities, where very massive stars can either evolve into an envelope-inflated RSG, or stay compact in the regime of high effective temperatures and become a Transparent Wind Ultraviolet INtense star (TWUIN; 78, 126). TWUINs double the HI ionizing luminosity and quadruple the HeII ionizing luminosity with respect to lower \(v {\sin \limits } i\) counterparts, and could be responsible for the extreme HeII emission detected in I Zw18 and the \(z \sim \) 6.5 galaxy CR7, currently attributed to population III stars [69, 124].

Aside from the initial mass, rotation, and metallicity, multiplicity is yet another critical ingredient that can dramatically impact the evolution of massive stars. Double stars with a short enough orbital period (\(\sim \) few years) will exchange mass and angular momentum with their companions thus deeply altering the future evolution of the stars involved [104], end-of-life events and left-over compact products (see below). The shortest-period systems may merge and develop extreme properties in terms of rotation [30] or magnetic field [113], that we are just beginning to unravel. In systems that do not merge, the primaries (initially more massive) will be stripped from their envelope and may become WR stars or OB subdwarfs [54, 55], which are very hot and generate significant UV excess. The secondaries will gain mass and angular momentum, which may trigger sheer instabilities, enhance internal mixing and send the stars on a chemically homogeneous evolution pathway. At the moment, we lack any information on the frequency and period distribution of metal-poor massive binaries beyond the Magellanic Clouds, since only a handful of systems are known, all of them in the galaxy IC 1613 (e.g. 10). Nonetheless, multiplicity has a critical impact on the global properties of massive star populations, since it modifies the perceived mass function and upper mass limit [112] and enhances/hardens the UV flux and the amount of ionizing radiation produced [54].

Another fundamental aspect to quantify feedback from massive stars is the end of their evolution. There is a plethora of very energetic events associated to the death of massive stars: core-collapse supernovae -SNe- (types Ib, Ic, II, IIL, IIn, IIP, IIb), pair instability SNe, super-luminous SNe (SLSNe), electron-capture SNe, hypernovae, kilonovae and GRBs. Evolutionary models can predict the ending mechanism and leftover products of single and binary systems [105, 148, 149], but observations have proven decisive to constrain and inform theory. For instance, pre-explosion images provided the first evidence that RSGs and LBVs can explode as SNe, and pre-explosion spectra of the progenitor of SN1987A showed that it was a blue supergiant, contrary to the canonical model at that time (see e.g. 58, 119). Likewise, the preference of long-GRBs and SLSNe for metal-poor galaxies [22, 86] is a clue to the specific evolution of metal-poor massive stars. Armed with a theoretically sound, observationally constrained map of progenitors [44, 119, 134] where the variation with metallicity is understood, the most energetic long-GRBs and SLSNe can be used to probe the high redshift Universe, constrain star-formation rates [102], or even detect the signatures of the first stars [16].

The LIGO and Virgo experiments have revolutionized our view of massive star death, with already 10 in-spiraling double black hole systems detected during the first two observing runs [3] and an even more prolific O3 run. Numbers will soon enable statistics on the distribution of black holes (BH) and neutron star (NS) masses, that will put the predicted scenarios for the fate of massive stars to the test. Nonetheless LIGO and Virgo have already accomplished paradigm-shifting results. The detection of GW170817, associated with a collapsing double neutron star and kilonova [2], linked short-GRBs with massive stars [132]. The very first gravitational wave system detected, GW150914, challenged all we knew about the formation of black-hole systems and put evolution of massive stars in binary systems in the spotlight. With 36M and 29M masses, the two BHs that merged were significantly larger than any stellar-mass BHs known back then (\(\sim \)5-15M, 19), and those that could be formed from stellar evolution at solar metallicity (\(\sim \) 20M, 125). This system has inspired the development of new scenarios, such as the CHE evolution of two metal-poor massive stars within their Roche Lobes avoiding mass exchange and the common-envelope phase [91].

The expected weak winds of metal-poor massive stars (Section 1.3) provide a natural means to reach the final stages of evolution with larger stellar masses, thus increasing the size of the ensuing BH [7]. Alternatively, implementing the quenching of mass loss produced by a surface dipolar magnetic field can also allow the star to maintain a higher mass during its evolution and eventually form heavier stellar-mass black holes (> 25 M, 103). The same mechanism would enable solar-like metallicity massive stars to form pair-instability supernovae [51]. Magnetic fields play a yet to be fully characterized role in the lives of massive stars specially in metal-poor galaxies, although the relative population of magnetic OB-stars is relatively small [141].

Constraining massive star evolution is a multi-dimensional problem. The high incidence of massive stars in multiple systems, and the fraction that will interact with their companions [108], enriches the problem exponentially. The way to proceed is to assemble large, multi-epoch spectroscopic datasets of large samples of massive stars to fully cover the parameter space, constrain their physical properties with the most advanced stellar atmosphere models, obtain distributions of \(v {\sin \limits } i\) and of the properties of binary systems, and contrast against the predictions of single and binary evolutionary models. Only a vast spectroscopic programme can lead to the reconstruction of the single- and binary-evolutionary pathways of massive stars. Such ensembles have been built over the years in the Milky Way [118], and most recently in the Magellanic Clouds [21, 38, 106]. However, only a handful of massive stars have been confirmed by spectroscopy in galaxies with poorer metal content than the Small Magellanic Cloud (see, 47, and Section 2). At this stage no signature of CHE has been reported in these galaxies, and very few massive binaries are known.

1.3 The winds of extremely metal-poor massive stars

Stellar winds are the mechanism by which the evolution of massive stars is strongly conditioned by metallicity. Massive stars experience very high effective temperatures (Teff > 20000 K) during a large fraction of their evolution (Section 1.2). In these stages the extreme UV radiation field exchanges energy and momentum with metal ions in the stellar atmosphere, resulting in an outflow known as radiation-driven wind (RDW, 20, 85).

RDWs are particularly significant in OB-type stars and WRs. The ensuing removal of mass, with rates of the order of \(\dot {M}\) \(\sim 10^{-8} M_{\odot }/yr - 10^{-4} M_{\odot }/yr\) [80], may be high enough to peel off the outer stellar layers (being responsible for the different flavors of WR stars as successive nuclear-burning products are exposed), but also to alter the conditions at the stellar core and the rate of nuclear reactions. It is because of RDWs, which inherit a strong dependence on metal content from their driving mechanism, that two massive stars born with the same initial mass but different metallicity can follow distinct evolutionary pathways [23] and result in different end-products (Section 1.2). RSGs, the cool evolutionary stages of massive stars, also experience outflows but the driving mechanism is different and mass loss rates are apparently independent of metallicity [57].

RDWs are weaker as metallicity decreases, with models predicting \(\dot {M}\) \( \propto Z^{\sim 0.85}\) for OB-stars [138] and nitrogen-rich WRs [139]. The theoretical relation has been verified observationally down to the metallicity of the SMC [96]. The winds of more metal poor hot stars require a special formalism [79] that should consider the shift of driving ions from Fe to CNO at Z ≤ 1/10Z [77], implying that the wind strength may vary as processed material is brought to the surface by internal mixing. The expectation is that at Z < 1/100Z winds are very weak unless the star is very luminous, and consequently would have very little impact on the evolution of the star.

Theory was finally confronted with observations with the arrival of multi-object spectrographs at 8-10m telescopes. The first efforts focused on IC 1613 (715 kpc), the closest star-forming Local Group galaxy whose \(\sim \)1/7O nebular abundance marked a significant decrease in present-day metallicity from the SMC. They soon were followed by studies in the \(\sim \)1 Mpc distant galaxies NGC3109 and WLM, where similar nebular abundances had been measured (see, 46, for references). The results were unanticipated: the finding of an LBV with strong optical P Cygni profiles [62], an extreme oxygen WR (WO, 128), and the optical analysis of O stars [63, 129] indicated that winds were stronger than predicted by theory at that metallicity.

The presence of Wolf-Rayet stars – the descendants of the most massive O stars and likely progenitors of Type Ib/c SNe and GRBs – in low-metallicity galaxies is a strong indication that more mass is lost during the evolution of massive stars than is presently accounted for. Current single-star evolutionary models cannot explain the existence of WR stars in the SMC, let alone the fully-stripped WO star in IC 1613. While recent empirical results indicate only a mild dependence of the winds of extreme WC and WO stars on metallicity (\(\dot {M}\)Z0.25, 131, much weaker than nitrogen-rich WRs) the question remains how these stars have shed their entire hydrogen envelope in previous evolutionary stages. An interesting possibility, that brings up again the important role of multiplicity in the life of massive stars, is mass exchange in binary interactions [55, 56].

UV spectroscopy by the Hubble Space Telescope (HST) played a crucial role deciphering the strong wind problem (Fig. 3). The detailed analysis of UV spectral lines, more sensitive to the wind than the optical range, yielded lower mass loss rates for O stars [13]. UV spectroscopy also showed that IC 1613’s content of iron was similar to or even larger than the \(\sim 1/5 \text {Fe}{_{\odot }}\) content of the SMC (46 and Fig. 4), superseding the 1/7Fe value scaled from oxygen. Similar SMC-like Fe-abundances were also reported for WLM and NGC3109 [13, 65]. While this finding sets a reminder that metallicity cannot be scaled from oxygen abundances since the [α/Fe] ratio reflects the chemical evolution of the host galaxy, it also alleviates the discrepancy since the expected mass loss rate is larger at the updated iron content [138].

Fig. 3
figure 3

The momentum carried by the wind depends on stellar luminosity and metallicity (solid lines). The optical studies of 1/7 O stars suggested that their winds were as strong as LMC analogs (squares). Terminal velocities from the UV revised these values downwards (stars). A full UV analysis resulted in wind momenta well under the theoretical prediction (triangles). We are far from a reliable prescription of the mass lost to RDWs by extremely metal-poor stars. From [47], reprinted with permission

Fig. 4
figure 4

The UV spectral morphology reflects variations of stellar metallicity. HST-COS/HST-STIS UV spectra of stars with similar spectral type (hence Teff, Lbol) in different Local Group galaxies. The pseudo-continuum at 1350–1500Å, dominated by FeV lines (green ticks), indicates a sequence of decreasing Fe content from top to bottom. The wind profiles of NV and CIV decrease correspondingly. From [47], reprinted with permission

New efforts are being directed to the Sextans A galaxy that has nebular abundances as low as 1/10-1/15O [71] and similarly low stellar 1/10Fe abundances (47, 68, see also Fig. 4). The first spectroscopic surveys have reported 16 OB stars [18], but being located 1.3 Mpc away only a handful can be observed in the UV [47, 100]. Two other extremely metal-poor star-forming galaxies with resolved stellar populations have been surveyed for O stars, both with positive results: SagDIG (1/20Z, 1.1 Mpc, 48) and Leo P (1/30Z, 1.6 Mpc, 39). However, the combination of distance and foreground extinction severely hampers optical observations of O stars in these galaxies and UV spectroscopy is basically unfeasible. The overall sample size is insufficient and the sub-SMC metallicity regime of RDWs remains largely unexplored.

The uncertain metallicity dependence of RDWs adds to a hotly debated question: what is the total mass lost throughout the stellar lifetime, and what is the main driving mechanism? Besides RDWs, pulsation- and rotation-driven outflows, evolution and/or mass exchange in binary systems, and eruptions such as those experienced by Eta Car [122] may lead to considerable amounts of mass loss. In fact, the concept that super-Eddington stars such as Eta Car may experience continuum-driven winds, provides an interesting metallicity-independent mass loss mechanism [135]. These processes are very poorly understood compared to RDW even at solar metallicity, let alone among sub-SMC stars. At the moment we simply lack any evidence to assess what is the dominant mass loss mechanism ruling the life of extremely metal-poor massive stars.

2 A metallicity ladder to look back in time

Massive stars are ubiquitous throughout cosmic history ever since the first, roughly metal-free, very massive stars. Their ionizing and kinetic energy production is critical to many astrophysical processes that can be counted back to the onset of the re-ionization epoch. Each generation inherits the chemical composition of the cloud where it forms, implying the existence of extremely metal-poor massive stars in the infant Universe, but also in pristine galaxies where star formation was only activated recently, or that lost their metals to galactic outflows. Understanding the physical properties of massive stars with extremely low metal content is therefore crucial to realistically compute feedback in a significant number of environments spread through the history of the Universe.

To summarize Section 1, the important questions that need answering are

  • Are the physics and evolution of extremely metal-poor massive stars substantially different from Solar metallicity analogs? If so, what is the impact in terms of ionizing flux, yields, and feedback?

  • Can current models be extrapolated to infer the physical properties of the first stars (initial M, Teff, Lbol, and \(\dot {M}\))? Is it possible to detect their end-of-life events?

  • Does the distribution of stellar initial masses depend on metallicity? Can extremely massive stars be expected at the infant Universe?

  • What kind of death-events can be expected from extremely metal-poor massive stars, and can any of them be detected up to very high redshifts?

  • What are the evolutionary channels that lead to binary stellar-mass black holes and gravitational wave sources?

The answer to these questions relies on exceptional-quality optical and UV spectra of a representative sample of massive stars with sub-SMC metallicity. Armed with the tools for quantitative spectroscopy that teams around the world have been perfecting for decades now, accurate stellar properties (Teff, M, Lbol, and wind properties) can be derived. These will allow us to draw the evolutionary pathways of massive stars, study the IMF in metal-poor environments and provide more realistic forecasts of feedback. Fortunately for this quest, metallicity increases monotonically with time but not isotropically, and some systems exist that are more metal poor than the average present-day chemical composition of the Universe. Unfortunately, the SMC is currently both a metallicity and distance frontier, and a sizable leap down in metallicity requires reaching distances of at least 1 Mpc (the outer Local Group and surroundings).

Very promising galaxies with 1/10Z (Sextans A, 1.3 Mpc away), 1/20Z (SagDIG, 1.1 Mpc) and 1/30Z (Leo P, 1.6 Mpc) are subject to close scrutiny with VLT, and the 10m Keck and GTC telescopes [14, 18, 37, 39, 48, 49]. They form a sequence of decreasing metal content that will enable understanding and parameterizing the properties of low-metallicity massive stars (Fig. 5). A crucial – yet ambitious – landmark is the 1/32 Z blue compact dwarf I Zw18 (18.9 Mpc, 5, 137). In this galaxy, very massive (300M) or alternatively metal-free 150M stars have been suggested as possible ionizing sources producing the extraordinarily strong observed HeII 4686 nebular line [69]. I Zw18 thus represents the best chance to reach primordial-like massive stars, and will enable studies of massive star populations in very metal-poor extreme starbursts (very enlighting when compared with all our compiled knowledge on 30 Doradus, 38).

Fig. 5
figure 5

Roadmap to the early Universe. Selected Local Group and nearby star-forming dwarf galaxies provide a ladder of decreasing metallicity that will allow us to study the physics of extremely metal-poor massive stars, and ultimately to extrapolate the properties of the first, metal-free stars. Adapted from [50] with permission

However, the world’s largest ground-based telescopes only reach the brightest, un-reddened massive stars in \(\sim \)1 Mpc galaxies after long integration times, and even for these spectral quality is sometimes too poor to yield stellar parameters from quantitative analysis. Spatial resolution is also an issue: breaking down the population of I Zw18 at optical wavelengths is beyond the capabilities of even the future European Extremely Large Telescope (ELT). The observations of stellar winds are yet more handicapped since the intrinsic strong UV emission is dulled by extinction, and a strong sensitivity limit is set by the relatively small mirror size of the only observatory offering UV spectroscopy, HST. The result is a biased and sorely incomplete view of sub-SMC massive stars. The reality is that we have hit the limit of current observational facilities.

3 Technical proposal

The questions raised by Sections 1 and 2 can be distilled into the following specific points:

  • Is the IMF universal? What is the upper mass limit? Does it increase with decreasing metallicity?

  • What kind of outflows do extremely metal-poor massive stars experience?

  • How do their physical parameters (Teff, Lbol, and \(\dot {M}\)) vary along evolution? What is the frequency of CHE?

  • What is the frequency and period-distribution of binary stars in extremely metal-poor environments? Do they have a significant impact on feedback? Can they populate the mass distribution of double BHs and NSs inferred from gravitational wave events?

The following subsections discuss the technical capabilities needed to tackle these points, followed by a brief discussion of the instrumentation and telescope diameter needed to meet them.

3.1 Technological needs for a breakthrough

Answering the questions stated in this White Paper requires observations of a large sample of massive stars in sub-SMC metallicity galaxies in all their flavors (OB-type, WR, LBV, YHG, and RSG), with sufficient good quality as to enable detailed and precise spectroscopic analyses. This section focuses on the OB-stars, the most challenging to observe with current facilities. The technical specifications set by OB-stars will also enable observations of WRs and LBVs. The VLT and the upcoming ELT warrant good prospects for YHGs and RSGs [15, 29, 101].

Homogeneous studies of high-quality optical and UV datasets, such as the IACOB [118] and VFTS [38] Europe-led projects, have provided invaluable insight into blue massive stars of the Milky Way and the Magellanic Clouds (see also 21, 106). Notably, the ULLYSES programme is devoting more than 200 HST orbits to ensure proper UV spectroscopic coverage of the SMC [100]. These and other on-going efforts are consolidating our knowledge of massive stars at the present day, and lay the groundwork for the kind of in-depth studies needed to provide quantitative results on extremely metal-poor massive stars.

The engine for analysis is ready, yet the observations are unfeasible with present-day instrumentation. The key technical enabling requirements are:

  • Spatial resolution of the order of 0.01\(^{\prime \prime }\) at UV and blue-optical wavelengths. This value can resolve stellar populations out to the distance of I Zw18 (18.9 Mpc), disentangle 30 Doradus-like clusters throughout the Local Group (≤ 1.5 Mpc), and break up the 30 Doradus inner core, R136a, out to 750 kpc (Fig. 6 top left). Coupled with follow-up spectroscopy, this will provide unprecedented constraints on the IMF of dense clusters and starbursts.

    Fig. 6
    figure 6

    The potential of current and future instrumental facilities to study OB-type stars at landmark galaxies: LMC (\(\sim \) 0.05 Mpc), SMC (\(\sim \) 0.06 Mpc), IC 1613 (\(\sim \) 0.75 Mpc), Sextans A/Leo P/SagDIG (\(\lesssim \) 1.5 Mpc), the Sculptor filament and Centaurus group (\(\lesssim \) 4 Mpc), and I Zw18 (18.9 Mpc).Top left (a): Power to resolve tight stellar populations. In the first column the squares mark the diameter of R136a, the compact cluster at the core of 30 Doradus that hosts \(\sim \) 150M stars (4\(^{\prime \prime }\)). The rhombus and the triangle mark typical 30 Doradus and inner R136a inter-star distances (0.5\(^{\prime \prime }\) and 0.1\(^{\prime \prime }\) respectively). The figure then illustrates the angular separation of similar structures at farther galaxies. The horizontal lines mark the diffraction limit of space facilities at 4000Å. The expected performance of ELT at blue-optical wavelengths is also included as reference. Top right (b): Flux limits for UV spectroscopy with R\(\sim 2 \)000. Expected UV fluxes of O stars (rhombuses) and B supergiants (triangles) if stars were hosted by different galaxies. These numbers have been scaled from IC 1613 observations where O stars and B supergiants registered very similar fluxes (3 × 10− 15 and 1 × 10− 14 ergcm− 2s− 1Å− 1 at 1500Å respectively, [46]) reflecting the trade-off between spectral sub-types and extinction at the time of target selection. The horizontal line marks the limiting flux that can be observed with HST-COS-G140L (R\(\sim 2 \)000) in 6 orbits with sufficient SNR as to enable analysis (SNR≥ 20, 3 × 10− 16 ergcm− 2s− 1Å− 1). Flux limits for both LUVOIR architectures were estimated scaling this value by mirror size, assuming no throughput improvement. Bottom: V-magnitude limits for optical spectroscopy with R\(\sim 8 \)000 (c, left) and R\(\sim 1 \)000 (d, right). The rhombuses and the triangles provide the V-magnitudes enclosing O stars hosted by different galaxies (\(M_{V} \in \left [-3.5,-6.5 \right ]\), [143]). The horizontal lines represent the magnitude reached by different facilities in 12 hour observing time, scaled by the mirror size only, and assuming no throughput improvement. The reference for the R\(\sim 8 \)000 panel are the VLT-FLAMES observations of O stars in 30 Doradus [38], and GTC observations of Sextans A [49] for the R\(\sim 1 \)000 panel. An extra magnitude has been added to space facilities to simulate the lack of atmospheric absorption and extremely low sky-brightness

  • Large collecting power to increase sensitivity in the whole UV-optical-IR range, enabling spectroscopy of distant massive stars and close but extincted objects. Optimal limiting values for different set-ups (see below) are: V\(\sim \)21 for optical R\(\sim 8 \)000 spectroscopy of O stars in the Local Group, V\(\sim \)25 for optical R\(\sim 1 \)000 spectroscopy in I Zw18, and F1500A= 1 × 10− 17 ergcm− 2s− 1Å− 1 for UV spectroscopy of O stars in this galaxy (Fig. 6). Accessing moderately reddened OB stars enables studying the processes of star formation in metal-poor galaxies and its relation with gas density. Strong synergies with a far infrared, SPICA-like mission [107] are foreseen.

  • Medium resolution (R = λ/Δλ 8000) multi-object optical and near-IR spectroscopy to constrain stellar parameters of massive stars and define evolutionary channels. This configuration would mimic the VLT-FLAMES optical survey of 30 Doradus, that produced the most accurate characterization of LMC massive stars [38]. The optical range (\(\sim \) 4000-5500Å and Hα region) contains the best characterized diagnostic lines constraining Teff, \(\log \) g, and element abundances, while near-IR spectroscopy is reserved for the most reddened population since OB stars are intrinsically faint in this range. The James Webb Space Telescope (JWST) will produce first exploratory studies in the IR, but both its collecting area and spectral resolution are insufficient (ϕ 6.5m; RNIRSpec= 2700 whereas at least R\(\sim 4 \)000 is needed, 66). An efficient optical/near-IR multi-object spectrograph would facilitate muti-epoch observations, which are critical to characterize spectroscopic binaries.

  • Ultraviolet spectroscopy with multi-object modes in order to accumulate exposure time in exchange for multiplexing, so that distant galaxies or reddened OB stars can be targeted. The resolving power must be R ≥ 2000 to confirm the presence of winds and to resolve the interstellar components from wind troughs when the profiles are weak. Observations with higher spectral resolution will enable additional constraints on mass loss rates and the velocity field.

UV observations alone would require a space observatory, but the sensitivity and the spatial resolution requirements also need a 10m-class telescope in space. This is illustrated in Fig. 6, that compares these metrics for current and future facilities: HST (ϕ= 2.4m diameter), the ground-based telescopes VLT (ϕ 8m) and GTC (ϕ 10.2m), the European ELT (ϕ 40m), and two designs for a future mission that will be described in Section 3.2: LUVOIR-A (ϕ 15m) and LUVOIR-B (ϕ 8m).

While ELT’s impressive collecting power will be crucial to follow-up specific targets in the IR, the telescope is not suitable for large-scale studies of extremely metal-poor massive stars in the visible. Only HARMONI among first-light instrumentation provides partial coverage in the optical range, missing important diagnostic lines in the uncovered 4000-4700Å interval. Even if optical coverage is considered for second-generation instruments, adaptive optics will struggle providing diffraction-limited observations in the optical-blue over a \(\geq 1^{\prime } \times 1^{\prime }\) field of view. Only a large-mirror telescope in space unites both requirements of sensitivity and outstanding spatial resolution in the optical range.

3.2 The LUVOIR observatory

A 10m-class telescope in space operating in the UV-optical-NIR ranges qualifies as an L-size mission, although there is a possibility that would greatly reduce the costs. One of the mission concepts NASA is considering for its next flagship mission meets the size and sensitivity requirements laid-out in Section 3.1. We propose that ESA joins as a partner.

The Large UV/Optical/IR Surveyor [87, 88] is a proposed multi-wavelength, large mirror telescope operating at L2 that truly captures the heritage of HST as a broad scope observatory. The study team is considering two architectures with different mirror size, LUVOIR-A (ϕ 15m) and LUVOIR-B (ϕ 8m). Both concepts are equipped with the LUVOIR Ultraviolet Multi Object Spectrograph (LUMOS), designed to provide high-throughput multi-object spectroscopy at UV wavelengths. Multiple resolution modes will be available with resolving power in the range: R = 500–65000. Multiplexing will be achieved by a grid of 6 micro-shutter arrays, with 480 × 840 shutters each, following the design used for JWST-NIRSpec. The multi-object capabilities of LUMOS coupled with on-going improvements on UV detectors, will revolutionize the field, by enabling the first extensive characterization of the outflows of massive stars beyond the SMC.

The selection of the most ambitious design will enable UV spectroscopy of individual stars in I Zw18 (Fig. 6 top right). The current specifications allow LUMOS-A to obtain good quality spectra of I Zw18 O stars in about 11.5 hours (SNR= 20 @ 1500Å, R\(\sim 5 \)000, 42). Both LUMOS-A and LUMOS-B will comfortably reach out to distances of a few Mpc, opening great discovery opportunities in the Sculptor, Centaurus, and M81 Groups. LUMOS ensures a proper characterization of RDWs and mass loss rates in extremely metal-poor environments.

LUVOIR-A will resolve individual stars in the sparse regions of I Zw18 in the optical and the UV (Fig. 6 top left). Both A and B architectures will be able to dissect 30 Doradus-like clusters – except for the densest cluster core – out to 4 Mpc.

The true power of LUVOIR, however, resides in the combination of sensitivity and outstanding spatial resolution over the extent of the field of view, regardless of the wavelength range. ELT cannot compete with the expected performance of LUVOIR in this respect. Coupled with follow-up spectroscopy in the optical range, these phenomenal capabilities will enable the definite characterization of extremely metal-poor massive stars together with unprecedented insight on the IMF of the host galaxies.

In principle, LUVOIR-A will have the required sensitivity to obtain R\(\sim 8 \)000 optical spectra of V\(\sim \)21 O type stars at 1.5 Mpc in about 12 hours (Fig. 6 bottom left). The analysis of such dataset can provide accurate Teff, \(\log \) g, and abundances. Farther galaxies require a lower resolution R\(\sim 1 \)000 mode, enough for first estimates of stellar parameters. LUVOIR-A could then comfortably reach V\(\sim \)22.5 O stars at 4 Mpc in 12 hours (Fig. 6 bottom right). Reaching I Zw18 would mean a leap of 2.5 mags that translates into a factor 10 longer exposure times. Such observations are feasible, but strongly advocate for multi-object capabilities.

We note that the instruments currently studied for LUVOIR do not include a multi-object optical spectrograph working at intermediate and high spectral resolution. This will be fundamental to many scientific cases beyond those outlined in this White Paper, providing spectroscopic follow-up of a broad range of sources observed by the exceptional imaging from LUVOIR. The French-led POLLUX study includes coverage of visible wavelengths but at higher resolutions, R > 100000, with the option of spectropolarimetry (see 88). We propose that ESA fills this niche by building an optical spectrograph, thus becoming a full partner of the LUVOIR observatory. The basic requirements for such an instrument are summarized in Table 1.

Table 1 Level-zero technical specifications for an optical spectrograph onboard LUVOIR

3.2.1 Technology challenges

Packaging and deployment

LUVOIR will build on lessons learnt by the JWST on this technological aspect, although its larger mirror size will be an additional challenge in terms of rocket size and mass. Active optics and mirror alignment after deployment are also critical technological elements.

Devices for multi-object spectroscopy in space

LUMOS will use a grid of micro-shutter arrays, heritage of JWST-NIRSpec. On-flight performances will demonstrate the maturity of this technology, but it may be interesting to test other possibilities offering a better trade-off between number of targets and spectral coverage.

Improved UV coatings and detectors

NASA is investigating new, enhanced LiF coatings and improved, large-field microchannel plate detectors. Both elements still lack flight qualification [42].

4 Conclusions

Our partial understanding of extremely metal-poor massive stars jeopardizes the interpretation of SNe and long-GRBs, star-forming galaxies throughout cosmic history, and the re-ionization epoch. Teams around the world are working to provide a quantitative characterization of these objects and realistic feedback prescriptions that can be ingested by other disciplines in astrophysics. Current efforts, focusing on nearby galaxies of the Local Group and vicinity, are pushing current facilities beyond their limits.

In order to make sizable progress on this field the wavelength coverage, sensitivity and spatial resolution of a 10m-class telescope in space is needed. The LUVOIR observatory, one of four Decadal Survey Mission Concept Studies initiated in January 2016, can potentially fulfill our technical requirements. We propose that ESA joins NASA in the construction of LUVOIR, building on the past and current synergies that continue making HST an extraordinarily successful telescope. Moreover, ESA can play a fundamental role in this quest by providing an optical spectrograph that will be fundamental for LUVOIR’s suite of instruments.