Neutron star mergers and how to study them

Abstract

Neutron star mergers are the canonical multimessenger events: they have been observed through photons for half a century, gravitational waves since 2017, and are likely to be sources of neutrinos and cosmic rays. Studies of these events enable unique insights into astrophysics, particles in the ultrarelativistic regime, the heavy element enrichment history through cosmic time, cosmology, dense matter, and fundamental physics. Uncovering this science requires vast observational resources, unparalleled coordination, and advancements in theory and simulation, which are constrained by our current understanding of nuclear, atomic, and astroparticle physics. This review begins with a summary of our current knowledge of these events, the expected observational signatures, and estimated detection rates for the next decade. I then present the key observations necessary to advance our understanding of these sources, followed by the broad science this enables. I close with a discussion on the necessary future capabilities to fully utilize these enigmatic sources to understand our universe.

Introduction

Two Neutron Stars (NSs) from the galaxy NGC 4993 merged, emitting two messengers that traveled together from the age of dinosaurs through the age of civilization. As the messengers neared Sirius the Fermi Space Telescope was launched; after they passed Alpha Centauri the Advanced Gravitational wave (GW) interferometers were turned on for the first time. On August 17th, 2017 the messengers arrived at Earth (Abbott et al. 2017b): the GWs observed as GW170817 (Abbott et al. 2017c) by the Advanced Laser Interferometer Gravitational-Wave Observatory (LIGO; Aasi et al. 2015) and Advanced Virgo (Acernese et al. 2015) and the gamma rays as GRB 170817A (Goldstein et al. 2017b; Savchenko et al. 2017) by Fermi (Meegan et al. 2009) and INTEGRAL (von Kienlin et al. 2003). This joint detection resulted in the greatest follow-up observation campaign in the history of transient astrophysics (Abbott et al. 2017d), which resulted in six independent detections of AT2017gfo (Coulter et al. 2017; Valenti et al. 2017; Tanvir et al. 2017; Lipunov et al. 2017; Soares-Santos et al. 2017; Arcavi et al. 2017), the theoretically predicted radioactively-powered kilonova, whose precise location enabled the identification of “off-axis” afterglow emission (Troja et al. 2017b; Margutti et al. 2017; Haggard et al. 2017) that has been detected until more than two years later.

These discoveries culminated in a suite of papers published only two months after the first detection, with contributions from thousands of astronomers and astrophysicists, ushering in the new era of GW multimessenger astrophysics. For decades, the scientific promise of these sources has been known, and the first event certainly met expectations with, on average, more than three papers written per day over the first two years.

There have been only three convincing multimessenger detections of individual astrophysical sources: neutrinos and photons from the core-collapse supernova SN 1987A (Hirata et al. 1987), gravitational waves and photons from a binary neutron star merger (this event; Abbott et al. 2017d), and likely neutrinos and photons from a flaring blazar (Aartsen et al. 2018). The modern era of time domain, multimessenger astrophysics will hopefully result in multiple detections of multiple source classes with multiple messengers. Binary Neutron Star (BNS) and Neutron Star–Black Hole (NSBH) mergers, collectively referred to here as NS mergers, will be important astrophysical multimessenger sources for the foreseeable future.

Several papers and reviews on the astrophysics of NS mergers have been written, both before and after GW170817. Several papers have been written on science beyond astrophysics enabled by observations of these events. When available, we reference manuscripts that contain more detailed discussions. This review collates and advances this information into a coherent summary, to ensure the information carried by messengers from NS mergers, already long into their journey to Earth, will be captured and utilized to understand our Universe. Our view of these mergers will depend on the ground- and space-based assets available to observe them and our strategies and scientific gains are placed in the context of our current outlook on these future capabilities.

In Sect. 2 we give a broad overview of our current understanding of NS mergers and how we observe them. This section contains rough detection rate predictions through the next decade. In Sect. 3 we discuss the astrophysical inferences on NS mergers that are important for several additional scientific studies and those that are not otherwise discussed. The later science sections are separated into the broad topics: Short gamma-ray bursts and ultrarelativistic jets (Sect. 4), Kilonovae and the origin of heavy elements (Sect. 5), Standard sirens and cosmology (Sect. 6), Dense matter (Sect. 7), and Fundamental physics (Sect. 8). The individual science sections are, as much as possible, self-contained. Based on the science sections, Sect. 9 makes recommendations for future capabilities. This discusses both current and funded missions, and identifies where gaps may occur.

Given the broad scope of this paper, particular attention is given to avoid or carefully define field-specific terminology and to use language that should prevent confusion for readers of various backgrounds. We use the astrophysical definition of “gamma-rays”, referring to all photons with energies \(\gtrsim 100\,\text{keV}\). We will directly state when we are discussing gamma-rays that originate from nuclear processes. We assume, unless otherwise stated, that our general understanding of science is correct, e.g., that BNS mergers and (some) NSBH mergers are the progenitors of most Short Gamma-Ray Bursts (SGRBs) and all kilonovae, or that the relative propagation of gravity and light is zero. We assume a standard \(\varLambda \)CDM cosmology, with \({H}_0=67.4\,\text{km}\,/\text{s}\,/\text{Mpc}\) and \(\varOmega _m=0.315\), from Planck Collaboration (2020). Canonical NSs are those with masses of \(1.4\,M_{\odot }\); canonical Black Holes (BHs) refer to those with masses of \(10\,M_{\odot }\). All rates are reported for a calendar year and refer to the prediction of the true rate (i.e. they do not account for Poisson variation). Variables and constants have their usual definition, e.g., c is the speed of light, G the gravitational constant, M represents masses, etc. Subscript \(\odot \) denote solar units. When referring to stars in a binary, both massive and compact, the heavier star is always referred to as the primary and is denoted by a subscript 1 and the lighter star is referred to as the secondary with a subscript 2, to match convention. Heavy elements here refers to those beyond-iron.

Neutron star mergers

NSs are the densest matter in the Universe, with BHs the only known denser object. Binary star systems emit GWs causing them to slowly inspiral as they lose energy. Tightly bound BNS and NSBH systems can lose energy fast enough to merge within the age of the Universe. The merging of the two objects can significantly disrupt the NS, releasing large amounts of matter and energy that can power the observed Electromagnetic (EM) and predicted neutrino signatures.

In Sect. 2.1 we provide a succinct overview of our current understanding of how these systems form, their behavior shortly before, during, and after merger, and potential longer-term signatures. We discuss the intrinsic event rates in Sect. 2.2, followed by subsections on the canonical signals, their individual detection rates, and what we learn from these observations. Interspersed are subsections on the necessary steps for combining information: Sect. 2.5 details the conditions required for robust statistical association, Sect. 2.6 joint detection rates for independent detections, and Sect. 2.7 methods for follow-up searches. Section 2.10 briefly discusses additional signatures that are expected and prospects of detection. We summarize our predicted future detection rates in Sect. 2.11.

Overview

Information on NS mergers can be gleaned from observations of these systems from eons before coalescence to long after merger. This section contains an overview of the lives of these systems; each subsection discusses a stage of their evolution and contains references for further detail. For an in-depth review of the expected EM signatures from NS mergers see the opening figure of Fernández and Metzger (2016), which we borrow as Fig. 1. We do not here give an overview on the history of our understanding of these events as we are unlikely to exceed existing literature; for a brief general history we refer the reader to the introduction of Abbott et al. (2017d).

Fig. 1
figure1

An overview of the expected GW and EM signatures from minutes before until years after merger, as discussed in Sects. 2.1.2, 2.1.3, 2.1.4 and 2.1.5. The bottom represents what occurs as a function of time with the corresponding observational signature on top. Image reproduced with permission from Fernández and Metzger (2016), copyright by Annual Reviews

System formation

The formation and evolution of stellar systems is a broad topic in astrophysics. We are focused on the science enabled with NS mergers. The events of interest are then BNS and NSBH systems that will form and merge within the age of the Universe. For relevant reviews see Sadowski et al. (2008) and Faber and Rasio (2012). Before discussing how such systems can form, we show the time until merger as a function of orbital separation radius R for two compact objects inspiraling only through GW emission, which is

$$\begin{aligned} t_{\mathrm{merge}}(r)&= \frac{5}{256} \frac{c^5}{G^3} \frac{R^4}{(M_1 M_2)(M_1+M_2)}\nonumber \\&\approx 54\,\text{Myr} \Bigg (\frac{1}{q(1+q)}\Bigg ) \Bigg (\frac{R}{R_{\odot} }\Bigg )^4 \Bigg (\frac{1.4\,M_{\odot} }{M_1}\Bigg )^3 \end{aligned}$$
(1)

individual masses \(M_1\) and \(M_2\), and mass ratio \(q=M_2/M_1\). This equation, and others in this section, assume quasi-circular orbits as compact object systems circularize quickly compared to their total inspiral time (Faber and Rasio 2012).

A star with mass between \(\sim 8\) and \(50\,M_{\odot} \) will end as a Core-Collapse Supernova explosion (CCSNe). Stars on the lower end of this mass range will result in a NS and those on the high end will result in a BH (see da Silva Schneider et al. 2020, and references therein for details). Such heavy stars become supergiants near the end of their lives with sizes \(R \gtrsim 30\,R_{\odot }\). When two of these stars form already bound together, as a field binary, they can result in compact object binaries once both have undergone supernova. For canonical BNS systems with initial separations larger than the size of the progenitor supergiant the GW-only inspiral time will be a thousand times the age of the Universe.

For canonical BNS systems to merge within one current age of the Universe, inspiraling only through GW radiation, they must have initial separation of \(\lesssim 5\,R_{\odot} \). This requires a common envelope stage, where either the two massive stars are not distinct or the primary forms a compact object before being enveloped by the secondary during its supergiant phase. This greatly accelerates the inspiral and results in tighter initial separation of the two compact objects.

If the primary compact object is a NS the second is most often also a NS. This likely forms a BNS system, but could form a NSBH system if the primary accretes sufficient mass to collapse into a BH during the common envelope phase. If the primary collapses directly to a BH the system becomes an NSBHFootnote 1 if the secondary is light enough to form a NS, otherwise it is a Binary Black Hole (BBH) system.

The prior discussion focused on what is thought to be the standard formation channel for BNS and NSBH systems whose mergers we can observe. It is also believed that a smaller number of systems can be formed dynamically, where two compact objects form separately but become gravitationally bound when they travel close enough to each other. NSs and BHs in globular clusters will tend to gravitate towards the center due to dynamical friction, leading to both a higher likelihood of dynamical capture and an accelerated inspiral aided by three-body interactions with other objects. This could contribute \(\sim \) 10% of merger events (e.g., Belczynski et al. 2002). There may be rare head-on collisions that would behave quite differently. These are beyond the scope of this paper, but investigations of their relative importance can be studied from the information relevant for Sect. 3.5.

Inspiral

After the BNS or NSBH system is formed, the two compact objects will lose energy to GWs, causing the two compact objects to inspiral towards one another. Long before merger this emission is weak and the orbital evolution is slow. Close to merger time the energy released greatly increases and the orbital evolution accelerates. We discuss these two cases and how we can best observe them separately.

Observations of the inspiral long before merger are best performed using EM observations of galactic BNS systems. An overview of the known galactic BNS systems and their observed parameters is available in Tauris et al. (2017). These BNS systems have inspiral times from \(\sim \) 85 Myr to greater than a Hubble time. There is no known galactic NSBH system.

The discovery of the Hulse-Taylor binary system (Hulse and Taylor 1975) enabled precise measures of the orbital decay of a compact binary system for the first time. Years of careful observation enabled a determination of the properties of the stars and the first proof of GW radiation (Taylor and Weisberg 1982).

These systems spend only a tiny fraction of their lives in the late inspiral phase, which is roughly hours to minutes before merger. We are unlikely to observe a NS system at this phase within the Milky Way, and are thus left to detecting extragalactic events. BNS and NSBH systems beyond the local group will likely be undetectable in photons during the early inspiral stage. Within the last \(\sim \) 100 s before merger it is possible that precursor EM emission could be detectable for some nearby events. The strongest observational evidence is the claim of precursor activity preceding the main episode of prompt SGRB emission (Troja et al. 2010); however, this question remains unsettled. There are theoretical models that predict precursor emission in gamma-rays, X-rays, and radio, with typical luminosities \(\sim 10^{42}\) to \(10^{47}\,\text{erg/s}\). These are discussed in Sects. 2.10 and 4.7.

GW observations of stellar mass compact object inspirals provide a new method to study these systems at this stage. Because of their extremely dense nature, compact binary inspirals are among the strongest sources of GWs. As they approach merger time, where the orbital radius is similar to the size of the NSs themselves, the luminosity of this signal increases and the emitted GW frequency enters the band of the ground-based interferometers. Shortly thereafter the objects enter the merger stage.

Merger

The loss of energy to GW radiation shrinks the orbital separation, increases the orbital frequency (with \(f_{\mathrm{GW}}=2f_{\mathrm{orb}}\) as the dominant GW emission is quadrupolar) and strengthens the GW emission. This frequency evolution results in well-known Compact Binary Coalescence (CBC) chirp signal. The peak GW luminosity approaches \(10^{56}\,\text{erg/s}\) around merger time (e.g., Abbott et al. 2019b; Zappa et al. 2018). In the surrounding \(\sim \) seconds the NS can be so disrupted that it releases matter which can power ultrarelativistic polar jets (Sect. 2.1.4) and mildly relativistic quasi-isotropic outflows (Sect. 2.1.5) that produce the known EM and likely neutrino counterparts.

There are several potential contributions to the matter freed from the NS. We follow the discussions from Margalit and Metzger (2019), Kawaguchi et al. (2020), Metzger (2020). Dynamical ejecta is released within milliseconds of the merger. The deformation of the NS late in the inspiral and efficient angular momentum transport from the remnant can release matter through tidal tails that can become spiral arms, which eject matter predominantly in the equatorial region. Shock-heating occurs at the interface of two NSs, squeezing out matter through quasi-radial oscillations at the interface region, which can dominate the polar region due to the lower densities in this region and solid angle spin effects.

Additional matter is ejected starting after the dynamical timescale and continuing for up to \(\sim \) 10 s after merger and is referred to as post-merger or wind ejecta. Disk winds can occur due to several physical processes. Magnetic fields can drive fast outflows with much of the ejection occurring within the first \(\sim \) 1 s (Siegel and Metzger 2017; Fernández et al. 2018). Longer term ejection after \(\sim \) 1 s can occur when viscous heating and nuclear combination dominate over neutrino cooling (Metzger et al. 2008a, 2009). There can also be significant contributions from a remnant NS which can power neutrino winds, magnetically driven outflows, and even strip material from the surface of the remnant itself (e.g., Dessart et al. 2008; Fernández and Metzger 2016).

The unbound material, or ejecta, is characterized by total mass, average velocity, and electron fraction \(Y_e \equiv n_p/(n_n +n_p)\) where \(n_n\) and \(n_p\) are the number densities for neutrons and protons, respectively. More detailed treatments consider additional behavior, such as the spatial and density distributions. Winds from the central engine can alter these properties, broadening the spatial distributions, accelerating and heating the outflows, providing additional matter, and altering the electron fraction through neutrino irradiation via the charged-current interactions

$$\begin{aligned}&p + e^- \leftrightarrow n + \nu _e,\nonumber \\&n + e^+ \leftrightarrow p + \bar{\nu _e}. \end{aligned}$$
(2)

Given the much larger initial fraction of neutrons to protons, these interactions will drive \(Y_e\) to higher values until equilibrium is achieved. The origin of these thermal neutrinos are from the accretion disk or, when one is present, created in pair interactions near the surface of the remnant NS

$$\begin{aligned} e^+ + e^- \leftrightarrow \nu + \bar{\nu }. \end{aligned}$$
(3)

We expect enormous variation between NS mergers. BNS and NSBH mergers should be quite different. Each of these can be further divided into sub-classes, which are discussed in detail below. Within these sub-classes we expect additional variety depending on the intrinsic parameters of the system.

NSBH mergers can be split into two classes. The delineation depends on whether \(r_{\mathrm{tidal}}\), the orbital separation at which the NS disrupts, is less than or greater than \(r_{\mathrm{ISCO}}\), the Innermost Stable Circular Orbit (ISCO) of the BH (Foucart 2012; Foucart et al. 2018). For a non-spinning BH \(r_{\mathrm{ISCO}}=6GM/c^2\). The spin of the BH alters this distance, approaching \(r_{\mathrm{ISCO}}=9GM/c^2\) for maximal retrograde spin and approaching the event horizon for maximal prograde spin. The NS disruption occurs when tidal acceleration due to the inspiral exceeds the self-gravity of the NS, and depends on the properties of the NS, including the NS Equation of State (EOS) (Sect. 7.2). Disruption is favored for low mass BHs, for BHs with high prograde spin, and for large NSs. When no disruption occurs we refer to these as Heavy NSBH mergers; when disruption does occur we refer to them as Light NSBH mergers as they have lower mass and should produce bright EM radiation.

  • Heavy NSBH Mergers

    Heavy NSBH mergers swallow the NS whole. They will produce significant GW emission during inspiral and coalescence, with BH ringdown frequencies up to \(\sim \) 1–2 kHz (Pannarale et al. 2015). Note the frequencies discussed here are the expected maximum values in a given NS merger type, not the ISCO frequencies. This is likely to be the only observable signal for these events.

  • Light NSBH Mergers

    NSBH mergers with tidal disruption can release a sizable fraction of the total NS before it enters the BH. The GW emission from these events is, in general, weaker than the heavy NSBH cases due to the lower mass. They will tend to reach higher frequencies, \(\sim \) 3–4 kHz (Pannarale et al. 2015), owing to the generally smaller BH size.

    Light NSBH mergers are more exciting for traditional (that is, EM) and neutrino astronomers. Disruption of the NS releases ejecta in the equatorial plane due to tidal effects. This dynamical ejecta moves outward at \(\sim \) 0.2–0.3c, roughly corresponding to the orbital velocity at \(r_{\mathrm{tidal}}\), and is incredibly neutron-rich with \(Y_e \lesssim 0.1\) (Kiuchi et al. 2015; Foucart et al. 2014). The bound material stretches around the BH into an accretion disk with a total mass up to \(\sim 0.1\,M_{\odot} \). The disk is initially maintained as neutrino cooling dominates other effects, with peak luminosities approaching \(\sim 10^{53}\,\text{erg/s}\) (e.g., Just et al. 2016). The main disk ejection phase can release tens of percent of the total disk mass at \(\sim 0.1c\); while this material initially also has \(Y_e \lesssim 0.1\), neutrino irradiation can significantly raise the electron fraction of polar ejecta due to geometric exposure effects to the disk torus and lower densities in this region (e.g., Fernández et al. 2018).

The structure of NSs is determined by the counterbalance of the combination of degeneracy pressure and nuclear forces against gravity. NSs have a maximum mass, beyond which they will collapse to a BH; however, when there are additional mechanisms supporting the star against gravitational collapse this mass threshold can be temporarily altered. The heaviest NSs that do not immediately collapse to a BH are supported against collapse by internal differential rotation, and are referred to as HyperMassive Neutron Stars (HMNSs; Baumgarte et al. 1999). Slightly lighter NSs can be supported against collapse by uniform rotation, referred to as Supramassive Neutron Stars (SMNSs). NSs that do not require additional support mechanisms are referred to as Stable NSs.

BNS mergers can be broadly split into four possible outcomes. Cases with the heaviest progenitor NSs are expected to promptly collapse to a BH in \(\lesssim 10\,\text{ms}\). Slightly lighter progenitors should result in a short-lived HMNS remnant with typical lifetimes of \(\lesssim 1\,\text{s}\) due to efficient energy losses to internal torques (Shibata and Taniguchi 2006; Sekiguchi et al. 2011). At lower masses the remnant object can survive as a SMNS with inefficient energy losses through magnetic dipole and quadrupolar GW radiation. Shortly after merger the (meta)stable NS is expected to have strong magnetic fields, which results in lifetimes as short as hundreds or thousands of seconds (Ravi and Lasky 2014). Finally, it may be possible for two low-mass progenitor NSs to combine into a Stable NS. We separate the following paragraphs to discuss our current understanding of these events from the most to least massive cases. Here the Stable NS and SMNS cases are combined as their lifetimes greatly exceed the merger and ejecta timescales, making these events very similar at this stage.

  • Prompt Collapse

    With sufficiently heavy NSs the system will collapse to a BH within milliseconds. These will be the loudest BNS mergers during inspiral due to their higher masses. In this case the GW frequencies reach \(\sim \) 6–7 kHz (e.g., Shibata and Taniguchi 2006; Clark et al. 2014), the highest achieved for any NS mergers. The inspiral is followed by BH ringdown, which has much weaker GW emission.

    Near merger, angular momentum transport stretches the NSs, forming tidal tails in the equatorial plane. Equal-mass binaries have been show to release dynamical ejecta with a low electron fraction \(Y_e \lesssim 0.1\) with mass \(10^{-4}\)\(10^{-3}\,M_{\odot} \) and outwards velocity \(\sim 0.3c\) (Hotokezaka et al. 2013; Just et al. 2015). Asymmetric mass ratios have been shown to achieve \(5\times 10^-3\,M_{\odot} \) (Kiuchi et al. 2019). This is far lower total ejecta than the Light NSBH merger case as NSs are larger than similar mass BHs. The other main dynamical ejecta mechanism in BNS mergers is negligible for this case as it is immediately swallowed by BH formation.

    The tidal tails stretch until they form an accretion disk which can range from \(10^{-4}-10^{-2}\,M_{\odot} \), depending on the NS EOS (e.g., Shibata and Taniguchi 2006; Hotokezaka et al. 2013; Just et al. 2015; Ruiz and Shapiro 2017). Magnetically-driven outflows and thermally-driven winds can both release up to 20% of the disk mass.

  • Hypermassive Neutron Star Remnant

    BNS mergers that result in HMNS remnants will have similar inspirals as the prompt collapse case, though a bit quieter. During the HMNS phase the internal differential rotation releases GWs about as loud as the peak emission at coalescence, which occurs at \(\sim \) 2–4 kHz (Zhuge et al. 1994; Shibata and Uryū 2000; Hotokezaka et al. 2013; Maione et al. 2017). When the HMNS collapses there is BH ringdown emission.

    The tidal ejecta for these mergers (Hotokezaka et al. 2013; Bauswein et al. 2013b) behave differently than the previously discussed cases. For disks around a BH the material accretes in the equatorial region. For a NS remnant the presence of a hard surface causes the in-falling matter to envelope the surface, resulting in additional material in the polar regions (Metzger and Fernández 2014). The unbound tidal ejecta for BNS mergers with a HMNS remnant will expand outwards at \(\sim 0.15{-}0.25c\). These are also the heaviest mergers that will have significant dynamical ejecta from the shock interface between the two NSs; this ejecta will dominate in the polar regions due to solid angle effects and the lower densities in this region. If the HMNS lives for \(\gtrsim \)50 ms the neutrino luminosity can strip \(\sim 10^{-3}\,M_{\odot} \) of material from the surface of the remnant itself (Dessart et al. 2008; Fernández and Metzger 2016).

    During these ejection processes the HMNS has formed and is of sufficient temperature (few MeV) to produce significant amounts of \(e^+e^-\) pairs at its surface. The total MeV neutrino emission can be \(10^{53}\,\text{erg/s}\) with contributions from both the disk and the temporary NS (e.g., Sekiguchi et al. 2011). The tidal tail ejecta is sufficiently massive, dense, and distant that its electron fraction is largely unchanged (\(Y_e \approx 0.1{-}0.2\)). However, the polar material is closer, has lower densities, and a greater geometric exposure to the disk allowing the combined neutrino irradiation to significantly alter the electron fraction of the dynamical material in this region (\(Y_e \approx 0.3{-}0.4\); Wanajo et al. 2014).

    Given the larger amount of disruption and the lower overall velocity of the disrupted material, HMNS remnants have larger disk masses than the prompt collapse case. The HMNS collapses in under a second during the disk wind phase. So long as the HMNS lives, the neutrino luminosities will cause an increase in the amount of ejected material and monotonically increase the electron fraction. From Metzger and Fernández (2014), the amount of disk wind ejecta can exceed the dynamical ejecta; if the HMNS lives for 100 (300) ms the effects of the HMNS can eject up to \(\sim \) 10% (\(\sim \) 30%) of the total disk mass into the equatorial region and \(\sim \) 5% (\(\sim \) 10%) into the polar region. For disk wind ejecta the equatorial material will be distributed between \(Y_e \approx 0.1{-}0.5\) and the polar material will be \(Y_e \gtrsim 0.3\), and move outwards at up to \(\sim 0.1c\).

    The combination of the dynamical and post-merger ejecta and their alteration due to the HMNS surface and winds summarizes into a reasonably simple picture. The dynamical ejecta leaves first being lanthanide-rich in the equatorial region and relatively lanthanide-free in the polar region, with a roughly comparable contribution from each component. Behind this is the ejecta from the disk winds which follows a similar spatial distribution of lanthanide-fraction. This combines to the representative Fig. 7 of Metzger (2020) and our similar representation in Fig. 9.

  • Stable and Supramassive Neutron Star Remnants

    SMNS remnants survive for (e.g., Ravi and Lasky 2014) longer than the ejection phase, meaning they are quite similar to Stable NS remnants during merger and ejection. The GW emission is similar to the HMNS case; the emission is slightly weaker during inspiral, they transition to significant GW release to internal differential rotation, but would be followed by secular GW radiation (e.g., Foucart et al. 2016) at twice their rotational frequencies for some time. The longevity of this last phase of GW emission is not well constrained, but when the SMNS collapses there will be weak BH ringdown emission. The neutrino flux is similar to the HMNS case, but would be significantly greater total irradiation as the cooling time for the full NS is longer than the lifetime of HMNSs.

    The initial ejecta is similar to the HMNS case, but the longer life of the NS provides additional ejecta and wind to the system. This results in greater total ejecta material moving at somewhat larger velocities and the polar dynamical and disk wind ejecta achieving electron fractions approaching the equilibrium value (e.g., Sekiguchi et al. 2011).

    The neutrino heating likely causes ejection of the majority of the total disk mass (Metzger and Fernández 2014). These systems can potentially approach an ejection up to \(0.1\,M_{\odot} \) (e.g., Coughlin et al. 2018; Margalit and Metzger 2019), with the disk wind ejecta dominating over dynamical ejecta, though large uncertainty remains. Stripping of material from the NS surface due to the neutrino-driven wind from the hot NS remnant can be more important here than in the HMNS case (e.g., Dessart et al. 2008).

    Lastly, the spin-down energy from these remnants should provide massive continued energy injection into the system. This is reviewed in detail in Metzger (2020).

Our understanding of what occurs during BNS and NSBH mergers comes from detailed simulations accounting for several incredibly complicated, coupled, non-linear effects. Despite the lengthy description in the preceding paragraphs, we have omitted several in-depth investigations into the effects of varying individual parameters, such as eccentricity, mass ratio, total mass, spins, the NS EOS, etc. The outcome of these variations is not immediately obvious. For a thorough review of these effects we refer to Fernández and Metzger (2016) and Metzger (2020). The large uncertainty range in the previously described parameters includes both the intrinsic effects of variation of these parameters and differences in the simulations, which vary their approximations.

However, some general effects are robust. For NSBH mergers there is larger mass ejection for lower mass BHs with higher values of spin. For BNS mergers there is a positive correlation for the total ejecta mass and electron fraction with the lifetime of the NS. Combining information from population synthesis models, numerical modeling, and the current constraints on the maximum mass of a NS we generally expect to eventually observe all of these cases. The exception might be a BNS merger with a Stable NS, which may or may not be possible, depending on if the lightest NSs are less than half the maximum NS mass (Sect. 7.1).

Jets

The disrupted but still bound material accretes onto the remnant object. In at least some cases, this produces a highly collimated, ultrarelativistic jet that results in a SGRB, as confirmed with GW170817 and GRB 170817A. As much of this process is still poorly understood we here pull the phenomenological arguments from Fernández and Metzger (2016).

These jets have enormous kinetic energies and produce some of the most luminous EM events in existence, with each approaching \(10^{50}\,\text{erg}\) (Fong et al. 2015). These are powered by the accretion disks (Oechslin and Janka 2006), with \(10^{-4}\)\(0.3\,M_{\odot} \) available according to simulations (the range includes extreme conditions but neglects heavy NSBH mergers with no released matter). The pure conversion of a typical value of \(0.1\,M_{\odot} \) into energy gives \(0.1\,M_{\odot} c^2 \approx 10^{53}\,\text{erg}\), which is sufficient to power a SGRB with reasonable overall efficiencies.

How this energy reservoir is converted into the jet is somewhat unsettled (Sect. 4.3). However, it is agreed that an enormous amount of energy, predominantly from the accreting matter, is deposited in the relatively empty polar regions near the surface of the compact object, which launches an ultrarelativistic fireball away from the central engine. This outflow is collimated into a jet by the material encroaching on the polar region, e.g., the thick accretion disk (or torus) and by the magnetic fields emanating from the system. The emission from the collimated ultrarelativistic jet is only detectable for observers within the jet opening angle, \(\theta _j\), due to Doppler beaming limiting the visibility region to \(1/\varGamma \), where \(\varGamma \) is the bulk Lorentz factor with typical value \(\sim 100\). The statements here are detailed and referenced in Sect. 4.

If there is significant baryonic matter in this region it is expected to sap the available energy and prevent jet launch (Sect. 4.2). If a jet launches and there is ejecta above the launch site in the polar region the jet must propagate through to successfully break-out; otherwise it could, in principle, be choked. The collimation and the jet interaction with polar material imparts structure onto the jet itself (Sect. 4.4).

For jets that successfully break-out they move outwards at nearly c. At \(\sim 10^{12}\)\(10^{13}\,\text{cm}\) the jet reaches the photospheric radius where light can escape for the first time (Beloborodov and Mészáros 2017). At around the same distance the jet may release the prompt SGRB emission due to the occurrence of internal shocks (though there are alternative models with much higher distances, see Sect. 4.6). The emission is characterized by a total duration of \(\sim 0.01{-}5\,\text{s}\) predominantly in the \(\sim 10\,\text{keV}\) to \(\sim 10\,\text{MeV}\), with peak isotropic luminosities \(\sim 10^{51}\,\text{erg}\,/\text{s}\) (e.g., von Kienlin et al. 2020; Abbott et al. 2017b).

After the prompt SGRB, the ultrarelativistic jet continues to speed away from the central engine, with a total kinetic energy \(\sim 10^{50}\,\text{erg}\), and interacts with the surrounding circumburst material with typical densities \(\sim 10^{-4}\)\(0.1\,\text{cm}^{-3}\) (Fong et al. 2015). As the jet interacts its bulk Lorentz factor slows, the observable angle grows, and it emits synchrotron radiation across nearly the entire EM spectrum, which has been detected from radio to GeV energies (e.g., Ackermann et al. 2010; Fong et al. 2015). This emission is referred to as Gamma-ray burst (GRB) afterglow.

In Sect. 4.7 we discuss other high energy signatures potentially related to the ultrarelativistic jet. For now it is sufficient to note that observations strongly suggest late-time energy injection into the system from the central engine, which likely has implications for other observable signatures.

Quasi-isotropic outflows

The unbound matter from the system evolves far differently than the bound material that powers the ultrarelativistic jet. This ejecta is neutron-rich, contains roughly \(\sim 10^{-3}\)\(10^{-1}\,M_{\odot} \), and moves outward at a \(\sim 0.1{-}0.3 c\). The rest of this section borrows heavily from Metzger and Fernández (2014), Metzger et al. (2014), Fernández and Metzger (2016), Tanaka (2016) and Metzger (2020). The merger process significantly raises the temperature of the NS(s). As the ejecta expands and releases energy as thermal neutrinos it rapidly cools, entering relatively slow homologous expansion in only \(\sim 10\)\(100\,\text{ms}\).

At \(\lesssim 10^{10}\,/\text{K}\) free nuclei combine into \(\alpha \) particles. At \(\lesssim 5 \times 10^{9}\,/\text{K}\) the \(\alpha \)-process forms seed nuclei with \(A \sim 90{-}120\) and \(Z \sim 35\) (Woosley and Hoffman 1992). The neutron-to-seed ratio results in rapid neutron captures at rates exceeding the \(\beta \) decay of the seeds, rapidly synthesizing the heaviest elements. This is the so-called r-process, responsible for half the heavy elements (here meaning beyond iron) in the universe. This continues until the nuclei reach \(A \gtrsim 250\) where fission splits the atoms in two, which are subsequently pushed to higher atomic mass in a process referred to as fission recycling. This generically returns peaks near the closed shell numbers \(A = 82, 130, 196\), observed in the solar system elemental abundances. A few seconds have passed.

The heavy nuclei are undergoing heavy radioactive decay, producing copious amounts of neutrinos (\(\sim 0.1{-}10\,\text{MeV}\)), nuclear gamma-rays (dozens of keV to a few MeV), and elements that approach the line of stability over time (e.g., Hotokezaka et al. 2016b). At early times the overwhelming majority of released energy escapes as neutrinos because the ejecta material is dense and opaque for photons (see Fig. 4, discussion, and references in Metzger 2020). In base kilonova models, the earliest photons that can escape are the nuclear gamma-rays, beginning on the order of a few hours. Neutrinos escape with \(\sim \) 30–40% of the energy; gamma-rays carry 20–50% of the total energy. This significantly lowers the remaining energy in the system before it reaches peak luminosity (e.g., Barnes et al. 2016; Hotokezaka et al. 2016b).

The main frequency range of interested for EM observations of kilonova is Ultraviolet, Optical, and Infrared (UVOIR). The opacity in this energy range is driven by atomic transitions of bound electrons to another bound energy state. The open f shell for lanthanides (\(Z = 58-72\)) have angular momentum quantum number of \(l=3\), with the number of valence electron states \(g=2(2l+1)=14\), where n electrons can be setup in \(C = g!/n!(g-n)!\) possible configurations, with bound-bound transitions scaling as \(C^2\), resulting in millions of transition lines in the UVOIR range. As the ejecta is expanding with a significant velocity gradient (e.g., Bauswein et al. 2013b) all of these lines are Doppler broadened. This blankets the entire range, preventing this light from escaping at early times.

As time continues the ejecta loses energy to neutrinos and gamma-rays, cools as it expands, the radioactive heating rate slows, and it transitions to lower densities until eventually the UVOIR photons can escape, resulting in a quasi-thermal transient known as a kilonova. The energy deposition rate of most forms of radioactivity of interest here decay as a power law with index \(-1.1\) to \(-1.4\) (see][and references therein]metzger2020kilonovae. In the hours to days post-merger this maintains high temperatures in the ejecta, with values \(\sim 10^4{-}10^3\,/\text{K}\). Ejecta with relatively high initial electron fraction \(Y_e \gtrsim 0.3\) will produce mostly lanthanide-free material which will result in a blue kilonova with peak luminosity on the \(\sim \) 1 day timescale (e.g., Metzger et al. 2010). Ejecta with low electron fraction \(Y_e \lesssim 0.3\) will produce lanthanide-rich material (and potentially actinides) that will produce a red kilonova with a peak luminosity timescale of \(\sim \) 1 week (e.g., Barnes and Kasen 2013).

The prior paragraphs in this section discuss the base-kilonova model, but there may be significant additional signals or alteration of these observables from the quasi-isotropic outflows. These include the radioactive decay of neutrons that are not captured into nuclei, the effects of jet interactions on the previously ejected polar material, and late-time energy injection from the central engine. These are summarized in Sect. 3.4, which references detailed works covering each.

Aftermath

After the energy ejection ends and the kilonova cools and fades, the quasi-isotropic ejecta will continue moving outwards. Over the next few months and years the event will transition to the nebular phase. Once it reaches the deceleration radius, where it has swept up a comparable amount of mass from the surrounding environment, the ejecta will transition to a Sedov–Taylor blast wave that releases synchrotron radiation in the radio bands (Nakar and Piran 2011; Piran et al. 2013; Hotokezaka and Piran 2015), analogously described as a kilonova afterglow.

Over decades, centuries, and millennia it forms a Kilonova remnant (KNR). These are bound by a shock wave at the interaction of the merger ejecta and surrounding material, providing a transition edge. They may be similar to supernova remnants but have lower total kinetic energies and will tend to occur in regions with lower surrounding material (due to occurring outside of their host galaxies). Even long after merger they will be radioactive, with emission dominated by isotopes with half-lives of similar order to the age of the remnant (Wu et al. 2019; Korobkin et al. 2020). Longer still, the kinetic energy will eventually be used up and the shock-front will dissipate. Ejecta that is bound to the host galaxy will eventually return and become part of the diffuse galactic material where long-term mixing distributes the heaviest elements throughout the galaxy (Wu et al. 2019). Some will eventually join new planets and stars, and a bit may eventually be dug out of the ground by advanced life. Heavy elements unbound from the host galaxy will be lonely for a reasonable part of eternity.

Intrinsic event rates

The rates of compact object mergers is of interest to several fields. The true value sets how quickly we can achieve specific scientific outcomes, and will determine the necessary devotion of observational resources and prioritization on telescopes with shared time. Estimates have arisen through several means with predicted rates spanning several orders of magnitude. The most direct measurement comes from GW observations, calculated from a detection number in a known spacetime volume. These are the basis for our assumed rates, and the large existing uncertainty should rapidly shrink in the next few years. The local volumetric rates assumed in this paper are explained below and summarized in Table 1.

The latest reported local volumetric rate measurements from LIGO/Virgo come from the discovery paper in GW190425, the second GW-detected BNS merger (Abbott et al. 2020a). The full 90% range reported for BNS mergers is 250–2810 \(\text{Gpc}^{-3}\,\text{yr}^{-1}\). This value is the union of two measurements, one considering a uniform mass prior between 1 and \(2\,M_{\odot} \) for each NS in a BNS merger and the second adding the sum of the rates of events like GW170817 to those like GW190425. The median value is approximately \(1000\,\text{Gpc}^{-3}\,\text{yr}^{-1}\). Following the initial release of this paper, which occurred before the publication on GW190425, and to enable for ease of scaling as these reported rates are updated, we chose to use the BNS local volumetric rate of \(R=1000_{-800}^{+2000}\) (200–3000) \(\text{Gpc}^{-3}\,\text{yr}^{-1}\).

The rates of NSBH mergers are known with less precision. Abbott et al. (2019b) bound the local upper limit of NSBH mergers as a function of BH mass. Since we do not know the distribution of BH mass in NSBH merger systems we take the least constraining value of \(<610\,\text{Gpc}^{-3}\,\text{yr}^{-1}\), which is for \(M_{\mathrm{BH}}=5\,M_{\odot} \). The lower and mid-range value come from the merger rates expectations paper prior to the initialization of Advanced LIGO (Abadie et al. 2010), where the high rate is similar to the constraints reported above.

The LIGO Scientific Collaboration and Virgo Collaboration (LVC) has also reported the discovery of a CBC with a high mass ratio, GW190814 (Abbott et al. 2020c). Owing to the strength of the signal and the large mass asymmetry this allowed for a precise determination of the individual masses, with the secondary being between 2.50 and \(2.67\,M_{\odot} \). This is potentially the first NSBH merger identified, but is more likely to be a BBH merger. We do not inform our NSBH rates with this event. We may expect a directly measured value once a GW-detected event is unambiguously classified as an NSBH merger.

For comparison, we report the inferred volumetric local BBH merger rates with a mass function that is self-consistent with the observed BBH mergers from O1 and O2 (Abbott et al. 2019a). This gives a range of 24.4–111.7 \(\text{Gpc}^{-3}\,\text{yr}^{-1}\) with a central value of 54.4 \(\text{Gpc}^{-3}\,\text{yr}^{-1}\). This has a factor of four uncertainty. This range is far narrower due to the larger number of detected BBH system. As the number of detected NS mergers increases the precision of the local rates measure will similarly improve.

Table 1 The local volumetric merger rates for BNS, NSBH, and BBH mergers

The rates of NS mergers vary through cosmic time. Under the standard formation channel, it should track the stellar formation rate modulo their inspiral times. The peak rate of SGRBs occurred at a redshift of \(\sim 0.5{-}0.8\) (e.g., Berger et al. 2013) before declining to the current rate. This is a useful proxy to estimate the largest average inspiral range due to the Malmquist bias in detecting SGRBs. The furthest known SGRBs occurred at a redshift of \(>2\) and few are expected beyond a redshift of \(\sim \) 5. We do not explicitly account for intrinsic source evolution for our detection rates in this manuscript. The rates of NS mergers do not evolve significantly over the distances we can detect these events through GWs, neutrinos, or as kilonovae for at least a decade. Source evolution does matter for SGRB observations, both prompt and afterglow, but our rates for those events are determined from empirical observations and thus source evolution is accounted for intrinsically.

We lastly close with the rates of rare events that may provide unique understanding of these mergers. Particularly nearby events will be able to be characterized to vastly greater detail; as such, we report the nearest event we may expect on fiducial timescales. Assuming the usual number density of Milky Way (MW)-like galaxies of \(\sim 0.01\,\text{Mpc}^{-3}\) (e.g., Hotokezaka et al. 2018), we show the rates per Milky Way-like galaxy per million years, and how many millennia we may expect between events in the Milky Way itself.

From Table 1 we can draw a few immediate conclusions. BNS mergers are locally more common than BBH mergers and likely more common than NSBH mergers. We may expect a BNS merger to occur within \(\sim \) 30 Mpc about once a decade. Events within \(\sim \) 20 Mpc are rare, occurring about as often as an average human lifetime. We should expect a BNS merger in the Milky Way about every 10 millennia.

Strongly lensed events are prize astrophysical occurrences. They provide both complementary and unique tests in cosmology (Refsdal 1964; Linder 2011; Blandford and Narayan 1992) and fundamental physics (Biesiada and Piórkowska 2009; Collett and Bacon 2017; Minazzoli 2019), and unique studies of transient events (e.g., Goobar et al. 2017; Perna and Keeton 2009). The detection and successful identification of a strongly lensed NS merger would be momentous, which is discussed in more detail in Sect. 6.2 and a few subsections of Sect. 8. The intrinsic rates of strongly lensed NS mergers are likely to be low but likely non-zero (e.g., Biesiada et al. 2014, after accounting for new rates estimates). These rates could be increased in the future by targeted known strongly lensed systems (see Collett 2015, for these prospects), analogous to the current galaxy targeting approach EM follow-up to GW-detected NS mergers.

Gravitational waves

GWs are detected by measuring their effect on spacetime itself as the strain \(h = \varDelta L/L\) where \(\varDelta L\) is the fractional change of length L (Abbott et al. 2009). At the reasonably nearby distance of \(\sim \) 100 Mpc (Sect. 2.2) the strain at Earth for a canonical BNS merger is \(\sim 10^{-21}\). Detection then requires the most sensitive ruler ever built. Weak GWs can be described by the ordinary plane wave solution. In General Relativity (GR) GWs have only two independent polarization modes (Will 2014). They can be distinguished by a \(\pi /4\) rotation in the plane perpendicular to the direction of motion, which, by convention, are referred to as the plus and cross polarization modes. The strain h from these modes are \(h_+\) and \(h_{\times} \), respectively.

Following Schutz (2011), the antenna response function can be written in terms of the two GR polarization modes as

$$\begin{aligned} h(t) = F_+(\theta , \phi , \psi )h_+(t) + F_{\times} (\theta , \phi , \psi )h_{\times} (t) \end{aligned}$$
(4)

where \(\theta \) and \(\phi \) are spherical coordinates relative to detector normal, and \(\psi \) the polarization angle for the merger relative to this same coordinate system. \(F_+\) and \(F_{\times} \) are the interferometer response to the two polarization modes

$$\begin{aligned} F_+&= \frac{1}{2} (1+\cos ^2{\theta }) \cos {2 \phi } \cos {2 \psi } - \cos {\theta } \sin {2 \phi } \sin {2 \psi } \nonumber \\ F_{\times}&= \frac{1}{2} (1+\cos ^2{\theta }) \cos {2 \phi } \cos {2 \psi } + \cos {\theta } \sin {2 \phi } \cos {2 \psi }. \end{aligned}$$
(5)

The antenna power pattern, which the Signal-to-Noise Ratio (SNR) is proportional to, is

$$\begin{aligned} P(\theta ,\phi )&= F_+(\theta ,\phi ,\psi )^2 + F_{\times} (\theta ,\phi ,\psi )^2\nonumber \\&=\frac{1}{4} (1+\cos ^2{\theta })^2 \cos ^2{2 \phi } + \cos ^2{\theta } \sin ^2{2\phi } \end{aligned}$$
(6)

GW emission is omnidirectional but not isotropic. For CBCs we can define the radiated power as a function of inclination angle \(\iota \), which goes from 0 to 180 because orientation matters for GW observations (as opposed to the 0 to 90 convention used for most EM observations). This relation can be represented as as \(F_{\mathrm{rad}}\), referred to as the binary radiation pattern, and is defined as

$$\begin{aligned} F_{\mathrm{rad}}(\iota ) = \frac{1}{8}\big (1 + 6 \cos ^2(\iota ) + \cos ^4(\iota )\big ). \end{aligned}$$
(7)

It is equivalent to the \(\phi \)-average of the interferometer antenna pattern. It is strongest along the total angular momentum axis (\(\iota = 0, 180\)) and weakest in the orbital plane (\(\iota = 90\)).

The sensitivity of individual ground-based interferometers is usually quoted in terms of detection distances for canonical BNS mergers (e.g., Abbott et al. 2018a). The detection horizon is the maximum detection distance, which occurs for face-on events (\(\iota \approx \)0 or 180, where the rotation axis is oriented towards Earth) that are directly overhead (or under). Converting the total sensitive volume to a spherical equivalent gives a radius referred to as the detection range, which is the usual figure of merit for (single) ground-based interferometer sensitivity. The horizon is 2.26 times the range (e.g., Abadie et al. 2012b).

NS mergers are identified in GW strain data through CBC searches, where CBC refers to BNS, NSBH, and BBH mergers for ground-based interferometers, which are found by looking for signals that match waveforms from a template bank of GW inspirals (e.g., Usman et al. 2016; Messick et al. 2017). Because the signals of interest are so weak and background noise is significant, a GW detection generally requires two or more interferometers to jointly trigger on an event. The interferometers are separated by thousands of kilometers, which results in generally uncorrelated background, giving a massive increase in search sensitivity. Signal significance has historically been quantified through the use of a False Alarm Rate (FAR), measuring how often an event with a given value of the ranking statistic occurs in background (e.g., Abbott et al. 2016a, b, 2017c). Recently, the development of \(P_{\mathrm{astro}}\), the probability that an event is astrophysical in origin, has provided additional information, conveying the chance a given event has an astrophysical origin based on an assumed volumetric event rate against the rate of detector noise in that region of parameter space. This is a more powerful method that should result in increased detection rates, but its effect on detection rates has not been quantified.

Interferometers directly measure amplitude, which falls as 1/d (e.g., Aasi et al. 2015), rather than the typical \(1/d^2\) for most astrophysical instruments. That is, an increase in sensitivity gives a cubic increase in detection rates, rather than the typical 3/2. For signal-dominated events this corresponds to a cubic increase in detection rates.

Through kilometer-scale modified Michelson interferometers the direct detection of GWs has recently been achieved (Abbott et al. 2016b). We first discuss the US-based observatories. The current design sensitivity of the Advanced LIGO interferometers is expected to achieve a BNS range of 175 Mpc (Barsotti et al. 2018) by \(\sim \) 2020.Footnote 2 The NSF has funded the Advanced LIGO+ upgrade which has a target BNS range of 330 Mpc (Zucker et al. 2016).

Beyond A+, there are proposed concepts. The LIGO Voyager upgrade would push the existing interferometers close to their theoretical maximum sensitivity, and we use a representative BNS range of 1 Gpc (McClelland et al. 2014). Lastly, third generation interferometers (e.g., Abbott et al. 2017a; Punturo et al. 2010) will detect these events throughout the universe. Converting from values in Reitze et al. (2019), the early stage Cosmic Explorer (\(\sim \) 2035) would have a BNS range of \(\sim \) 12 Gpc and the late-stage version (\(\sim \) 2045) \(\sim \) 60 Gpc. We take \(\sim \) 10 Gpc as a representative value.

The LIGO interferometers are only part of the ground-based GW detection network. The active GW detectors are the two Advanced LIGO interferometers and the Advanced Virgo (Acernese et al. 2015) interferometer. LIGO and Virgo work together as the LVC. They are to be joined by the Kamioka Gravitational Wave Detector (KAGRA) interferometer (Aso et al. 2013) in late 2019 and eventually by LIGO-India which would enter at the A+ version (Iyer et al. 2011). These interferometer sites are generally referred to by letters, H for LIGO-Hanford, L for LIGO-Livingston, V for Virgo, K for KAGRA, and I for LIGO-India. A summary of the currently expected ground-based GW network sensitivity and planned observing runs through \(\sim \) 2026 is shown in Fig. 2. The plan updates will be available online.Footnote 3

Fig. 2
figure2

The planned ground-based GW network observing runs. O1, O2, and about half of O3 have already completed. During O4 the interferometers should approach their Advanced design sensitivity. From 2025+ several interferometers will be upgraded to their advanced configuration

In Table 2 we report reasonable and conservative detection rates for NS mergers for the four representative sensitivities. Our base estimate accounts for only two, coaligned interferometers, equivalent to the HL configuration for at least the next decade. This enables easy calculation of a particularly conservative estimate. We also provide a broader network estimate as a function of time based on the network figures of merit in Schutz (2011, which are not directly comparable given the differing interferometer sensitivities) and simulations in Abbott et al. (2016c). All estimates assume individual interferometer livetime fractions of 70%, corresponding to 50% livetime for the HL(-like) configuration(s).

The Advanced and A+ rates are calculated with the intrinsic rates from Table 2 and their sensitivity volume. Source evolution at these distances are unimportant and neglected. The NSBH rates assume they are detected \(\sim \) 2 times further, corresponding to a reasonably light BH (giving conservative estimates) which should produce EM emission. The Voyager and Gen 3 rates assumes no source evolution, which is a conservative estimate. The Gen 3 rates further only consider events within a redshift of 0.5, providing a very conservative limit. These ranges are 90% confidence, giving lower limits at 95% confidence.

Table 2 The expected interferometer sensitivities for the current Advanced interferometers at design sensitivity, the Advanced+ upgrade, the Voyager upgrade, and representative values for third generation interferometers

Beyond just detecting them, characterization of NS mergers is an additional priority for design requirements. The high end frequency is set by the wish to directly observe the merger events themselves. From Sect. 2.1.3 the highest expected maximum frequency is for the BNS prompt collapse case reaching \(\sim \) 6–7 kHz. Sufficiently capturing this range should also enable sensitive searches for NS modes above the primary frequency in the BNS (meta)stable remnant cases (see Ackley et al. 2020, and references therein).

Pushing to lower frequencies has a number of benefits, such as providing vastly improved parameter estimation precision due to a far greater SNR for a given event. A canonical BNS (NSBH) merger emitting GWs at 0.1 Hz will merge in about a decade (a year) (e.g., Graham et al. 2017). For NS mergers that will merge within an instrument lifetime this provides a reasonable lower frequency goal. This range is also ideal for the best-case GW localizations, as we will show. Thus, absent funding or technical considerations, the best range to study these events is \(\sim 0.1\) to \(\sim \) 10 kHz. The rough frequency range for the four ground-based GW interferometer sensitivity examples is given in Table 2. For the next decade we are largely limited to the \(\sim \) 10–1000 Hz regime. Achieving higher frequencies may be possible, but pushing lower than 5 Hz on the ground is nearly impossible.

Generic GW observations of CBCs measure more than a dozen parameters. The extrinsic system parameters include the location (\(\theta \), \(\phi \), and the luminosity distance \(d_L\)), inclination (\(\iota \)), polarization angle (\(\psi \)), eccentricity (e), coalescence phase (\(\phi _0\)), and merger time \(t_{\mathrm{GW}}\). The intrinsic parameters include the mass and spin components of each pre-merger object (\(m_1\), \(m_2\); \(\overrightarrow{S}_1\), \(\overrightarrow{S_2}\)). Most of these parameters have strong correlations (often referred to as degeneracies). One example is the amplitude dependence on both \(\iota \) and \(d_L\), contributing to greater uncertainty on both measures (Schutz 2002). For NS mergers matter effects accelerate the late inspiral which can be captured into the tidal deformability parameter (\(\varLambda \)).

Eccentricity is generally expected to be zero for these systems, as circularization happens on a shorter interval than the expected inspiral time to merger (Peters and Mathews 1963; Faber and Rasio 2012). The polarization can be constrained for events detected by interferometers that are not coaligned, based on the SNRs and antenna response as a function of position. These detections will tend to have more precisely measured inclinations, as the parameters are correlated. The merger time and coalescence phase are precisely measured for NS mergers given the long inspirals (e.g., Abbott et al. 2017b). Tidal deformability is determined by the (non-)detection of accelerated inspirals due to matter effects, and for NSBH mergers, by determining the frequency at which tidal disruption occurs, which tends to happen at high frequencies where we currently have insufficient sensitivity.

The remaining GW-determined parameters are mass and spin. The masses are determined from the chirp mass

$$\begin{aligned} \mathcal{M}_c = \frac{(M_1 M_2)^{3/5}}{(M_1 + M_2)^{1/5}}, \end{aligned}$$
(8)

where \(M_1\) and \(M_2\) are the masses of the primary and secondary, and the mass ratio \(q = M_2/M_1\) which is by definition \(q \le 1\). For NS mergers the chirp mass measurement is extremely precise as the GW observation covers thousands of cycles, giving a great measure on the frequency evolution of the inspiral. The mass ratio effect on the inspiral is perfectly correlated to first order with one of the spin parameters, requiring high SNR near merger to be well constrained. q will be poorly constrained for BNS mergers so long as the merger occurs out of band of the GW interferometers (Abbott et al. 2019b), except for particularly loud events. The spin components are usually written in terms of dimensionless spin \({\chi } \equiv c \mathbf{S}/(GM^2)\).

A unique aspect of GW observations is knowledge of the distance to the source. Both the strain amplitude h and \(\dot{f}_{\mathrm{GW}}\) depend on the \(\mathcal{M}_c\), defined in Eq. 8, enabling a determination of the luminosity distance to the source (Schutz 1986, 2002). For ground-based interferometers typical distance uncertainty is tens of percent (e.g., Chen et al. 2017), with improved uncertainty for higher SNR events. Given the distance-inclination correlation, the constraint can be improved when external inclination information is provided (e.g., Guidorzi et al. 2017).

The earliest detectable signal for NS mergers are GWs. As such, they play an important role in both the detection and characterization of these events, but also in providing localization information for searches with other instruments. Current ground-based GW interferometers can measure BNS merger times to sub-ms accuracy. As they are separated by thousands of kilometers and GWs travel at the speed of light (Abbott et al. 2017b) we can combine pairs of detections into narrow timing annuli on the sky. The narrowness is determined by \(\delta t/d_I\) where \(d_I\) is the distance between contributing instruments. The precise timing for BNS mergers (\(\lesssim \)ms) enables narrow annuli, despite the (comparatively) short baselines between interferometers.

For two interferometer detections the typical 90% confidence region is a few hundred square degrees, with large variation in each case (e.g., Singer et al. 2014). Three interferometer detections decrease to a median of few 10s of square degrees. Additional interferometers improve this accuracy (e.g., Abbott et al. 2018a). Table 3 shows the absolute and cumulative livetimes for a number of active interferometers from a network of a given size. Extreme loud single interferometer events can be reported without independent confirmation; in this case the localization will match the antenna pattern of that interferometer, giving a 90% confidence region of order half the sky. When one interferometer is significantly more sensitive than another the joint detection rate will decrease and two interferometer localizations will be the antenna pattern of the more sensitive instrument, slightly modified by the other, with 90% confidence region covering several thousand square degrees, as shown by GW190425 (LVC 2019).

Table 3 The first column varies the number of interferometers contributing to a given observing run

Because inspirals can be detected before merger, GW detections can be reported before merger, i.e. act as early warning systems. Knowing the event time in advance can be beneficial for several reasons, such as pointing wide-field telescopes, switching observational modes, increasing temporal resolution, etc, but perhaps the greatest potential outcome would be the pointing of EM telescopes to observe the source at merger time, which would uncover vastly greater understanding of these sources. The localizations available before merger using the method discussed above will give typical accuracies about a thousand square degrees a minute before merger (e.g., Cannon et al. 2012) because the timing uncertainty is not precise until just before merger. Loud events could have improved, but still poor, localizations.

There are additional mechanisms for constraining source position from GW observations, relying on the motion of the interferometer. Ground-based interferometers are bound to the surface of Earth and their antenna patterns sweep over the sky as Earth rotates through the day. For signals that are \(\sim \) hours long this change causes time-dependent exposure that depends primarily on the source position, refining the location. For the recent listed frequency range of Cosmic Explorer, the U.S. third generation proposal, it will achieve 5 Hz on the low end (Reitze et al. 2019), which would begin to observe BNS mergers about an hour before merger. Therefore, even with third generation interferometers we will not be able to rely on additional localization methods and will likely be limited to accuracies of order \(\sim \) 100 square degrees a minute before merger. For comparison, 30 s is among the current fastest repoint times (from reception of alert to observation) currently available in time domain astronomy.

Space-based interferometers will localize primarily through measuring Doppler shifts as their orbit moves towards/away the source (e.g., Cutler 1998). The longer integration time can give higher SNR, providing more precisely determined distances. This is the dominant localization method for the funded satellite constellation mission Laser Interferometer Space Antenna (LISA), which would have an Earth-like orbit around the Sun and would cover the \(\sim \) mHz frequency range. LISA may detect BNS and NSBH systems, but they would be long before merger.

There are proposed mid-range interferometers, referring to instruments that cover frequencies between LISA and the ground-based network, (e.g., Dimopoulos et al. 2008; Kawamura et al. 2011; Canuel et al. 2018; Mueller et al. 2019; Kuns et al. 2020). Such devices would measure BNS systems years before merger and are likely the only way to achieve good pre-merger localizations. The details vary, but even conservative instruments/predictions give sub-degree accuracy for at least a few systems per year. These would enable broadband EM observations of NS mergers during coalescence through the first few hours. There is no funded mission in this range, precluding launch within a decade, but we discuss them as they would enable unique science with NS mergers inaccessible through other means.

Prompt gamma-ray bursts

The easiest method to detect NS mergers is through their prompt SGRB emission. The GRB monitors have detected more than a thousand SGRBs, which is (currently) three orders of magnitude more than GW detections of NS mergers, two more than claimed kilonovae, and one more than SGRBs afterglow. These events emit primarily in the \(\sim 10\,\text{keV}\)–10 MeV energy range, which is only observable from space. There are two classes of GRBs, short and long, separated in the prompt phase by a duration threshold. These classes have different origins, as proven by follow-up observations. Long Gamma-Ray Bursts (LGRBs) origin from a specific type of core-collapse supernova; SGRBs originate from BNS mergers and likely NSBH mergers. Short and long colloquially refer to these separate classes, despite the fact that the duration distributions overlap.

The most prolific active detector of SGRBs is the Fermi Gamma-ray Burst Monitor (GBM) (Meegan et al. 2009) which identifies more SGRBs than all other active missions combined. It is this instrument we will use to baseline our rates. GBM consists of two types of scintillators to cover \(\sim 10\,\text{keV}\)–10 MeV. The duration threshold where events are equally likely to belong to the short or long distributions for Fermi GBM is 5 s (Bhat et al. 2016). From the combined fit to the short and long log-normal distributions, the weight of each distribution is 20% and 80%, respectively. This gives a Fermi GBM SGRB detection rate of 48 SGRBs/yr. The low-energy detectors are oriented to observe different portions of the sky and, to first order, have a cosine response from detector normal. Localization is done by deconvolving the observed counts in each detector with the response of the instrument as a function of energy and constraining the sky region where the event is consistent with a point source origin. The median GBM SGRB localization, including systematic error, has a 90% containment region of \(\sim 500\,\text{deg}^2\). The typical localization accuracy is a few hundred square degrees, comparable to the two-interferometer GW localizations, but are quasi-circular blobs rather than narrow arcs.

The Swift Burst Alert Telescope (BAT) consists of an array of gamma-ray scintillators below a partial coding mask, which imparts shadows in a unique pattern (Barthelmy et al. 2005). This detector setup trades effective area for localization accuracy, detecting \(\sim \) 8–9 SGRBs/yr with localizations to 3’ accuracy (e.g., Lien et al. 2016). Swift has two narrow-field telescopes, the X-ray telescope (XRT) and Ultraviolet/Optical Telescope (UVOT), which are repointed to the BAT localizations for bursts within their field of regard. The XRT recovery fraction of BAT SGRBs is 75%, and is 85% of those it observes promptly. This enables localization accuracy to a few arcseconds. This is sufficient for follow-up with nearly any telescope, and was the prime mission for Swift. The BAT is sensitive over 15–150 keV, preventing it from performing broadband spectral studies of SGRBs.

There are two other instrument types that can promptly detect SGRBs. The Large Area Telescope (LAT) is the primary instrument on-board the Fermi satellite and is a pair-conversion telescope that observe from \(\sim \) 100 MeV–100 GeV (Atwood et al. 2009). It detects about \(\sim \) 2 SGRBs/yr, though some of these are afterglow-only detections (Ajello et al. 2019). Compton telescopes are phenomenal SGRB detectors that detect photons within the \(\sim \) 100 keV–10 MeV energy range, with great sensitivity, wide fields of view, and localization accuracy of order a degree. They can provide a large sample of SGRBs with localizations sufficient for follow-up with wide-field instruments.

Beyond autonomous localizations by individual satellites, the Interplanetary Network (IPN) pioneered using the finite speed of light to constrain events with timing annuli on the sky (see Hurley et al. 2011, and references therein). GRB temporal evolution is fit by empirical functions and their intrinsic variability is limited to \(\gtrsim \)50 ms. That is, to achieve annuli similarly narrow to the GW network localizations we require baselines longer than can be achieved in Low Earth Orbit (LEO). By placing gamma-ray detectors on spacecraft bound for other planets the baseline increases by orders of magnitude, enabling very bright events to be localized to arcminute accuracy. The limitation of the IPN is the high data downlink latency, generally too long for the purposes of following SGRB afterglow and early kilonova observations. The other issue is the lack of gamma-ray detectors on recent planetary spacecraft, threatening an end to massive baselines for the IPN.

The KONUS-Wind instrument has broadband energy coverage comparable to GBM, no autonomous localization capability, but sits at the Sun \(L_1\) point (Aptekar et al. 1995). The INTErnational Gamma-Ray Astrophysics Laboratory (INTEGRAL) SPectrometer onboard INTEGRAL—Anti-Coincidence Shield (SPI-ACS) is an anticoincidence shield sensitive to \(\gtrsim 100\,\text{keV}\) with no energy or spatial information, but has a highly elliptical orbit that brings it up to half a light second from Earth (von Kienlin et al. 2003). With the LEO GRB monitors they form the backbone of the modern IPN, with sufficient distances from Earth and detection rates to regularly constrain the localizations of GRBs to sub-degree accuracy.

Once a burst is identified it is characterized by its temporal and spectral properties. The GRB time is often set to the trigger time, though this definition varies for a given instrument. The on-set time of GRB emission can be refined when necessary by fitting a field-specific pulse function and defining the start time as when some amount of the peak height (e.g., 5% of the maximum) is achieved. The duration of a burst is determined through the \(T_{90}\) measure, the time from when 5–95% of the total fluence is observed, which gives a first assignment as short or long. Out of this analysis comes an estimate of the peak photon and energy flux, and total energy fluence for the event. Spectral analysis of GRBs is performed with the forward-folding technique, where an empirical functional form is convolved with the detector responses and compared with the data. The usual forms are a basic power law, a smoothly broken power law, or a power law with an exponential cutoff. These functions are not selected with any theoretical motivation. Spectral analysis is often done in a time-integrated manner, which averages out the spectral evolution of the event. Generally a power law fit indicates a burst that is too weak to constrain spectral curvature. When this curvature is constrained it is parameterized as \(E_{\mathrm{peak}}\), where most of the power is radiated.

When the distance to the source is known (Sect. 3.5) the observed flux and fluence can be converted into the isotropic-equivalent energetics, \(L_{\mathrm{iso}}\) and \(E_{\mathrm{iso}}\) for the peak luminosity and total energy released, respectively. These are calculated by assuming the observed brightness is constant over a spherical shell with radius \(D_L\) to the source, and are reported in the bolometric range 1 keV–10 MeV, after accounting for cosmological redshift through the k-correction factor (Bloom et al. 2001). These values can be refined to jet-corrected energetics if the half-jet opening angle is determined through observations of the afterglow (Fong et al. 2015).

These are the basic parameters in wide use within the field. There are additional analyses that can be done that are quite useful. Examples include fitting multiple spectral functions simultaneously has provided evidence for additional components (e.g., Guiriec et al. 2011; Tak et al. 2019) and a potential spectro-temporal signature indicative of nearby BNS mergers (Burns et al. 2018).

Fig. 3
figure3

The SGRB rate as a function of sensitivity. Orange is the histogram of observed 64 ms peak flux in the 50–300 keV energy range for GBM SGRBs over an 11-year period. The 64 ms duration is chosen to encompass most SGRBs (e.g., the majority of bursts are longer than this timescale) and 50–300 keV is the dominant triggering range for GBM. The grey line is the cumulative logN-logP yearly detection rate. GBM has an average exposure of \(\sim \) 60% (conservatively ignoring sky regions GBM observes with poor sensitivity), which is scaled to give the all-sky detection rate of SGRBs above GBM’s on-board trigger sensitivity in black. We fit a power-law to this curve for events above \(7\,\text{ph/s/cm}^2\) as this should be a reasonably complete sample. The fit has an index of \(-1.3\)

Lastly, we discuss how the detection rate of SGRBs varies with sensitivity, as shown in Fig. 3. The result is an estimation of the all-sky SGRB rate above the on-board trigger threshold for GBM of \(\sim \) 80/yr and an extrapolation to higher sensitivity by a logN-logP power-law with an index of \(-1.3\), varying by \(\sim \) 0.1 depending on where the fit threshold is applied. That is, instruments with 2 (10) times GBM sensitivity corresponding to a detection rate multiplier of 2.5 (20). Given sensitivity scales as the square root of effective area, to maximize detection rates with a fixed amount of scintillators one should prioritize all-sky coverage over depth in a given direction, though depth is preferred for characterization of individual events.

The SGRB detection rates discussed in the previous paragraph were for on-board triggers, which are basic to ensure sample purity, minimize the use of limited bandwidth, and due to the limitations of flight computers. The initial data downlinked after a trigger is limited. Most GRB monitors also provide continuous data which is generally binned with somewhat coarse temporal or energy resolution, owing to bandwidth considerations. Fermi GBM is able to downlink continuous Time-Tagged Event (TTE) data, which enables deep searches for additional SGRBs. There is a blind untargeted search for SGRB candidates that reports the results publicly with a few hours delay, limited by the data downlink latency.Footnote 4 The targeted search of GBM data (Blackburn et al. 2015; Goldstein et al. 2016; Kocevski et al. 2018) is the most sensitive SGRB search ever developed. Based on the maximal detection distance for GRB 170817A with the targeted search against the detection limit of the on-board trigger (Goldstein et al. 2017a), the inefficiencies of the on-board trigger due to non-uniform sky coverage, and the logN-logP relation, the GBM targeted search should be capable of recovering a few times as many SGRBs as the on-board trigger, or a few per week.

Statistical association and joint searches

Multimessenger science is incredible. It requires detections in multiple messengers and the robust statistical association of those signals. This is often neglected or totally ignored. As such, we focus on this problem before proceeding to other detections of NS mergers. Much work has been done in this endeavor during the past several years, with varied focus and applicability. For example, Ashton et al. (2018) developed a general Bayesian framework to associate signals based on commonly measured parameters. For our purposes it is sufficient to use a representative frequentist method using the three dominant parameters that provide association significance: temporal and spatial information, and the rarity of the event itself.

We first discuss time. The rate of GW-detected NS mergers will remain at less than one per day for the better part of a decade. The rate of NS mergers detected as SGRBs will remain similarly rare. The time offset of these two events is expected to be only seconds long. For example, the chance coincidence of a GBM triggered SGRB occurring within a few seconds of a GW detection of a NS merger is \(\sim \text{few}\times 10^{-6}\). Then, with the inclusion of spatial information, even with the independent localizations spanning hundreds of square degrees, the association easily surpasses \(5\sigma \) (see Abbott et al. 2017b; discussions in Ashton et al. 2018). A pure sample is readily maintained even for large numbers.

Spatial information can be even more powerful. For much of observational astronomy localization alone is sufficient to associate multiwavelength signals because the uncertainty on the localization from radio to X-ray can be a trillionth of the sky, which enables easy association of steady sources. These are so precise that association significance is generally not calculated. We use the nominal Swift operations as our example here. Swift has a GRB rate (both long and short) of \(\sim \) 100/year which are localized to 3’ accuracy with the BAT. Swift autonomously repoints to the majority of these events within about a minute. Fading X-ray signals above the limit of the ROSAT All-Sky Survey (Voges et al. 2000) within the BAT localization are effectively always the GW afterglow.

Among the hidden issues exposed by GW170817 is the association of kilonovae signals to a GW event. For GW170817 the last non-detection with sufficient limits was the DLT40 observation 21 days before merger time (Yang et al. 2017). With our median BNS merger rate and the \(380\,\text{Mpc}^3\) volume from the final GW constraint (Abbott et al. 2017c), \(P_{\mathrm{chance}} \approx (380\,\text{Mpc}^3) \times (1000\,\text{Gpc}^{-3}\text{yr}^{-1}) \times 21 \text{days} \approx 10^{-5}\), which is a reasonably robust association.

To examine a worse-case scenario we can imagine a similar EM detection in the follow-up of GW190425 which has a distance estimate of \(156 \pm 41\,\text{Mpc}\) and a 90% confidence region covering \(7461\,\text{deg}^2\) (LVC 2017d). Then, \(P_{\mathrm{chance}} \approx 0.5\), a rather questionable association. As the GW interferometers improve their reach, events will tend to have similar fractional uncertainty on their distance determination which corresponds to a far larger total localization volumes. Take a middle example with a typical localization region of \(500\,\text{deg}^2\), distance \(200 \pm 50\,\text{Mpc}\), and a last (constraining) non-detection a week before, then \(P_{\mathrm{chance}} \approx 1\%\). So, even if we know the event is a kilonova, we may not be able to robustly associate it. This effect is even more important when relatively pure samples are strongly preferred (e.g., standard siren cosmology). This issue can either be solved by increasing the spatial association significance (either through better GW or GRB localizations) or the temporal association significance. The latter can be accomplished in two ways. More recent non-detections help, but may require sensitivity to \(\sim \) 23–24 Mag (Cowperthwaite and Berger 2015). Alternatively, one can determine the start time to \(\sim \) 1 day accuracy either by directly constraining the rise or through inferring the age of the kilonova for well-sampled events.

Joint searches for NS mergers can be more powerful than individual searches by elevating the significance of a true signal and repressing background. Most work in joint searches for NS mergers has focused on GW-GRB searches. Owing to the rarity of GRBs and the \(\sim \) seconds intrinsic time offset, current joint searches can improve the GW detection distance by 20–25% (Williamson et al. 2014), which is a corresponding search volume increase of nearly double.Footnote 5 Further, for at least the next few years we will have a significant amount of time where only a single GW interferometer is active (Table 3, Fig. 2). SGRBs are so rare that association with a single interferometer trigger could confirm the event. This improves the effective livetime of the GW network for GW-GRB searches.

In addition to increasing the number of multimessenger detections of NS mergers, joint GW-GRB searches also provide improved localization constraints by combining the two independent, morphologically different localizations. We demonstrate with GW170817. The first localization reported by the LVC was the GBM localization (LVC 2017a). This was because Virgo data was not immediately available, a massive glitch occurred contemporaneously in LIGO-Livingston (LVC 2017c), and the GBM localization is more constraining than the single interferometer antenna pattern from LIGO-Hanford. The first GW network localization (HLV) was reported 5 hours after event time, with a 90% containment region covering \(31\,\text{deg}^2\) (LVC 2017b). If we take the HL localization region and combine it with the independent GBM localization, the 90% confidence region covers \(60\,\text{deg}^2\). These combined localizations also improve the estimate of the distance to the host galaxy. This information was available much earlier than the Virgo information, but was not reported publicly.

Even with the poor localization accuracy of Fermi GBM, the different morphologies of the typical GBM and GW confidence regions enable greatly improved joint localizations. GBM will tend to reduce the 90% confidence regions for single interferometer events by \(\sim \) 90%, for double interferometer localizations by \(\sim \) 80%, but will tend to not improve localizations from three or more interferometers (Burns 2017). Should a joint GW-GRB detection occur with Swift, the BAT (or XRT) localization would be sufficient for immediate follow-up. IPN localizations will be between the two, but with much longer reporting latency (hours-days instead of a minute).

The other promising joint search is GW-neutrino or neutrino-GRB searches, for cases where the neutrino emission is nearly immediate (e.g., Van Elewyck et al. 2009), though the prospects for neutrino detections of NS mergers are pessimistic or uncertain. Some work has been done on prospects for elevating sub-threshold GW detections through association with a kilonova or afterglow. Lynch et al. (2018) find that to double the number of true GW events the FAR threshold would increase by five orders of magnitude. They advocate for LVC reporting thresholds to be determined by \(P_{\mathrm{astro}}\), which we support. However, weak events have to overcome the likelihood that the GW event is not real for confirmation (Ashton et al. 2018). For example, the LVC initial classification for S190718y is 98% terrestrial (noise) and 2% BNS (LVC 2017e), lowering the claim of a joint detection by more than an order of magnitude. With the prior established difficulty in associating kilonova to GW detections, it seems performing follow-up searches of sub-threshold GW signals is not a good use of observational resources. Then, for joint searches, the most promising prospect is the identification of a kilonova or afterglow by an optical (or other) survey in its normal operating mode which is then associated to a GW or SGRB trigger. Such joint searches should be developed and automated.

Because of the importance of this section we summarize the results:

  • Robust associations are necessary to enable multimessenger astronomy, and are not possible for all events.

  • Spatial constraints from the discovery instruments are critical for robust statistical association.

  • Temporal constraints for follow-up instruments are critical for robust statistical association. This can either be through a constraint on rise-time or previous non-detection from wide-field surveys.

  • Follow-up observations of sub-threshold GW signals is ill-advised, but automatically associating signals found in independent surveys should be done.

Fig. 4
figure4

The observed inclination angle distributions for NS mergers detected through GWs and prompt SGRB observations. The GW solution comes from Schutz (2011) and the SGRB from slight modification (to handle solid angle) from observational results in Fong et al. (2015). We use the astrophysical convention of \(0 \le \iota \le 90\), ignoring handedness relative to Earth. Against the rather naive assumption of a solid-angle distribution, roughly 1 in 8 GW-detected NS mergers that produce jets will have those jets oriented towards Earth

Joint GW-GRB detection rates

Prior to GW170817 it was considered somewhat unlikely, though possible, for a joint GW-GRB detection to occur with the Advanced network of interferometers. This belief was continued due to several misconceptions or misunderstandings. We briefly describe these and their resolution:

  • Inclination Biases SGRBs have an observed half-jet opening angle distribution of \(16^\circ \pm 10^\circ \) (Fong et al. 2015), which does not include GRB 170817A. Then, from solid angle effects only a few percent of successful SGRB jets will be oriented towards Earth. Therefore, the assumption was that only a few percent of GW-detected NS mergers would have an associated SGRB (or less, if not all NS mergers produce successful jets).

    The emission of GWs is omnidirectional but not isotropic. It is strongest when the system is face on. Convolving this with solid angle gives an observed inclination angle probability distribution for GW-detected NS mergers of

    $$\begin{aligned} \rho _{\mathrm{GW-detected}}(\iota ) = 0.002656\Big (1+6\cos ^2(\iota )+\cos ^4(\iota )\Big )^{3/2}\sin (\iota ) \end{aligned}$$
    (9)

    Schutz (2011). Note that we have altered the distribution to be in terms of degrees (not radians) and removed directionality from \(\iota \) (GW measures of inclination go from 0 to 180 but EM studies of NS mergers generally only go to 90).

    The effect of this is shown in Fig. 4. The GW distribution comes from Eq. (9). The SGRBs distribution is a Gaussian convolved with solid angle that roughly recreates the observed distribution compiled in Fong et al. (2015), accounting for the intrinsic vs observed differences. The outcome is that roughly 1 in 8 GW-detected NS mergers that produce SGRBs will have Earth within the jet angle.

  • The Minimum Luminosity of SGRBs Shifting a typical cosmological SGRB with \(L_{\mathrm{iso}} \approx 10^{52}\,\text{erg/s}\) within the GW detection volume would have an observed flux \(\sim 10^4\) times the typical value. Such a burst has not been observed in half a century of observations.

    The implicit assumption is that SGRBs have a minimum luminosity, which was widely assumed (see e.g., Wanderman and Piran 2015, references therein, and references to). The was an implicit assumption that SGRBs arise from top-hat jets, where the jet has uniform properties within its cone, which largely explained observations until GRB 170817A. Structured jets, where there is variation within the jet cone, have now been considered (see Sect. 4.4). For these models the intrinsic luminosity function of SGRBs refers to the peak luminosity of the jet, generally corresponding to the face-on value. Then, for the same jet, the isotropic-equivalent luminosity as viewed from Earth depends on the inclination angle. Prior to GRB 170817A, there were papers that avoided this implicit assumption, such as Ghirlanda et al. (2016) who predicted joint detect rates without requiring an imposed minimum luminosity.

    Evans et al. (2015) was the first paper to consider that we may not identify nearby SGRBs based on flux measurements if they are “systematically less luminous than those detected to date”. Burns et al. (2016) investigated the observed brightness of SGRBs as a function of redshift and found no relation, empirically showing that we likely had not observed the bottom of the luminosity function, and suggested that subluminous SGRBs exist. From the knowledge gained from GRB 170817A these subluminous bursts would arise from nearby off-axis events.

  • The limited GW Detection Distance and the Redshift Distribution of SGRBs There were no known SGRBs within the Advanced interferometer design BNS range of 200 Mpc. Neglecting the full GW network fails to account for the true spacetime volume observed, as shown in Fig. 7. Joint GW-GRB detections will have a restricted inclination angle, giving a sky-averaged GW-GRB BNS range 1.5 times greater than the GW-only range. Further, with joint searches we can increase the detection distance by \(\sim \) 25% (Sect. 2.5).

    The GW-GRB detection distances and the observed redshift distribution of SGRBs are shown in Fig. 5 with relevant information in Fig. 7. This suggests a few percent of SGRBs are within the joint detection horizon, corresponding to a few events per year with current sensitivities.

Fig. 5
figure5

Prior observations of NS mergers. The grey shaded region is the cumulative redshift distribution observed for SGRBs, bounded by the pessimistic and optimistic samples from Abbott et al. (2017b). Blue squares and triangles are the claimed kilonova and cases with constraining upper limits. The top axis marks the approximate KN170817 magnitude as a function of distance, based on an assumed 17.5 Mag (within half a Mag of most bands Villar et al. 2017) and neglecting redshift effects. Overlaid are the joint GW-GRB detection horizons

Combining this information together, Burns (2017), published before GW170817, stated that we should expect joint detections with the Advanced network at design sensitivity, and potentially before. With GW170817 and GRB 170817A we confirmed that nearby bursts exist, that subluminous SGRBs exist, and that joint detections should be expected with existing instruments. As a result, in predicting future joint detection rates we use the same underlying principles.

Another issue, that remains unsolved and is not considered in the prior paragraph, is the fraction of observed SGRBs from NSBH mergers. NSBH mergers are heavier and can be detected in GWs roughly an order of magnitude greater volume (for those expected to produce SGRBs). That is, even a low fraction of detected SGRBs originating from NSBH mergers would result in a sizable fraction of GW-GRB detections from NSBH mergers (as compared to joint detections from BNS mergers). The fractional contribution from each progenitor can then significantly alter the expected joint rates.

However, this requires a very important caveat. Since GW170817, several papers have been published that estimate future joint detection rates with the intrinsic BNS merger rate, a half-jet opening angle (typically \(\sim 16^\circ \) from Fong et al. 2015), that all BNS mergers produce SGRBs, and a 100% recovery efficiency for the EM instrument. This last assumption is fundamentally flawed. As a sanity check, applying this calculation to GBM vastly overestimates the expected joint detection rate by a factor of several. It is necessary to account for the low recovery fraction of weak SGRBs due to detection distances like GRB 170817A.

Table 4 The key parameters for joint GW-GRB detections

For the joint rates estimates we use existing literature to determine a reasonable range of the fraction of SGRBs that will be detected by NS mergers, which has the benefit of avoiding the uncertainty on the fraction of NS mergers that produce SGRBs. These rates consider the detections of off-axis events, being built on literature that considers this either explicitly or implicitly. To start, we assume only a two-interferometer network with a 50% network livetime (70% each) and that all SGRBs originate from BNS mergers. For the Advanced network at design sensitivity we assume that 0.8–4.5% of SGRBs are detected in GWs. This is consistent with limits on the fraction of nearby SGRBs from comparing their localizations against galaxy catalogs (Mandhai et al. 2018) and on the inverse fraction of GWs detections with associated SGRB detections (Song et al. 2019; Beniamini et al. 2019). These values come from the methods described in Abbott et al. (2017b, 2019f), as well as the simulations from Howell et al. (2019) and Mogushi et al. (2019). For the A+ network we take 2–10%, based on a \(\sim \) 2.5\(\times \) scaling relative to the Advanced network from Howell et al. (2019). For Voyager we assume 10–20% as a representative recovery fraction based on the observed SGRB redshift distribution (Fig. 5). The Gen 3 interferometers have a joint BNS range beyond the furthest SGRB ever detected; therefore, we assume they recover all events when the network is live.

To calculate an absolute base rate we scale these fractions by the rate of GBM on-board triggers. We note that this is a particularly conservative estimate. It ignores single interferometer GW triggers that are confirmed by an associated SGRB trigger (\(\sim \) 80% increase for a two interferometer network), the effects of adding interferometers to the network (\(\sim \) 2–3\(\times \) for a five interferometer network, with slightly asymmetric sensitives, due to higher network livetime and more uniform coverage), the increase in recovered SGRBs (a factor of a few, see Sect. 2.4), and the contributions from the rest of the active GRBs monitors (\(\sim \) 30–40% more than the GBM on-board trigger rate). These effects are not fully independent (e.g., a five interferometer network will have negligible single interferometer livetime). As a conservative estimate of the effects of these additional detections we provide the final column in Table 4, which doubles the rate of GBM+HL triggers. For Advanced LIGO at design sensitivity we should expect a few joint detections per year. With A+ this should happen several times per year.

We also provide an estimate for Swift-BAT+HL joint detection rates by scaling the GBM+HL values. This is reasonable because they have similar detection thresholds. However, this is a lower limit. By reordering the observation list to bias the BAT Field of View (FoV) to overlap with the LIGO sensitivity maximum the joint detection rates can be increased by several tens of percent. Scaling to instruments with different sensitivities requires accounting for the bias of brighter events being more likely to occur in the nearby universe.

Follow-up searches

As of the time of this writing, no NS merger has ever been discovered without a prompt SGRB or GW detection. This is not particularly surprising. Using optical as an example, only a few LGRB afterglows have been detected without an associated prompt trigger. Detections of SGRBs are rarer than LGRBs and have systematically fainter afterglows. Similarly, there are thousands of known supernova identified through optical surveys but they are orders of magnitude brighter and more common than kilonovae.

As such, the dominant mode for finding SGRB afterglows, kilonovae, and the other expected EM transients from NS mergers will be through follow-up observations of prompt SGRB and GW triggers. This is true at least until the era of Large Synoptic Survey Telescope (LSST). These follow-up observations can be performed in a few different ways. The most common method is through follow-up of Swift-BAT SGRBs with afterglow detections approximately every other month (generally detected by XRT).

As previously discussed, GW detections of NS mergers provide localizations of tens to hundreds, and sometimes thousands, of square degrees. They also provide an estimate of the distance to the event, with typical uncertainty of tens of percent. These 3D localizations are distributed as HEALPix maps (Gorski et al. 2005) through Gamma-ray Coordinates Network (GCN), with the distance reported as a function of position (Singer and Price 2016). These localization regions are massive, and difficult to follow-up with the vast majority of telescopes. However, for the initial GW era detections will tend to be in the nearby universe (\(\lesssim 200\,\text{Mpc}\)), where galaxy catalogs are reasonably complete. That is, narrow-field telescopes can prioritize the position of known galaxies within the GW-identified search volume, a technique referred to as galaxy targeting (e.g., Kanner et al. 2012; Gehrels et al. 2016).

The other solution to this problem is to build sensitive telescopes with a large FOV. When a localization is reported these facilities tile the large error region and rapidly cover the observable containment region to a depth sufficient for a reasonable recovery fraction. This technique can also apply to GRB localizations. Such optical facilities identify enormous numbers of transients that have to be down-selected to a small subset of events of interest. A great demonstration of this technique is the Zwicky Transient Facility (ZTF) follow-up of GW190425, covering \(\sim \) 10% of the sky on successive nights, in two bands, identifying more than 300,000 candidate transients, and quickly down-selecting to 15 events of interest (Coughlin et al. 2019a).

In estimating follow-up detection rates we should not expect to recover those events that occur near the Sun. The space-based observing constraint is within \(\sim 45^\circ \) of the Sun for many narrow-field space-based telescopes (e.g., Swift, Hubble, Chandra). The ground-based limitation is generally a few hours of RA from the Sun, for a comparable exclusion zone size. An exception to this is for events detectable long enough for the Sun to move across the sky, requiring \(\sim \) months of detectability. We neglect this here, only considering events identified in the first \(\sim \) week. Either case rules out about 15% of the sky. We may also not be able to recover SGRB afterglow and kilonovae if they occur within about \(5^\circ \) of the galactic plane because of extinction and the insane rate of transients at lower energies. Therefore, follow-up observations could be capable of recovering up to 80% of GW or GRB triggers.

We briefly remark on the possibility of separating afterglow and kilonova observations. SGRB afterglow can be bright and dominate kilonova emission, or faint and undetectable below a given kilonova. From observations it appears afterglow will dominate in \(\sim \) 25% of cases (Gompertz et al. 2018). When they are of comparable strength, or the observations sufficient, the different spectral signatures and their temporal evolution of these events should enable disentanglement. Further, afterglow will tend to fade away long before the dominant emission of red kilonova.

Gamma-ray burst afterglows

Swift identified the first SGRB afterglow and has provided a sampleFootnote 6 of about 100. These detections and broadband EM observations from radio to GeV have shown afterglow is well described by synchrotron radiation. This radiation spans the EM spectrum and is described as power laws with three breaks: the self-absorption break \(\nu _a\), the minimum Lorentz factor break \(\nu _m\), and the synchrotron cooling break \(\nu _c\) (Sari et al. 1998).

As summarized in Berger (2014), broadband observations and closure relations enable determination of these break energies and their temporal evolution allow determination of several parameters. This includes the kinetic energy of the blastwave \(E_k\), the half-jet opening angle \(\theta _j\) (historically calculated assuming a top-hat jet), the density in the circumburst region n (on \(\sim \) parsec scales), the power law index of the electron distribution in the jet, and a few microphysical parameters. In response to GRB 170817A excluding the base top-hat jet models, closure relations for structured jet models have been derived (Ryan et al. 2020). Afterglow detection also enables arcsecond localizations and thus distance determination (see Sect. 3.5), which allows for the calculation of \(E_{\mathrm{iso}}\) and \(L_{\mathrm{iso}}\) of the prompt emission, and the half-jet opening angle allows for the jet-corrected values of these parameters and \(E_k\).

The rates of SGRB afterglow detections is well understood for Swift bursts. With the rate of SGRB detections by BAT and the fraction detected in XRT, there are \(\sim \) 6–7 X-ray detections of SGRBs/yr. The XRT sample of GRB afterglows is shown in Fig. 6. The recovery fraction at other wavelengths is poor. The summary in Fong et al. (2015) covers observations of 103 SGRBs; X-rays have a 74% recovery fraction, optical and Near infrared (NIR) 34% and radio 7%. Note that these pessimistic recovery fractions are for narrow-field telescopes, which are effectively always more sensitive than wide-field telescopes covering the same energy range.

Fig. 6
figure6

The Swift XRT afterglow sample. LGRBs are orange and SGRB in red, showing they are systematically dimmer by \(\sim \) 1–2 orders of magnitudes. XRT has an 85% recovery fraction for SGRBs it observes in the first 100 s. The black markers are the nearest SGRBs with known redshift. The upper limits (triangles) are for GRBs 170817A and 150101B. The lines are for GRBs 061201, 080905A, and 100628A. Like the prompt SGRB emission, they are not brighter at Earth than the full sample

The temporal decay of afterglow is steeper than the sensitivity gain most telescopes get for longer observation times. The faster an observation begins after event time the higher a likelihood of recovery, which was the main technical driver for Swift. Alternatively, vastly more sensitive telescopes can be pointed at later times and still recover these signals, such as Chandra detections days later.

Beyond the typical cosmological SGRB afterglows, off-axis afterglows were thought to be promising EM counterparts to GW detections. From Metzger and Berger (2012), and references therein, when top-hat jets interact with the surrounding material they slow and broaden. Over long enough timescales this emission can become observable to wider angles than the prompt SGRB emission, but can still be bright enough to be detected from nearby events. GW170817 and GRB 170817A proved that afterglow can be detected significantly off-axis, but it also showed that off-axis afterglows may not be promising EM counterparts unless the precise source localization is known through other means (i.e. identification of the kilonova). Fermi-GBM, an all-sky monitor that is secondary on its own spacecraft, could detect GRB 170817A nearly as far as the narrow-field X-ray Great Observatory Chandra. Indeed without the kilonova determination of the source position the afterglow for GRB 170817A event would not have been identified.

For the previously discussed reasons, searches for blind discovery of SGRB afterglow using current wide-field monitors are unlikely to be successful. This is unlikely to change at least until LSST operation. The most likely follow-up technique to succeed is then the galaxy targeting technique, as it enables follow-up with more sensitive telescopes; however, this is limited to well-localized and nearby events. The instrument most likely to identify a SGRB afterglow following a GW detection is the Swift-XRT, as it is the only fast response X-ray instrument.

Estimating the number of SGRB afterglow detections following NS mergers is difficult because we do not understand their structure and therefore their brightness distribution. We will lose some events due to Sun constraints, transient contamination Milky Way, or relative sensitivity issues, which we estimate as \(\sim \) 25% based on the Swift XRT recovery fraction of BAT bursts. However, we may also recover some events undetectable by GBM due to Earth occultation or livetime considerations. These two effects are likely of similar order. Therefore, we roughly estimate the rates by assuming they have similar recovery fractions as the prompt GBM on-board triggers.

Kilonovae

The first widely discussed claim of a kilonova detection came from follow-up observations of the Swift SGRB 130603B (Tanvir et al. 2013). There are a handful of other claims of kilonova signals in follow-up of Swift GRBs, (e.g., Perley et al. 2009; Yang et al. 2015; Jin et al. 2016). Inferred color and luminosity distributions for the claimed events are summarized in Gompertz et al. (2018) and Ascenzi et al. (2019). However, the only well studied kilonova is KN170817. This event likely had a HMNS remnant (see Sect. 3.2), suggesting the brightness was near the middle of the possibilities (with SMNS and Stable NS being brighter and prompt collapse fainter). However, the early emission was on the bright end of expectations and the exact reason remains a matter of debate (see discussions and references in Arcavi 2018, Metzger et al. 2018, but see Kawaguchi et al. 2020).

If we assume that this unexpected bright behavior is due to our lack of understanding of these sources, rather than being a rare occurrence, we can use it as a representative kilonova, which we do in this paper. Villar et al. (2017) compiled a large sample of the UVOIR observations of KN170817. At the distance of \(\sim \) 40 Mpc the Ultraviolet (UV) emission peaked at \(\sim \) 19th Mag (thought it may have peaked before the first observations), blue bands at \(\sim \) 18th Mag, with red and infrared approaching almost \(\sim \) 17th Mag. With a limiting Mag of \(\sim \) 26, within the reach of existing sensitive telescopes, around 30–40% of Swift SGRBs occur close enough for a KN170817-like event to be detected and studied. The majority of Swift SGRBs do not have follow-up at these sensitivities. This is in part because the primary goal of Swift follow-up was afterglow studies, and SGRB afterglow usually fade before the on-set of kilonova emission. With the devotion of sufficient observational resources \(\sim \) 1–2 kilonova per year can be identified by following up Swift SGRBs, though we note that many of the nearby bursts have claims of kilonova or interesting upper limits as shown in Fig. 5.

KN170817 was independently identified in the follow-up of GW170817 through both the wide-field tiling and galaxy targeting techniques (e.g., Coulter et al. 2017; Soares-Santos et al. 2017; Valenti et al. 2017; Arcavi et al. 2017; Tanvir et al. 2017; Lipunov et al. 2017). Both methods will continue to be useful for future events, with the best technique depending on a given event. For events that are nearby (where galaxy catalogs are relatively complete) and reasonably well-localized galaxy targeting will be quite beneficial, with methods that account for galaxy incompleteness being particularly powerful (Evans et al. 2016). For events that are nearby and poorly localized (e.g., several hundreds of square degrees or more), or events that are further away, the wide-field tiling technique will be dominant, provided the telescopes are sufficiently sensitive. There is no active wide-field UV monitor. The band with the current best wide-field telescopes for identifying kilonova are in optical, where instruments like ZTF (Bellm 2014) can tile a large fraction of the sky to \(\sim \) 21st–22nd Mag in one or two filters in a single night, as demonstrated by the (current) worse-case event (Coughlin et al. 2019a). However, even these depths may be insufficient to recover the majority of kilonova following GW detections (Sagués Carracedo et al. 2020).

Reliably predicting the detection rates of kilonova in follow-up of GW-detected NS mergers may be a fools errand. The values depend on the volumetric rate of NS mergers (each with more than an order of magnitude uncertainty), predictions on the sensitivity of the GW network years in advance (that is an attempt to predict how some of the most sensitive machines ever built will change), the color and luminosity distribution of kilonova themselves (and how the intrinsic system parameters affect this, with only a single well-studied event to base our knowledge on), and would have to account for dozens of follow-up instruments scattered over the surface of Earth and teams with different observational strategies.

Here we bound the rate. To calculate the number of kilonova detected through follow-up of GW detected BNS mergers we start with the rate of such events within 200 Mpc. This is estimated using KN170817 as a baseline, with observations achieving a sensitivity of \(\sim \) 21st Mag, we can recover KN170817 out to the Advanced design range of LIGO and Virgo. At 22nd Mag this reaches to \(\sim \) 250 Mpc. This is roughly the sensitivity of ZTF (depending on the observation time) which has a \(47\,\text{deg}^2\) FOV, covers the g, r, and i filters (effectively, green, red, and infrared), and observes the northern sky. For our estimate of kilonova detection rates we assume that we can achieve ZTF-like depths in the majority of optical filters over the observable night sky, which is a reasonable assumption given active and potential upcoming comparable facilities (e.g., Diehl et al. 2012; Bloemen et al. 2015).

We do not attempt to estimate the gain from wide-field telescope sensitivity (e.g., LSST) as the rate of optical transients becomes too great for this simple method to be accurate. Galaxy-targeting campaigns or smaller field of view telescopes that are more sensitive (e.g., DECam) generally require 3 or more interferometer localizations to succeed. This will not be common for events beyond 200 Mpc in the Advanced era, but will be in the A+ era where our provided numbers are conservative. This is shown in Fig. 7.

The first estimation is the GW-recovery fraction of these events, which is shown in Fig. 7. We multiply the GW detection efficiency with differential volume to determine the distance distribution for GW-detected NS mergers. From this, we can also calculate the recovery fraction of a network for BNS mergers within 200 Mpc, roughly corresponding to the discovery distance for kilonova until LSST. This value is calculated by taking the time a network will spend with specific detector combinations and multiplying by the recovery fraction of the second-best live interferometer. We assume 70% livetime for each individual interferometer and treat Virgo and KAGRA as roughly equivalent (taking the higher recovery fraction). For the Advanced era this suggests the network will recover \(\sim \) 30% of BNS mergers within 200 Mpc and about 75% in the A+ HLVKI era. These assumptions neglect the fact that most detectors are not copointed, but this is somewhat counteracted by the additional sensitivity of three and four detector livetimes.

Fig. 7
figure7

The GW detection efficiency and distance distributions for GW-detected NS mergers by he Advanced and A+ networks. These are constructed with the projection parameter from Finn and Chernoff (1993), as used in the literature (e.g., Howell et al. 2019), and the tables from (Dominik et al. 2015). The left panels are for the Advanced interferometer era and the right for the A+ era. The top panels shows the GW detection efficiency for canonical BNS mergers as a function of distance. The middle and lower panels scale this by the differential volume to show the cumulative and differential distance distributions for GW-detected NS mergers. The assumed distances for the different interferometers are the median value for the interferometers (Advanced: LIGO-175 Mpc, Virgo-105 Mpc, KAGRA-77.5 Mpc; A+: LIGO-330 Mpc, Virgo-205 Mpc, KAGRA-130 Mpc) from Abbott et al. (2018a)

Multiplying this fraction by the 5% and 95% bounds on the local volumetric rate of BNS mergers gives the expected rates as a function of distance. To estimate the rate of kilonova detections following GW detections of BNS mergers we account for the 20% loss of events that occur close to the Sun or in the galactic plane, where follow-up observations are either impossible or likely to be too contaminated to reliably identify as discussed in Sect. 2.7. We calculate reasonable values for pessimistic and optimistic scenarios, as well as a mid-range estimate. For the representative estimate we assume 70% will be like AT2017gfo or brighter (the remaining 30% being assumed to be prompt collapse and too faint to detect), and for the high-end estimate we assume 100% (assuming prompt collapse events are rare). These values come from Margalit and Metzger (2019), which is conservative compared to predicted remnant object fractions from other estimates Lü et al. (e.g., 2015).

For the low-end estimate we remove the assumption of kilonova brightness being predominantly determined by the progenitor, e.g., due to properties of the merger or inclination effects, which also removes the assumed mass distributions for BNS mergers. Gompertz et al. (2018) investigate kilonova brightness based on SGRBs follow-up. They find that three kilonova candidates would be brighter than KN170817 and that four events have non-detections with upper limits sufficient to rule out a KN170817-like event. We here assume 25% of kilonova would be as bright as KN170817, corresponding to 2 of the 3 candidates being real detections. We caution that this may still prove to be optimistic.

These calculations give a representative estimate of 5.6 GW-kilonova detections per year with the Advanced network, with pessimistic and optimistic scenarios estimating between 0.4 and 24 per year. For the Advanced network this is 14/yr in the representative case, and between 1.0 and 60/yr in the other scenarios. For Voyager and Gen 3 we adopt a lower limit of detections of at least once a month, corresponding to the recovery of BNS mergers (assuming the 95% intrinsic lower limit) within 300 Mpc (with the previously mentioned losses). Should wide-field telescopes sufficiently advance in sensitivity, or should LSST prioritize the follow-up of GW detected NS mergers, this rate could greatly increase. The rate of kilonova detected following-up NSBH mergers is likely to be low in comparison, due to the generally greater distances and emission peaking in infrared (where wide-field telescopes are much less sensitive), though the intrinsic rates are broadly unknown.

These estimates neglect inclination effects on recovery fraction. As KN170817 was thought to be oriented for maximal brightness this may suggest the rates are somewhat optimistic. However, this is counteracted by the observed inclination distribution GW-detected NS mergers. This is discussed in Sects. 2.6 and 2.11.

Earlier detections are necessary for characterization of the kilonova and for robust statistical association to the GW (or GRB signal). The earliest light expected from these events is in UV. The only active mission that does UV discovery searches is Swift, which relies on the galaxy targeted technique. Otherwise, observations in b and g filters within about a day (for blue kilonova), and r and i filters on timescales of a week (for red kilonova) are likely the discovery bands (Cowperthwaite and Berger 2015). However, separation of kilonovae from other optical transients must rely on color information, and we likely need detection in multiple bands for discovery. Once the source position is known, either through identification of afterglow or kilonova, broadband study of the kilonova begins. Telescopes covering these wavelengths are abundant, which can make use of both follow-up techniques; however, NIR wide-field telescopes are significantly less sensitive than optical ones.

UVOIR observations from the earliest detection until they fade from detectability (in each wavelength) enable us to infer properties of the ejected material. The ejecta mass, velocity, and opacity (or lanthanide fraction, depending on the formulation) can be determined from the broadband evolution of the quasi-thermal signature. This relies on an underlying assumed kilonova model. This is discussed in detail in Sect. 5.

Other signatures

GW inspirals, prompt SGRBs, afterglow, and kilonova are the primary signals for detecting and characterizing these events. This section briefly summarizes several other possible signals expected on observational or theoretical grounds. Detecting any of these signatures would provide incredible insight into the physics of NS mergers. The discussion here is limited to observational requirements with a base scientific motivation, with more detailed discussion in later sections.

MeV neutrinos

As discussed, BNS mergers can have neutrino luminosities a few times greater than CCSNe. The Supernova Early Warning System (SNEWS) was developed to cross-correlate short-duration signal excesses from multiple \(\sim \) MeV neutrino telescopes to identify and localize nearby CCSNe and alert the astronomical community before the first light (from shock break-out) is detectable (Antonioli et al. 2004). It should also work for NS mergers, where the very short intrinsic time offset from a GW trigger can enable sensitive joint searches.

To discuss potential detection rates we focus on Hyper-Kamiokande, which is a 0.5 Megaton detector under construction in Japan (Abe et al. 2018). It follows the Nobel Prize winning detectors Kamiokande and Super-Kamiokande, will increase our neutrino detection rate of CCSNe by an order of magnitude, and provides a potential path forward from the Standard Model. Unfortunately, it will probably not inform our understanding of NS mergers as they can only be detected to \(\sim \) 15 Mpc. The closest BNS merger every century should be roughly \(13_{-4}^{+9}\,\text{Mpc}\), suggesting during a decade run of Hyper-Kamiokande there is a \(\lesssim \)10% chance of detecting a BNS merger.

Other observed non-thermal signatures

Observations of SGRBs have uncovered several additional non-thermal signatures. These signatures provide unique insight into these events, the possibilities and implications of which are discussed in Sect. 4.7. The main peak in prompt emission is sometimes observed with preceding emission referred to as precursor activity and sometimes with extended emission that can last up to \(\sim \) 100 s. These are reliably identified with the prompt GRB monitors. Gamma-ray precursors may require pre-trigger data with high temporal resolution (if the trigger is due to the main emission), and are generally expected to be softer, requiring energy coverage near \(\sim \) 10–100 keV. There may also be precursor emission at other energies. Clear identification of extended emission requires well-behaved backgrounds after trigger and generally emits at \(\lesssim \)100 keV.

SGRB afterglow emission has large variation in addition to the base temporal decay. The Swift-XRT sample of SGRB afterglows with X-ray flares and plateau activity in excess of the base temporal decay. These appear to be signatures of late-time energy injection into the jet. They require prompt X-ray observations, generally concluding within 10,000 s of trigger time.

High-energy neutrinos

We may also expect high-energy (\(\sim \) TeV-EeV) neutrino emission from NS mergers. The most sensitive instrument at these energies is the gigaton-class IceCube detector. The prompt and extended emission of SGRBs and the extra components seen in some SGRB afterglow may produce significant amounts of neutrinos (e.g., Kimura et al. 2017, and references therein). These signals are favorable for joint detections given the short time offset and rough localization capability of IceCube. Extended emission appears to be the most favorable signature, but only occurs for a fraction of NS mergers. In light of the neutrino search around GW170817 (Albert et al. 2017) approaching interesting limits and the relatively new consideration of the SGRB jet interaction with polar kilonova ejecta, new theoretical studies have been performed that suggest we may be able to detect SGRBs in high energy neutrinos (Kimura et al. 2018). This generally requires a GW-GRB event within \(\sim \) 50 Mpc and occurring in the northern hemisphere, where IceCube is far more sensitive. Such an event occurs about once per decade.

Murase et al. (2009) opened the possibility of observing \(\sim \) EeV neutrinos over days to weeks after merger from proton acceleration by a new, long-lived NS remnant with a high magnetic field, referred to as a magnetar. Fang and Metzger (2017) applied this to BNS mergers and their model was tested in Albert et al. (2017), which suggests we are 2 orders of magnitude away from interesting limits. This high energy neutrino signature is unrelated to the prescence of a jet. The understanding gained through the multimessenger observations of GW170817 have led to reevaluation of potential coincident detections (e.g., Kimura et al. 2017) and additional mechanisms for high energy neutrino production, such as choked jet scenarios (e.g., Kimura et al. 2018). Precise predictions of detection rates are difficult, but are generally expected to be rare.

Very-high energy electromagnetic detections

Gamma-rays refers to about half of the electromagnetic spectrum. The primary energy range of SGRBs (\(\sim \) keV–MeV energies) are soft gamma-rays. The mid-energy range is covered by the Fermi-LAT. In its first decade of observation is has detected 186 GRBs, 155 of which are with its normal data (\(\gtrsim \)100 MeV). The seed information for LAT GRB searches is usually GBM triggers, with about 30% of GBM detections observed within the nominal LAT FOV, giving a LAT recovery efficiency of \(\sim \) 25%. Of that 25%, 30% (2%) is seen above 5 GeV (50 GeV) (Ajello et al. 2019). Notably, of those with measured redshift 80% (12%) have source-frame photons above 5 GeV (100 GeV). These detections appear to be a mixture of prompt and afterglow emission, which can occur during the prompt phase even for SGRBs.

Beyond the reach of Fermi are Very High Energy (VHE) gamma-rays, roughly defined as \(\gtrsim 100\,\text{GeV}\), that are observed by ground-based facilities utilizing Cherenkov radiation. Detections at these energies are expected observationally from extrapolation of the LAT power-law measurements and theoretically, e.g., from synchrotron self-Compton afterglow emission. There are two classes of VHE telescopes. Water Cherenkov telescopes like High-Altitude Water Cherenkov Array (HAWC Wood 2016) which observe a large fraction of the sky instantaneously (day or night). Imaging Atmospheric Cherenkov Telescopes (IACTs) are pointed observations, though by most definitions they are wide-field telescopes (\(\sim \text{few deg}^2\) FOV) that are far more sensitive but can only observe at night.

The first report of a VHE detection of a GRB occurred earlier this year, with the Major Atmospheric Gamma Imaging Cherenkov Telescopes (MAGIC) detection of LGRB 190114C (Mirzoyan et al. 2019). The LAT observations of this burst are impressive, but within the observed distribution. This suggests that the MAGIC observation resulted in detection because it was the first early VHE observation of a very bright afterglow. It is sufficiently bright that it could have been detected by HAWC in the sensitive region of its FOV. There are also two reports from H.E.S.S. of VHE detection of afterglow from the LGRBs 180720B and 190829A (Velasco 2019; de Naurois 2019). This suggests a detection rate of a few LGRBs per decade with existing telescopes, which is consistent with extrapolation from the LAT rates.

To estimate the detection rate of SGRBs with VHE telescopes we can scale the rate by the fraction of SGRBs to the total GRBs rate. The LGRB-to-SGRB ratio for GBM is 4:1. The same ratio for the LAT is 10:1. This is not surprising as a large portion of the LAT detections are from only afterglow emission (which is fainter for SGRBs). Then, an optimistic VHE detection rate of NS mergers with existing instrumentation is \(\sim \) 1/decade. The planned Cherenkov Telescope Array (CTA) is an IACT that is roughly an order of magnitude more sensitive than its predecessors. Then, we may expect a VHE detection of a SGRB every few years. However, we emphasize this is a very rough estimate.

Neutron precursors to kilonova and additional energy injection

Among the surprises of KN170817 that remains unsolved is the origin of the early bright UV/blue emission. This topic is discussed in Sect. 3.4. The possibilities range from the decay of free neutrons, shock-heated contributions from jet interactions with polar ejecta, additional heating supplied through a temporary magnetar, etc. In all cases these require observations in UV and blue optical wavelengths as early as \(\sim \) 1–2 hours after merger.

Late-time radio emission

The quasi-isotropic ejecta will emit late-time radio emission as it interacts with the circumburst material (Nakar and Piran 2011). Their estimate of the detectability distances for a representative set of sensitive radio telescope reaches a few hundred Mpc. This signal should therefore be detectable, but we note the assumed densities are higher than most of the observed distribution following SGRBs (Fong et al. 2015). We emphasize that this cannot be the only counterpart to a GW detection for it to be reliably associated, given the massive delay time preventing robust association.

Gamma-ray detections of prompt kilonova and kilonova remnants

Kilonova are nuclear powered transients. Our observational understanding of the properties of the ejecta material comes from indirect, model-dependent inferences. We could directly measure the nuclear yield by detecting the nuclear gamma-rays that emit from \(\sim \) tens of keV to a few MeV with a flat spectrum across this range due to Doppler broadening of many lines (Hotokezaka et al. 2016b; Korobkin et al. 2020). No existing telescope can detect this emission unless the event occurs within the local group. The current design of the most sensitive proposed instruments (e.g., McEnery et al. 2019) could detect these signals up to \(\sim \) 15 Mpc, comparable to the prospects for MeV neutrino detections of these events.

However, another option has recently been identified. Based on fiducial BNS merger rates and kilonova ejecta properties both Korobkin et al. (2020) and Wu et al. (2019) discuss the possibility of identifying KNRs in the Milky Way. They make different assumptions but come to the same conclusion that detecting kilonova remnants in the galaxy may be within reach with next-generation nuclear astrophysics missions. The use of these observations is discussed in Sect. 5.2.

Wu et al. (2019) also consider potential diffuse emission from ancient NS mergers that have fully diffused with the Milky Way. The spatial distribution would likely differ from usual galactic distributions given the natal kicks to these systems, but detection prospects are hopeless for decades.

Detections summary

Given the breadth of this total section we provide a short summary tying the observations together. NS mergers may produce observable signatures in all astrophysical messengers across wide ranges in energy and time, as shown in Fig. 8. In Tables 5 and 6 we summarize the rates results of this full section. See the text for a full understanding of the assumptions underlying each number.

Fig. 8
figure8

The observing timescales when detectable emission is known or expected from NS mergers. Because of our greater history (and therefore understanding) of EM observations, we divide this messenger into bands. Intervals where signals were detected for GW170817 are outlined with black boxes. The full color regions are for times with known observations of other SGRBs. The shaded regions cover times where we expect to detect signals in the future

We provide a short summary here for convenience. We assume a base intrinsic BNS merger rate, neglecting any contribution from NSBH mergers. This is used to calculate the GW detections where each network assumes only two co-aligned interferometers (corresponding to the two US-based LIGO interferometers for the next several years). Advanced refers to the current design sensitivity, A+ is the funded upgrade, with Voyager and Gen 3 referring to the proposed future interferometers.

The prompt SGRB and SGRB afterglow rates are based on empirical observations. The joint rates assume a fixed fractional recovery of SGRBs by GW interferometers of a given sensitivity. The Swift BAT joint detection rate comes from scaling the Fermi-GBM values by their relative SGRBs rates. Note that the GW and GW-GRB rates for GBM and BAT are for two interferometer GW networks and are therefore a lower bound. See Sect. 2.6 for a broader explanation.

The kilonova rates are very broad bounds, which account for a more complete GW network than the GW or GRB rates shown in this table. The low end is bound by a base recovery fraction of the low end of the GW detection rates and on the high end by assuming recovery of the majority of intrinsic event rates within a fixed distance. The detections of kilonova following SGRBs assume KN170817-like events and the fraction of SGRBs with measured redshift from following within the maximum detection distance for an assumed sensitivity.

Table 5 A summary of the expected individual detection rates of NS mergers in their canonical signals
Table 6 A summary of the expected joint detection rates of NS mergers in their canonical signals

In broad strokes, all the canonical signals from NS mergers are brighter when observed from a polar position than an equatorial one. In Sect. 2.6 we discuss the effects of inclination bias on joint GW-GRB detection rates, where SGRBs are only visible when Earth is within the jet and GWs are stronger along the total angular momentum axis. Observed kilonova brightness also depends on the inclination angle (e.g., Kasen et al. 2017). If polar ejecta is faster moving than the equatorial ejecta then its brightness is fairly constant regardless of the observer angle. If it is slower then its emission is obscured when viewed from an equatorial region (e.g., Kawaguchi et al. 2020). Equatorial ejecta is brighter when viewed on-axis due to viewing a larger cross section. These conclusions hold for most putative signatures as well (e.g., MeV neutrinos from a thick disk). Overall this may be viewed as a beneficial selection effect for multimessenger astronomy and will result in a larger sample of particularly well characterized events, but will induce biases that must be handled carefully for some science (e.g., standard siren cosmology).

Astrophysical inferences

From the observable parameters for individual events, we may make a number of additional inferences and draw new information from combined information. Section 3.1 discusses the observations that allow identification of NS mergers and classification into BNS and NSBH mergers; and Sect. 3.2 discusses how to determine the immediate remnant object formed in BNS mergers. The potential contribution to the origin of the observed time delay between the GW and GRB emission is discussed in Sect. 3.3. The origin of the early bright UV/blue emission in KN170817, and potential contributions to future events, is discussed in Sect. 3.4. Lastly, how to determine where these events occur, both in spatial position and redshift, and the inferences this information allows with respect to formation channels, stellar formation and evolution, and redshift determination for individual events is discussed in Sect. 3.5.

Progenitor classification and the existence of neutron star–black hole systems

There is no known NSBH system. These systems are thought to be formed through the same field binary formation channel as BNS systems (which we know exist), where instead the primary remnant is either born a BH or becomes one through accretion during the common envelope phase. Determining the astrophysical rates and intrinsic properties of these systems has important implications for the science that can be done with NSBH mergers.

As discussed in Sect. 2.1.3 some NSBH mergers are not expected to have EM signals. Based on current population synthesis models for intrinsic system parameters, the inferred BH spins from LIGO/Virgo observations, and our understanding of which systems will release NS material to power the EM transients it seems likely that EM-dark NSBH mergers exist and that EM-bright mergers could exist (e.g., Foucart 2020). Once we have observed them, they provide a separate handle on stellar evolution (Sect. 3.5), may enable a precise determination of NS radius in a NS merger (Sect. 7.1.4), and may allow for some more stringent measures of fundamental physics (e.g., speed of gravity) with a given network sensitivity (Sect. 8). As they can be detected through GWs to greater distances and are phenomenologically different, they would require different EM capabilities to understand.

Classifying events as BNS or NSBH mergers is critical to ensure pure samples and understanding how these events differ. GW detections of CBC provide information on the progenitor masses. Events with the primary constrained to be under the maximum mass of a NS can be assumed to be BNS systems. Events with the secondary constrained to be over this value can be classified as BBH mergers. This value is currently not known (see Sect. 7.2) but is almost certainly between 2 and 3\(M_{\odot} \). Systems with one mass below this value and one above can be classified as NSBH mergers.

These classifications assume that there are no exotic stars in this mass range and that there is a clear separation between NS and BH masses. For low-mass systems we will tend to precisely measure the chirp mass but poorly measure the mass ratio (unless the event is particularly nearby/loud), so we may expect a significant fraction of events to have inferred individual mass posteriors that cross this boundary. This mass range is particularly difficult to precisely constrain for most events as was shown in Littenberg et al. (2015) who investigate the possibility of probing the existence of the first mass gap of compact objects, i.e. the lack of known NS or BH between \(\sim \) 2 and \(\sim \) 5 \(M_{\odot} \). Assuming this gap exists would make GW classification easier, but this is a strong assumption to make.

Further, the first GW detections of NS mergers require a higher standard of proof for strong classification claims. GW observations can conclusively distinguish between progenitors by finding or ruling out matter effects on the inspiral, characterized by the tidal deformability parameter \(\varLambda \). Constraining this value to be non-zero would exclude a BBH merger and classify the event as a NS merger. Determining between BNS and NSBH merger would then rely on the mass constraints of the primary.

The difficulty of GW measurement of tidal deformability with the current high-frequency sensitivity is demonstrated with GW170817 as despite being one of the loudest events detected thus far and utilizing the precise position from the kilonova detection the final LVC results cannot rule out a BBH merger origin from GW observations alone (Abbott et al. 2020b). In fact, the LVC discovery paper for GW170817 comments that the GW observations alone do not classify the event as a BNS merger, relying on the information provided by the EM counterparts and to make the firm claim (Abbott et al. 2017c), which additionally relied upon the assumption that BHs do not exist in this mass range (Hinderer et al. 2019; Coughlin and Dietrich 2019). For NSBH mergers the inspiral can be dominated by the heavier BH and appear similar to a BBH merger (e.g., Foucart 2020). GW-only classification of these events will not be unambiguous for a large fraction of these events until they achieve sensitivity at higher frequencies.

In the O3 observing run LIGO and Virgo reported the GW trigger GW190814 (Abbott et al. 2020c) which demonstrates many of these difficulties. The precisely determined secondary mass requires the object to be either the heaviest known NS or the lightest known BH, but it cannot be assigned to either class as the boundary between the two is unknown. The precise secondary mass measurement was enabled by a large mass asymmetry and the loud signal, but no evidence for matter effects was observed.

MeV neutrino observations provide another potential direct determinant of the presence of a NS, or even determination of a BNS progenitor if it observes the (meta-)stable NS remnant, but these detections will be very rare for at least a decade (Sect. 2.10).

Given these difficulties, multimessenger detections provide a solution. If there is an associated SGRB we can immediately infer the presence of at least one NS. If the inferred BH mass is sufficiently heavy then the GW-GRB observations can classify the event as a NSBH merger. Otherwise, they can only conclusively state the system is not a BBH merger. This information may be useful in real-time to prioritize follow-up observations once we are in an era where GW-detections of NS mergers are a regular occurrence. There have been searches for quasi-periodic oscillations in prompt SGRB emission (Dichiara et al. 2013), which may occur in NSBH mergers if the spin-axis of the BH was misaligned with the orbital angular momentum axis (Stone et al. 2013). However, it is unknown if the accretion disk will align with the BH equator and precession of the jet may or may not occur (Liska et al. 2017, 2019) in NSBH mergers.

Kilonova observations will provide the strongest indirect evidence for system classification. The predictions for the inferred ejecta mass, average velocity, and electron fraction differs for NSBH mergers and BNS mergers. Delineation between the progenitors and remnants will have to rely on combinations of ejecta mass, velocities, kilonova color, and multimessenger determination of inclination (Barbieri et al. 2019, 2020). A self-consistent picture with GW-determined masses SGRB and kilonova observations will strengthen such claims.

The immediate remnant object in binary neutron star mergers

In NSBH mergers the remnant object will always be a BH because one already exists. In BNS mergers we have the previously discussed (Sect. 2.1.3) four cases: Stable NS, SMNS, HMNS, and prompt collapse to a BH. Determining what mergers produce which immediate remnant objects is key to understanding NS mergers themselves and informs on the NS EOS studies, our understanding of the central engines of ultrarelativistic jets, the heavy element yield distribution, and biases in standard siren cosmology. Figure 9 summarizes the expected differences, collating information from several sections (2.1.2, 2.1.3, 2.1.4, 2.1.5) and is relied upon throughout the paper. While the text and figure represent generally robust expectations and are based on the current understanding of these cases, these will invariably be updated as future multimessenger detections occur and simulations improve. Some current limitations are discussed in Sect. 5.

Fig. 9
figure9

The key expected signatures for the different classes of NS mergers. Left to right corresponds to increasing mass: BNS mergers classed into a Stable NS, SMNS, HMNS, or prompt collapse scenarios, then EM-bright NSBH mergers and lastly EM-dark NSBH mergers. The differing prompt SGRB and kilonovae signatures are shown for each scenario, providing a potential method to distinguish them. Dashed lines indicate the assignment of this signature to a specific scenario is not yet certain, or that the signature is theoretically expected but not yet confirmed observationally. The geometric representations are approximate and intended only as guidelines

Directly classifying remnants can likely only be done with GW or neutrino signals. With neutrino observations we could infer a NS remnant because the \(\sim \) MeV neutrino flux would be in excess of that from the accretion disk. EeV neutrinos should be emitted at late times around long-lived magnetars. Neutrino detections are unlikely to occur with upcoming neutrino telescopes.

The GW merger frequency and strain evolution could reliably differentiate between most of the four cases. For prompt collapse we expect BH ringdown at \(\sim \) 6–7 kHz and an immediate drop in amplitude. For (meta)stable NS remnants the merger would occur at \(\sim \) 2–4 kHz and significant GW emission would remain after merger. In the HMNS case this emission would cutoff in \(\lesssim \)1 s as the object collapsed to a BH. For the Stable NS and SMNS case the amplitude of the GW emission would decrease as the remnant transitioned to the isotropic rotation phase where secular GWs may be released at twice the rotation frequency, which will slowly decrease with time. Distinguishing between Stable NS and SMNS classes with GW observations is unlikely.

With the planned high frequency sensitivity for the Advanced interferometers it may be possible to detect GW emission from a HMNS remnant at 10–25 Mpc, where 25 Mpc is of order a once a decade event (Clark et al. 2014). A direct GW determination requires improved GW sensitivity up to at least \(\sim \) 4 kHz. The A+ upgrade (and similar upgrades) are currently not aiming to be sensitive beyond \(\sim \) 1 kHz. Therefore, we are unlikely to have direct determination of the immediate remnant object within a decade. Then indirect determination using EM observations is the only viable option. Fortunately, there are expectations for significant EM signal variation between remnant classes, guided by theory and simulation.

Below we summarize how the kilonova, SGRBs, and other EM signatures are expected to vary depending on the immediate merger remnant. Because these rely on model-dependent predictions on the behavior of matter in extreme regimes and the scientific results we wish to claim have incredibly important implications, we require a self-consistent understanding to emerge from these distinct predictions and the GW determined masses.

Kilonovae will be the most common EM counterpart, and they should vary significantly between remnant classes. The understanding of the expected differences has come about over the past decade of improvements in simulation and theoretical understanding. The subject was broached with regards to disk winds in Metzger and Fernández (2014), refined for general ejecta type in Metzger (2020), and well described in Margalit and Metzger (2019) and Kawaguchi et al. (2020). We summarize those arguments here and plot representative early spectra to show how this can be done. We emphasize that these are representative cases and variation on observed emission within a specific remnant class is expected to be significant depending on orientation effects, the mass, mass ratio, spins, etc.

However, the underlying differences are robust. In general, the longer the remnant NS lives the more total ejecta will be unbound and it will be systematically bluer. Given enough time, the tidal tails become spiral arms that collide with the dominant NS mass and are released. In the HMNS case ejection at the shock-interface terminates during collapse, but it can continue in the lower-mass cases. The disk-wind ejecta increases with lifetime as a higher fraction of its total mass is unbound. The massive neutrino luminosities alter the electron fraction for much of this ejecta. These are all generic outcomes.

Second, it also will help resolve the origin of the early UV emission. The origin of the early bright UV/blue emission in KN170817 is generally debated. As discussed in detail in Sect. 3.4 the resolution to this question should not affect the relative differences between the cases discussed above, as the UV brightness expected for different models generally scales with NS remnant lifetime. One potential exception is if magnetars cannot power SGRBs, meaning we would only expect jet interactions in the HMNS case.

Fig. 10
figure10

Representative early spectra for the Stable NS and SMNS, HMNS, and prompt collapse cases for events at 100 Mpc. We here assume KN170817 originated from a HMNS remnant and represent this case with the finely tuned model from Kasen et al. (2017). The spectra for the prompt collapse and Stable NS/SMNS cases are generated using the toy kilonova model described in Metzger (2020), using the code to generate the lightcurves in Villar et al. (2017), and were generated by P. Cowperthwaite (private communication). The Stable NS/SMNS case was generated assuming ejecta mass with the properties \(M_{ej}^\mathrm{Blue} = 0.1\,M_{\odot} \), \(v_{ej}^\mathrm{Blue} = 0.3c\), \(\kappa ^\mathrm{Blue}=0.1\,\mathrm{cm}^{2}/\mathrm{g}\) and \(M_{ej}^\mathrm{Red} = 0.005\,M_{\odot} \), \(v_{ej}^\mathrm{Red} = 0.25c\), \(\kappa ^\mathrm{Red}=10\,\mathrm{cm}^{2}/\mathrm{g}\). The prompt collapse case has \(M_{ej}^\mathrm{Red} = 0.005\,M_{\odot} \), \(v_{ej}^\mathrm{Red} = 0.25c\), \(\kappa ^\mathrm{Red}=10\,\mathrm{cm}^{2}/\mathrm{g}\), which neglects a potential subdominant blue component

Kawaguchi et al. (2020) focus on timescales between about a day and a week post-merger and conclude that the peak timescale and luminosity of the infrared emission may enable delineation between the remnant classes. In Fig. 10 we show early spectra for the different cases using representative parameters from Sect. 2.1.3. The early UV/blue emission should very easily distinguish prompt collapse from other scenarios for any observation in the first day or so. The fast evolution of the peak in the HMNS case can be distinguished from the Stable NS/SMNS case, as the latter should brighten over time. This method is advantageous as the initial classification can be done relatively soon after merger, allowing for follow-up prioritization and more precise inferences based on the more complete dataset.

SGRB observations will provide complementary information on the remnant object, and may provide a key signature to discern between a fully Stable NS and a SMNS remnant. It is debated if magnetars can power SGRBs (discussed in Sect. 4.2).

If magnetars cannot power ultrarelativistic outflows, we would only ever observe SGRB emission from mergers that undergo prompt collapse or have a HMNS, where in the latter case the jet will not launch until the NS has collapsed. Stated another way, there should never be a SGRB observed in a SMNS or Stable NS remnant case. We would also observe the non-thermal plateau emission in the HMNS and prompt collapse cases, which would require work to identify its origin.

If magnetars can power ultrarelativistic jets then SGRB observations still provide distinguishing characteristics. The non-thermal signatures of extended emission following the prompt peak and X-ray plateaus in the afterglow suggest late-time energy injection into the jet and led to the development of magnetar central engine theory (this is discussed in Sect. 4.7). Then, we should expect to detect these signatures only in the Stable NS and SMNS remnant cases, and will not observe them in the HMNS or prompt collapse cases. A sharp drop in X-ray flux at the end of the plateau is thought to occur when the NS collapses, providing an observational signature between a Stable NS and SMNS remnant. The X-ray plateaus have been modeled by late-time fallback accretion, but we should be able to distinguish this from a magnetar central engine (see discussions and references in Sect. 4.7).

The time delay from GW to GRB emission is another key piece of distinguishing information, which may provide another way to disentangle what occurs in these sources. From Zhang (2019), the time delay could be up to \(\sim \) 10 s in cases with a magnetar central engine. This is roughly the timescale for the hot NS to cool enough to stop driving baryons from the surface, enabling a clean enough environment for the jet to launch. In other cases the time delay should not exceed a few seconds.

In additional to the potential plateau signature, there are other methods of distinguishing between the Stable NS and SMNS cases. Long-lived remnants will result in a significantly brightened kilonova signature, which could distinguish the cases as Stable NS remnants will be even brighter than SMNS remnants (e.g., Yu et al. 2013; Metzger and Piro 2014; Metzger et al. 2018). There may be an increase in the radio emission from the quasi-isotropic outflow interactions with the circumburst material \(\sim \) years after merger (Metzger and Piro 2014; Fong et al. 2016). There may be differences in the \(\sim \) EeV neutrino emission weeks after merger (e.g., Gao et al. 2013; Murase et al. 2018).

To summarize, with current planned instruments a direct determination of the remnant object for all but the most fortuitous mergers is unlikely for a decade. Until then, we can rely on broadband EM observations to characterize these events. Should a self-consistent picture emerge between the observed kilonova, GRBs, and other signature behavior with the inferred masses from GW observations then we can reliably infer the remnant object outcome indirectly. As much of the science from NS mergers relies on remnant object classification and it likely has significant effects on the observed signatures, determination of the merger remnant for a sample of events is a key goal of observations of NS mergers.

The time delay from merger to prompt gamma-ray burst emission

The total observed time offset for two astrophysical messengers is

$$\begin{aligned} \varDelta t_{\mathrm{observed}}=\varDelta t_{\mathrm{intrinsic}}(1+z)+\varDelta t_{\mathrm{propagation}}, \end{aligned}$$
(10)

with \(\varDelta t_{\mathrm{intrinsic}}\) the intrinsic time delay which is affected by cosmological time dilation \((1+z)\) and \(\varDelta t_{\mathrm{propagation}}\) the induced arrival delays caused during propagation of the messengers from source to observation. Much of the science in this paper relies on \(\varDelta t_{\mathrm{GRB-GW}}\), the observed time offset from the coalescence time as measured by GW measurements and the on-set of the prompt gamma-ray emission. Separating the individual contributions to this term could enable us to determine or better constrain the lifetimes of HMNSs, the speed of gravity, and the emission mechanism of SGRBs, to name a few.

We will show that possible propagation effects for GW to SGRB reduce to violations of fundamental physics. So far these all appear to be zero, which simplifies separation of the total individual terms. Should the propagation term be non-zero we can separate them from intrinsic delays as the cosmological redshift effects on the latter should be negligible for the foreseeable future. Alternatively, if they are hard to disentangle we may require future GW interferometers to detect NS mergers to distances where the redshift will become the dominant term. However, the relevant fundamental physics currently seems rather well supported. We discuss the intrinsic and propagation delay for GW to SGRB emission separately. We discuss the individual terms to show how we can distinguish the relative contributions of each term, or separately constrain their maximal effects.

For BNS mergers we can write (assuming the standard GRB BH central engine, relativistic jet, internal shock scenario):

$$\begin{aligned} \varDelta t_{\mathrm{intrinsic}} = \varDelta t_{\mathrm{collapse}} + \varDelta t_{\mathrm{formation}} + \varDelta t_{\mathrm{breakout}} + \varDelta t_{\varGamma }. \end{aligned}$$
(11)

\(\varDelta t_{\mathrm{collapse}}\) is the time from coalescence to the formation of the BH; \(\varDelta t_{\mathrm{collapse}}\approx 0\) if the event undergoes prompt collapse, else \(\varDelta t_{\mathrm{collapse}} \lesssim 1\,\text{s}\) in the HMNS case. \(\varDelta t_\mathrm{formation}\) is the time until jet formation once the BH has formed, which is expected to be \(\lesssim \)1 s (limited by the cooling time in the neutrino powered jet scenario and the accretion timescale in the magnetically powered case; see Sect. 4.3). If there is previously ejected material in the polar region then the newly formed jet must breakout, where \(\varDelta t_\mathrm{breakout} \approx 1\,\text{s}\) following known closure relations (from Nakar et al. 2012 as applied to SGRBs in Abbott et al. 2017b). Lastly, the jet must propagate outwards until the prompt SGRB emission. The various SGRB emission mechanisms (Sect. 4.6) usually require at least a few minutes of propagation, but with typical bulk Lorentz factors of \(\varGamma \approx 100\) the jet effectively matches the speed of the GWs and the observed time delay is short (generally of order the duration of the burst).

Equation 11 can be modified for different NS merger cases. Another formulation more useful for GRB-specific studies is described in Zhang (2019); we do not use this here for clarity with our immediate remnant object discussion. If magnetars can power SGRBs then \(\varDelta t_{\mathrm{collapse}}\) should be removed from the discussion. In cases with little polar ejecta (prompt collapse BNS and NSBH) \(\varDelta t_\mathrm{breakout}\) is negligible. These differences should enable us to disentangle the relative importance of these individual terms. For example, NSBH mergers have \(\varDelta t_{\mathrm{collapse}} = \varDelta t_\mathrm{breakout} = 0\). Should we observe a particularly short SGRB then \(\varDelta t_{\varGamma }\) is small and \(\varDelta t_{\mathrm{intrinsic}} = \varDelta t_\mathrm{formation}\).

One can write a very general equation to capture the total possible propagation effects:

$$\begin{aligned} \varDelta t_\mathrm{propagation} = \varDelta t_{\varDelta v} + \varDelta t_\mathrm{LIV} + \varDelta t_\mathrm{WEP} + \varDelta t_\mathrm{massive} + \varDelta t_\mathrm{dispersion} + \varDelta t_\mathrm{deflection} + \varDelta t_\mathrm{other}, \end{aligned}$$
(12)

where each term captures induced relative delay during propagation by different effects: \(\varDelta t_{\varDelta v}\) represents different intrinsic velocities, \(\varDelta t_\mathrm{LIV}\) Lorentz Invariance Violation (LIV), \(\varDelta t_\mathrm{WEP}\) relative Shapiro delay, \(\varDelta t_\mathrm{massive}\) capturing velocities of massive particles with a given energy according to Special Relativity (SR), \(\varDelta t_\mathrm{dispersion}\) for dispersion, \(\varDelta t_\mathrm{deflection}\) the delay induced for magnetic deflection of charged particles, and \(\varDelta t_\mathrm{other}\) represents other effects or the unknown. Note that some of these terms are subsets of the other; they are separated in this manner for pedagogical purposes, but see Sect. 8 for a full explanation.

For GWs and SGRBs we can neglect several of these terms. That is, \(\varDelta t_\mathrm{deflection}=0\) because (inter)galactic plasma and magnetic fields do not affect \(\sim \) MeV gamma-rays nor GWs. The gamma-rays have \(\varDelta t_\mathrm{dispersion}=0\), but the GWs may not. We assume \(\varDelta t_\mathrm{other}=0\) for simplicity. This then leaves

$$\begin{aligned} \varDelta t_\mathrm{propagation} = \varDelta t_{\varDelta v} + \varDelta t_\mathrm{LIV} + \varDelta t_\mathrm{WEP} + \varDelta t_\mathrm{massive} + \varDelta t_\mathrm{dispersion}. \end{aligned}$$
(13)

These terms correspond to specific violations of fundamental physics. \(\varDelta t_{\varDelta v}\) is the induced propagation delay for \(v_{\mathrm{GW}} \ne c\). \(\varDelta t_\mathrm{LIV}\) is for different LIV by gravity and light; \(\varDelta t_\mathrm{WEP}\) is the same except for the Weak Equivalence Principle (WEP). \(\varDelta t_\mathrm{massive}\) is the delay induced for a graviton with non-zero mass; and \(\varDelta t_\mathrm{dispersion}\) capturing other potential forms of GW dispersion. Each of these terms and the scientific importance of determining them is discussed in subsections in Sect. 8.

By convention, limits on individual fundamental physics terms are set by assuming the other contributions are 0. Should any of these terms be non-zero a sample of events will be required to determine relative contributions, which is possible because the separate possible propagation terms are most strongly dependent on different parameters.

The precision of the tests of fundamental physics (Sect. 8) that rely on the GW-GRB time offset is determined by how accurately we can model the intrinsic time offset for the event of interest, the redshift, and the observed time offset. We can remove the cosmological time dilation of \(\varDelta t_{\mathrm{intrinsic}}\) if we know z, or calculate z from a known distance and an assumed cosmology. Redshift is likely negligible during the Advanced interferometer era. For the A+ era it may begin to be important, and could become the dominant effect for third generation interferometers.

Fig. 11
figure11

The observed time delay from GW170817 to GRB 170817A. The top panel is the 50–300 keV lightcurve from Fermi GBM and the bottom is a time-frequency map from combining the LIGO observations. Figure is from NASA Goddard which is modified from Fig. 2 in Abbott et al. (2017b)

The total intrinsic time delay is expected to be a few seconds (e.g., Zhang 2019), and potentially up to 10 s in extreme scenarios. Separately constraining the different contributions to the intrinsic time delay unveils great insight into these events (e.g., Li et al. 2016; Abbott et al. 2017b; Zhang 2019). The more precisely we can determine the intrinsic time delay the greater our constraints on fundamental physics. Before GW170817 the LVC prior on this time offset was \([0, +4]\,\text{s}\) with a 1 second addition on either side for safety (e.g., to account for light travel time from distance spacecraft or differing GRB triggering methodologies; see Abbott et al. 2017b and references therein). The time offset from GW170817 to GRB 170817A fell right in the middle of this range, as shown in Fig. 11.

When this redshift is accounted for, and allowing for two-sided constraints, we write \(\varDelta t_\mathrm{intrinsic,z}^{\pm } = \varDelta t_{\mathrm{intrinsic}}^{\pm }(1+z)\). Throughout this paper we assume \(\delta t_{\mathrm{intrinsic,z}} = 2\) s for individual events, giving 1 s uncertainty for two-sided constraints. This is only twice the precision of the prior set by the LVC based only on theory prior to GW170817 (Abbott et al. 2017g). This assumption also makes the results easily scalable, should this precision be unachievable for some events, given each side of a two-sided constraint are set to 1 s precision. This assumption is used in Sect. 8.

The origin of early ultraviolet emission

The observations of KN170817 were broadly consistent with a two-component kilonova: bright blue emission that peaks on the order of a day which fades to redder emission that peaks on the order of a week before fading out of detectability. It was brighter and bluer than expected, including a somewhat surprising UV detection half a day post-merger (Evans et al. 2017). The origin of this emission is debated. Metzger et al. (2018) and Arcavi (2018) discuss most of the theoretical explanations that have been invoked, their successes and limitations, and how future early UV/blue observations can resolve this question. We summarize the options below but refer to these papers for more detail.

The most basic explanation is a kilonova origin for the emission (Villar et al. 2017). This is potentially feasible but has some difficulties, which led to the discussion of other models (e.g., Metzger et al. 2018; Arcavi 2018). Before proceeding we point out a peculiar outcome of these theoretical models: in all cases we may expect the brightest early UV/blue emission to occur for BNS mergers with Stable NS, SMNS, and HMNS cases (in expected brightness listed in decreasing order) with little to no early bright UV/blue emission from prompt collapse or NSBH mergers. This may complicate delineation between these models, but has the benefit that the phenomenological description on indirectly differentiating between BNS merger remnants (Sect. 3.2) is unaffected by the true origin of the early bright UV/blue emission in KN170817.

Jet-interaction effects were invoked by a few teams (e.g., Evans et al. 2017; Kasliwal et al. 2017). In these models the jet is launched after material already exists in the polar region. This can release a portion of very high velocity radioactive ejecta which allows the light to escape much earlier and can provide an additional source of energy through shock-heating. If magnetars can power SGRBs (Sect. 4.2) then we expect jet interaction effects in the Stable NS, SMNS, and HMNS cases with the amount of polar material related to the lifetime of the NS. If magnetars cannot power ultrarelativistic outflows then we only expect jet interactions in the HMNS case. The BNS prompt collapse and NSBH mergers should not have significant material in the polar regions at jet-launch time and should thus have much dimmer blue emission compared to the other cases. However, this model struggles to reproduce the observations of KN170817 as it would require jet kinetic energies beyond anything previously seen (Metzger et al. 2018) but may impart observable signatures in future events.

Metzger et al. (2018) argue the emission can be explained by a neutrino-heated, magnetically accelerated wind from a short-lived magnetar, resulting in mildly-relativistic outflows. Again brightness scales with remnant NS lifetime. Observing a more traditional SGRB or directly inferring relativistic motion through radio interferometry observations or detection of high energy photons can clearly distinguish between these possibilities.

Lastly, free neutrons decay as \(n^0 \rightarrow p^+ + w^- \rightarrow p^+ + e^- + \bar{\nu }_e\) with a half-life of \(\sim \) 10 min. If the fastest moving initial neutrons escape capture they lead the ejecta material, allowing the majority of their decay energy to escape while the photospheric temperatures are still high. This can provide a very blue signature that peaks on the timescale of \(\sim \) hours, before kilonova emission, with a comparable peak luminosity (Metzger et al. 2014). This model is applied to KN170817 in Metzger et al. (2018).

Both Arcavi (2018) and Metzger et al. (2018) point out that early UV and optical observations should be able to distinguish between these potential contributions as their temporal and spectral evolution differ. GRB observations will provide additional information on distinguishing jet interaction effects from the other models. Early UV emission may arise from combinations of these potential contributions.

Host galaxy, redshift, and where neutron star mergers occur

Understanding where these events occur determines their source evolution and thus the (volumetric and detection) rates of these events through cosmic time, inform on their formation channels, and provide information on stellar evolution through constraints on rare evolutionary pathways. The best current observational evidence to answer these questions come from observations of SGRB afterglows, which provided the strongest evidence tying these events to NS mergers prior to GRB 170817A; for an overview see Fong et al. (2015). For a study on how GRB 170817A compares to these observed distributions see Fong et al. (2017). For reviews on the formation channels of BNS systems see Lorimer (2008), for reviews on compact object binaries see Kalogera et al. (2007), Postnov and Yungelson (2014). The standard formation channel for NS mergers is described in Sect. 2.1.1. See Belczynski et al. (2002) for a discussion on other possible formation channels.

There are two methods to determine the redshift of GRBs. The first is from direct measurement of redshift from the afterglow itself. This is common for LGRBs but has only occurred twice for SGRBs, due to the lower overall brightness. The second method is through statistical association to a host galaxy, and then determining the redshift of that galaxy (e.g., Fong et al. 2015, and references therein).

The natal kicks during supernova explosion send a large fraction of NS mergers outside of their host galaxies. More SGRBs are observed outside of the half-light radius of the inferred host galaxy than within, with typical physical offset \(\sim \) 10 kpc and the largest inferred of 75 kpc (see Fong and Berger 2013, and references therein). The assignment of a host galaxy is relatively robust when it is within the half-light radius, and becomes more difficult as the offset increases. The assignment is probabilistic, counting the likelihood of a chance alignment of the source with likely host galaxies. Note that we observe the 2D projection of the 3D offset, e.g., for an event 10 kpc from the host galaxy we can observe it anywhere from 10 kpc offset to directly aligned, depending on our viewing geometry. There is no way to directly separate this effect.

The assignment of a SGRB (or kilonova) to a host galaxy requires localizations with \(\sim \) arcsecond accuracy to be robust. Swift XRT localizations are not sufficient, as the chance alignment of galaxies within typical error regions is non-negligible. There are some SGRBs where no robust host galaxy assignment can be done as there are no potential hosts very nearby in 2D angular offset (\(\lesssim 1'\)), despite deep observational searches. These hostless SGRBs have bright galaxies somewhat nearby in 2D offset (\(\sim \) few arcminutes) in excess of random chance, suggesting at least some belong to these galaxies (Tunnicliffe et al. 2013). This creates an observational bias against associating some particularly nearby SGRBs with their true host galaxy, which is shown in Fig. 12. That is, for a fixed intrinsic offset the maximum observed 2D offset (the vector from host to source being perpendicular to that of host to Earth) can vary by more than an order of magnitude over the observed distance range for SGRBs. This directly corresponds to the host association probability. There is also the obvious bias of more difficult host galaxy detection for distant events.

Fig. 12
figure12

Both arcseconds per kpc and kpc per arcsecond as a function of redshift. This figure shows how we require arcsecond precision for distant events to distinguish host from source, and we may fail to associate nearby events as the probability of this depends on the observed 2D offset for a fixed distance. The black dashed line is the distance to NGC 4993, the grey line is the distance to the furthest claimed redshift for a SGRBs

The figure also demonstrates that the largest inferred intrinsic offset of 75 kpc at the distance of GW170817 would have a 6’ offset from the host galaxy. With EM-only observations we could not associate the source to host in this circumstance. Distance determination through GW observations will alleviate these issues and will resolve some systematic problems with redshift determination of SGRBs (and NS mergers) These observations require sensitive spectrometers, such as the X-shooter instrument on the VLT.

The cosmic rate evolution of NS mergers is not well known. The peak cosmic star formation rate occurred at a redshift of \(\sim \) 1.5–3.0 (e.g., Hopkins and Beacom 2006, or Madau and Dickinson 2014a for a review). We expect the peak rate of NS mergers to track the peak star formation rate modulo the average inspiral time (the lifetimes of massive stars that result in Compact Objects (COs) are negligible). SGRBs will provide the only constraints on the source evolution of NS mergers for at least the next decade. The observed median redshift for SGRBs is inferred to be \(\langle z \rangle_{\rm SGRB} \approx 0.5-0.8\) (Berger 2014), which corresponds to an average inspiral time of \(\lesssim \)5 Gyr; this is a lower limit on the average redshift due to the Malmquist bias and the detection threshold of BAT.

Population synthesis studies are being provided with much improved data from EM observations to test data against (e.g., Brown et al. 2018; Bellm 2014; Ivezić et al. 2019). This will be commensurate with GW observations from LIGO/Virgo and in the future by additional ground-based interferometers and LISA. GW and EM detections of BNS and NSBH mergers will provide some unique information (e.g., Belczynski et al. 2008) to the overall understanding of how stars form and evolve (e.g., Abbott et al. 2017f). The formation channels can be tested from these studies and by greater understanding of the source evolution of these events. However, results on the inspiral time distribution may require \(\sim \) 100 nearby events (Safarzadeh et al. 2019). A detailed description of the input physics and how they are constrained with GW observations of BNS, NSBH, and NSBH mergers is given in Kruckow et al. (2018).

One example is the primary mass gap, i.e., the idea that there is a gap between the heaviest NSs and the lightest BHs (Belczynski et al. 2012). This is borne from the lack of observed compact objects between 2 and \(5\,M_{\odot}\). With the NS merger determination of the maximum mass of a NS (Sect. 7.1.1) we can set a strict boundary threshold. Then, if the mass gap does not exist we would expect to eventually detect a loud CBC merger with a BH posterior contained within the putative mass gap. If it does exist then this would never occur. This would have a number of important implications from the CCSNe mechanism (Belczynski et al. 2012), that implication on stellar evolution, and potentially robust CBC classification from GW-only observations.

This understanding has implications for future detection rates of these events. The SGRBs detection rate is empirically determined and unaffected, but the GW detection rates for Voyager or third generation ground-based interferometers are altered significantly by the evolution of the rates of these events. Further, this has implications for other outstanding questions. Perhaps the best example is the origin of heavy elements, which depends on the source evolution as the modern abundances are determined by the time-integrated history of their creation rate (Sect. 5).

Short gamma-ray bursts and ultrarelativistic jets

During the cold war the Vela satellites were launched to monitor Earth for gamma-ray signatures of nuclear detonations in the atmosphere to enforce the Partial Test Ban treaty between the United States and the Soviet Union. The detection of GRBs in 1967 were initially slightly concerning, before timing annuli placed their origin as outside the solar system, enabling their declassification (Klebesadel et al. 1973). This was the beginning of the study of GRBs, and therefore the beginnings of our observational study of NS mergers.

The first three decades of GRB study were limited to observations of the prompt phase. The high peak energies, and short duration tended to suggest very energetic phenomena as their source (e.g., Strong et al. 1974; Mazets et al. 1981). A cosmological origin for these events would require energetics well beyond anything previously known which strongly suggested a galactic origin. In Euclidean space the observed flux distribution from sources with a homogeneous distribution is \(P^{-3/2}\). Deviation from this power law would require a source distribution where space is non-Euclidean, i.e., to cosmological distances (Meegan et al. 1992; Mao and Paczynski 1992). Sources with a galactic origin have an anisotropic source distribution concentrated in the galactic plane. Data favored an isotropic, inhomogeneous distribution requiring a cosmological origin, with the Burst and Transient Source Experiment (BATSE) on-board Compton Gamma-Ray Observatory (CGRO) the first to hit discovery significance (Briggs et al. 1996).

Another key result from this era was the discovery of two classes, short and long, separated by their prompt duration and spectral hardness (Dezalay et al. 1991; Kouveliotou et al. 1993). This separate has been confirmed by broadband afterglow studies. LGRBs arise from host galaxies, and regions within those hosts, with high rates of star formation (e.g., Fruchter et al. 1999; Le Floc’h et al. 2003; Christensen et al. 2004; Fruchter et al. 2006). Nearby LGRBs usually are followed by a CCSNe (Galama et al. 1998; Cano et al. 2017), giving direct evidence that these events are powered by a subset of CCSNe referred to as collapsars (Woosley and Bloom 2006). SGRBs track older host environments (e.g., Leibler and Berger 2010; Fong et al. 2013), occur outside of their host galaxies (e.g., Church et al. 2011; Fong and Berger 2013), and varied properties of those hosts (e.g., Gehrels et al. 2005; Fong et al. 2013) all matched expectations from NS mergers (e.g., Belczynski et al. 2006). As previously mentioned, the association of GW170817 and GRB 170817A directly confirmed that at least some SGRBs arise from BNS mergers. For a review on these properties see Berger (2014). For a summary of the multiwavelength studies from the first decade of the Swift mission, which largely confirmed these predictions, see Fong et al. (2015).

Fig. 13
figure13

The measured isotropic-equivalent energetics for GBM GRBs with measured redshift. \(L_{\mathrm{iso}}\) is the peak 64 ms luminosity; \(E_{\mathrm{iso}}\) is the total energetics measured over the burst duration. GRB 170817A is both the closest and the faintest by large margins. This figured is modified from Fig. 4 in Abbott et al. (2017b)

With base values the observed fluence of GRBs at Earth from cosmologically distant GRBs would require intrinsic isotropic-equivalent energetics of \(\gtrsim 10^{50}\,\text{erg}\). The prompt emission of GRBs have small intrinsic variability timescales, with the most extreme values being sub-millisecond (e.g., Bhat et al. 1992). Structure at this timescale constrains the size of the central engine, in the non-relativistic case, to \(R \approx c\delta t \lesssim 300\,\text{km}\), requiring a compact central engine. This amount of energy being emitted from such a small volume would result in an enormous opacity for \(\gtrsim \)MeV photons due to pair creation, i.e., \(\gamma + \gamma \rightarrow e^+ + e^-\). The resulting spectrum then must be thermal, incompatible with observations. Paczynski (1986), Goodman (1986) stepped through these issues and determined that GRBs from cosmological distances require bulk relativistic motion. As calculated through various means, typical values of the bulk Lorentz factor \(\varGamma \) are \(\sim \) 100 (e.g., Fenimore et al. 1993; Baring and Harding 1997; Lithwick and Sari 2001; Hascoët et al. 2012).

It was hypothesized that the fast outflows from GRBs would interact with the surrounding matter, emitting synchrotron radiation at lower energies (Paczynski and Rhoads 1993). The first detections of GRB afterglow by BeppoSAX confirmed the cosmological origin by localizing events to distant host galaxies (e.g., Van Paradijs et al. 1997; Reichart 1998). Tying specific bursts to specific distances enables a direct determination of the intrinsic isotropic energetics with some approaching a few times \(10^{54}\,\text{erg}\)s (Fig. 13), which is an energy equivalent to the total mass of the Sun after all of the relevant efficiency factors have been accounted for. This strongly suggested that the emission was not isotropic.

It is now known that the bulk relativistic outflow from GRBs is not isotropic, but is collimated into jets. The isotropic-equivalent energetics are corrected by the factor \(1-\cos (\theta _j)\) where \(\theta _j\) is the half-jet opening angle. With a representative values of \(\theta _j=1-10\,\text{deg}\) this reduces the required energetics by \(\sim 10^2\)\(10^4\) (e.g., Sari et al. 1999; Frail et al. 2001; Panaitescu and Kumar 2001; Racusin et al. 2009; Cenko et al. 2010). Observational evidence in favor of relativistic jets as the origin of GRBs includes constraints on the angular size from the detection of radio scintillation (Goodman 1997; Frail et al. 1997) and direct measurement of superluminal motion of compact emitting regions in both long and short SGRBs (Taylor et al. 2004; Mooley et al. 2018).

We can determine the collimation angle by measuring the jet-break with afterglow studies (Rhoads 1997). The afterglow undergoes early temporal decay that is somewhat counteracted by the increase in the observable region due to the change in Doppler beaming, \(1/\varGamma \), as \(\varGamma \) slows due to jet interaction with the circumburst material. Once the beaming angle encompasses the entire jet the temporal decay steepens to the intrinsic value. This signature was first observed in LGRBs in the late 1990s (Kulkarni et al. 1999; Fruchter et al. 1999; Harrison et al. 1999) while jet-break measurements for SGRBs required the Swift era, where a sample now suggests a typical half-jet opening angle for SGRBs of \(\sim 16\pm 10\,\deg \) ( Fong et al. 2015, and references therein). These jet-break measurements relied on a top-hat model, which is consistent with observations for these GRBs.

The preceding paragraphs discuss the understanding of GRBs that is generally agreed upon. However, there are many important questions related to GRBs that remain unresolved. For a thorough and quantitative discussion on GRBs in general see Kumar and Zhang (2015) for a review article and Zhang (2018) for a book. These publications discuss a number of key outstanding questions relevant to understand GRBs. Multimessenger studies of NS mergers, especially those with detected SGRBs emission in the prompt or afterglow phase, may provide new pieces of information to answer these questions. Section 4.1 focuses how GW-GRB studies can probe the types and relative contributions of progenitors of GRBs. Section 4.2 discusses the possibility of magnetar central engines for ultrarelativistic jets. Determining the possible formation mechanisms for these jets is discussed in Sect. 4.3. The propagation and structure is explored in Sect. 4.4. Their role in the production of other high energy particles is discussed in Sect. 4.5. How GW observations may help us understand the prompt emission mechanism is provided in Sect. 4.6. Lastly, Sect. 4.7 discusses the other non-thermal signatures seen in GRBs and how we can uncover their origin.

The progenitors of gamma-gay bursts

As discussed, the circumstantial but convincing evidence enabled by the Swift mission tied most SGRBs to a NS merger origin. There are two key questions where GW observations and multimessenger studies may improve this question. The first is the direct knowledge of the progenitor system for events detected both in in GWs and as GRBs, as demonstrated by the association of GW170817 to GRB 170817A and the classification as a BNS merger (Abbott et al. 2017b, c). With a larger population of confidently classified events we can constrain the fraction of SGRBs that arise from BNS mergers and those from NSBH mergers, providing a method to determine if their GRBs properties differ.

Second, future GRBs that are determined via EM observations to originate within the BNS merger sensitive volume for the GW network can be conclusively classified as arising from other sources. There are some events that have some properties of the short class and some of the long class, such as GRB 060614 (e.g., Zhang et al. 2007), where a sufficiently sensitive GW network will confirm or reject a merger origin, providing some insight into these ambiguous events.

Further, some fraction of SGRBs may arise from a different origin. Magnetars are NSs with magnetic fields of order \(\sim 10^{15}\,\text{G}\). Some of them produce soft gamma-ray repeater flares, which are \(\sim 10\,\text{ms}\) long and generally softer than SGRBs (e.g., Lazzati et al. 2005). These magnetars sometimes produce a magnetar giant flare, with three having been observed in the Milky Way and its satellite galaxies (e.g., Mazets et al. 1979; Hurley et al. 1999; Palmer et al. 2005). As noted in Hurley et al. (2005), observing such flares that originate in other galaxies, out to a few tens of Mpc, would result in temporal and spectral properties largely consistent with cosmological GRBs. From basic rates estimates it follows that a small fraction of SGRBs, between \(\sim \) 1 and 10%, are from extragalactic giant flares (Ofek 2007; Svinkin et al. 2015).

There are two previously published SGRBs that are strongly suggested to be extragalactic magnetar giant flares (e.g., Frederiks et al. 2007; Mazets et al. 2008) and an initial report of a third case (LVC 2020). Non-detections by the LVC constrain a NS merger origin for the first two to be beyond the likely nearby host galaxy, reducing the options to a giant flare from that galaxy, a SGRBs from that galaxy with an unknown origin, or chance alignment of a cosmologically distant SGRBs (Abadie et al. 2012a; Abbott et al. 2008). These measurements were interesting with the previous generation of LIGO. As joint observations proceed in the coming years with far more sensitive GW interferometers we can more clearly separate SGRBs arising from different progenitors.

The central engines of short gamma-ray bursts

As summarized in Kumar and Zhang (2015), viable central engines of GRBs must be able to launch a jet with enormous luminosities, the jet must be relatively clean of baryons to enable ultrarelativistic speeds, and it likely needs to be intermittent to recreate the observed variability timescales and likely able to reactivate to power the later X-ray flares. Based on these criteria, a hyper-accreting stellar-mass BH is generally accepted as a viable option (e.g., Woosley 1993; Popham et al. 1999; Lee et al. 2000; Lei et al. 2017). Magnetar central engines have also been invoked (Usov 1992; Thompson 1994; Zhang and Mészáros 2001) and appear to easily explain observational signatures observed in tens of percent of SGRBs that require late-time energy injection into the system (Sect. 4.7).

So far, simulations suggest this cannot happen as they may fail to meet the second criterion: if the (meta)stable NS remnant lives for \(\gtrsim 50\,\text{ms}\) the neutrino luminosity strips \(\sim 10^{-3}\,M_{\odot} \) of material from the surface of the remnant itself (Dessart et al. 2008; Fernández and Metzger 2016). Even with \(10^{52}\,\text{ergs}\) (\(\approx 0.1\,M_{\odot} c^2\), the rough total mass of the accretion disk) to power the jet, this small amount of baryonic material could only be accelerated to \(\varGamma \approx 10\), an order of magnitude below the typical values expected for SGRBs (e.g., Lee and Ramirez-Ruiz 2007; Murguia-Berthier et al. 2014). However, we note the observational signatures requiring ultrarelativistic outflows for SGRBs is more sparse than in LGRBs, as demonstrated by Ghirlanda et al. (2018) providing a measurement for 1 SGRBs compared to 67 LGRBs. While this baryon loading has not been resolved theoretically, there are potential paths forward (see e.g., discussions in Metzger 2020).

If magnetars can power ultrarelativistic jets then SGRBs may be generated in the low-mass BNS merger cases so long as the remnant object forms a magnetar (Giacomazzo and Perna 2013; Giacomazzo et al. 2015), could alter the kilonova signatures in the Stable NS and SMNS cases due to jet interactions with the polar material and enormous energy deposition into the system during spin-down (e.g., Yu et al. 2013; Metzger and Piro 2014; Metzger 2020). Magnetar central engines have also been studied for a subset of LGRBs (e.g., Bucciantini et al. 2008; Ioka et al. 2016); however, the baryon content issues for collapsars is different than for mergers, and it is possible that magnetars may only be viable central engines in the latter case.

If these magnetars cannot power ultrarelativistic jets, then only higher-mass BNS mergers produce SGRBs, we would only expect potential jet interactions in the HMNS case and it may suggest magnetars cannot power LGRBs. Resolving this question is related to confirming the origin of the additional non-thermal emission (Sect. 4.7) as originating from magnetar spin-down energy or fall-back accretion, would alter EM signatures of the remnant object (Sect. 3.2), and has implications on the inferred properties of the ejecta from kilonova observations and therefore their production of the heavy elements (Sect. 5).

Either way we can use this information to classify BNS remnant cases, but in different ways (Sect. 3.2). Observing a SGRB, with the non-thermal plateau emission, from a BNS merger confidently classified as a Stable NS or SMNS merger is suggestive of a magnetar central engine. Given the various possible explanations for the plateau emission we will require several detections with confident classification to prove magnetars can power ultrarelativistic jets (Sect. 3.2).

Otherwise, we will never observe SGRBs for these events. Approximately 5% of GW-detected NS mergers in the Advanced era will have jets oriented towards Earth with SGRB emission detectable with current (or funded) missions (Song et al. 2019). With the unknown fraction of mergers that result in long-lived magnetars, and the unknown viable viewing angles for the plateau emission, we likely require several tens of GW detections of NS mergers to confidently rule out this possibility. This will likely be resolved in the A+ era.

Ultrarelativistic jet formation

A related question to the central engines of SGRBs is how the jet itself is launched. With the required bulk Lorentz factors and total energetics seen in GRBs, the jet formation condition requires an enormous energy deposition into environments nearly devoid of baryonic matter, as previously discussed. From Sects. 2.1.3 and 3.2, some NS mergers have relatively empty polar regions (referenced to the total angular momentum axis) providing a natural jet launching site. The viable jet-launch mechanisms depend on the central engine, intimately tying this question to the previous section. As summarized in Kumar and Zhang (2015), there are three mechanisms thought to be viable for BH central engines. We will describe the two most widely discussed options. We refer the reader to that review for more details on all three cases.

One mechanism is through neutrino-antineutrino annihilation (Ruffert and Janka 1998), whereby enormous neutrino luminosities interact as \(\nu + \bar{\nu } \rightarrow e^+ + e^-\) occurs with moderate efficiency and drives a relativistically expanding fireball away from the central engine (e.g., Katz and Canel 1996). The origin of these neutrinos would generally be the thermal emission from the disk which are geometrically exposed to both polar regions. In cases with, providing

$$\begin{aligned} \dot{E}_{\nu \bar{\nu }} = 1.1 \times 10^{52} \mathrm{erg/s} \left( \frac{M}{M_{\odot} }\right) ^{-3/2} \left( \frac{{\dot{M}}}{M\odot /s}\right) ^{9/4} \end{aligned}$$
(14)

with \(\dot{E}\) the annihilation power, M the BH mass, and \(\dot{M}\) the accretion rate (Zalamea and Beloborodov 2011; Lei et al. 2013).

The other commonly discussed option is the Blandford–Znajek mechanism which can extract the rotational power of the BH from the magnetic field of the disk (Blandford and Znajek 1977). These Poynting flux jets appear capable of recreating GRBs observations with representative power

$$\begin{aligned} \dot{E}_{BZ} = 1.7 \times 10^{50} \mathrm{erg/s} a^2_{\star } \left( \frac{M}{M_{\odot} }\right) ^2 B^2_{15} F(a_{\star }) \end{aligned}$$
(15)

with \(a_{\star} \) the spin parameter \(Jc/GMc^2\) where J is the angular momentum of the BH and \(F(a_{\star} )\) is the spin-dependent function that is often approximated (Blandford and Znajek 1977; Mészáros and Rees 1997; Lee et al. 2000; Lei et al. 2013). Note that in this case the neutrino-antineutrino annihilation energy can still be provided to the jet.

Lei et al. (2013) investigate the capability of these two jet-launch mechanisms to reproduce the bulk Lorentz factor and observed intrinsic energetics seen in GRBs by considering the effects of baryon loading. They find that both mechanisms can produce highly energetic bursts with values spanning order of magnitudes, but that the neutrino-antineutrino case generally results in bulk Lorentz factors lower than has been observed. The high magnetic fields required for the Blandford–Znajek case acts as a barrier preventing protons from entering the jet (Li 2000) resulting in a jet with lower numbers of baryons. Given the much larger mass of baryons, compared with electrons, the higher the baryon content the lower the total velocity of the jet (for a given amount of energy).

If magnetars are to be GRB central engines then the jet launch mechanism is related to the enormous large-scale magnetic field (e.g., Usov 1992; Metzger et al. 2008b). From Metzger et al. (2011), Kumar and Zhang (2015), the initially hot magnetar drives baryons from the surface, preventing the launch of an ultrarelativistic jet. Once it cools and the baryonic wind stops, the rapid spindown generates magnetic energy via a dynamo mechanism, launching the ultrarelativistic outflow. The total available energy for this case is related to the spin energy of the magnetar, \(\sim 2\times 10^{52}\,\text{erg}\), which does not appear violated in the GRBs with plateau emission suggestive of a magnetar origin (Lü and Zhang 2014).

Multimessenger observations of GWs and SGRBs provide new information to investigate the viable jet launching mechanisms. In the magnetar case you would expect a longer time delay from the GW-inferred merger time to the on-set of GRBs emission (Zhang 2019). Remnant classification (Sect. 3.2) may provide conclusive evidence proving the viability of the magnetar mechanism. Delineating between the leading mechanisms to power a GRB with a BH will benefit from (future) direct GW measures on the final BH mass and inferred spin, allowing the input of measured instead of assumed values in the above equations. Considering additional information from kilonova and afterglow observations may enable tighter constraints on other parameters, which in the future may favor one method over the other for individual bursts (e.g., Salafia and Giacomazzo 2020).

Propagation and structure

Forming a jet also requires some method of collimation. In SGRBs this can be done by matter surrounding the launch site originating from the expansion of the equatorial ejecta or the dynamical ejecta already in the polar regions (e.g., Mochkovitch et al. 1993; Aloy et al. 2005; Nagakura et al. 2014). The observed half-jet opening angle is \(\sim 16^\circ \pm 10^\circ \) (Fong et al. 2015), with a range of observed values from \(\sim 3^\circ \) to \(\gtrsim 25^\circ \). Given solid angle effects the median observed value is wider than the true value.

The top-hat jet model refers to a conical emitting regions with uniform parameters as a function of angle (Rhoads 1997). They have historically been used to model GRBs because they involve (comparatively) simple math and were capable of reproducing (most) observations. However, structured jets of various forms where the properties vary within the jet opening angle have been considered (Mészáros et al. 1998; Rossi et al. 2002; Zhang and Mészáros 2002; Granot and Kumar 2003; Kumar and Granot 2003; Perna et al. 2003; Panaitescu 2005a) and have been applied to some particularly well observed bursts (e.g., Berger et al. 2003; Starling et al. 2005; Racusin et al. 2008).

GRB 170817A had a few unusual properties. Its isotropic-equivalent energetics were several orders of magnitude less energetic than the known sample (see Fig. 13), there was no detection of the X-ray afterglow from the earliest observation at \(\sim \) 0.5 days post-merger (Evans et al. 2017), and was first detected in X-rays nine days later by Chandra (Troja et al. 2017b). Some key observations since this time are discussed below. Several models were invoked to explain these characteristics, which can be classified into three options:

  • A Top-hat Jet GW170817 was so close that we were able to use radio interferometry observations to prove superluminal motion of the main emitting region, confirming compact bulk relativistic motion, proving a successful jet (e.g., Mooley et al. 2018; Ghirlanda et al. 2019). However, in top-hat jets the afterglow fades in time as a power-law, so the lack of detection in X-rays at first observation rules out a top-hat jet origin (e.g., Troja et al. 2017b; Margutti et al. 2017; Fong et al. 2019).

  • A Structured Jets A structured jet origin is the leading explanation remaining for GRB 170817A (e.g., Margutti et al. 2018; Alexander et al. 2018; Hajela et al. 2019b; Nynka et al. 2018; Lazzati et al. 2018; Fong et al. 2019; Troja et al. 2019). In this scenario GRB 170817A is usually referred to as off-axis, implying the most luminous section of the GRB was oriented away from Earth. The long term monitoring of GRB 170817A has allowed for broadband characterization of the temporal evolution from X-ray to radio over years timescales. This has shown a slow temporal rise to a smooth peak followed by the usually decay rate seen from on-axis jets, as expected once the full jet has slowed enough to be fully visible (see Hajela et al. 2019a, and references therein).

  • Cocoon emission refers to the hot envelope that develops around a jet propagating through dense media (Nakar et al. 2012). Some groups invoked a fully choked jet resulting in cocoon (and shock breakout) emission to simultaneously explain the low luminosity of GRB 170817A, the early afterglow behavior, and the early UV emission of the kilonova (e.g., Kasliwal et al. 2017; Gottlieb et al. 2018). Successfully choking a jet generally requires large amounts of material in the path of the jet (i.e., the polar region) in absorb the large kinetic energies and prevent successful propagation of the jet. As such, choked jets in SGRBs was not widely considered before GRB 170817A given the generally low expected densities in that region, as discussed in Sect. 2.1.3. This has been confirmed by simulations performed after GRB 170817A of jet dynamics in NS mergers (Duffell et al. 2018). While the prompt emission of GRB 170817A is consistent with cocoon closure relations, it would require chance coincidence for this event to occur at the correct distance to produce a burst within all of the normal gamma-ray parameters as measured at Earth (Goldstein et al. 2017a; Abbott et al. 2017b). Such emission is expected to produce spectra with peak energies much below those seen in time-resolved analysis of this burst (Lazzati et al. 2017; Veres et al. 2018). The late-time afterglow emission favors a structured jet origin and disfavor a cocoon origin, as discussed and referenced in the previous bullet. The radio observations of the bulk relativistic motion of the compact emitting region also favors a structured jet (Ghirlanda et al. 2019). Further, powering the early UV emission would require a jet with kinetic energies beyond the previously known sample (Metzger et al. 2018). Together these results strongly suggest that a fully choked jet is incompatible with the broadband observations of GRB 170817A.

A major result of GRB 170817A is the exclusion of the top-hat jet model for this burst. While the choked jet cocoon origin for GRB 170817A now appears unlikely, it may be viable for future events (Gottlieb et al. 2018) and in cases with prompt detection can be tested by the cocoon closure relations in Nakar et al. (2012) to check for consistency (e.g., Abbott et al. 2017b; Burns et al. 2018). This test can be performed within hours of the merger time, allowing for informed follow-up observations. These studies can confirm the viability of GRBs originating from a shock breakout origin or exclude this option shortly after event time and may be particularly interesting when tied to investigations of the merger remnant given the different expectations for material in the polar region (Sect. 3.2).

Jets can be collimated by a density gradient in the polar region, preventing particles from expanding too far from the polar region (e.g., Mochkovitch et al. 1993; Aloy et al. 2005; Nagakura et al. 2014). Magnetic fields can accelerate charged particles in a preferential direction, where ordered poloidal fields can also contribute to jet collimation (e.g., Rezzolla et al. 2011). These interactions impart structure onto the jet where dependencies on the amount and distribution of polar material and the jet launch time are important (e.g., Xie et al. 2018; Geng et al. 2019; Kathirgamaraju et al. 2019; Gill et al. 2019). GW observations provide new information to investigate this question and additional constraints to be met with future models and simulation. The first is a measure of inclination, \(\iota \), where variations in the jet should alter the observed prompt GRBs properties, such as the observed energetics, and combined study may elucidate their structure or, at least, constrain the properties of assumed functional forms (e.g., Mogushi et al. 2019; Williams et al. 2018; Song et al. 2019; Beniamini et al. 2019; Biscoveanu et al. 2020). These studies require detections with non-aligned GW interferometers and sensitive sky coverage in gamma-rays, as stringent non-detections are also informative. Combining this information with the jet-opening angle determined from afterglow observations will be particularly powerful.

The second benefit of GW detections for these purposes is immediate identification of particularly nearby SGRBs; GRB 170817A is so close that it has been observed \(\sim \) 100 times longer than prior SGRB afterglows. Among the key parameters to study GRB structure is the late-time temporal decay of the afterglow, which can distinguish between jetted and quasi-spherical outflows (e.g., Fong et al. 2019). Top-hat jets are predicted to have achromatic jet breaks, but chromatic jet breaks, which are often observed, may allow for inferences on the structure of these outflows (e.g., Panaitescu 2005b), though this explanation is not unique (e.g., Fox et al. 2003; Curran et al. 2007).

GW detections also constrain the jet launch time from studies of the GW-GRB time delay (e.g., Xie et al. 2018; Gottlieb et al. 2018; Geng et al. 2019). Jets that are launched earlier will experience less polar material, potentially providing less collimation and would be more likely to breakout. Jets that launch later may be more collimated, given thick disk expansion or additional dynamical polar ejecta. If they launch too late they could potentially be choked and fail. Studies seeking to understand the delay time are strongly tied to understanding the remnant object in the case of a BNS merger because of the expected variations in the amount ejecta, the distribution of that ejecta, and the expected time to launch the jet (Sect. 3.2).

So far studies of the structure of GRB 170817A and SGRBs in general have focused on either the prompt or afterglow emission separately. This is for the perfectly understandable reason that it is difficult to address the two together, but a successful, general structured jet model will have to simultaneously explain all observables, including historic constraints. For example, it would need to be capable of recreating the inferred \(\varGamma \gtrsim 1000\) observed for GRB 090510 based on Fermi LAT observations of this event (Ackermann et al. 2010), will also have to reasonably reproduce the observed SGRB redshift distribution, reproduce GRB 170817A, and the observed intrinsic energetics distribution (Beniamini and Nakar 2018).

Gamma-ray burst jet composition and ultra high energy cosmic rays

Cosmic rays were first identified more than a century ago, through Victor Hess’s high altitude balloon flight (Hess 1912). This was before the formulation of GR or the postulated existence of the neutrino. These particles carried new information from the Universe to Earth and led to the creation of a new field of study. One of the greatest outstanding questions in astrophysics is the origin of Ultra-High Energy Cosmic Rays (UHECRs), i.e., cosmic rays with energies in excess of 1 EeV. For reviews see Nagano and Watson (2000) or Sokolsky (2018)

When protons are accelerated to high energies in dense environments they generically undergo photohadronic processes, e.g., \(p + \gamma \rightarrow \varDelta ^+ \rightarrow n + \pi ^+\) (e.g., Rachen and Mészáros 1998). These can be followed by leptonic decays \(\pi ^+ \rightarrow \mu ^+ + \nu _\mu \) and \(\mu ^+ \rightarrow e^+ + \nu _e + \bar{\nu }_\mu \), which tie the predicted energies of gamma-rays, neutrinos, and cosmic rays produced in the same interactions; the total observed flux of these messengers are relatively equal, which is suggestive of a common origin (see e.g., Halzen and Hooper 2002, and references therein). Among the problems in determining the origin of UHECRs is the deflection of charged particles by the (inter)galactic magnetic fields and large gyro radii, causing both a propagation delay and altering the arrival direction. In principle we can reconstruct the source direction for a particle with known properties (e.g., mass, energy), but this relies on our imperfect understanding of the (inter)galactic magnetic fields, obscuring the origin even in the best case. It is for this reason that the quest to detect gamma-rays and neutrinos from a common source, with appropriate relative energies, have been used to search for the origin of UHECRs.

Given the ultrarelativistic nature of GRBs and the enormous energetics involved, it is natural to assume they will accelerate some amount of protons to high energies, with simulations showing some level of baryon loading even in the Poynting flux case (e.g., Lei et al. 2013). This led to the suggestion that they may be responsible for UHECRs (Vietri 1995; Waxman 1995) and the idea that a large scale neutrino detector could be used to investigate their potential common origin (Waxman and Bahcall 1997). The short intrinsic timescales and external trigger information would make association (after detection) relatively easy.

While IceCube has indeed found an astrophysical flux of high energy neutrinos (Aartsen et al. 2014), deep searches have never robustly associated these signals with GRBs (Abbasi et al. 2011; Aartsen et al. 2015). This is somewhat of a puzzling finding, as it suggests a very low baryon loading in GRBs jets, despite the general expectation that the baryons are present above the jet-launching site and should be accelerated. It could be that these protons are accelerated to high velocities, but the prompt emission radius is significantly larger than the internal shock scenario, where the photohadronic interactions become less likely due to the lower densities and neutrino production is suppressed (Zhang and Kumar 2013). These non-detections led to suggestions that choked LGRBs, where the jet fails to breakout through the massive star, may be significant sources of neutrinos (e.g., Mészáros and Waxman 2001; Senno et al. 2016).

LGRBs are generally more favorable for these studies than SGRBs, as their higher total energetics should produce a higher neutrino flux and their greater total matter above the jet launch site should result in a higher proportion of choked jets. However, the detection of GW170817 and GRB 170817A resulted in renewed interest in SGRBs as neutrino sources. First, among the issues of choked LGRBs is that they are EM-dark (or at least, extremely fainter than successful jets). If there are NS mergers with choked jets we can identify nearby events through GW detections, which will provide a time and location for joint sub-threshold searches (Kimura et al. 2018) as well as inform on the expected EM counterparts and their behavior. Second, the inferred structure of SGRB jets suggests a higher likelihood of neutrino detection for nearby events identified by GW detections (e.g., Ahlers and Halser 2019).

It is not known for certain what GRBs observations are the most likely to produce detectable neutrinos; however, because neutrino telescopes are all-sky monitors we will have observations of nearly all events. Ideally future studies will be able to detect neutrinos from these events and allow us to study baryon presence in the jet. Alternatively more stringent limits may show that the launch of a relativistic outflow in the presence of baryons is not a sufficient condition for the production of UHECRs or that UHECR production may not require significant neutrino production if the emission radius is large enough (Zhang and Kumar 2013), which may have implications for multimessenger searches for the origin of UHECRs from other sources.

Fig. 14
figure14

A simplified picture of the emission from GRBs. Thermal emission is possible once the jet has passed the photospheric radius. Internal dissipation of the jet releases the prompt GRBs signal, shown here with the internal shocks model. Then, the on-set of afterglow emission occurs when the external shock develops as the jet interacts with the surrounding media. This figure is courtesy of Dan Kocevski (private communication)

The prompt emission mechanism(s) of gamma-ray bursts

GRB jets are often discussed as an ultrarelativistically expanding fireball (e.g., Piran 1999; Yost et al. 2003; Willingale et al. 2007). A basic representation of the emission stages is shown in Fig. 14. The energy density is truly enormous preventing gamma-rays from escaping until the jet reaches the photospheric radius, where opacity becomes low enough to allow light to escape from within the jet for the first time, at \(\sim 10^{11}\)\(10^{12}\,\text{cm}\) (Beloborodov 2010; Kumar and Zhang 2015). Inhomogeneities from the central engine result in shells that propagate outwards with differing bulk Lorentz factors. Fast-moving shells catch slow-moving shells that were emitted at earlier times at \(\sim 10^{12}\)\(10^{13}\,\text{cm}\), releasing the main prompt GRB emission through internal shocks (Rees and Meszaros 1994). Lastly, the jet propagates outwards until the interaction with the local environment creates the afterglow emission via synchrotron radiation (Kobayashi and Zhang 2007).

Except, maybe not. There are those that argue the dominant emission of GRBs originates from a photospheric origin (reviewed in Beloborodov and Mészáros 2017). Or that a Poynting flux jet can release the prompt signal once turbulence and magnetic reconnection hit a critical point, at a distance \(\sim 10^{16}\,\text{cm}\) from the central engine (Zhang and Yan 2010), which has implications for GRBs as the origin of UHECR (Sect. 4.5).

Observations have provided insight, but no full resolution. The broader energy coverage of Fermi has enabled the study of more complex spectral models. For example, Guiriec et al. (2010) fit the prompt emission of three bright SGRB with multiple components, including a thermal component, the main non-thermal component, and an extra power law, which has been seen in additional bursts (e.g., Tak et al. 2019). These components could originate from the three stages (photospheric, internal dissipation, external shock) which can have temporally overlapping signals given the enormous bulk Lorentz factors involved. Alternatively, some explain similar features through synchrotron radiation (e.g., Ravasio et al. 2018). The detector response of gamma-ray scintillators is non-linear, requiring a forward-folding spectral analysis method that still (usually) relies on empirical functions rather than theoretically motivated ones, which significantly complicates these studies.

There are two capabilities that are providing new insight into the prompt GRB emission mechanism. Polarization probes the existence of large-scale magnetic fields, where significant detection of high polarization implies Poynting flux jets (Toma et al. 2009). Population analyses have only recently become available, as these require Compton telescope observations of particularly bright bursts, given the probabilistic scattering angle. Results are not yet conclusive, given the varied results (e.g., Lyutikov et al. 2003; Yonetoku et al. 2012; Chattopadhyay et al. 2019; Zhang et al. 2019; Burgess et al. 2019). Continued advancement in these studies is a promising method to understand the prompt emission mechanism of GRBs. We note that the lower fluence of SGRBs implies their polarization will be measured an order of magnitude less often than LGRBs, but, under the general assumption that GRBs have the same emission mechanism(s), results from LGRBs are likely to be informative.

The other new parameter is the time offset from the GW to GRB emission. These were explored for GW170817 and GRB 170817A in Abbott et al. (2017b) and followed by several wonderful analyses (e.g., Granot et al. 2017; Shoemaker and Murase 2018; Zhang et al. 2018a), as well as those that sought to test or distinguish between leading models (e.g., Meng et al. 2018) or alternative scenarios (e.g., Kasliwal et al. 2017). The separate intrinsic time delay parameters each provide unique information on these events (Zhang et al. 2019). With a large enough sample we can independently constrain the separate parameters, providing tighter constraints, e.g., on the jet launch time and the size of the emitting region at emission time. Tying specific bursts to a known central engine type (Sect. 4.2) or potentially constrained to a dominant jet formation mechanism (Sect. 4.3) will provide additional insights into the viable models.

These studies then require polarization measurements of GRBs, which will be difficult given there is no active Compton telescope. We also need broadband characterization of the prompt SGRB emission in joint GW-GRB detections. Currently, only KONUS-Wind and Fermi-GBM cover the necessary range (\(\sim \) 10 KeV–10 MeV). Several proposed SmallSats cover only a restricted energy range (\(\sim \) 50 keV–2 MeV), largely due to mass limitations (e.g., Racusin et al. 2017; Grove et al. 2019). To constrain the time-resolved \(E_{peak}\) in a majority of SGRBs we require sensitivity to several MeV.

There have been a few detections of the prompt phase of GRBs by telescopes at lower energies, (e.g., Guiriec et al. 2016; Troja et al. 2017a). Broadband characterization, beyond the energy range of the GRB monitors, of the prompt emission would be phenomenally informative for prompt emission mechanisms (see discussion in Kumar and Zhang 2015), so long as their contribution can be separated from a external shock component. This would require either telescopes with massive fields of view, or sufficient early warning from GW detectors.

The origin of other non-thermal signatures

Discussed below are observed or predicted signatures that are likely to be tied to the central engine activity. These includes flares and plateaus in the prompt and early afterglow emission, which are separate from the dominant components. Determining if these events exist and their origin can enable greater understanding of NS mergers, as discussed below.

Short gamma-ray burst precursors

Precursors generally refer to short emission episodes that occur 100 s or less before the main GRB episode. Troja et al. (2010) analyzed Swift data to identify precursor signals, claimed confirmation of these pulses in other instruments, and argue \(\sim \) 10% of SGRBs have precursor activity. Other analyses suggest a lower fraction of potential SGRB precursors in other instruments (e.g., Zhu 2015; Burns 2017; Minaev and Pozanenko 2017; Li et al. 2018). A similar fraction of SGRBs have secondary pulses that succeed the main pulse. There is no analysis showing SGRBs precursors are spectrally distinct from the main emission. As discussed below, the majority of SGRBs occur at distances beyond where we would theoretically expect to detect precursors. Therefore, it appears feasible that previously observed precursors are just lower-flux SGRB pulses. None were observed before GRB 170817A to constraining limits (Abbott et al. 2017b; Li et al. 2018).

There are theoretical models (mentioned below) that predict precursor emission in gamma-rays, X-rays, and radio, with typical luminosities (\(\sim 10^{42}-10^{47}\,\text{erg s}^{-1}\)) and potentially UHECR production. Signals at these luminosities would only be detectable by all-sky monitors if the events are particularly nearby, precluding these models as the origin of some claimed precursors (e.g., the precursor for GRB 090510, which occurred at a redshift of 0.9; Ackermann et al. 2010). Isotropic precursor emission may be expected in these wavelengths from magnetospheric interactions (Hansen and Lyutikov 2001; Metzger and Zivancev 2016; Wang et al. 2018), disruption of the NS crust could produce a short gamma-ray flash (Tsang et al. 2012), or emission from the crust can power an EM chirp (Schnittman et al. 2018). These could give unique constraints on the magnetic fields of the progenitors or on the NS EOS (Sect. 7.2). While these signatures would be emitted before merger time, radio precursors may arrive at Earth after merger being delayed by dispersion.

GW observations will enable a resolution to this question. First, they select nearby events where the expected precursor brightness from theory may be detectable by existing or future GRB instruments. In some models the precursor emission is more isotropic than the jet, and do not necessarily require an associated prompt SGRB. Second, they provide the merger time. This will unambiguously determine if the observed SGRB precursors (relative to the main EM peak) occur before or after the GW merger time, more directly tying precursors to the theoretically-motivated regime or classifying them as prompt SGRBs pulses.

Extended emission and X-ray plateaus

Extended emission describes an observed behavior of longer, lower flux tails following the main peak of some SGRBs. While the main peak of SGRBs is \(\lesssim 5\,\text{s}\), the extended emission can persist for up to \(\sim \) 100 s with the two components having comparable total fluence. This signature was first identified in BATSE data (Lazzati et al. 2001; Connaughton 2002) and has been found in BAT data (Norris and Bonnell 2006). BAT allows for the exclusion of extended emission down to stringent flux limits, and suggests it occurs in \(\gtrsim \)15% of SGRBs, but is not ubiquitous (Lien et al. 2016). Extended emission has rapid variability, tying it to late-time energy injection from the central engine. It could be powered by the spin-down energy of a fast-rotating magnetar which can naturally explain the relatively flat emission over the times of interest, corresponding to the Stable NS or SMNS remnant cases (e.g., Dai and Lu 1998; Gao and Fan 2006; Metzger et al. 2008b; Bucciantini et al. 2011; Fan et al. 2013; Lü et al. 2015). Matching observations may require significant energy losses to GW emission, which would be beneficial for future direct GW detections of long-lived remnants.

A somewhat similar plateau signature has been observed on top of the temporal decay of the X-ray afterglow in some SGRBs (e.g., Rowlinson et al. 2010) and in LGRBs. Evidence for which may exist in up to half of SGRBs afterglows (Rowlinson et al. 2013). These signatures can also be reasonably explained by a magnetar central engine (e.g., Gompertz et al. 2013). It may be possible to detect similar signatures from proto-magnetar winds outside of the observable prompt GRB line of sight (Sun et al. 2017). There are potentially two such detections already (Xue et al. 2019; Sun et al. 2019).

However, there are other models that can result in plateau emission. In the fall-back accretion scenario material is launched with some velocity away from the remnant, but remains gravitationally bound (Rosswog 2007; Kisaka and Ioka 2015). The variability seen can then arise from interactions of this material during fallback (e.g., Coughlin et al. 2020). Other models have been considered, such as a two-component jet model (e.g., Barkov and Pozanenko 2011; Matsumoto et al. 2020). Another explanation that arose with the increased consideration of structured jets following GRB 170817A is high latitude emission creating the observed plateaus (Oganesyan et al. 2020; Ascenzi et al. 2020). For each of these models there are additional predictions that will allow for exclusion in some cases, pending sufficient broadband follow-up detections.

Multimessenger observations could provide an unambiguous resolution to the origin of these non-thermal signatures. If magnetars are the origin then we should only expect these signatures following Stable NS and SMNS cases, corresponding to low-mass GW inspirals and bright blue kilonovae (Sect. 3.2). If they are observed in other cases, and incompatible with a late-time fall-back origin, then we must search for a different origin. It is also possible that there could be multiple causes for the observed plateau emission, which would require a larger number of multimessenger observations to fully understand.

X-ray flares in the afterglow

X-ray flares above the afterglow have also been observed, which differ from plateaus by having a distinct rise and fall (Burrows et al. 2007). Long-lived remnants with high magnetic fields could potentially explain this emission as well (e.g., Dai et al. 2006; Gao and Fan 2006); however, these signatures are more often explained via late-time fall-back accretion (e.g., Fan et al. 2005; Rosswog 2007; Kocevski et al. 2007). Time-resolved multiwavelength observations should be able to distinguish between these models (e.g., Lamb et al. 2019).

There are predicted differences between the progenitor systems, with NSBH mergers having up to an order of magnitude more fall-back material than in BNS mergers (Rosswog 2007). There should also be differences based on the properties of these systems, likely corresponding to the amount of tidal ejecta and being related to the mass ratio of the system. GW measurement of these intrinsic parameters and the multimessenger classification of progenitor system and BNS remnant type should confirm if observations follow expectations and determine if the X-ray flares are indeed caused by late-time fallback accretion.

Synchrotron self Compton

It is generally agreed that the radio to gamma-ray afterglow emission is synchrotron radiation from the external shock (Sari et al. 1998). From the conditions in GRB jets we generically expect Synchrotron Self Compton (SSC) emission. The first public claim of VHE detection of a GRB was for GRB 190114C (Mirzoyan et al. 2019), which has been modeled with a SSC origin (e.g., Fraija et al. 2019; Derishev and Piran 2019; Wang et al. 2019). However, no analysis published so far has performed robust multi-instrument spectral analysis showing a statistical preference for a SSC origin against a base synchrotron explanation, which may also fit the data. Regardless of this specific burst, the detection of SSC emission in GRBs would give phenomenal constraints on several microphysical parameters which would inform a wide range of GRB studies. These would require sensitive VHE observations as close to the on-set of prompt emission time as possible. These constraints will likely be most sensitive for LGRB observations, but the GW identification of nearby SGRBs and the upcoming CTA provide a promising combination to seek SSC emission from a NS merger. The ideal scenario would be distributed CTA coverage of the highest probability region from a GW early warning localization.

Kilonovae and the origin of heavy elements

The origin of the elements is among the most basic questions in existence. As discussed in Sect. 6, Hydrogen, Helium, and Lithium were produced at recombination. Despite 13.8 Gyr of the production of all other elements, these are still the most common by an overwhelming margin. Some of these atoms coalesced into the first stars. Stellar fusion combine the light elements into heavier elements through well understood nuclear reactions. In massive stars these reactions progress to heavier elements until iron, beyond which fusion becomes endothermic. Eventually the star will explode and release copious amounts of elements from carbon through the \(\sim \) fifth row of the periodic table. Boron, Beryllium, and nearby elements are created mostly from cosmic ray spallation.

The heavy elements, those beyond iron, are created by slow and rapid neutron capture processes. The s-process (s for slow) occurs mostly in asymptotic giant branch stars where, over thousands of years, neutrons can be captured into iron seeds from prior supernovae and create heavier elements (Johnson 2019). Here beta-decay is more rapid than the neutron capture. The reverse is true in the r-process (r for rapid), responsible for the heaviest elements including most of the lanthanides and all of the actinides (Burbidge et al. 1957; Cameron 1957), which generally requires material with particularly high neutron density and a low electron fraction. The heaviest (stable) elements must have more neutrons than protons to overcome the massive Coulomb repulsion or else they will radioactively decay to lighter elements. For a recent review on the origin of the heaviest elements see Cowan et al. (2019).

In all the universe, the highest neutron density occurs in NSs. It seems reasonable to investigate the violent births and deaths of NSs as potential r-process generation sites. For a long time the leading candidate for r-process element production were CCSNe (e.g., Meyer et al. 1992; Woosley et al. 1994). However, as simulations improved they showed the large neutrino irradiation of the material shifting the electron fraction to higher values, preventing the formation of significant amounts of lanthanides and actinides (e.g., Martínez-Pinedo et al. 2012; Roberts et al. 2012; Wanajo 2013). There are more complicated scenarios that could potentially resolve these issues. For a very nice summary of the current understanding of r-process sites, particularly with respect to common and rare CCSNe, we refer to the Supplementary Methods in Siegel et al. (2019).

The necessary enrichment rate to reproduce the amount of heavy elements (\(A > 140\)) in the Milky Way, as inferred from the solar system abundances, is \(\sim 2 \times 10^{-7}\,M_{\odot} /\text{yr}\) (Qian 2000). With a fiducial rate of CCSNe per Milky Way-like galaxy of 2.84 per century (Li et al. 2011), or 0.0283/yr, if CCSNe do produce r-process elements the lanthanide yield of individual events must be low, giving an effective constant enrichment of the heavy elements. There are observational evidence that tend to argue against such a scenario as the dominant r-process site. The first comes from observations of \({}^{244}\)Pu in the ocean floor at two orders of magnitude below the expected value from constant r-process enrichment, favoring a rare process (Wallner et al. 2015). Such a measurement relies on using the radionuclide as a natural clock. A second key piece of evidence was the detection of heavy neutron-capture elements in several stars in the dwarf spheroidal galaxy Reticulum II with abundances two orders of magnitude higher than in other such galaxies, again arguing against (relatively) common low-yield events (Ji et al. 2016), with inferred total production capable of reproducing the total r-process production of the Milky Way, suggesting a common origin (Beniamini et al. 2016). The actinide abundances in the early solar system also favor a rare origin (Côté et al. 2019; Bartos and Marka 2019).

Ripping apart NSs promises a neutron-dense, low electron fraction environment. Lattimer and Schramm (1974) were the first to suggest NSBH mergers as r-process sites, followed by Symbalisty and Schramm (1982) suggesting BNS mergers. Freiburghaus et al. (1999) demonstrated the first simulations showing NS mergers could roughly reproduce the observed relative elemental abundances, a result which has been confirmed as simulations have improved. With the apparent r-process production problems in CCSNe and observations favoring rare, high-yield sites, NS mergers became prime candidates for the dominant r-process sites owing to their much lower rate (\(\sim \) 1 per 10,000 years) in the Milky Way. For a review with a historical discussion on the r-process origin and the role of NS mergers see Metzger (2020).

The identification of KN170817 following GW170817 with the broadly expected behavior for a kilonova was the first firm detection of r-process nucleosynthesis. With the inferred ejecta mass from KN170817 and the GW-determined local NS merger rate it appears that NS mergers can be the dominant r-process sites, though large uncertainties remain (Côté et al. 2018). It appeared then, that we had a reasonably consistent understanding of the origin of heavy elements from theory, simulation, and observation.

Then, Siegel et al. (2019) decided to complicate things, by using knowledge gained from KN170817 to re-energize an old suggestion (e.g., Pruet et al. 2004). We briefly summarize their arguments. The observed properties of KN170817 suggest the dominant ejection method came from accretion disk outflows, from a total disk mass \(\sim 0.1\,M_{\odot} \). LGRBs originate from collapsars, which are fast-rotating massive stars that undergo core-collapse, and are powered by accretion disks with characteristic mass \(\sim 3\,M_{\odot} \). In short, the thick disk can maintain an electron fraction (in cases with a BH central engine) sufficiently low to produce actinides. Despite CCSNe being rarer than NS mergers (by a factor of a few, based on the inferred LGRB and SGRB rates) the higher yields (more than an order of magnitude more) of CCSNe suggest they have been the dominant r-process production sites over the life-age of the Universe. The viability of this explanation based on current observational evidence is the subject of on-going work (e.g., Macias and Ramirez-Ruiz 2019; van de Voort et al. 2020).

As noted in Siegel et al. (2019), collapsars would be consistent with the observational evidence that support rare, high yield production sites. They predict an infrared signature somewhat similar in evolution to a red kilonova (though from a much larger ejecta mass) that would follow LGRBs and could be detectable by sufficiently sensitive infrared telescopes such as James Webb Space Telescope (JWST). Should this signature be observationally identified then delineating between the relative importance of collapsars and NS mergers will require a more precise yield measurement for each class, their distributions, as well as their relative rates through cosmic time. More generally, there are other suggested rare types of supernova that would produce high yields of the heavy elements, also discussed in Siegel et al. (2019). We use collapsars as the representative case but note most tests of the two options apply to the larger case of rare supernova.

In Sect. 5.1 we discuss the nucleosynthetic yield of NS mergers, both relative and absolute abundances, and prospects for improving our understanding through simulation and observation of kilonova. Section 5.2 discusses prospects for determining the current lanthanide and actinide enrichment of our own galaxy. Section 5.3 ties these observations to the source evolution of the potential r-process sites and determination of the dominate sites as a function of time through cosmic history.

Heavy element production in candidate r-process sites

There are at least three important observational constraints that r-process sites must explain: they must be able to reproduce the relative and absolute heavy element abundances, and they need to be able to explain the varied r-process enrichment in stars. This latter constraint was relied upon to narrow the candidate sites to NS mergers and rare types of supernova. The relative values can be inferred from the observed solar system abundances, predicated on the assumption that we do not live in an unusual place. Lower electron fractions may not reproduce the low-mass heavy elements (e.g., iron to lanthanides), and higher electron fractions \(Y_e \gtrsim 0.3\) cannot reproduce lanthanides and actinides. Wanajo et al. (2014) first demonstrated with high fidelity simulations that NS merger ejecta composed of varying electron fraction successfully reproduce the full range of r-process elements. However, despite the optimal dataset of KN170817 there were suggestions, but no unambiguous observational proof of production of the heaviest elements (corresponding to the third r-process abundance peak) and an answer may require capabilities that do not yet exist (Kasliwal et al. 2019).

From the arguments suggesting collapsars as potential r-process sites and as checked with initial simulation, collapsars show similar capability to reproduce the observed abundances (Siegel et al. 2019). As stated, most simulations of standard CCSNe scenarios appear unable to reproduce the observed relative abundance pattern. It is a reasonable assumption that the dominant r-process production sites should produce these elements with the relative abundances that are observed in the solar system. We assume this is true in ensemble (e.g., the average production from these events, but not necessarily every individual event) when discussing absolute production.

A great deal of simulation work has been performed to tie the observed UVOIR behavior to the ejecta properties (e.g., Barnes and Kasen 2013; Barnes et al. 2016; Tanaka 2016; Metzger 2020, and references therein). With prior kilonova candidates (e.g., Perley et al. 2009; Tanvir et al. 2013; Berger et al. 2013; Gompertz et al. 2018; Ascenzi et al. 2019) the data was insufficient to reliably constrain elemental production of individual NS mergers (with some published claimed kilonova signatures relying on a single single data point), especially after accounting for the Malmquist bias towards detecting brighter events (and thus inferring higher average yield per event than the true value).

In the first detection of a kilonova following a GW detection the observers hit the limit of precision of existing models. Villar et al. (2017) collated the UVOIR data reported by various groups for KN170817; the results are shown in Fig. 15. Like most authors they identify a red and a blue component, but they favor the addition of a third component with opacity in between the other two.Footnote 7

Fig. 15
figure15

The combined UVOIR lightcurves for KN170817. The data comes from several groups. The three component fit using the toy model from Metzger (2020) is overlaid with solid lines. Image reproduced with permission from Villar et al. (2017), copyright by AAS

We list some of the complications with inferring ejecta properties from the current kilonova models and methods to improve these uncertainties. This is not a criticism of these works. They combined several complicated processes into software frameworks that run sets of efficient simulations to predict the signatures of kilonovae before one was ever observed and studied in detail, sometimes hitting the limits of human knowledge itself. The discussion here to show where progress will have to be made in the next few years to determine the true nuclear production in these events.

In order for kilonova models to be easily utilized to infer ejecta properties from observations they need to provide a range of considered parameter values for comparison. We discuss only two examples out of several options. The kilonova models used in Villar et al. (2017) are constructed from the toy model presented in Metzger (2020), allowing for a broad range of considered ejecta parameters. Kasen et al. (2017) generated a set of models covering a reasonable parameter space using full radiative transport, reproducing KN170817 with a specially tailored model, presenting a range of specific models that data can be compared to. They are broadly similar in behavior, but some important differences remain, e.g., the predicted early UV flux. Coughlin et al. (2018) generated an effective method to interpolate between the available grid models from Kasen et al. (2017), providing an important step towards tying observations to simulations.

Laboratory astrophysics is critical. Early work on tying ejecta parameters to lightcurves that predicted a bright blue kilonova with a peak timescale of about a day by assuming iron-like opacities (Metzger et al. 2010). Using more realistic opacities for ejecta with lanthanides and actinides results in values orders of magnitude higher, which prevents the quasithermal emission from escaping for longer times, resulting in a redder kilonova with lower peak emission on the timescale of a week (Kasen et al. 2013; Barnes and Kasen 2013; Tanaka and Hotokezaka 2013). The dominate contribution to the UVOIR opacities are the bound-bound transitions of the lanthanides and actinides. As discussed in Sect. 2.1.5, the opacities in these papers are calculated from reasonable approximations because we lack the atomic orbital information for these heavy elements which determine the bound-bound transitions. Over the past few years we have improved laboratory and computational determination of these values, work which is critical to improving our estimates of ejecta properties in kilonovae. However, we still do not have key information on individual atoms and much uncertainty remains on how to calculate the ensemble opacities (see discussions in Metzger 2020, and references therein).

Similarly, our current understanding of nuclear physics with regards to the heaviest elements, particularly those far from the region of stability, also limits the accuracy of kilonova lightcurve models. For example, Barnes et al. (2016) investigate a few nuclear mass models to check abundance yields which produce variations in the relative elemental abundances, particularly in the actinides, as well as the fraction of total radioactive energy combined in different decay species as a function of time. The relative \(\alpha \), \(\beta \), and fission decay differences between nuclear models determines the amount of energy deposited into these products, including neutrinos which can escape very quickly, and gamma-rays which can escape before the peak luminosity time. This alters the thermalization efficiency, i.e. how much energy is converted into heat rather than lost, of the radioactivity as a function of time, which effects the lightcurves and thus our inferences of the total ejecta mass (Hotokezaka et al. 2016b; Barnes et al. 2016). Fortunately, upcoming atom smashers, particularly the Facility for Rare Isotope Beams (Balantekin et al. 2014), which has astrophysics as a core science goal, will help improve our understanding of the heaviest elements over the next several years.

The simulations themselves make different assumptions and contain different approximations. They vary the assumed velocity gradients of the ejecta, neutron capture fraction, neutrino treatments, radiative transport schemes, nuclear model, opacities, thermalization efficiencies, magnetic fields, entropies, grid formulations and resolution, NS EOS, etc (e.g., Tanaka 2016; Wollaeger et al. 2018; Kawaguchi et al. 2020; Metzger 2020, and references therein). Over the years papers have been published to resolve the importance of these different assumptions which has led to significant improvements in the accuracy of the models and, in general, trends within models based on different input parameters. As examples, that longer-lasting remnants in BNS mergers result in more and bluer ejecta, that increasing lanthanide and actinide fraction results in redder kilonova, and that the same kilonova can appear with different color based on the inclination angle. However, uncertainty remains with respect to absolute behavior.

As a particular example we consider the magnetically-driven disk winds. Siegel and Metzger (2017) investigated these outflows using 3D GRMHD simulations with approximate neutrino transport, running for \(\sim \) 0.4 s. Extrapolating beyond the end time they conclude that these outflows could ejecta similar amounts of matter as the viscously driven outflows. Fernández et al. (2018) ran simulations for several seconds, providing direct evidence for those conclusions and additional suggest this ejecta could produce a kilonova precursor signal. Miller et al. (2019a) consider full 3D GRMHD simulations with full neutrino transport which significantly altered the electron fraction of the ejected material, suggesting these outflows could power a blue kilonova.

With each increase in fidelity the conclusions were strengthened or even altered, and this may be expected to continue for some time. As an important example, each still assume idealized initial conditions of the magnetic field. MHD instabilities can significantly amplify magnetic fields (e.g., Balbus and Hawley 1991) and their topology is not necessarily simple. Kiuchi et al. (2014) and Kiuchi et al. (2015) study the magnetic fields that develop in BNS and NSBH mergers respectively, showing strong and complicated fields in both cases, but with different topology. Using the information from careful merger simulations as initial conditions for studies focusing on post-merger effects, as in Nouri et al. (2018), may provide more accurate results. These currently limit our ability to infer ejecta parameters from kilonova lightcurves.

Most kilonova models have assumed spherical symmetry for simplicity, but accounting for more realistic spatial distribution results in inclination effects on the observed lightcurves for the same event (e.g., Kasen et al. 2017; Wollaeger et al. 2018). In measuring the ejecta properties of the components in KN170817 nearly every group assumed spherical symmetry for each contributing ejecta region, which is not necessarily a good assumption (e.g., Metzger 2020, ). If a red kilonova emitting region is between a blue kilonova emitting region and the observer the blue emission will be blocked by the bound-bound opacity of the intervening material (Kasen et al. 2017). Even if the view is unobstructed the spatial distribution can alter the inferred ejecta properties. Indeed, accounting for the expected equatorial distribution for the lanthanide-rich material and the polar distribution for the lanthanide-free material for KN170817 suggests a lower overall yield, removing some of the tension with kilonova simulations (Kawaguchi et al. 2020). This is also considered in Bulla (2019) with the additional consideration of the polarization which may provide key additional information.

These are further complicated by the intrinsic variations in mergers themselves. The progenitor system and different immediate remnant cases have vastly different ejecta morphology, velocity, opacity, neutrino irradiation, etc. Within each case the mass ratios, spins, and other intrinsic parameters also cause variation in the observational signature. These are further complicated by potential additional sources of energy and heat into the kilonova, like late-time fallback accretion onto the remnant object (see Sect. 4.7). Astrophysical observations of a large population of varied NS mergers will be particularly helpful in understanding these effects.

Assuming our general understanding is correct, NSBH mergers could release no matter or up to \(0.1\,M_{\odot} \) of lanthanide and actinide-rich ejecta. BNS mergers that undergo prompt collapse will produce similar elements, but in lower abundances. Though, these cases could produce the lighter elements if fast magnetically-driven disk outflows occur. HMNS could release the full range of beyond-iron elements with higher mass elements from the tidal and disk wind and lower mass elements in the polar ejecta, perhaps up to \(\sim 0.05\,M_{\odot} \) based on KN170817. Stable NS and SMNS remnants can release \(0.1\,M_{\odot} \) of the lower mass beyond-iron elements, but only a smaller portion of lanthanides and actinides. To understand the enrichment of heavy elements from NS mergers we will likely need to determine the distribution of yield for these different cases, as well as how often these cases occur.

Fig. 16
figure16

The periodic table showing the heavy elements produced in NS mergers. The color shading is a simplified representation of the wavelengths that probe production of that element, with violet representing UV and near-UV, light blue representing optical and some NIR, and red showing NIR and IR. Figure from Judy Racusin (private communication)

In order to both precisely test existing models and to accurately infer the ejecta properties for a given event, UVOIR observations of GW-detected NS mergers are absolutely critical. Figure 16 shows a basic representation of the elemental yield probed by the different wavelengths. UV observations will help understand the unusual excess seen in KN170817 (Sect. 3.4) which will separate out the contributions of radioactive heating from other potential sources and enable more accurate inferred mass yields. The discovery of the arcsecond position of EM counterparts will almost certainly be dominated by optical observations. Infrared uniquely probe the contributions of lanthanide and actinide-rich ejecta, and provide the latest observations of these events. A full understanding of these sources requires the broadband observations from early to late times, noting that limited band observations can be consistent with multiple parameter combinations. GW detections provide information on the intrinsic parameters which, with the multimessenger determination of the merger remnant (Sect. 3.2), will enable a broad understanding of these sources. The inclination information will be particularly helpful in understanding inclination effects.

Given the complicated nature of these events and our models to understand them, direct determination of nucleosynthetic yield would be helpful. Nuclear gamma-rays can escape beginning a few hours after merger and can carry tens of percent of the total energy of the system (e.g., Hotokezaka et al. 2016b). The emission would be concentrated from a few dozen KeV to a few MeV, bright for a few days, and be a relatively flat spectrum due to Doppler broadening. Such a detection would provide another handle on the ejecta properties that is not dependent on a number of assumptions that the UVOIR determination is. However, this is beyond the capability of existing instruments, and likely beyond the capability of proposed instruments unless we are lucky (Timmes et al. 2019).

Alternatively, one could potentially measure yields of individual elements. This can be direct spectroscopic measurements of individual absorption lines with sensitive IR telescopes weeks after merger when the ejecta has sufficiently slowed to minimize Doppler broadening. Late-time temporal decay in the infrared may be dominated by the decay of individual (or a few) isotopes. These prospects are reviewed in Metzger (2020), who suggest the approaches are promising, though some uncertainty remains.

Observationally measuring or constraining the lanthanide production in collapsars appears phenomenologically similar to that of NS mergers. Siegel et al. (2019) argue a late-time infrared signature following LGRBs detections would arise if they are significant r-process sites. Then, similar modeling to tie the observed light curves to the ejecta properties are required. This may be the only observable signature as the Milky Way is generally too metal-rich for collapsars to occur, preventing study of nearby LGRB remnants.

On-going heavy element nucleosynthesis in the Milky Way

Combining yields from individual events with the GW-determined volumetric NS merger rate measures the local heavy element production from these events. This rate currently has an order of magnitude uncertainty in the 90% range, which should rapidly shrink over the next few years. With the inferred ejecta for KN170817 and the merger rates in the Milky Way from Table 1, BNS mergers alone can robustly create the r-process elements in the Milky Way at the rate required to be the dominant site of r-process.

However, as we begin to constrain the yield distribution of BNS mergers, better constrain the local rate of NS mergers, and determine the relative contribution of NSBH mergers, we will have to consider additional effects. From Tunnicliffe et al. (2013) about 30% of SGRBs are hostless, implying no nearby potential galaxy to deep observational limits. From Fong et al. (2015) some of the SGRBs with reliable hosts also appear to be significantly outside of the galaxy itself. This implies that a few tens of percent of NS mergers are nearly or totally unbound from their host galaxy. Then, the nucleosynthetic yield of these mergers will not contribute to the observed abundances in their galaxies, and we should expect a similar effect for BNS and NSBH systems born in the Milky Way. This consideration does not apply to either CCSNe or collapsars which should track the stellar mass within the galaxy.

The use of radionuclides can uncover recent nucleosynthesis in our own galaxy. That is, explosive nucleosynthesis results in radioactive isotopes. With nuclear reaction networks we can calculate the expected isotopic ratios of some key elements as a function of time for various initial relative abundances. These natural clocks allow constraints on past explosions in the Milky Way. These studies usually rely on recent Supernova explosion (SNe), supernova remnants, or observations of diffuse radioactive emission. Wu et al. (2019) consider diffuse emission from NS mergers suggesting they are well beyond the capability even of any proposed telescope.

Searches for KNR in the galaxy have been proposed (Wu et al. 2019; Korobkin et al. 2020), suggesting detections are possible with proposed \(\sim \) MeV gamma-ray telescopes. The detection of \(^{126}\)Sn lines would identify a past r-process production site, likely limited to events occurring in the last \(\sim \) Myr. Detection of additional lines would enable constraints on the age of the remnant and the relative production of actinides.

Distinguishing between the potential r-process sites could be done through spatial information and yield determination. If the events occur outside of the galactic plane it will favor a BNS/NSBH merger origin; otherwise a rare CCSNe origin. Events with low initial yields will favor basic CCSNe. Events with incredible yields (\(\sim 1\,M_{\odot} \)) would favor a collapsar origin but we do not expect to identify these in the Milky Way. Events with yields \(\sim 10^{-2}\)\(0.1\,M_{\odot} \) would favor a NS merger origin. Delineating between BNS and NSBH merger remnants may be difficult unless multiple lines are detected. In general, the inferred actinide fraction will be informative, with NSBH mergers generally requiring a high value. Most BNS cases do not. The exception is the prompt collapse scenario which may be difficult to distinguish from an NSBH merger with low (\(\sim 0.01\,M_{\odot} \)) initial ejection. Being able to reliably determine what the origin of the r-process site is would require a MeV telescope with line sensitives a factor of a few better than the current advanced proposals.

The other method of direct isotopic determination is through careful cosmic ray studies. Binns et al. (2019) argue that uncovering the relative isotopic abundances of the actinides and comparison of their ratios would constrain the rarity of the currently dominant r-process sites, similar to the constraints of observing \(^{244}\)Pu on the sea floor. They discuss this specifically delineating between base CCSNe and BNS mergers.

The heavy element enrichment history of the Universe

The prior subsection discusses how to resolve the dominant r-process site in the current time. This answer may differ from the site that has produced most of the lanthanides and actinides that now exist. That is, current elemental abundances in the solar system are the cumulative effect of all prior r-process events in the Milky Way. We know that the rates of BNS mergers were higher in the past than they are today (e.g., Berger 2014). The peak rates for CCSNe occur earlier, and the rates of collapsars earlier still. Then, the relative contributions of each potential source varies through the history of the Universe.

The best understood source evolution of these potential sites is CCSNe. Stars that undergo core collapse are massive and have short lifetimes, measured in tens of millions of years, or less than 0.1% the age of the universe. Their creation should largely track the cosmic star formation history which peaked at roughly \(z \approx 1.9\) when the universe was \(\sim \) 3.5 Gyr old (see e.g., Madau and Dickinson 2014b; Hopkins and Beacom 2006). The e-folding scale is \(\sim \) 3.9 Gyr, suggesting half the stellar mass was created before \(z \approx 1.3\). These are effectively the source evolution of CCSNe, with the normalization determined by the current local rate.

In the early universe the source evolution of collapsars should track that of CCSNe (and thus the stellar formation evolution). However, overall, collapsars do not track the environments of CCSNe (Fruchter et al. 2006). It is empirical fact that collapsars strongly prefer low metallicity environments. Given the increase in average metallicity as the universe ages due to elemental enrichment from supernovae (and other processes), then the peak collapsar rate should occur earlier than the peak Star Formation Rate (SFR). This has been confirmed observationally, suggesting a peak rate before \(z \approx 2{-}3\) (e.g., Langer and Norman 2006; Wanderman and Piran 2010).

As previously discussed the formation of BNS and NSBH mergers likely follows the SFR evolution, as they are thought to originate in field binaries of stars that undergo CCSNe, but they have long inspirals that delay the merger times. The observed peak rate is around (or greater than) \(z \approx 0.5{-}0.8\) (e.g., Berger 2014), or when the universe was about half its current age.

Then, we can discuss the relative importance of these sites through cosmic time. If collapsars are important r-process producers they are almost certainly the dominant sites for the first several billion years of the Universe. Heavy elements before a redshift of \(\sim \) 3 are likely attributable to these sources. If CCSNe are r-process sites then they are likely most important around the times of peak SFR, potentially still being sub-dominant during that time. NS mergers of either type are likely to be important in the latest half of the universe and currently the dominant sites, with BNS and NSBH mergers having different yields per event and likely different source evolution.

These studies will have to be done in concert with studies of ancient elemental enrichment (e.g., Macias and Ramirez-Ruiz 2019; Johnson et al. 2019), and as we improve our determination of the SFR. These are key questions in astrophysics and we can rely on continued investment in these areas. We should seek to determine the SGRB source evolution through follow-up observations of prompt signals detected by more sensitive telescopes as a proxy for NS merger source evolution. Identical instruments can provide the same for LGRBs as a proxy for collapsar evolution. We support the use of JWST to seek the infrared lanthanide signature in follow-up of LGRBs.

Standard sirens and cosmology

Cosmology is the study of the Universe on the grandest scales, using observations of the past to understand how it began, how it evolved to its present state, and how it will end. For much of recorded history humanity largely believed in a Geocentric Universe. Copernicus moved us to Heliocentrism through mathematical description. This world view stood until the onset of observational cosmology, little more than a century ago.

Standard candles are EM sources with known intrinsic luminosities, which enable us to determine their distance from the observed brightness and known \(1/d^{2}\) behavior. Cepheid variables were the first known standard candles with luminosities described by the Leavitt Law (Leavitt 1908; Leavitt and Pickering 1912). Harlow Shapley switched us to Galactocentrism when he used Cepheids to infer the distance to the galactic center (Shapley 1918). Soon after, Edwin Hubble used Cepheids to identify other galaxies in the local group as island universes (of Kant’s imagination) distinct from the Milky Way (Hubble 1925, 1929b), moving us to Acentrism, and then used them to prove the Universe was expanding in 1929 (Hubble 1929a). George Lemaître found evidence and provided a theoretical explanation for Hubble’s results in 1927 (Lemaître