The effect of mission duration on LISA science objectives

The science objectives of the LISA mission have been defined under the implicit assumption of a 4-years continuous data stream. Based on the performance of LISA Pathfinder, it is now expected that LISA will have a duty cycle of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\approx 0.75$$\end{document}≈0.75, which would reduce the effective span of usable data to 3 years. This paper reports the results of a study by the LISA Science Group, which was charged with assessing the additional science return of increasing the mission lifetime. We explore various observational scenarios to assess the impact of mission duration on the main science objectives of the mission. We find that the science investigations most affected by mission duration concern the search for seed black holes at cosmic dawn, as well as the study of stellar-origin black holes and of their formation channels via multi-band and multi-messenger observations. We conclude that an extension to 6 years of mission operations is recommended.


Introduction
The Laser Interferometer Space Antenna (LISA) [1] 1 is a space-borne gravitational wave (GW) observatory selected to be ESA's third-large class mission, addressing the science theme of the Gravitational Universe [2]. It consists of three spacecraft trailing the Earth around the Sun in a triangular configuration, with a mutual separation between spacecraft pairs of about 2.5 million kilometres. The laser beams connecting the three satellites are combined via time delay interferometry (TDI) [3] to construct an equivalent pair of two Michelson interferometers. Thanks to its long armlength, LISA will be most sensitive in the millihertz frequency regime, which is anticipated to be the richest in terms of astrophysical (and possibly cosmological) GW sources, including coalescing massive black hole binaries (MBHBs) across the Universe, millions of binaries of compact objects within our Milky Way, and stochastic GW backgrounds (SGWBs) produced in the early Universe (see Ref. [2,4] and references therein).
The science objectives (SOs) and science investigations (SIs) of the LISA mission have been defined under the implicit assumption of a 4-years continuous stream of data, implying that during mission operations, the downtime of the detector is negligible compared to the effective time of data taking. If we define T elapsed to be the time of mission operation (from first light to final shut down) and T data to be the total time of effective data taking, then one can define a duty cycle D = T data /T elapsed ≤ 1. The LISA proposal assumed a duty cycle D > 0. 95 [1]. Based on the performance of LISA Pathfinder (which started scientific operations on March 8, 2016 and took data for almost sixteen months), it is now expected that LISA will have a duty cycle D ≈ 0.75, which, for a 4-years mission, reduces the effective span of usable data to 3 years.
As we move towards mission adoption by ESA, it is necessary to define a mission design that will fulfill the SOs spelled out in the LISA Science Requirements Document (SciRD) [5]. In particular, it is of paramount importance to consider the actual condition of data taking and processing, including a realistic duty cycle. In this study we answer the following questions: are the SOs formulated assuming a 4-years continuous data stream still achieved with a duty cycle D = 0.75? If they are not, can we achieve them through an extension of the mission duration with the same duty cycle D = 0. 75? Under the assumption of a duty cycle significantly smaller than D = 1, some confusion can arise in the definition of mission duration. Therefore, we start by clarifying the conventions adopted in this study: -T elapsed denotes the nominal mission duration, i.e. the time elapsed since LISA is first turned on, until it is turned off for the last time. The LISA SciRD [5] assumed T elapsed = 4 years. -T data denotes the actual length of the usable data stream. If we have a duty cycle D, then T data = D × T elapsed . The current best estimate is T data = 3 years, given the estimated D = 0.75. -T signal is the typical lifetime of a specific signal in band. Depending on whether this is longer or shorter than T elapsed , sources are affected by mission duration in different ways.
According to the above definitions, the LISA proposal SciRD assumed D = 1, corresponding to T elapsed = T data = 4 years. In this paper we investigate the potential science impact of increasing the current lifetime of the LISA mission by considering the following scenarios: The above scenarios can be thought as if there were only a single long gap in the data lasting (1 − D) × T elapsed , occurring either before or after a continuous stretch of data taking.
Besides these continuous-data scenarios, we will also consider scenarios where the (1 − D) × T elapsed downtime is distributed in short-duration gaps. Assuming that the gaps have a probability distribution p(T ) = r exp(−r T ), such that the expected time between gaps is T = dT T p(T ) = 1/r , we can define several gapped scenarios depending on the rate r as: 2 -T4G5: Data for 4 years with gaps of length 5 days such that 25% of the data is lost (i.e. total data stream duration 3 years), with the time between gaps T following a distribution with r = 1/(15 days); -T6G5: Data for 6 years with gaps of length 5 days such that 25% of the data is lost (i.e. total data stream duration 4.5 years), with the time between gaps distributed with r = 1/(15 days); -T4G1: Data for 4 years with gaps of length 1 day such that 25% of the data is lost (i.e. total data stream duration 3 years), with the time between gaps distributed with r = 1/(3 days); -T6G1: Data for 6 years with gaps of length 1 day such that 25% of the data is lost (i.e. total data stream duration 4.5 years), with the time between gaps distributed with r = 1/(3 days).
Since the main scope of the study is to assess how a duty cycle D = 0.75 due to the presence of random gaps affects LISA's capabilities to reach its SOs, we have primarily focused on the comparison between Cases T4G5, T4G1, T6G5, and T6G1 and the LISA-proposal assumption of 4 years of continuous data (SciRD). The paper is organized as follows. The SOs identified in the SciRD document are divided into three main science investigation domains: astrophysics, cosmology, and fundamental physics. Within astrophysics, we further separate SOs according to the relevant GW sources, and we investigate separately MBHBs (Sect. 2); stellar-mass compact objects, both in the Milky Way and at cosmological distances (Sect. 3); and extreme mass-ratio inspirals (EMRIs; Sect. 4). For cosmology, we consider separately the SOs defining LISA's potential to perform standard sirens-based cosmography (Sect. 5) and those related to the detection of putative SGWBs of cosmological origin (Sect. 6). In fundamental physics, we investigate separately LISA's capabilities to constrain dark matter (Sect. 7), test general relativity (Sect. 8), and explore the nature of black holes (Sect. 9). We summarize our main findings in Sect. 10. A detailed mapping of SOs and SIs to the sections of this paper can be found in the summary Table 4 in Sect. 10.
We caution that our simulations are not always homogeneous across SOs. For some signals (e.g. strictly monochromatic or stochastic), to first order, the important quantity to be considered is T data , regardless of the duty cycle. Therefore, in the absence of tools for analyzing data with gaps, we sometimes consider continuous streams of length T data . These details are specified case-by-case in each section below. Moreover, when gaps are included in the calculations, those are assumed to be lost chunks of the data stream that only affect the source signal-to-noise ratio (SNR) calculations. In reality, gaps will also modify the properties of the noise, which can in turn further affect detection statistics and parameter reconstruction of specific sources. More detailed parameter estimation studies (adopting e.g., the data analysis techniques developed in Ref. [6,7]) are beyond the scope of this paper.

Formation, evolution, and electromagnetic counterparts of massive black hole mergers
In this section we consider the impact of the mission lifetime on SOs related to the formation, evolution, and electromagnetic (EM) counterparts of MBHBs. We first examine the effect of the mission lifetime (T elapsed ) and then focus on the impact of gaps of different length given a duty cycle D = 0.75. Our results will be formulated in terms of three timescales: T signal , T elapsed , and T data . Most MBHBs stay in the LISA band for a period of time (weeks, at most months) much shorter than LISA's lifetime, hence T signal T elapsed . This means that the number of observed sources scales linearly with T elapsed . It is therefore important to investigate the effect of gaps of different lengths on the resulting number of detections and compare it to a scenario with a continuous data stream. We thus focus on comparing the SciRD, T4G1, and T4G5 scenarios, with the understanding that results scale linearly for longer mission duration.
We run the light seed (hereafter popIII, since the seeds originate from Population III stars) and heavy seed models used in Ref. [8]. 3 The two models describe the co-evolution of MBHBs with their host galaxies assuming that MBH progenitors are either light (∼ 100M ; popIII remnants) or heavy (10 5 M ) seeds forming at redshifts 15 < z < 20. In both models, MBHBs are driven to coalescence via interactions with stars, gas, and/or a third black hole, and the evolution of their orbital eccentricity is followed self-consistently (see Ref. [8] for details).
Using these fiducial models, in which binary merger timescales (of the order of millions to billions of years) depend on the host galaxy properties, we first assess the impact of gaps on the overall number of detections. We thus generate a Monte Carlo sample of 100 years of MBHB mergers and consider either continuous observations or data with 1 day or 5-day gaps resulting in D = 0.75. To assess the global impact of gaps, we divide this set in 25 chunks of 4 years each and compute the number and SNR distribution of detected systems for the cases SciRD, T4G1, and T4G5. We assume SNR = 8 as a detection threshold.
The results reported in Fig. 1 show that the impact of gaps depends on the nature of MBH seeds. In the heavy seeds case, compared to the SciRD scenario, there is no loss of detections (> 99% detections) in the T4G1 scenario, whereas in the T4G5 scenario 95% of the systems are still detected. Gaps have a stronger impact in the popIII case where, compared to SciRD, 88% and 85% of the sources are still detected in the T4G1 and T4G5 scenarios, respectively. The first thing to notice is that those fractions are always larger than the 75% duty cycle. This is because MBHBs stay in band for weeks or more, as shown by the SNR accumulation depicted in Fig. 2 (from Ref. [9]) for systems of total mass 3 × 10 5 M , 3 × 10 6 M , and 10 7 M at z = 1. Random gaps of few days will remove portions of the signal, but in the vast majority of the cases there will still be enough SNR build-up to guarantee detection. This is especially true if gaps are short and sources have high SNR, which is the case for heavy seeds and T4G1. The longer are the gaps and the lower is the typical source SNR, the higher are In each panel we show the median and the 68% and 95% confidence regions for a sample of 10 4 simulated binaries with the indicated total mass and otherwise randomized parameters (sky location, inclination, polarization, etc.). The mass ratios are randomly drawn in the range [0.1, 1]. (Adapted from Ref. [9]) the chances that sources end up below the detection threshold. This is why gaps are more detrimental if they last 5 days and in the popIII scenario.
Despite introducing a duty cycle has a sub-linear effect on the overall number of detections, there are specific types of sources that might be more severely affected, jeopardizing some of the LISA mission goals. In the following, we focus on the opposite ends of the MBHB spectrum, namely low-mass seeds at high redshift and lowredshift massive systems. Again, we fix T elapsed = 4 years and compare configurations SciRD, T4G5, and T4G1.
The number of observed high-redshift (z > 10), low-mass (M < 10 3 M ) systems is severely impacted by the presence of gaps reducing the duty cycle to D = 0.75. This is due to a combination of features that are unique to those systems: they are often close to the SNR observability threshold (SNR = 8, for MBHBs), they have T signal T elapsed , but at the same time T signal T gap . Therefore, gaps affect pretty much all of these sources and including gaps in the data causes many of them to drop below the SNR threshold. More specifically, in the SciRD case we expect ≈ 25 observable sources with M < 10 3 M in the popIII scenario. This number drops to 10 when we consider configurations T4G5 and T4G1, as shown in the left panel of Fig. 3. These results are qualitatively consistent with the findings of Ref. [10], specifically their Light Seed noSN models, which are similar to the one used here, and the unscheduled gaps scenario with 3-day gaps. For this configuration, Ref. [10] finds that the number of observed sources is reduced by ∼ 50% relative to the case without gaps. However, Ref. [10] used a more pessimistic gap scenario than the one considered here, which led to an effective duty cycle of D 0.65, compared with D 0.75 in our case.
To quantify uncertainties due to model assumptions, we carry out a similar investigation for alternative (more pessimistic) popIII seed models including supernova feedback and other effects that dramatically reduce the number of potential LISA sources (see Ref. [11] for details). We find that the number of detected low-mass (M < 10 3 M ) systems drops from ≈ 10 in the SciRD case to 6 in the T4G5 and T4G1 scenarios. It is therefore clear that including a 75% duty cycle into a four year mission operation baseline is severely detrimental to the observation of seed black holes.
At the other end of the MBHB spectrum, several relatively massive (M > 10 5 M ), nearby (z < 2) sources might experience a significant SNR drop due to gaps, as shown in the top-and bottom-right panels of Fig. 3. About 30% of these sources experience SNR drops by more than a factor of 10. This is more severe for 5-day gaps, in which the merger-ringdown phase of loud signals can be lost entirely. This is emphasized in Fig. 2; especially for massive systems, the SNR is accumulated in a relatively short period at the end of the binary's lifetime, which can be down to few days only. If the detection threshold is SNR = 8, then 1-day gaps should not affect the detection of any of these systems, whereas 5-day gaps would hinder the detection of some of the more massive binaries with mass above ∼ 10 7 M . The sources in the figure are at z = 1, and increasing the source redshift will inevitably shorten the effective SNR accumulation timescale, exacerbating this potential issue. In practice, this also means that, effectively, a 6-years mission with 1-day gaps (T6G1) is almost equivalent to a 6-years mission with 100% duty cycle and no gaps (i.e. T elapsed = T data = 6 years), except for a reduced SNR. However, a drop in SNR also carries a penalty, as it implies a proportional deterioration in parameter estimation and (most importantly) sky localization, which might have consequences when searching for EM counterparts.
We also expect that gaps will lead to selection effects in terms of certain spin configurations. We did not quantify this bias, but we can make some qualitative considerations. The spin-orbit coupling in spinning black hole binaries can delay (hasten) the onset of the plunge phase compared when the spins are aligned (antialigned) with the orbital angular momentum, respectively. This is often called the orbital hang-up effect [12], and it is more pronounced for highly spinning binaries. Therefore, gaps will introduce an observational selection effect: highly spinning binaries with aligned spins will be more likely to be detected relative to other configurations with shorter lifetimes (antialigned, non-spinning, etc.). The highly spinning binaries with aligned spins are also more luminous in GWs, so the two effects would presumably be compounded. This selection effect is expected to be more severe for longer gaps.
Finally, besides considering randomly distributed gaps, which are scheduled or happen without external input, we also propose the following scenario for consideration. Assume a long-lived GW event has already been discovered a month prior to a MBHB merger. Unfortunately, the SNR is too low, and the source can not be localized on the sky, but at some point well in advance of the merger (e.g., weeks earlier) the merger time can be predicted with an accuracy of a day or so. Within this final day it can become possible to localize the source, issue alerts, and enable precursor EM observations, or observations of the merger itself. This detection can be unaffected by gaps if LISA has the capability to adaptively reschedule gaps, when they coincide with the final day of a merger that can be predicted sufficiently in advance. This could significantly mitigate, or eliminate, the deleterious impact of gaps on precursor observations. These findings have important implications for SO2 ("Trace the origin, growth and merger history of massive black holes across cosmic ages"), and in particular SI2.1 ("Search for seed black holes at cosmic dawn") and SI2.3 ("Observation of EM counterparts to unveil the astrophysical environment around merging binaries of the LISA mission"): -With respect to SI2.1, the loss of M < 10 3 M sources at z > 10 caused by gaps is substantial. For the popIII model investigated, the detection rate of such sources is reduced from ≈ 5 years −1 for continuous observation streams (T elapsed = T data ) to ≈ 2 years −1 in the case of the observations with gaps and a duty cycle of 75%. Numbers can be as low as ≈ 1 years −1 for more pessimistic scenarios. It is therefore clear that configurations T4G5 and T4G1 imply a significant loss of detections compared to the SciRD LISA baseline. The only way to mitigate the effect of gaps is by extending the mission duration. Therefore, in order to collect a large enough sample of such sources to ascertain the origin of seed MBHs, an extension to a 6-years mission requirement (i.e. cases T6G5 and T6G1) is warranted. -With respect to SI2.3, the detection rate of M > 10 5 M sources at z < 2 is of the order of 2 years −1 in the investigated models. Because of gaps, about 30% of them will suffer a significant loss of SNR compared to continuous collection of data throughout the mission lifetime, making parameter estimation and, particularly, sky localization problematic. In light of these considerations and in order to maximize the multi-messenger potential of MBHBs, an extension to a 6-years mission requirement is warranted.
Conversely, gaps have a minor impact on SI2.2 ("Study the growth mechanism of MBHs before the epoch of reionization") and SI2.4 ("Test the existence of intermediate-mass black holes"), as they do not pose a critical risk to the detection of the sources relevant for achieving those scientific goals.

Stellar-mass compact objects
In this section we will study the impact of mission duration on resolved and unresolved stellar-mass sources (Sect. 3.1) and on the observability of stellar-origin black holes (SOBHs) similar to those detected by the LIGO Scientific & Virgo Collaboration (Sect. 3.2).

Stellar-mass sources
Maximizing the number of detectable binaries is important to reduce the level of the confusion noise, which further improves the detectability and measurement accuracy of extra-Galactic sources at those same frequencies. This is true even of transients which might occur during the first years of observations, as the improved understanding of the Galactic foreground can be applied retroactively when reanalyzing data from early in the mission.

Resolved sources
Most of the resolved Galactic and extra-Galactic sources at low frequency will be nearly monochromatic, with evolution times much greater than both T data or T elapsed . Thus, gaps will not have strong effects on the majority of the resolved Galactic sources. However, in the cases where the frequency evolution occurs on similar timescales to the duty cycle, e.g., SOBHs (see Sect. 3.2), gaps can reduce the fidelity of the parameter estimation of these sources. Left: the number of detectable UCBs scales between √ T data and T data due to the combined effects of the increased SNR and frequency resolution. Right: the number of detected binaries with measurableḟ (used for breaking degeneracy between chirp mass and luminosity distance, and for identifying interacting binaries) scales more dramatically with elapsed time T elapsed , because it enters the GW phase as T 2 elapsed The Galactic binary signals qualitatively scale as For an isolated binary the SNR scales as ∝ √ T data regardless of duty cycle when not considering losses of data due to windowing or TDI interpolation kernels. Therefore, longer observations are better, but the growth slows down as the observing time increases: the number of resolved Galactic binaries will increase much more quickly between years 1 and 2 of observing than between years 5 to 6. However, in the confusion-dominated regime, the ability to distinguish resolvable binaries from the foreground depends on improved frequency resolution, which scales as ∝ 1/T data . As a result, the number of detectable binaries grows more rapidly than the simple SNR scaling predicts. The actual number of detections lands somewhere in the middle between √ T data and T data (see Fig. 4, left panel). Detailed studies of the Galactic binary population, and the dynamics of individual binaries, depend on measuring the time derivatives of the orbital period. These time derivatives introduce stronger time dependence, but importantly, it is the elapsed time that matters most. The first time derivative of the frequencyḟ is used to distinguish between systems that are likely evolving primarily due to GW emission vs. astrophysical interactions (e.g., mass transfer [13][14][15]). In cases where the orbital evolution is dominated by GW emission,ḟ can break degeneracies in the GW amplitude to determine the sources' chirp mass and luminosity distance. Ref. [16,17] show that the characterization ofḟ with mission durations of 4 and 8 years leads to an increase from ∼ 1100 to ∼ 2800 double white dwarfs (DWDs) and 4 to 10 binary black holes (BBHs) with measured masses.
Theḟ contribution to the GW phase scales as T 2 elapsed , thus at fixed T data the science requirements for Galactic binaries benefit from lower duty cycles (see Fig. 4, right panel). The second derivative of frequency depends even more dramatically on observing time, scaling as T 3 elapsed . The second derivative of the orbital period encodes further details about dynamics (e.g., tidal interactions between binaries! [18]) and gives an independent measure of chirp mass as a consistency test in the case of assumed GRdominated period evolution. Systems with measurablef will be comparatively rare, with O(10) sources providing constraints to better than ∼ 20% after T elapsed ∼ 8 years. While a longer observing time from a longer mission duration will yield more resolved sources, in the case where duty cycles are being considered, maximizing T elapsed is more impactful to SI1.1 ("Elucidate the formation and evolution of Galactic binaries by measuring their period, spatial and mass distributions") and SI1.2 ("Enable joint gravitational and electromagnetic observations of Galactic binaries to study the interplay between gravitational radiation and tidal dissipation in interacting stellar systems"), than maximizing T data alone.

Unresolved foreground
The unresolved foreground confusion noise can be characterized as [19] where f is the frequency, f 1 and f 2 are the break frequencies, f knee is the knee, A is the overall amplitude, and α is a smoothing parameter. This reduced empirical model was adopted after performing the analysis described above in this section, on the same catalog of sources, but considering different durations of the mission. Based on simulated LISA TDI time series data with total observation duration of T data, max = 10 years, and estimated confusion noise for different fractions of T data, max , the parameters f 1 and f knee of Eq. (2) are related to the observation duration T data as: where the parameters a 1 , a k , b 1 , and b k depend on the SNR threshold for detectability of Galactic binaries. One of the most relevant characteristics of this unresolved foreground is f knee , which roughly indicates the boundary between the stochastic and resolvable parts of the signal and scales as f knee ∼ T −0.4 data , a rather mild function of the observation time. However, the reduction in the stochastic foreground has an important impact on the SNR of other sources.
SOBHs generally have observable signal durations such that T signal > T elapsed . This makes the assessment of the impact of mission duration less straightforward compared to, e.g., MBHBs. The signal duration T signal is also much longer than the duration of typical gaps thus, to first order, gaps will simply cause the SNR of the source to diminish by D 1/2 . To simulate the impact of data with gaps, we therefore artificially reduce the amplitude of the GW signal by D 1/2 , where D = 0.75; because of this, configuration T4C is essentially equivalent to T4G1/T4G5, and configuration T6C is equivalent to T6G1/T6G5.
To investigate the effects of changes in mission duration, a SOBH population was simulated with a comoving merger rate density of 35 Gpc −3 years −1 , with masses distributed flat in log space and a maximum mass cut for the primary BH of M 1 = 50M . We show the results of 1000 realizations of LISA observations for two scenarios (continuous data or data with gaps) in Fig. 5. We find that the number of SOBHs that can be identified with SNR> 8 increases from an average of 10, for 3 years of continuous data to an average of 19, for 4.5 years of data. This corresponds to a N ∝ T 3/2 data scaling. The number of SOBHs observed by LISA depends on T data rather than T elapsed . In practice, 4.5 years of continuous observations yield the same number of detections as 6 years of observations with 75% duty cycle, since the gap duration of both the T6G1 and T6G5 scenarios are much shorter than T signal . The number of potential multiband sources observable by ground-based detectors within 10 years of LISA observation also roughly doubles when increasing T data by 50% in scenarios T6C/T6G1/T6G5, going from ≈ 1.5 to ≈ 3, again assuming SNR> 8. By increasing T data from 3 to 4.5 years, the chance of a simulated Universe realization yielding zero multiband sources with SNR> 8 ( f bad , shown at the bottom of Fig. 5) decreases from ≈ 20 to ≈ 5%.
These findings have an impact on SO4 ("Understand the astrophysics of stellar origin black holes"), both SI4.1 ("Study the close environment of SOBHs by enabling multi-band and multi-messenger observations at the time of coalescence") and SI4.2 ("Disentangle SOBH binary formation channels") of the LISA proposal. The possibility of observing extra-Galactic SOBHs with LISA has been realized following the detection of GW150914. Early investigations suggested that LISA might observe up to several hundreds such sources, with few tens of them qualifying as multiband sources [20]. Subsequent downward revisions of SOBH merger rates, together with the relaxation of the LISA high-frequency sensitivity requirement, severely affected the expected numbers of SOBHs, jeopardizing the achievement of SOs listed in the LISA proposal.
A 4 year mission with a 75% duty cycle (T4C, T4G1, T4G5) will observe on average between 1 and 2 multiband sources with SNR > 8, with a 20% chance of observing none, thus completely failing the the SI4.1 science objective. Extending the mission requirement to 6 years (T6C, T6G1, T6G5) will result in a rough doubling of multiband sources, reducing the risk of getting zero such sources to ≈ 5%. Disentangling competitive SOBH formation channels based on eccentricity measurements for science objective SI4.2 requires a sizable number of detections. For example, based on calculations from Ref. [21], the ≈ 10 detections expected for T data = 3 years (T4C, T4G1, T4G5) will not even allow us to distinguish between the main field and cluster formation scenarios at a 2σ level. Already with ≈ 20 observations, allowed by T data = 4.5 years (T6C, T6G1, T6G5), the discriminating power will increase to > 3σ .
The detection numbers reported above are ultimately very sensitive to the intrinsic SOBH rate and to the maximum BH mass allowed by the pair instability gap. In particular, the existence of SOBHs with M > 50M would significantly increase the number of LISA detections. The SOBH landscape will become clearer with the release of the complete catalog of LIGO-Virgo O3 data. Given our current knowledge, extending the mission duration requirement to 6 years might be crucial to achieve SO4 of the LISA proposal.

Detecting SOBHs from O1/O2 LIGO-Virgo catalogs
For concreteness, we consider the three loudest BBH systems in the LISA band from the O1/O2 LIGO-Virgo catalog [22]: GW150914, GW170104 and GW170823. For each of these three systems we find the best (for LISA) sky position and polarization. We estimate the SNR distribution based on posterior samples from the Gravitational Wave Open Science Center [23], assuming that the system merges in 10 years from the moment of observation. By considering an observation time T data and a 100% duty cycle, we find the SNR values summarized in Table 1. In addition, given the distribution of SNR we give the probability (in percentage) of the source being above the detection threshold (SNR > 8). As an example, for GW150914 optimally positioned on the sky, we find a best SNR of 12.34 (for 6 years of observation), a mean SNR of 7.21 (based on the parameters uncertainties inferred by the LIGO-Virgo Collaboration), and a probability of having SNR > 8 after 6 years of observation of ≈ 25%.  We now consider how parameter estimation for the three systems above is affected by the observation time. We vary the merger time between 7 years and 20 years from the start of LISA observation. Because these results are obtained using a Fisher matrix analysis, small fluctuations due to numerical evaluation of derivatives and inverting badly conditioned matrices are possible.
For each source, in Fig. 6 we show the SNR, the relative error on the chirp mass M c , and the absolute errors on the symmetric mass ratio η and on the well-measured effective inspiral spin combination χ + = (m 1 χ 1 + m 2 χ 2 )/(m 1 + m 2 ), where χ 1 is the spin aligned with the orbital angular momentum of the primary, and χ 2 is for the secondary. The chirp mass is always well determined, while the mass ratio and spins are well determined only for systems which are not far from merging. With 4 years of observation we can hardly constrain the mass ratio and spins, whereas with 6 years of observation we can constrain the parameters of chirping systems. The black dashed line corresponds to our (optimistic) detection threshold SNR = 8. Because these results are obtained assuming 100% duty cycle, a lack of data from smaller duty cycles affects the SNR roughly as the square root of the duty cycle, so all reported errors will increase in the same proportion.
The above results have direct impact on the detectability of GW150914-like systems as defined by science requirement SI4.1 and on evaluation of binary parameters for disentangling competitive SOBH formation channels defined by science requirement SI4.2. They support the recommendation that an extension of the mission lifetime to 6 years (T6C, T6G1, T6G5) is desirable.

Extreme-and intermediate-mass ratio inspirals: detection, characterization, population
Extreme mass-ratio inspirals (EMRIs) consist of a stellar-mass compact object inspiralling into a MBH. The mass ratio is typically expected to be ∼ 10 −4 -10 −6 , meaning that the system completes many orbits emitting GWs in LISA's frequency band.
Tracking the orbital evolution hence enables precision measurements of the system's properties and a characterization of the spacetime of the MBH. For this reason EMRIs are important for understanding the astrophysics of MBHs and their environments and for testing the Kerr nature of black holes. More extreme mass-ratio systems, such as those composed of a substellar-mass brown dwarf and massive black hole, are known as extremely large mass-ratio inspirals (XMRIs). These evolve even slower than EMRIS, negligibly changing over the lifetime of the LISA mission. Less extreme mass-ratio systems, such as either an intermediate-mass black hole and a MBH, or a stellar-mass compact object and an intermediate-mass black hole, are known as intermediate mass-ratio inspirals (IMRIs). These evolve quicker than EMRIs, and are more comparable to MBHBs or SOBHBs. We concentrate here on canonical EMRIs. Changes to observing time, mission duration and gaps can effect the measured SNR (Sect. 4.1), make it more difficult to track the phase (Sect. 4.3), and affect the total phase across the observations (Sect. 4.3). These effects can change the number of detections and the precision to which we can perform measurements.

Changes in SNR
EMRIs are long-lived signals that accumulate their SNR over the observable lifetime of the inspiral. The number of detectable events increases faster than linearly with observing time T data . This is because while the number of EMRIs merging goes linearly with time, we also integrate for longer, meaning that quieter signals can accumulate sufficient SNR to become detectable. Number of EMRIs observed with SNR> 8 as a function of T data for two representative models from [24]. The plot shows that T data sets the number of detections, regardless of the presence of gaps, and that the number of detections is roughly ∝ T The presence of gaps will decrease the SNR: to first order, the presence of a gap is effectively equivalent to changing the mission lifetime. The final parts of the signal are the loudest, so gaps during these times have the greatest cost.
To support these statements, we ran representative models from Ref. [24] with the same assumptions made for SOBHs in Sect. 3. Results are shown in Fig. 7. Similarly to the SOBH case, the number of observations is set by T data , regardless of the presence of gaps, and we find N ∝ T 3/2 data . Maximizing the potential for detection is extremely important if EMRIs are rare. This could be the case if tightly bound low-mass objects like brown dwarfs around MBHs are common [25,26]. These XMRI systems would not be detectable at cosmological distances, but they could disrupt the evolution of EMRIs, leading to scattering of the EMRI compact object before it enters the LISA band.

Missing phase
As EMRIs are long-lasting and slowly evolving signals, we should be able to track the GW phase across interruptions, enabling us still to perform matched filtering to dig the signals out of the noise. Complications arise if there is a more sudden and distinctive change in phase during a gap.
A significant change in the phase evolution could happen if the EMRI passes through a transient resonance. These can occur due to radiation reaction in completely isolated systems (self-force resonance [27]), or the tidal perturbation from a small third body (tidal resonance [28]). Transient resonances are common, but only a few should have a noticeable impact [29][30][31]. While missing the observation of a transient resonance would mean that we would not have the data at the time of the phase jump, this need not be a significant problem for detection or parameter estimation.
Even though the change in phase is extremely sensitive to the orbital parameters on resonance, templates that account for resonance will still allow coherent filtering of the pre-and post-resonance data. This could be done in a fully modeled and self-consistent way [32,33], or through the addition of phenomenological resonance parameters [34]. An alternative approach is semicoherent analysis, which could enable the phase jump to be reconstructed without the use of resonance models.

Extra phase
Extending the mission lifetime T elapsed means that there is potentially a greater observable phase change across the observing window. Assuming that the evolution can be tracked across the entire mission (even if semicoherent methods are used for initial detection, it may be possible to perform a coherent follow-up analysis), we can measure the total phase evolution, tracking its change with time even if there are gaps.
The extended baseline gives greater sensitivity to quantities which affect the phase. This means greater measurement precision for parameters at a given SNR, which are essential for meaningful tests of relativity and the Kerr solution if the number of observed EMRIs is low. Measurements of environmental effects may also benefit from this extra observation time, as the phase change can increase superlinearly: for EMRIs in accretion disks, the scaling may be ∼ T 2 elapsed -T 4 elapsed [35]. Overall, since EMRIs are long-lived signals, data gaps are unlikely to cause a significant loss in scientific performance for astrophysics, provided that waveforms and analysis algorithms are developed to account for gaps. However, the presence of gaps will reduce the overall observing time, which could have an impact on the measurement precision. Long gaps might also discard valuable information about transient effects such as resonances, or potential high-frequency effects such as quasinormal bursts [36,37]. An increase in mission lifetime enables observation of a greater change in phase, enabling more precise measurements at a given SNR, assuming that the phase can be tracked coherently across the entire duration.
In summary, although LISA's SO3 ("Probe the dynamics of dense nuclear clusters using EMRIs") can likely be achieved by a 4-years mission, several aspects of EMRI observations have superlinear scaling, indicating a clear preference for an extension of the mission lifetime requirement to 6 years.

Estimation of cosmological parameters
We report here on the impact of data stream duration with and without the presence of gaps on SI6.1 ("Measure the dimensionless Hubble parameter by means of GW observations only") and SI6.2 ("Constrain cosmological parameters through joint GW and EM observations").

Measurement of the Hubble parameter with EMRIs
In the SciRD, SI6.1 concerns the capability of LISA to constrain the Hubble parameter today, H 0 , by using SOBHB and EMRIs as luminosity distance indicators, together with a statistical technique to identify the redshift, based on the cross-correlation of the GW measurement with galaxy catalogs. Preliminary results using only EMRIs as distance indicators hinted to the fact that with 4 years of continuous data it is possible to constrain the Hubble parameter today to about 1.7% at 1σ (cf. Fig. 8). The analysis also considers 10 years of continuous data, finding in that case the 1σ uncertainty Interpolating between these two results with a scaling of the relative error proportional to 1/ √ T data one would obtain that a 5-year mission with D = 0.75, corresponding to 3.75 years of continuous data stream, is necessary to fulfill SI6.1, i.e. providing a measurement of H 0 to better than 2% at 1σ .

Measurement of the cosmological parameters with MBHBs
We now turn to SI6.2, which refers to the capability of LISA to constrain cosmological parameters using MBHB as luminosity distance indicators, together with EM counterparts to determine the redshift.
For this analysis, we adopt the methodology developed in Ref. [39]. The technique to identify the counterpart can be either direct observation of the host galaxy (in particular, we modeled detection with the LSST), or connecting the GW source with a transient occurring at the moment of the MBHB merger, e.g., a radio jet. In this last case, we have implemented sky localization with the SKA and redshift identification from the host galaxy with the ELT [39]. We analyzed three astrophysical models for the formation of the MBH, two with high-mass seeds (Q3d, which provides the lowest number of sources, and Q3nd, which provides the highest one) and one with low-mass seeds (popIII, giving an intermediate number of sources) [39].
We analyzed the following duration scenarios, all with D = 0.75: continuous data stream of 4 years, 5 years and 6 years (T4C, T5C and T6C); 4 years data stream with 1-day and 5-day gaps (T4G1, T4G5); and 6 years data stream with 1-day and 5-day gaps (T6G1, T6G5). Figure 9 shows the distribution of standard sirens as a function of redshift for the different duration scenarios. The majority of standard sirens resides in the redshift range 1 < z < 3 (the more optimistic astrophysical model Q3nd presents a significant number of sources also at z < 1). The number of standard sirens scales linearly with the data stream duration, and the scenario providing the highest number Fig. 9 Number of standard sirens as a function of redshift for the 7 mission duration/gaps scenarios, from left to right in the low mass seed MBHB formation channel (popIII), and in the two high mass seeds ones (Q3d and Q3nd) of standard sirens is T6G5. In scenarios with gaps, it is less likely to completely miss a source, while shorter and more frequent gaps lead to the highest SNR loss.
In Fig. 10 and Table 2 we present the 1σ relative uncertainties on h and Ω m , where h = H 0 /(100km s −1 Mpc −1 )) and Ω m is the relative fraction of (dark) matter energy density today, for all 3 MBHB astrophysical formation channels and all data stream duration scenarios, The uncertainties naturally scale inversely to the square root of the number of standard sirens: therefore, the best case scenario is the one with 6 years data stream, and 5-day gaps. We adopt as a Figure of Merit a threshold error on H 0 less than 4% for at least two formation channels. This is met by two of the duration scenarios: 6 years data stream with 1-day and 5-day gaps, T6G1 and T6G5. The error on H 0 strongly depends on the MBHB formation channel. In the best case (Q3nd, featuring high-mass seeds with no delay in the binary formation) it is always smaller than 3.5%, while in the worst case (PopIII, featuring low-mass seeds) it can grow to as much as 65% for the T4C mission configuration.
As a consequence of the lack of a full parameter-estimation analysis including merger and ringdown on the MBHB catalogs considered here [39], the results presented above are based on the estimation of the MBHB event parameters performed accounting for the inspiral phase only (cut 5 hours before merger). This approach underestimates the number of available standard sirens, and consequently also the instrument performance. On the other hand, in the absence of catalogs produced with the SciRD sensitivity in the frequency range 10 −4 Hz < f < 0.1 Hz, we have used  those produced with SciRD, but extended down to 10 −5 Hz. This might overestimate the number of detected standard sirens, although measurements at low frequency are not expected to strongly affect the present analysis (cf. the low-frequency study [40]).

Characterization of stochastic backgrounds
A stochastic GW background (SGWB) can be characterized by its power spectrum as a function of frequency, by the angular variation of its intensity [41,42], and possibly by its polarization.
The SNR for the measurement of an isotropic SGWB scales as √ T data under the assumption of stationary signal and noise [43]. Therefore, the presence of randomly distributed 1-day or 5-day gaps influences the signal detection capability only as it influences the total duration of the data stream. 4 We thus analyze only the three scenarios without gaps: continuous data for 3 years (Case T4C), continuous data for 3.75 years (Case T5C), continuous data for 4.5 years (Case T6C). We perform two kinds of studies (cf. the low-frequency study [40]). The first one, presented in Sect. 6.1, concerns the generic power-law signal and the specific signals defined in SI7.1 ("Characterise the astrophysical stochastic GW background") and SI7.2 ("Measure, or set upper limits on, the spectral shape of the cosmological stochastic GW background") of the SciRD [5], which read where θ( f ) is the Heaviside step function and f 1,2,3,4,5,6 = {0.1, 0.8, 2, 15, 20, 100} mHz. In Eq. (4), n T is the primordial spectral index; this case is sufficiently general to describe a spectrum arising from inflation, scaling sources like cosmic strings, or the tail of a broken power-law as arising from a first order phase transition. The spectrum given in Eq. (5) represents an astrophysical foreground of inspiraling binaries, characterized by the f 2/3 spectrum [44]. Finally, to probe a broken power-law SGWB from the early universe, Eq. (6) is a statement of the requisite sensitivity to achieve the target science goals [1]: it represents the minimal sensitivity requirement to detect either the infrared tail f 3 (if the peak is above 0.1 Hz), or the ultraviolet tail 1/ f (if the peak is below 0.1 mHz), of a broken power-law signal from bubble collision during a first-order phase transition [49]. This particular source has been chosen as a representative example. The second study, presented in Sect. 6.2, considers the signals caused by two possible SGWB sources operating in the early universe. Both studies show that changing the overall mission duration from 4 years to 5 years or 6 years (i.e. 3 years, 3.75 years and 4.5 years of continuous data stream) provides an insignificant detection improvement. In particular, SI7.1 and SI7.2 can be fulfilled in all three duration scenarios. LISA is also sensitive to the angular variation of the SGWB intensity, as it has different sensitivity to different regions of the sky while orbiting around the Sun. The SNR for the detection of an SGWB anisotropy scales proportionally to √ T data [45]. On the other hand, gaps could influence the SGWB anisotropy characterization, as they might reduce the detector sensitivity to a particular region of the sky. If they appear with a random pattern (i.e. at random positions of the LISA orbit), it is conceivable that their influence is similar to the one of a reduction in the overall mission duration. The worst case scenario would be the one of gaps with periodicity multiple to one year, so that LISA would be always blind at the times in which it is mostly sensitive to a specific region of the sky. However, we can foresee that LISA will be able to pick up the anisotropy of the SGWB only at very large scales, represented by the first few multipoles of the spherical harmonics expansion of the sky, say 10. Gaps with duration of the order of a few days would correspond to a sensitivity loss at much smaller scales, for which the resolution of the instrument is already very low. On the basis of these arguments, we infer that the overall continuous data-stream duration, and the presence of gaps in the data stream, do not significantly alter the capability of LISA to characterize the anisotropy of the SWGB.

Analysis of power law SGWB signals
To quantify the effect of increasing the overall continuous data-stream duration on the SGWB detection, we analyse the detection capabilities for the signals in Eqs. (4), (5) and (6) for the duration scenarios T4C, T5C and T6C. For the signal in Eq. (4), we adopt the fiducial detection criterion SNR > 10. For the duration scenarios T4C and T6C, this criterion is fulfilled in the parameter region {Ω 0 , n T } below the solid curve and the dashed curve of Fig. 11, respectively. The result highlights that for this kind of signal, the gain in parameter space from T4C to T6C is too small to justify an extension of the mission duration. We further investigate the detectability of the SGWBs in Eqs. (5) and (6), with a more elaborated detection criterion. Specifically, we adopt the Bayes factor B between a model with pure noise and a model with noise plus a generic power-law SGWB signal (see Ref. [46] for details). The result is shown in Fig. 12: the signals of SI7.1 and SI7.2 given in Eqs. (5)-(6) do satisfy B ≥ 100, meaning that they can be detected with high confidence also in the shortest mission duration Case T4C. Besides being detected, these signals are also reasonably well reconstructed. We use the SGWBinner code [47,48] to test this feature.
The SGWBinner code reconstructs the spectral shape of a SGWB signal in the LISA band, via parameter estimation of a series of power laws fitting the signal in frequency bins with adaptive size (the noise curve parameters are also reconstructed Fig. 11 Contour regions of parameter space in which the SGWB signal Ω GW ( f ) = Ω 0 ( f /1 mHz) n T has SNR > 10. This has been calculated using the SciRD sensitivity curve, for 3 years (solid line) and 4.5 years (dashed line) of continuous data stream Fig. 12 The coloured contour lines represent the level of the signal amplitude Ω GW that would be detected with high confidence, B ≥ 100, for the three continuous data stream duration scenarios. The grey and pink lines represent the signals identified in SI7.1 and SI7.2, given in Eqs. (5)- (6) at the same time). In each bin, the reconstruction follows the parametrization Ω GW = Ω 0 ( f / f * ) n . At this stage of code development [48], we use a single TDI channel [47] as the consequent reconstruction improvements would rely on extra assumptions on the LISA noise. Figures 13 and 14 display the reconstruction perspective in the duration scenario T4C in the case of the SI7.1 and SI7.2 signals, respectively. Both signals can be reconstructed with reasonably small error bars even in the shortest mission duration scenario T4C.
In particular, the left panel and right panel of Fig. 15 show the 1σ and 2σ Fisher ellipses of the reconstructed parameters for the signal SI7.2 in the left outermost and right outermost reconstruction bins, respectively. Different colors correspond to different duration scenarios. In all duration scenarios, the reconstructed parameters are compatible with the true values (black dots) within 1σ . In the cases T5C and T6C, the areas of the 1σ ellipses are ∼1.1 and ∼1.4 times smaller than the area in the case T4C. The areas scale approximately linearly with T data , corresponding to relative errors on the reconstruction parameter decreasing as √ T data . The gain of 20% in the parameter reconstruction of these signals is a target that should have lower priority than other possible improvements in the LISA mission.

Analysis of early universe sources
A first order phase transition (FOPT) occurring in the primordial universe can generate a SGWB detectable by LISA. The FOPT parameters entering the SGWB signal are the  Hz. The remaining frequency region is used to improve the prior on the noise parameters transition temperature T * , strength α, inverse relative duration β/H * and the bubble wall velocity v w . Several mechanism can source GWs: bubble wall collisions, and the thereby generated sound waves and/or magnetohydrodynamic turbulence [49]. Here we focus on the GW signal produced by sound waves, the one that is best characterized [50]. Fixing the FOPT temperature to 10 GeV, 80 GeV and 150 GeV, the bubble wall velocity to a highly relativistic value v w = 0.95, and the number of relativistic degrees of freedom to g * = 100, we quantify the gain in parameter space from increasing the continuous data stream duration. The result, shown in Fig. 16, is that the extra parameter region reached by increasing the mission duration from 4 to 6 years with D = 0.75 is too small to prioritize an extension of the mission (for details on the codes, see Refs. [51,52]). Concerning the 7442 FOPT benchmark points identified in  [50], the variation in the detection prospects increases as: 478/7442 points for T4C; 516/7442 points for T5C; 538/7442 points for T6C.
A similar result is obtained in the case of the SGWB signal generated by secondorder scalar perturbations, when these latter are enhanced by the presence of a bump in the primordial inflationary scalar power spectrum (see for instance [53] for the details of the computation). Figure 17 shows that, not only the gain in parameter space is tiny, but the range of the parameter space which is scientifically the most relevant is well within the reach of the three mission duration configurations. This corresponds to the range, in the amplitude of the bump of the scalar spectrum, for which this inflationary scenario leads to primordial black holes (PBHs) with masses that allow them to account for 100% of the dark matter in the Universe.

Constraints on dark matter
Many theoretical models predict the existence of ultralight boson fields, which may be a significant fraction of the dark matter content in the Universe. Because of black hole superradiance, these fields may be sources of monochromatic GWs that can be detected Fig. 16 The parameter region {α, β/H * } that LISA can probe when the FOPT SGWB is dominated by the sound-wave contribution. The regions on the right of the curves, evaluated for some given values of v w , g * , T * , are detectable with SNR > 10. Solid lines correspond to the scenario T4C while the dashed ones to T6C Fig. 17 Red curve: amplitude of the scalar power spectrum that gives the totality of the dark matter being PBH, as a function of the PBH mass. Black curves: minimal amplitude needed to have SNR = 10 at LISA for T4C (solid line) and T6C (dotted line). The dotted vertical lines denote approximately the mass range of interest: the lower bound originates from the γ background due to PBH evaporation, and the higher bound originates from lensing (Subaru HSC) either from isolated sources or as a stochastic background [54]. The analysis of Ref. [55] indicates that extending the mission duration from 4 to 6 years would increase LISA's sensitivity to resolvable and stochastic sources of this kind. The number of detectable events with phase coherent searches scales as T 3/2 data , while semicoherent searches scale as T 3/4 data . For resolvable continuous GWs, this translates into a factor of ∼1.8 (∼1.4) increase in the number of sources detectable by a coherent (semicoherent) search. Mission duration also impacts the boundaries in parameter space of the expected constraints on boson masses. By extending the analysis of Refs. [55,56] to a more general mission duration, we find a difference in the interval of masses probed by this method of around 5−10% (e.g., a 4-years mission would constrain dark matter with particle masses in the range [3.7 × 10 −19 , 2.3 × 10 −16 ] eV, while a 6-years mission would constrain the range [3.3 × 10 −19 , 2.7 × 10 −16 ] eV). However, these numbers are heavily dependent on astrophysical population models that have large uncertainties.
Searches for dark matter imprints on gravitational waveforms are not as developed [57][58][59][60][61][62][63]. Approaches using Newtonian expressions for dynamical friction, incorporating accretion but no backreaction on fluid-like dark matter configurations, find that the post-Newtonian (PN) phasing is affected at −5.5PN order [59,61]. For models where Fig. 18 Change in the number of cycles due to a DM spike with respect to the vacuum case, for different total observation times, with the observation ending at the merger. These results were obtained by adapting the code developed in Ref. [63]. They refer to a central IMBH of mass M 1 = 10 5 M and different masses M 2 for the smaller compact object, as shown in the legend. The difference between the two plots is in the properties of the DM spike (parametrized by γ sp ) dark matter is an ultralight field the correction is a −6PN effect [64]. The impact of the duration of the mission can be estimated by connecting this phenomenology to the PN parameters (cf. Sect. 8). In most situations, the difference between a 4-years and 6-year mission is a factor of 2 improvement in the constraints on dark matter density. This general prediction was confirmed by large N -body simulations of IMRIs in some particular scenarios [63]. Figure 18 shows the dephasing in the GW signal for two DM profiles, as a function of mission duration. The dephasing grows linearly (or faster) with the observation time. In some cases where the dephasing may be marginal (of order 1 cycle), increasing the observation time can be important for getting an effect large enough to be detectable.

Tests of general relativity
We now ask how the LISA mission duration affects our ability to test general relativity (GR) with LISA. We quantify the effect of mission duration by using parametrized tests and inspiral-merger-ringdown consistency tests.

Parametrized tests
In GR, the GW signal in the time domain can be written in the form h(t, k) = A GR (t, k)e iΦ GR (t,k) , where A GR (t, k) is the amplitude and Φ GR (t, k) is the phase of the wave. These two quantities are the main observables. Non-GR effects can be classified into two categories: emission effects and propagation effects. Emission and propagation effects can modify both the amplitude and the phase of GW signals [65][66][67].
Let us first discuss the non-GR corrections to the amplitude. The amplitude is given by an initial amplitude at emission A i GR (t, k) multiplied by the transfer function T GR (t, k) encoding information about the cosmological evolution, i.e. k). Corrections due to modified emission can be simply mimicked by taking the appropriate modified function A i non−GR (t, k) as the initial condition. If the background evolution is not ΛCDM, one would capture that with an appropriate transfer function T non −GR (t, k). The precise measurement of the amplitude will be for instance crucial for the GW luminosity distance, enabling us to provide an independent measurement of the expansion rate H 0 . Since there will be degeneracies between the dimming of the amplitude due to the expansion and due to new physics, one will need to theoretically model and observe the merger rate of compact binaries as a function of redshift. For instance, if the gravity theory contains additional non-abelian gauge fields [68] or tensor fields [69] belonging to the dark sector, they will yield a periodic effect on the amplitude due to GW oscillations. These effects can be parametrized in a model-independent way and tested against the redshift information. Therefore, the LISA mission duration will be crucial to obtain good statistical rates to break such degeneracies [70]. In the following we will solely focus on the modifications in the waveform phase and work in the Fourier domain.
Non-GR corrections to the inspiral part of the waveform phase in the Fourier domain can be prescribed within the parametrized post-Einstein (ppE) formalism [71] (or generalized IMRPhenom formalism [72,73], that has a one-to-one correspondence with the ppE parametrization for corrections entering in the inspiral waveform [65]) as where Ψ GR is the waveform phase in GR and u ≡ (π M f ) 1/3 . 5 Here M and f denote the chirp mass of the binary and the GW frequency, β represents the non-GR correction parameter, and the index n indicates that the correction enters at nth PN order relative to GR. Such a theory-agnostic formalism can be mapped to violations of various fundamental aspects of GR, such as the strong equivalence principle (time variation of G at −4PN, scalar dipole radiation at −1PN), Lorentz invariance (−1PN and 0PN), parity invariance (2PN), or a nonzero graviton mass (1PN) [65,66]. Such a formalism also allows us to probe dark matter effects (e.g., gravitational drag at −5.5PN or −6PN [61,64]) and frequency-dependent departures of the GW propagation speed from c T = 1 (in this case, the PN order depends on the form of the dispersion relation). The top panel of Fig. 19 presents the ratio of the upper bound on β between continuous 3 years vs. 4.5 years observations. This ratio measures the improvement in tests of GR with 4.5 years of observation relative to 3 years of observation, and shows that the typical improvement is by a factor of 1-2. Following Ref. [65], the IMRPhenomD waveform has been used for the GR part of the waveform, and the measurability of β is estimated through a Fisher matrix analysis. EMRIs have a different behavior from other systems, probably because the dynamical frequency range is small, and longer observations help to break the degeneracy between β (at positive PN orders) and other parameters, like the masses. We assumed that the observation starts T data before coalescence, which is the optimal case. If we cannot detect the merger, it would be difficult to break the degeneracy between β and other parameters even for probing negative PN effects, and thus the measurability of β becomes much worse than the case considered here.
The bottom panel of Fig. 19 shows a similar result, but including gaps in observations. The bounds on β can improve by a factor of 3 compared to the continuous 3 years observation case. With a fixed elapsed time of 4 years, the improvement is up to a factor of 2. This is because the case with gaps can have a wider dynamical frequency range when performing a Fisher analysis. We also see that longer gap durations yield better improvements at probing non-GR effects in these examples. This is possibly because there is a significant difference in the frequency evolution in the last segment of observation (that contains the merger) compared to all the other segments. The amount of frequency change for the case of 5-day gaps (with 15 days observation segment) is larger than for 1-day gaps (with 3 days observation segment), which further helps to break the degeneracy between β and other parameters.
We can give a rough estimate of how Δβ scales with the observation time T data at negative PN orders (at positive PN orders β has strong correlations with other parameters, and thus it is not easy to find such a scaling). If we neglect correlations between β and other parameters, Δβ is roughly given by Here f min and f max are the minimum and maximum cut-off frequencies,h is the waveform in Fourier space, and S n is the noise spectral density. The absolute value of the waveform amplitude in frequency domain scales like |h| ∝ f −7/6 and ∂ βh ≡ ∂h/∂β ∝h f 2n−5 3 . Assuming a simple scaling for the noise as S n ∝ f s , one finds Assuming that we start the observation a time T data before coalescence, we have f min ∝ T We show this scaling in the top panel of Fig. 19 for s = 0 and s = −6. Observe that this analytic estimate with s = 0 agrees almost perfectly with the numerical result for the Fig. 19 Top: Improvement on constraining the non-GR parameter β in the phase, cf. Eq. (7), at different PN orders with a continuous Tdata = 4.5 years observation (scenario T6C in Sect. 1) relative to a Tdata = 3 years observation (scenario T4C) for various example systems. We assume that the observation starts at a time Tdata before coalescence. The detector's low-frequency cutoff is assumed to be 10 −4 Hz for all cases, except for the SMBH binary system (2 × 10 6 ; 106)M , for which we assumed the detector cutoff frequency to be at 10 −5 Hz. If the cutoff frequency were at 10 −4 Hz there would be no difference in terms of measuring β between the 3 years and 4.5 years cases for this SMBH binary system (the frequency 3 years before coalescence is already outside of this cutoff frequency, and thus a longer observation time does not change the measurability of β). We also show the rough analytic estimate of Eq. (10), or more precisely the quantity (4.5/3) (4n−3s−14)/16 with s = 0 and s = −6. Bottom: same as in the top panel, but now including gaps in the observation. We compare the measurability of β for the 4 scenarios with gaps in Sect. 1 against the case with a continuous observation for 3 years (T4C). We assumed that mergers occur outside of the gaps system with (60,50)M . This is because f min for such a system is f min ∼ 0.01 Hz where S n ∝ f 0 . On the other hand, for systems with larger masses, f min is much lower and the numerical results can be better captured with S n ∝ f −6 , which is the frequency dependence of the noise at low frequency. The deviation from this scaling is due to the various approximations used in this rough estimate, and in particular to the degeneracy between β and other parameters.

Inspiral-merger-ringdown consistency tests
Another model-independent test of GR with GWs is the inspiral-merger-ringdown consistency test [72,73,[75][76][77][78], where we measure the final mass and spin of the remnant black hole with inspiral and merger-ringdown independently and check the consistency between the two measurements. We studied how such tests are affected by the mission duration for the two sources with masses (10 5 , 5 × 10 4 )M and  Factor of a few − The first column indicates whether the number of events scales with the actual observing time T data ; the second column indicates whether we expect better constraints and their scaling with the mission duration time T elapsed (= T data /D with a duty cycle D); the third column indicates whether we expect more statistics (e.g., mode stacking, coherent searches, etc.) [74,[79][80][81][82] (60, 50)M considered in the top panel of Fig. 19. As expected, the mission duration only changes the final mass and spin estimate from the inspiral portion, though the difference is small. We conclude that, at least for the systems studied here, the inspiral-merger-ringdown consistency tests are almost unaffected by the duration of the observation. In summary, longer observation times mainly improve bounds on non-GR effects entering at negative PN orders (such as varying-G effects) by a factor of 2-3. As shown in Table 3, it also helps to have more events, and hence better statistics.

Testing the nature of black holes
A key component of the LISA mission's scientific objectives is to test nature of BHs and search for other dark compact objects [83]. In particular, elements of SO5 are addressed by investigations of these types, including SI5.1 ("Use ring-down characteristics observed in MBHB coalescences to test whether the post-merger objects are the black holes predicted by GR") and SI5.2 ("Use EMRIs to explore the multipolar structure of MBHs"). These investigations share methodologies with tests of the foundations of the gravitational interaction (Sect. 8) so, as a rule of thumb, we expect the same potential limitations due to a decrease of the effective mission duration.

Tests of the nature of black holes
Here we briefly list the tests of the nature of BHs and searches for compact objects we considered in this study.

Inspiral-based test with MBHBs, IMBHBs, and EMRIs
The sources for these tests are compact binaries in various ranges of masses and mass ratios. The dynamics of these binaries will be affected by dipolar radiation if the objects are charged (either under an EM or a dark field). It will also be impacted if the multipolar structure of the binary components differs from that predicted in Kerr, where all multipoles are determined by the mass and spin through elegant relations [84]. In particular, smoking guns of the non-Kerrness of an object would be the presence of moments that break equatorial symmetry or axisymmetry, as in the case of multipolar boson stars [85] and of fuzzball microstate geometries [86][87][88][89][90]; or the lack of efficient absorption of radiation by the objects (i.e. tidal heating), at variance with the BH case. For EMRIs in the LISA band, measurements of tidal heating can be used to put a very stringent upper bound on the reflectivity of the object's surface, at the level of 0.01% [91]. In addition, the presence of tidal deformability effects (other than the aforementioned tidal heating), which are absent for BHs [92,93] but are generically non-zero for other objects, can leave detectable imprints in the LISA band [94][95][96].

Ringdown tests
Measuring the ringdown modes in the post-merger signal of a binary coalescence provides a clean and robust way to the nature of the remnant. Detecting several QNMs would allow for multiple independent null-hypothesis (Kerr) tests, and enable GW spectroscopy [97,98], in particular for golden events [80,99]. Besides deforming the QNM spectrum, if the remnant differs from a Kerr BH, some further smoking gun deviations in the prompt ringdown can be the presence of other modes or extra degrees of freedom and the existence of mode doublets arising from isospectrality breaking [100]. Even in the absence of deviations in the prompt ringdown, GW echoes [101][102][103] in the late-time post-merger signal of a compact binary coalescence might be a generic smoking gun of new physics at the horizon scale (see [83,104] for some recent reviews). The echo amplitude depends on the object's reflectivity [105] that can be constrained only by SNRs of O(100) in the post-merger phase [106,107]. This makes LISA particularly well suited for echo searches and gives the tantalizing prospect of probing the near-horizon (possibly quantum) structure of dark compact objects. Finally, the high sensitivity of LISA could be used to test proposals for the area quantization of BHs [108,109] with suitably modified inspiral-merger-ringdown signals [110][111][112].

Quantifying the impact of a change in mission duration
The impact of a change in mission duration depends on the relative magnitude of the signal duration T signal and of the mission duration T elapsed (recall that the actual observing time is T data = D × T elapsed , where D is the duty cycle). For tests of the nature of BHs, we expect four different scenarios (summarized in Table 3): Case a : T signal T elapsed . For signals that are short relative to the mission duration, we expect the primary benefit of a longer mission to be the detection of a larger number of signals, with N signal ∝ T data . Multiple events can be combined in order to derive constraints on the nature of black holes, and such constraints should obey the usual 1/ N signal scaling (in the limit of a large number of similar detections). Thus, for these shorter transients, we expect bounds to improve as √ T data . Signals in this category would include MBHBs, which are the primary candidates for no-hair tests with ringdown, for those parametrized inspiral tests which are impacted by properties of the BHs, and for post-merger echo searches of deviations from classical horizons [103][104][105][106]. Case b : T signal T elapsed . For sources with signals that are long compared to the mission lifetime, increasing the mission duration could have a much stronger impact on the measurements. SOBHBs, when they last a significant portion of the mission duration, fall into this category, as do Galactic binaries which include BHs. The impact of mission duration on constraints for this class of systems depends on the scaling of the phase evolution with time for a given source. For approximately monochromatic sources, a change in the frequency derivative due to non-GR/non-BH effects would result in a phase drift ∝ T 2 elapsed , as discussed in Sect. 3. For these sources then we expect constraints to scale as T 2 elapsed , and the number of detections will scale better than T data , since quiet signals can accumulate SNR over the entire observational data T data . Case c : T signal ∼ T elapsed . EMRI events are the most representative example of an intermediate case, and are particularly relevant for tests of the nature of supermassive objects since they can potentially provide unparalleled constraints. EMRIs can last a significant amount of time, and so we expect that the number of EMRI detections will improve faster than linearly with T data , as discussed in Sect. 4. A simple way to estimate the impact on mission duration on the detection of these sources is to require that the system be observed for at least some amount of observation time T 0 before it can be used. Then the amount of time during which these signals can actually be detected is T det = T data − T 0 . By increasing T data by a factor γ , we see that T det → γ T det + T 0 (γ − 1). This results in an increase in the number of detections which is linear in γ , but with an additive factor. The lowest mass MBHBs will also take a significant amount of time to inspiral, and are covered by this intermediate case. Case d : Rare golden events. Finally, for certain scientific goals and especially for precision tests of gravity and of the nature of BHs, rare golden events can make a major difference, since they are paramount for major and groundbreaking discoveries. The probability of detecting one or more rare events scales approximately with the amount of time observed, so that the expected number of such rare events also scales linearly with T data .
We conclude that SO5 of the LISA mission proposal, "Explore the fundamental nature of gravity and black holes," would be facilitated by a longer mission duration, with the expected number of events (including rare golden events of paramount importance for fundamental physics) increasing linearly with mission duration. In some SOs are listed in ascending order following the LISA proposal [1]. The hyperlinks in parentheses next to each SO refer to the sections of the present document used to draw the conclusions summarized in this table. The different colors (red, green, yellow, blue) indicate whether each SO/SI goal is met, according to the interpretation provided in the main text. In the definition of gaps, "one" means that the data set is only reduced by a factor of D = 0.75 relative to T elapsed as a consequence of a single long gap either at the beginning or at the end of the mission: for example, in scenario T4C we have T data = D×T elapsed = 3 years of continuous data. White entries appear because for SI6.1 we could not study the effect of gaps cases, especially for long-duration signals, we expect better than linear improvement in the number of detected events and/or in the constraints derived from each event.

Conclusions
In this paper we have examined the performance of the various scenarios described in the introduction with respect to the LISA SOs defined in the mission proposal [1] for the configuration SciRD. An in-depth scrutiny of the scientific capabilities of LISA has revealed that the adopted mission duration has a strong impact on several SOs and SIs, as defined in the LISA proposal. Although all areas of LISA science (astrophysics, cosmology, and fundamental physics) are affected to some extent, the impact is more prominent for some of the astrophysics goals.
Our main findings are summarized in Table 4, where the color code has the following interpretation: green: the objective, as defined in the LISA proposal for SciRD, can be achieved; -yellow: we cannot establish whether the objective can be achieved, because of astrophysical uncertainties, or because the results would need deeper verification. Nonetheless, our investigation points towards a substantial performance degradation compared to SciRD; -red: there is a significant danger of failing the objective as defined for SciRD; -blue: there is an improvement in the capabilities of the instrument compared to SciRD.
In the table we only list the SOs for which configuration T4C (i.e. a reduction in the usable data stream due to the D = 0.75 duty cycle) corresponds to either a degradation of the SO (yellow) or danger of failing the goals stated in the LISA proposal (red).
Based on the analysis presented in this paper, we strongly recommend an extension to 6 years of mission operation. The recommendation is based on the following assessment of the impact of mission duration on individual LISA SOs.
-SO1. SI1.1: Enable joint gravitational and EM observations of Galactic binaries to study the interplay between gravitational radiation and tidal dissipation in interacting stellar systems. The study of this interplay relies on the measurement of the frequency derivatives of the GW signal, to discriminate GW vs mass transfer driven evolution, unveil tidal interactions, etc. The number of Galactic binaries for whichḟ andf can be measured scales with T 2 elapsed and T 3 elapsed , respectively. The benefits of extending the mission to 6 years are therefore clear, especially when considering that measuringf will be feasible only for a handful of sources. -SO2. SI2.1: Search for seed black holes at cosmic dawn. Inclusion of gaps in the data stream for a total duty cycle D = 0.75 significantly affects the number of observable high-redshift (z > 10), low-mass (M < 10 3 M ) MBHBs. In our standard models, assuming a 4-years mission, those are reduced from ≈ 25 for SciRD to 10 for the scenarios with gaps in the data (T4G5 and T4G1). For more pessimistic scenarios, the number of low-mass MBHB detections decreases from ≈ 10 to 6. Those are dangerously low numbers that can jeopardize our ability to reconstruct the nature of the first MBH seeds. Extending the mission to 6 years would put this investigation on safer ground, increasing the low-mass/high-redshift seed MBH sample from 10 to 15 in our standard model. This is necessary in order to address SI2.1. -SO2. SI2.3: Observation of EM counterparts to unveil the astrophysical environment around merging MBHBs. The number of sources at z < 2 which are primary targets for EM follow-ups are expected to be just a few (≈ 2 years −1 in our fiducial models). Compared to SciRD, the presence of a 0.75 duty cycle will severely degrade the SNR and sky localization of ≈ 30% of these sources, posing a significant threat to the success of associated EM searches. It is therefore essential to extend the mission to 6 years, which mitigates the risk of failing SI2.3, since the number of detected massive and nearby sources scales with T elapsed . -SO3. SI3.1: Study the immediate environment of Milky Way-like MBHs at low redshift. The presence of gaps in the data will make it harder to observe EMRIs up to z ≈ 4, which is the goal stated in the LISA proposal. Our simulations indicate that the number of observable EMRIs scales with ≈ T 3/2 data , roughly doubling the number of observed systems for a mission duration extension from 4 to 6 years. This will mitigate the chances of missing EMRIs altogether should we face the most pessimistic astrophysical scenarios, which forecast ≈ 1 observable EMRI per year. The SNR for detecting deviations in EMRI waveforms due to environmental effects (e.g., the SOBH's interaction with circumbinary gas) scales more steeply with mission duration, as ∝ T 2 -T 3 [35], further justifying the extension to a 6 years mission.
-SO4. SI4.1: Study the close environment of stellar-origin black holes (SOBHs) by enabling multi-band and multi-messenger observations at the time of coalescence. The inclusion of gaps in the data, together with the relaxed highfrequency sensitivity requirement (by a factor 1.5 compared to the LISA proposal design) pose major obstacles to the fulfillment of this objective. With 4 years of observations and D = 0.75 duty cycle (i.e. T data = 3 years, as per T4C, T4G5, and T4G1), the expectation is to observe a couple of multi-band sources, and the SNR > 8 goal on GW150914-like sources is difficult to achieve. A 6-years extension will double the number of multi-band systems, which is crucial for SI4.1. -SO4. SI4.2: Disentangle SOBH binary formation channels. For the reasons mentioned above, the number of detectable SOBHs is likely going to be O (10), which might be insufficient to statistically discriminate formation channels via eccentricity measurements. Since the number of observable SOBHs also scales with ≈ T 3/2 data , an extension to 6 years will double the number of detections, allowing for a better measurement of the eccentricity distribution, which is of paramount importance for SI4.2. -SO5. SI5.2: Use of EMRIs to test multipolar structure. Mapping the spacetime around a BH using an EMRI signal is not endangered by a mission duration of 4 years if EMRIs are observed (cf. SI3.1 above), but weak EMRI signals will build up throughout the mission duration. A longer mission thus results in improvements that scale faster than linear with the mission lifetime for these tests. -SO5. SI5.3 and SI5.4: Propagation properties of GWs and other emission channels. Many fundamental questions in gravitational physics, such as the dispersion effects induced by a nonzero graviton mass, the existence of dipolar charges, a time-varying Newton's constant, and environmental effects due (say) to dark matter, can be addressed jointly via a parametrized formalism. Mission duration has an impact on our ability to constrain these parameters, especially when they affect the waveform at low frequencies, and require long observation times to remove degeneracies. A rough scaling of these bounds on the associated ppE coefficients with mission duration is given in Eq. (10): for example, the bounds on environmental and dark matter effects will degrade by up to a factor of two if the mission lifetime is reduced from 6 to 4 years. For the reasons highlighted above (SO2. SI2.3), we may also miss several golden events, and this would affect BH spectroscopy tests based on the detection of multiple harmonics of the ringdown. -SO5. SI5.5: Test the existence of ultralight fields and discover dark matter spikes. Ultralight fields can produce monochromatic GW signals through superradiance. The mission duration has a significant impact on the number of resolvable sources of such monochromatic GWs, which scales super-linearly (cf. Table 3). Therefore, mission duration affects our ability to discover ultralight dark matter. It also impacts the constraints on the local dark matter density in some binaries, with up to a factor of 2 improvement if the mission is extended from 4 to 6 years. -SO6. SI6.1 and SI6.2: Probe the rate of expansion of the Universe. Different categories of sources enable LISA to probe the expansion of the Universe at different redshift. In particular, SI6.1 selects SOBHB and EMRIs as distance indicators, to probe the Hubble parameter today with statistical identification of the redshift.
Preliminary results using EMRIs alone seem to indicate that a 5-years mission with D = 0.75 (i.e. configuration T5C) is the minimum necessary to constrain the Hubble parameter today to better than 2% (SI6.1). SI6.2 selects MBHB as distance indicators, with redshift identification coming from an EM counterpart. We propose a new FoM to meet this science objective, i.e. the measurement of the Hubble rate at redshift 2. In the most pessimistic astrophysical scenario for the MBHB formation channel, the FoM cannot be met, but it can be met for two more optimistic scenarios. -SO7. SI7.1&7.2: Understand stochastic GW backgrounds and their implications for the early Universe and TeV-scale particle physics. While extending the overall mission duration would improve the science return of LISA concerning SO7, both SI7.1 and SI7.2 can be met with 3 years of continuous data. Gaps are not expected to affect the detection of a stochastic GW background.
In summary, the introduction of a 75% duty cycle on a 4-years mission duration (i.e. configurations T4C, T4G5, and T4G1) has a detrimental effect on several of the SOs and SIs that are the foundation of the LISA science case.