1 Introduction

Massive stars, with masses above ∼8-10 M, significantly impact the environment in which they are born. They affect the thermal properties and chemical composition of the parent cloud via photoionization and dust heating onto the circumstellar discs of their neighbouring low-mass stars. They end their lives as supernovae, impacting the surroundings and the subsequent star formation by depositing high amounts of heavy elements and increasing the level of turbulence within the cloud. Different scenarios have been proposed to explain the formation of massive stars, however, no global consensus has yet been found on how this occurs (see McKee and Ostriker 2007, Zinnecker and Yorke 2007 or Motte et al. 2018 for a review). Bonnell et al. (1997, 2001) proposed the competitive core accretion model, where all bound objects accrete gas from their surroundings; objects placed in the center of the cloud become more massive than in the outskirts, thanks to the favourable conditions for accretion at the bottom of the potential well. McKee and Tan (2002, 2003) proposed a scaled-up version of the low-mass star paradigm (Shu et al. 1987), termed turbulent core accretion model, where massive prestellar cores are supposed to have high gas pressures, supersonic turbulence and significant magnetic support, leading to a rather slow, almost monolithic and unfragmented collapse. One of the largest differences between these two models is the predicted collapse timescale. Therefore, a proper analysis of the collapse evolution is needed to distinguish between different formation scenarios. A useful observational tool to measure such timescales are chemical clocks, i.e., molecular tracers that show drastic abundance changes with density and temperature variations as a function of time (e.g., Beuther et al. 2009, Fontani et al. 2011b, Molinari et al. 2016, Urquhart et al. 2019, Coletta et al. 2020, Sabatini et al. 2021, Mininni et al. 2021). To exploit this technique, it is possible to combine the information from different tracers. However, this is highly dependent on the observability of each molecule transition.

Infrared Dark Clouds (IRDC), massive quiescent clouds that represent the most likely birthplaces for the next generation of high-mass stars, are characterized by low gas temperatures (\(T_{\mathrm{gas}}<20\) K), high gas column densities (\(N_{\mathrm{gas}}\sim 10^{23-25}\text{ cm}^{-2}\)) and a large degree of CO-depletion (e.g., Fontani et al. 2011a, Caselli and Ceccarelli 2012, Caselli 2013, Giannetti et al. 2019, Jorgensen et al. 2020, Sabatini et al. 2020; see also Bergin and Tafalla 2007 for a review). The absence of C-bearing molecules in the gas phase enables further chemical reactions to take place, such as the enhancement of the abundances of deuterated molecules. Deuteration reactions start from the proton-deuteron exchange between HD and \(\mathrm{H}_{3}^{+}\),

$$\begin{aligned} {\mathrm{H}}_{3}^{+} +{\mathrm{HD}} \rightleftharpoons &{\mathrm{H}}_{2}{\mathrm{D}}^{+} +{ \mathrm{H}}_{2} +\Delta E_{1}, \end{aligned}$$
(1)
$$\begin{aligned} {\mathrm{H}}_{2}{\mathrm{D}}^{+} +{\mathrm{HD}} \rightleftharpoons &{\mathrm{D}}_{2}{\mathrm{H}}^{+} +{\mathrm{H}}_{2} +\Delta E_{2}, \end{aligned}$$
(2)
$$\begin{aligned} {\mathrm{D}}_{2}{\mathrm{H}}^{+} +{\mathrm{HD}} \rightleftharpoons &{\mathrm{D}}_{3}^{+} +{ \mathrm{H}}_{2} +\Delta E_{3}, \end{aligned}$$
(3)

where \(\Delta E_{1}\), \(\Delta E_{2}\) and \(\Delta E_{3}\) depend on the isomers involved in each reaction (Hugo et al. 2007). This set of reactions increases the abundances of deuterated species such as H2D+ and D2H+ over time. However, they can also be efficiently destroyed by the presence of CO in the gas phase, via

$$\begin{aligned} {\mathrm{H}}_{3}^{+} +{\mathrm{CO}} \rightleftharpoons &\mathrm{{HCO}^{+} +{\mathrm{H}}_{2}}, \end{aligned}$$
(4)
$$\begin{aligned} {\mathrm{H}}_{2}{\mathrm{D}}^{+} +{\mathrm{CO}} \rightleftharpoons &{\mathrm{DCO}}^{+} +{ \mathrm{H}}_{2}. \end{aligned}$$
(5)

In the cold and dense inner regions of IRDCs, the absence of CO boosts the formation of deuterated species and provides an insight into the onset of collapse in prestellar cores (Dalgarno and Lepp 1984). Models of the evolution of such deuteration reactions (Walmsley et al. 2004; van der Tak et al. 2005; Flower et al. 2006; Sipilä et al. 2013; Bovino et al. 2019) show a maximum H2D+ abundance right before the formation of a protostellar object, which would then heat up the gas and evaporate the CO from the dust grain surfaces, converting H2D+ back into DCO+ (following reaction (5)), and decreasing the abundance of H2D+.

All the models that attempt to estimate the evolution of deuteration reactions are subject to the initial value of the ratio between the different isomers of H2 (ortho-to-para ratio). This ratio can be observationally traced using the H2D+ and D2H+ molecules as proxies (e.g., Flower et al. 2006, Hugo et al. 2009, Brünken et al. 2014, Bovino et al. 2021). Unfortunately, observations of low energy transitions from para-H2D+ suffer from large atmospheric attenuation at terahertz frequencies, hence, most of the efforts have been directed to observe the sub-millimeter transitions of ortho-H2D+ \(J_{\mathrm{{K_{a}, K_{c}}}} = 1_{10}\)-\(1_{11}\) at ∼372.42 GHz (Amano and Hirao 2005). Several detections have been reported toward low-mass cores (Caselli et al. 2003; Vastel et al. 2006; Caselli et al. 2008; Parise et al. 2011; Friesen et al. 2014; Brünken et al. 2014; Miettinen 2020), since they are relatively close and are therefore easier to detect. In the context of high-mass sources, single-dish observations have also been carried out (Harju et al. 2006; Swift 2009; Pillai et al. 2012; Giannetti et al. 2019; Sabatini et al. 2020). However, massive sources are rather distant (≳1 kpc; Zinnecker and Yorke 2007, Giannetti et al. 2014 and König et al. 2017), and the angular resolutions available are not sufficient to resolve the inner regions of the cores. Sabatini et al. (2020) detected o-H2D+ toward 16 high-mass star-forming regions in different evolutionary stages, observed with the Atacama Pathfinder EXperiment 12-m telescope (APEX; Güsten et al. 2006), and found a correlation between the evolutionary state of the clumps and the abundance of o-H2D+, which is higher in younger sources (see also Giannetti et al. 2019). More recently, Redaelli et al. (2021) presented the first interferometric detections of o-H2D+ in high-mass star forming clumps, observed with the Atacama Large Millimeter/submillimeter Array (ALMA; Wootten and Thompson 2009). Their observed line emission of o-H2D+ is rather narrow and subsonic, suggesting that the gas in their sources is cold (\(>10\) K) and dense (> 106 cm−3), indicating the lack of protostellar heating and more representative of a prestellar and sub-virial phase.

In this work, we developed a framework for synthetic observations and we applied it to a set of numerical simulation of massive star formation, using the observability of the o-H2D+ transition \(J_{\mathrm{{K_{a}, K_{c}}}} = 1_{10}\)-\(1_{11}\) as our case of study. Our aim is to show the difficulties involved in the observations of such distant and obscure sources. For this, we performed a set of radiative transfer (RT) simulations using these simulated cores as synthetic sources (see 2.2) and then post-processed them by adding instrument-related effects. In this last step, we distinguished between single-dish and interferometric observations, looking for an understanding of the key differences that may arise when observing the same source with both techniques. Finally, we derived column densities of o-H2D+ from the resulting intensity distributions and compared our results both to values reported in the literature and also to the physical column densities measured from the model.

In Sect. 2 we describe the steps followed for each synthetic observation. In Sect. 3 we show the radial distributions of the synthetic maps and the column densities derived and then we present a comparison of the column densities derived from a single-dish and a interferometer. In Sect. 4 we discuss the limiting cases of our results in terms of source distance and observing time. Finally, we provide a summary and conclusions in Sect. 5.

2 Methods

2.1 Workflow

To perform a successful synthetic observation, three main ingredients are needed: a synthetic source, a ray-tracer and a synthetic detector. The synthetic source, or model, can be either a simple density and temperature distribution or a full three dimensional magneto-hydrodynamic (MHD) simulation (Haworth et al. 2018) (Fig. 1a). The ray-tracing is done by a RT code, which calculates the propagation of light within the source, accounting for emission, absorption and scattering, and creates a resulting flux distribution attenuated at a given distance (Fig. 1b). The synthetic detector is the numerical array in which this ideal intensity distribution is stored. To these intensity maps, instrument-related effects must be applied, such as convolution with a telescope beam (in the case of single-dish telescopes) or a reconstructed beam (in the case of inteferometers), the addition of noise and image reconstruction from a complex visibility, in the case of interferometers (Fig. 1c,d). Thus, in order to make synthetic observations as realistic as possible, the intensity maps from the RT simulations must be post-processed based on the specific properties of the telescope of interest. The fluxes resulting from each synthetic observation are finally converted into column densities, to then be compared against the values from the numerical simulations (Fig. 1e). All the functions and routines used in steps (c), (d) and (e) from Fig. 1 are provided in an online repository.Footnote 1

Fig. 1
figure 1

Workflow of each synthetic observation. The input MHD simulations are ray-traced using the POLARIS RT code and then post-processed to distinguish between interferometric and single-dish simulations, using the CASA software and a Python module written for this project, respectively. Each resulting intensity distribution is then converted into column densities for further comparison with real data

2.2 Synthetic source

The synthetic source used in this work is an isolated magnetized massive prestellar core taken from the set of 3D ideal-MHD simulations performed by Körtgen et al. (2017). They employed the Flash code (v4.2.2) (Fryxell et al. 2000), coupled with the Krome package (Grassi et al. 2014) to follow the deuteration chemistry of light hydrides.

2.2.1 Initial conditions

The core is initialized as an isolated Bonnor-Ebert (BE) sphere (Bonnor 1956 and Ebert 1955), supersonically turbulent and assumed to collapse isothermally at \(T_{\mathrm{gas}}=15\) K. The initial conditions of the core are listed in Table 1. The mass of the core is 60 M. The central gas number densities evolve from \(n_{\mathrm{gas}}\sim 3\times 10^{6}\text{ cm}^{-3}\) (\(N_{\mathrm{gas}}=10^{23.5}\text{ cm}^{-2}\); at 0.1 free-fall times) to \(\sim 1\times 10^{8}\text{ cm}^{-3}\) (\(N_{\mathrm{gas}}=10^{25}\text{ cm}^{-2}\); at 0.5 free-fall times) as seen in Fig. 2. Radially averaged density profiles along the midplane of the core are also shown in Fig. 3, which show that the core is indeed a compact source with most of the dense material lying within the inner ∼6000 au.

Fig. 2
figure 2

Time evolution of the gas column density in the simulated collapsing core, shown at the two timesteps analyzed in this work. White contours represent the column density of o-H2D+ and the dashed circle represents the field of view of the APEX and ALMA telescopes (16.8″ at 372.42 GHz) at a distance of 1 kpc. The scalebar of 1″ illustrates the synthesized beam of the ALMA synthetic observations of the compact source

Fig. 3
figure 3

Radially averaged midplane density profiles for the two time snapshots shown in Fig. 2

Table 1 Initial parameters of the core selected from Körtgen et al. (2017), labeled Lmu10M2. The name refers to a source within their sample that has a low surface density, a mass-to-flux ratio of 10 and a Mach number of 2

2.2.2 Deuteration chemistry

The simulations from Körtgen et al. (2017) included a detailed non-equilibrium chemical network, with 21 chemical species in the gas phase and also dust grains, along with their ionized states.

The network solved the deuteration reactions based on Walmsley et al. (2004), who assumed full depletion of elements heavier than He, which accounts for the CO freeze-out onto dust icy mantles that has been observed to be effective at the densities studied here (\(\gtrsim 10^{4}\text{ cm}^{-3}\); Caselli et al. 1999, Tafalla et al. 2002, Giannetti et al. 2014). This assumption allows to significantly reduce the network and the computational time of the simulation, while still providing realistic results (e.g. Sabatini et al. 2019 and Bovino et al. 2019). Electron attachment, recombination of positive ions and grain surface reactions were also included, such as the formation of H2 and HD, with the exception of D2, which is mainly formed in the gas-phase. The full chemical network was solved and evolved using the Krome package (Grassi et al. 2014).

2.3 Radiative transfer calculations

We performed ray-tracing RT simulations with the code PolarisFootnote 2 (v4.06) (Reissl et al. 2016 and Brauer et al. 2017) based on the temperature and density distribution taken from two timesteps of the collapsing core simulations, starting from slightly after the initial state (0.1 \(t_{ \mathrm{ff}}\); with \(t_{\mathrm{ff}}\) the free-fall time, 16 kyr) until 0.5 \(t_{ \mathrm{ff}}\) (74 kyr, see Fig. 2). We simulated the emission of the o-H2D+ transition \(J_{\mathrm{{K_{a}, K_{c}}}} = 1_{10}\)-\(1_{11}\) at ∼372.42 GHz. For simplicity, dust was not included in our setup, to reduce the uncertainty in the continuum subtraction from the spectra and assuming optically thin emission for the o-H2D+. This assumption was assessed by computing the line optical depth from the synthetic cubes, using equation (10). We obtained values no larger than 0.1 for even the most dense stages of the core, as for instance that shown in Fig. 8. However, the exclusion of a dust component in the setup can only be done safely when the opacity from the dust is known to be low enough so that line emission is not significantly extincted. In order to quantify the effect, we assume a dust composition of silicates and graphites, previously used in the modelling of dust in pre- and proto-stellar cores (Draine and Lee 1984; Ossenkopf and Henning 1994). This composition, at the frequency of 372.42 GHz, yields an opacity of \(\kappa _{\mathrm{abs}}<4.56\text{ cm}^{2}\) g−1. This means that only in the peak dust surface density of the evolved stage of the core (\(\sim 1\) g cm−2; for a dust-to-gas mass ratio of 0.01), the optical depth would be \(\tau _{\nu}\sim 4.56\), and much lower around it. This upper limit in the opacity is based on the dust coagulation model of Ossenkopf and Henning (1994) after 100 kyrs of evolution, which is longer than the latest time in our collapsing core simulations. The opacity they provide for the initial state of the coagulation model yields \(\tau \sim 0.9\), which means that the emission from both of the collapse stages we considered should be marginally optically thick in most of the core scales, with the only exception of the peak surface density at 74 kyr. Based on this, we neglect the effects of dust opacity in this work but acknowledge that it should be important to take into account in more detailed investigations.

We studied distances from 1 up to 10 kpc. For each time and distance we generated spectral cubes, covering 10 km s−1 centered on the rest frequency of our line of interest, i.e. ∼372.42 GHz. We fixed the spectral resolution of each datacube to 0.03 km s−1 (∼38 kHz) for all simulations, by splitting the spectral range into 333 channel maps, meant to represent the highest resolution offered by the FLASH+ dual-frequency MPIfR principal investigator (PI) receiver (Klein et al. 2014) at a frequency of ∼372.42 GHz, mounted at the APEX telescope. Based on the gas column density distribution of the core (see Fig. 2), the lowest number densities are around \(1.62\times 10^{5}\text{ cm}^{-3}\), for a core radius of 0.2 pc. Such number densities are above the critical density of the o-H2D+ line emission at 372.42 GHz (\(1.3 \times 10^{5}\text{ cm}^{-3}\); Hugo et al. 2009) and therefore Local Thermodynamic Equilibrium (LTE) can be assumed when computing the level populations. The assumption of LTE means that the excitation temperature \(T_{\mathrm{ex}}\) and the kinetic gas temperature \(T_{\mathrm{gas}}\) are equal (15 K), and then

$$ T_{\mathrm{gas}}=T_{\mathrm{ex}}=\frac{h\nu _{ij}}{k_{\mathrm{B}}}\left [\ln{\left ( \frac{g_{i}\,n_{j}}{g_{j}\,n_{i}}\right )}\right ]^{-1}, $$
(6)

with \(g_{i}\) and \(g_{j}\) the statistical weights, and \(n_{i}\) and \(n_{j}\) the level populations following the Boltzmann distribution. The molecular data of o-H2D+ used in our simulations was obtained from the Cologne Database for Molecular SpectroscopyFootnote 3 (CDMS; Müller et al. 2005, Endres et al. 2016).

2.4 Single-dish observations

The single-dish synthetic observations were performed as a post-processing of the ideal intensity maps from the radiative transfer calculation. This was done by adding instrument related effects to the datacubes, such as the convolution of the cubes with a telescope beam and the addition of thermal noise (Fig. 1c).

2.4.1 PSF convolution

We convolved the datacubes with a 2D Gaussian kernel, resembling the Point-Spread-Function (PSF) of a parabolic single-dish telescope. We used a Full-Width-at-Half-Maximum (FWHM) beam size of 16.8 arcseconds, that is the effective resolution achieved by the APEX 12-m dish at a frequency of ∼372.42 GHz. After the convolution, we converted the maps from Jy pixel−1 into Jy beam−1 by rescaling the flux with the ratio of the area of a gaussian beam over the area of a square pixel, as

$$ \frac{F}{\text{Jy}\,\text{beam}^{-1}} = \frac{\pi}{4\ln 2} \frac{\theta _{\mathrm{maj}}\,\theta _{\mathrm{min}}}{{\mathrm{arcsec}}^{2}}\left ( \frac{\text{pix}\,\text{size}}{\mathrm{arcsec}}\right )^{-2} \frac{F}{\text{Jy}\,\text{pixel}^{-1}}, $$
(7)

where \(\theta _{\mathrm{maj}}\) and \(\theta _{\mathrm{min}}\) are the FWHM of the major and minor axis of the beam in arcseconds, equal for a circular beam. At a distance of 1 kpc, the APEX-beam corresponds to 16800 au. This means that these observations were spatially unresolved almost during the entire evolution.

Convolution onto the images was performed using the convolve_fftFootnote 4 function from the Astropy Python package (Astropy Collaboration et al. 2013; Price-Whelan et al. 2018). This function performs a Fourier-space convolution of 2D data-matrix with a Gaussian kernel of standard deviation \(\sigma ={\mathrm{FWHM}}\,/\,\sqrt{8\ln 2\,}\), where FWHM is the angular resolution in number of pixels.

2.4.2 Addition of noise

We converted the images from Jy beam−1 to brightness temperature (\(T_{ \mathrm{b}}\)) and added Gaussian noise to them, with a standard deviation (\(T_{ \mathrm{rms}}\)) derived from the equation

$$ T_{\mathrm{rms}} = \frac{T_{\mathrm{sys}}}{\sqrt{\Delta \nu \,t_{\mathrm{int}}}}, $$
(8)

where \(\Delta \nu \) is the spectral resolution, \(t_{\mathrm{int}}\) is the integration time and \(T_{\mathrm{sys}}\) is the system temperature (Kraus 1966). To obtain realistic values of \(T_{\mathrm{sys}}\) for a given observing setup, we used the APEX Observing time calculator.Footnote 5 The noise level added to the simulations was \(T_{\mathrm{rms}}\sim 8.6\) mK, corresponding to \(T_{\mathrm{sys}}\sim 543\) K for an integration time on source of 6 hrs, a source elevation of 45 deg, a precipitable-water-vapor (pwv) level of 0.5 mm (typical for APEX observations at 372.42 GHz, e.g. Miettinen 2020, Sabatini et al. 2020) and a spectral resolution of \(\Delta v=0.3\) km s−1, achieved by binning the spectra with a factor of 10. We emphasize that the integration times presented here and in Sect. 4.1 correspond to on-source times only and do not consider calibration observations.

2.5 Interferometric observations

We performed interferometric synthetic observations for ALMA, to ease the comparison with single-dish observations since it represents an array of APEX-like telescopes. We simulated the interferometric response of our intensity maps by using the Common Astronomy Software Applications package (CASAFootnote 6) tasks simobserveFootnote 7 and tclean.Footnote 8simobserve is used to create a visibility measurement set from an image model and tclean to reconstruct an image out of the visibility table using the clean algorithm. An example script for the ALMA simulations is provided in the online repository mentioned in Sect. 1. The input image models were the output from the radiative transfer calculations, in units of Jy pixel−1.

We used the most compact array configuration C43-1 in band 7 (Cycle 7), obtaining a synthesized beam of about \(1''.01\times 0''.92\) at 372.42 GHz, in order to achieve the highest sensitivity among all configurations and to recover most of the extended emission.

Part of the analysis included simulations using the Atacama Compact Array (ACA; Iguchi et al. 2009). ACA is used to patch the shorter baselines in the uv-plane, not covered by the ALMA main array, retrieving information from large scale emission.

This configuration also aids in reducing the spatial filtering of the extended emission. Based on the same argument, we cleaned with natural weighting for all images, aiming for improved sensitivity over angular resolution, since the source is always resolved in our interferometric observations. The number of pixels and image size were adjusted accordingly for every source distance that we studied, in order to ensure a Nyquist sampling of the image (i.e., 5 pixels across the beam minor axis). We cleaned interactively, to define a proper mask for the source at each timestep and distance, checking the residual map at each major cycle. The cleaning threshold was also adjusted interactively and a multiscale deconvolver was used, for scales of 0 (point-source), 1, 5 and 10 times the beam size. The gridder was set to standard. The spectral channel width was 0.03 km s−1, although we also binned the spectrum up to 0.3 and 1 km s−1 when analyzing the low-S/N observations at large source distances. Thermal noise was added by the simobserve task, based on a sky temperature of 114.3 K, atmospheric zenit opacity (\(\tau _{0}\)) of 0.579, precipitable-water-vapor of 0.658 and a ground temperature of 269 K. This led to noise levels of ∼1.4 mJy beam−1 (9 mK) for a 10 hrs integration time.

2.6 Derivation of column densities

We derived source column densities from the single-dish and inteferometric synthetic fluxes, following the procedure described in Mangum and Shirley (2015), and similarly used in Vastel et al. (2006), Caselli et al. (2003), Busquet et al. (2010) and Parise et al. 2011 for this specific line, where the total column density (\(N\)) of H2D+ is given by

$$ \frac{N(\mathrm{H_{2}D^{+}})}{\mathrm{cm^{-2}}} = \frac{8\pi \nu ^{3}}{c^{3}}\frac{Q(T_{\mathrm{ex}})}{g_{u}A_{ul}} \frac{e^{E_{u}/T_{\mathrm{ex}}}}{e^{h\nu /k T_{\mathrm{ex}}}-1}\int \tau dv, $$
(9)

with \(u\) and \(l\) referring to the upper and lower levels of the transition, respectively, \(k\) and \(h\) are the Boltzmann and Planck constants, respectively and \(\nu =372.42\) GHz is the frequency of the transition \(J_{\mathrm{{K_{a}, K_{c}}}} = 1_{10}\)-\(1_{11}\). The statistical weight of the upper level and the Einstein coefficient for the transition are \(g_{u}=9\) and \(A_{ul}=1.08\cdot 10^{-4}\) s−1, respectively. \(E_{u}=17.87\) K is the energy of the upper level, \(Q(T_{\mathrm{ex}})=11.70\) is the partition function of the molecule (for the ortho isomer only) at \(T_{\mathrm{ex}}=15\) K and \(\tau \) is the optical depth of the line, obtained from

$$ \tau _{\nu} =-\ln{\left (1- \frac{T_{\mathrm{b}}}{J_{\nu}(T_{\mathrm{ex}})-J_{\nu}(T_{\mathrm{CMB}})}\right )}, $$
(10)

with \(J_{\nu}\) the radiation temperature.

To compare our column densities derived from synthetic observations (\(N_{ \mathrm{APEX}}\) and \(N_{\mathrm{ALMA}}\)) to the model values (\(N_{\mathrm{Model}}\)), we integrated the o-H2D+ number density within the simulation box along the light-of-sight (LOS) and then derived \(N_{\mathrm{Model}}\) by averaging the column densities over the area subtended by the beam. For the comparison of observed quantities with the model values, in the following sections we derive column densities over several angular scales, depending on telescope. When computing column density ratios, the same angular scales were used for both the model and synthetic maps.

3 Results

3.1 Radial distributions

We first studied the fluxes, intensities and column densities as derived from the output of the RT simulation, then from the ALMA simulation and also ALMA combined with ACA. The results presented in this section do not include thermal noise. This is to focus on the spatial and angular effects of the observations and to avoid any biased interpretation due to low signal-to-noise ratios (S/N). We performed observations using the two stages of the core shown in Fig. 2, and we varied the source distance between 1, 5 and 10 kpc, to understand how the spatial filtering affects the retrieved column densities. We show the resulting radially integrated flux distribution and radially averaged profiles of the intensity and column density, in Figs. 4, 5 and 6, respectively. The two columns in each figure represent the stages at 16 and 74 kyr, from left to right. Each panel shows the results from the RT simulations in solid-green, from ALMA simulations in dashed-orange and from ALMA combined with ACA in dotted-blue. In Fig. 6, we included the column densities obtained from the MHD simulation for reference, shown in semi-dashed-red.

Fig. 4
figure 4

Cumulative distribution of the intensity at a distance of 1 kpc (top row), 5 kpc (middle row) and 10 kpc (bottom row), integrated within concentric circles of radii from 0.5 to 22 arcseconds, centered on the center of the map. The distributions are shown for two evolutionary stages, at 16 kyr and 74 kyr, same as those depicted in Fig. 2 and 3. The fluxes are shown for the output of the radiative transfer simulations (solid-green), for ALMA (dashed-orange) and for ALMA combined with ACA (dotted-blue)

Fig. 5
figure 5

Radially averaged intensity distribution over the same scales presented in Fig. 4

Fig. 6
figure 6

Radially averaged column density distributions over the same scales presented in Fig. 4. The model (semidashed-red) curve represent the physical column density integrated along the simulation box

The flux distribution in Fig. 4 is an increasing function of radius because of the constantly increasing area over which it is integrated, i.e., a cumulative function of intensity. The increment is rather smooth for the early stage, at 16 kyr, since the gas surface density distribution is also a smooth function of radius, slightly evolved from the initial Bonnor-Ebert sphere. The interferometric observations show some relevant differences compared to the RT case, which are also higher at larger radii. A possible explanation for this is that the largest angular scale of ALMA at the most compact configuration C43-1 (Band 7) is 8.25 arcseconds, while the size of the map is 41 arcseconds (at 1 kpc). Therefore, part of the extended emission is not detected by ALMA (main array only). This effect is slightly mitigated with the inclusion of ACA into the observing array (with a largest angular scale of 19.3 arcseconds), because it aids in covering the short-range baselines and allows to recover emission from larger angular scales, increasing the total fluxes and making them closer to those from the RT calculation. This effect is also appreciable when comparing the upper, middle and bottom panels, representing increasing distances of 1, 5 and 10 kpc, respectively. The difference between fluxes from ALMA (both main array and combined) as compared to the RT case is largest at the closest distance and becomes almost negligible at 10 kpc. The reason is the smaller angular scale that the source covers at larger distances, which allows all the extended emission to lie within the ALMA field of view. All these effects produced by the different spatial coverage are similarly reflected in the intensity distributions shown in Fig. 5. The main difference in its radial profile as compared to that of the radial flux is that the intensity decreases for longer radii because it is computed as a beam averaged quantity. Since the large scale emission of o-H2D+ is very faint, the detectable emission is rather compact (as seen in Fig. 2), therefore averaging over larger beams certainly decreases the retrieved intensity.

The column densities were derived from the intensity using equation (9) and therefore their radial distribution is similar to that of the intensity (see Fig. 6). For the column density analysis, we also included the values derived from the MHD simulations, labeled \(N_{\mathrm{Model}}\), which represent the true model values, in order to quantify the decrease in the measurement of column densities when derived from the ALMA simulations. The model values were obtained by integrating the o-H2D+ number density in the simulation box along the LOS and averaging it over concentric circles. The radii of such circles are the corresponding physical scales of apertures from 0.5 to ∼22 arcseconds, at distances of 1, 5 and 10 kpc. Our main purpose here is to check how well the column density can be recovered from the synthetic observation. The differences between ALMA and the model are as high as in the case for the intensity and follow the same trend at both timesteps. The origin of these differences is the same as for the flux and intensity distributions.

3.2 Comparison of inferred o-H2D+ column densities from ALMA and APEX

We also compared our observed column densities to values reported in the literature, in order to asses the reliability of our model and framework for synthetic observations. Common values of the o-H2D+ column density toward low- and high-mass sources lie within 1011 and \(10^{13}\text{ cm}^{-2}\) (see Table 2). Most of these values come from single-dish observations, with the exception of the recent detections carried out with ALMA by Redaelli et al. (2021). Here we compare them to our results obtained from the single-dish and interferometric simulations, aiming to understand the key differences that arise when using any of the two approaches. Our results are presented in Fig. 7. We show the results of the model, for synthetic observations with ALMA and APEX, and for this analysis we present several timesteps over the core evolution, from 16 kyr to 74 kyr. In the upper panel we also overplot the values from the literature, shown in Table 2, for reference. When comparing our results to real observations, we can see that, after ∼30 kyr of evolution, the column densities derived from ALMA lie well within the range of values reported in the literature for single-dish observations (Kong et al. 2016; Giannetti et al. 2019; Sabatini et al. 2020), but are lower than the results from Redaelli et al. (2021) by a factor of 10. Similarly, the values derived for APEX are 6 times lower (on average) than the real APEX observations (Giannetti et al. 2019; Sabatini et al. 2020). The difference in the derived column densities is very likely related to the assumed value of the excitation temperature \(T_{\mathrm{ex}}\), which we have fixed to 15 K, under the assumption of full thermal coupling with the gas (i.e., LTE conditions). The survey of low-mass sources conducted by Caselli et al. (2008) used a \(T_{\mathrm{ex}}\) of ∼7 K (average between all their starless cores), which was similarly used by Kong et al. (2016) and slightly increased to \(T_{\mathrm{ex}}=10\) K by Redaelli et al. (2021) when deriving column densities with ALMA. However, our ALMA synthetic column densities match better the results derived by Giannetti et al. (2019) and Sabatini et al. (2020), who assumed excitation temperatures in the range of \(10 - 20\) K for high-mass sources observed with APEX. To estimate the difference, we have additionally examined our column densities using lower excitation temperatures, and we find that a value of \(T_{\mathrm{ex}}=7\) K makes our ALMA values reproduce the observations from Redaelli et al. (2021) and our APEX values reproduce those from Giannetti et al. (2019) and Sabatini et al. (2020). As pointed out by Caselli et al. (2008), this comparison shows the effect that the selection of excitation temperatures has in the derivation of column densities. Within the overall uncertainties, we thus consider our theoretical expectations to be consistent with the observed fluxes.

Fig. 7
figure 7

Model versus synthetic column densities as a function of the collapsing time, using a source distance of 1 kpc. The upper panel shows the column densities from the model, ALMA and APEX, within their maximum resolution element, e.g., 235 au, 1000 au (1″) and 16800 au (16.8″), respectively. The middle panel shows the column densities from the model and ALMA derived both over the same extent of 6000 au (6″), which covers the densest and detectable parts of the core. The lower panel shows the model and APEX results, both averaged over the scale of covered by the APEX beam (16800 au)

Table 2 List of column densities of o-H2D+ (N[o-H2D+]) from cores and clumps reported in the literature and used in Fig. 7

The upper panel of Fig. 7, compares the column densities derived from the model, ALMA and APEX, over their respective highest resolution element, i.e., 235 au, 1000 au (1″) and 16800 au (16.8″), to see how relevant the limitation in spatial resolution is, and the retrieval of lower column densities due to the averaging over larger areas. As a proof of concept, the APEX values are lower than ALMA and the model because the emission of the core is very compact, and therefore the average intensity is lower for a larger beam. This means that for single-dishes and unresolved sources one must account for the beam filling factor of the source in order to avoid underestimations of the real column densities.

In the second panel we show the model and ALMA values averaged over the same spatial extent, equal to 6000 au. This is about the largest angular scale obtained by ALMA in its most compact configuration and avoids the necessity to perform multiple pointings to cover the whole image. The scale is also sufficient to cover the densest regions that lead to detectable emission, as it can be seen in the synthetic ALMA map shown in Fig. 8, for the core at the age of 74 kyr after a 10 hr observation. We compared over the same scale to focus on the effect of the sensitivity of the interferometer to the real physical values, without affecting the column densities by different spatial averages. The third panel shows a similar comparison but for the case of the APEX resolution (16.8″), which covers 16800 au. Based on the comparison in the middle and bottom panels, we see that when averaging over the same angular scale for both the model and APEX, the values retrieved by APEX are closer to the model than in the case for ALMA. This is due to the continuous spatial sampling that is achieved by single-dish telescopes, and points to the effect that the spatial filtering from interferometers has on decreasing the estimation of column densities.

Fig. 8
figure 8

Integrated emission (moment 0) of o-H2D\(^{+}\,(1_{10}-1_{11})\) from an ALMA synthetic observation, produced by the core at the age of 74 kyr (cf., Fig. 2), at a distance of 1 kpc and after observing for 10 hrs

We additionally reproduced the same analysis including the combined effect of ALMA and ACA, in order to see how significant the increase in the recovered flux is when complementing the uv-coverage with a sampling of the short-baselines. From the results shown in Fig. 9, we see that the inclusion of ACA increases the retrieved column densities at all stages of the collapse, similar to the effect seen in the radial profiles shown in Fig. 6, but this time taking into consideration the sensitivity of the observations by the inclusion of thermal noise. For the first two stages of evolution, we present our results as upper limits on the column density because the low abundance of o-H2D+ at the early timesteps makes the emission to be highly affected by the sensitivity of the observations.

Fig. 9
figure 9

Column density evolution over time, analogous to Fig. 7. All points are averaged within a beam of 6 arcseconds, for the model (semi-dashed red), for ALMA (orange) and for ALMA+ACA (blue). The downward arrows represent upper limits for the column density, since the detections at early timesteps is highly affected by the sensitivity of the observations

4 Discussion

4.1 Minimum integration time for a \(5\sigma \) detection

We also studied the observability of the o-H2D+ \(J_{\mathrm{{K_{a}, K_{c}}}} = 1_{10}\)-\(1_{11}\) line as a function of the distance to the source and the integration time required for interferometric observations only. Here we are interested in the dependence of the minimum integration time required to have a 5\(\sigma \) detection on the ratio of the column density measured over the model values when the source is placed farther away. For this analysis we performed ALMA (main-array only) simulations for the core at 74 kyr and at different distances, from 1 up to 7 kpc. The S/N value considered here was obtained by measuring the peak flux of the emission within the cube, in the regions and channels containing o-H2D+ emission (based on the ideal datacube from the radiative transfer) and dividing it by the cube rms. The rms of each cube was obtained by deriving the mean rms over 4 rectangular regions far from the source, for a few channels far from the line emission and then averaging them all. The results are shown in the upper panel of Fig. 10. In order to keep the S/N constant at 5, we had to integrate longer for larger distances because of flux dependence on distance. At a distance of 1 kpc, a S/N of 5 can be reached after 1 hr of time on source. At a distance of 7 kpc, a S/N of 5 can be reached within at least 7 hrs of integration. All these simulations were binned along the spectral axis by a factor of 10, from \(\Delta v=0.03\) km s−1 to 0.3 km s−1 to decrease the noise level, following equation (8).

Fig. 10
figure 10

Top panel: Minimum integration time required to have a 5\(\sigma \) detection as a function of the distance to the source. The color-shaded region indicates observing times for which no detection was obtained. Bottom panel: Ratio of synthetic column densities as derived from ALMA over the values from the model, as a function of the source distance. The red line represent the ratios for the 5\(\sigma \) observations and the blue line shows the results for high S/N observations with a 10 hr integration time

Simulated observations lying on the red-shadowed region of Fig. 10, led to a S/N value lower than 5 and were classified as no detections.

4.2 Quantifying the column densities information loss

We are additionally interested in quantifying how much information is lost due to the post-processing presented in this paper. We estimate the amount of information loss by means of the ratio of column densities from the ALMA synthetic observations (with 1″ resolution) and the column densities from the model (at intrinsic resolution), labeled \(N_{\mathrm{ALMA}} / N_{\mathrm{Model}}\). We initially compute this ratio for the low S/N (5\(\sigma \)) observations presented in Sect. 4.1. The results are shown in the lower panel of Fig. 10, represented by the red data points. Our results indicate a constant underestimation of the column densities when the S/N is ∼5, happening for distances up to 5 kpc. This estimation seems to be slightly higher at 6 and 7 kpc, likely produced by the decrease in the effect of spatial filtering, since all the emission of the source at 6 and 7 kpc lies well within the ALMA beam. For these low-S/N observations, the model column densities are underestimated by around 85% at 1 kpc and a bit more than 80% at 7 kpc. These column density ratios are rather low, since the source emission that is detected above the noise limit is a tiny fraction of the actual emission.

A natural question to ask is how much of the real column densities we lose if we aim for an actual high S/N detection. To understand this, we also performed ALMA observations at the same distances but increasing the integration time to 10 hrs each. The S/N of these observations were 14, 10.2, 9.6 and 9.0 for the 1, 2, 3 and 4 kpc distances, respectively, and calculated in the exactly the same way as in Sect. 4.1. In Fig. 8 we present the resulting map of the integrated emission (moment 0) of o-H2D+, for the case of highest S/N observation, which is produced by the core at the age of 74 kyr (from the density distribution shown in Fig. 2), at a distance of 1 kpc and observed for 10 hrs. The results for the column density ratios at 1, 2, 3 and 4 kpc are shown by the blue line in the lower panel of Fig. 10. For this part of the analysis, we only report the results up to 4 kpc because those at 5, 6 and 7 kpc did not achieve S/N much larger than 5, even when observed over several days or when binning down spectrally. For the high-S/N observation at 1 kpc, the model column densities are underestimated by a factor of ∼0.46. For the 2 kpc case, the underestimation increases a bit to ∼0.59 and then decreases to ∼0.56 and ∼0.30 at 3 and 4 kpc. The lack of a direct correlation between S/N and \(N_{\mathrm{ALMA}} / N_{\mathrm{Model}}\) is due to the interplay between the spatial filtering and the low recovery of source emission at lower S/N. In our results, the former is more relevant at 1 and 2 kpc while the latter is more relevant at 3 and 4 kpc.

5 Summary & conclusions

In this work we present a framework to produce synthetic observations using the radiative transfer code Polaris and the CASA observing tools. Our implementation focuses on the observability of the molecular line emission of ortho-H2D\(^{+} J_{\mathrm{{K_{a}, K_{c}}}} = 1_{10}\)-\(1_{11}\) at 372.42 GHz generated from the simulation of an isolated, rather compact high-mass prestellar core. We studied the differences that may arise when deriving column densities from a physical model and when deriving them from the flux of a single-dish or interferometric observation. We present a proof of concept of our framework and show that it reproduces the observed fluxes. Our main conclusions are as follows:

  • Column density estimates directly depend on the size of the area over which they are averaged. Then, when observing sources with compact emission, observations performed at lower angular resolutions will tend to have large losses due to beam dilution effects. Similar results have been reported by Bovino et al. (2019). This is highly effective when comparing estimations between interferometric and single-dish observations. However, when comparing both approaches to the real values averaged over the respective spatial extent, single-dish telescope estimations are much closer to the model than from interferometers. This is the result of a lack of sensitivity to the more extended emission due to missing short baselines, which decreases the overall source sampling as compared to single-dish telescopes, where the sampling is more extended and continuous within the beam.

  • The combined observations of ALMA and ACA improve the column density estimation with respect to the to the real values as compared to ALMA main array only, because it aids in reducing the effect of spatial filtering.

  • The correlation between distance and the fraction of column density obtained by ALMA (\(N_{\mathrm{ALMA}}/N_{\mathrm{Model}}\)) is not linear. Instead, it is determined by an interplay between the S/N and the spatial filtering.

We emphasize that the best results to estimate the physical column densities from a source will be obtained through the combination of single-dish and interferometric observations. Observers commonly take this effect into account by including Total Power measurements in their setup. However, we have not included them in our analysis, to strictly compare single-dish versus interferometric observations. Our analysis confirms the effects that contribute to the information loss of the estimated column densities. These are, beam dilution in the case of single-dish telescopes and spatial filtering in the case of interferometers.