1 Introduction

Except perhaps for the most simple objects, interferometric data are hard to interpret directly, and image reconstruction is a powerful tool for scientific analysis of the observations. However, optical long-baseline interferometric instruments have, since their beginnings, faced a series of challenges to produce science-grade images. These challenges are related to:

  • Sparse sampling of the measurements due to a limited number of spatial frequencies;

  • Non-convex inverse problem to solve (i.e., there may be several local minima);

  • Phase disturbance by the atmosphere in front of the telescope or interferometer, which smears out the phase information of the object.

Since direct inversion of the data is neither possible nor recommended, image reconstruction algorithms are based on regularized minimization processes that iteratively solve an (inverse) ill-posed problem, where the model of the data given the image is compared to the actual data, in order to determine how to better fit the data while respecting some imposed constraints. These constraints are needed to avoid over-fitting of the data (and thus explaining, erroneously, the noise as well as the significant signal) and to compensate for the sparsity of the data which leads to an under-determined problem. The basic concept of this regularized minimization could be expressed in the following form:

$$ \boldsymbol{x}_{\text{ML}}=\underset{\boldsymbol{x}}{\text{argmin}}[(1/2) \chi^{2}(\boldsymbol{x})+{\sum\limits_{i}^{n}}\mu_{i} R(\boldsymbol{x})_{i}],\\ $$
(1)

where χ2(x) is the likelihood of our data to a given imaging model, R(x)i are the used (prior) regularization functions and μi the weighting factors that trade-off between the likelihood and priors. There are several algorithms to perform the minimization to find the most probable image xML. Two of the most important algorithms are Gradient Descent and Monte Carlo Markov Chain (MCMC). Gradient Descent is an optimization algorithm that takes steps proportional to the negative of a gradient function with respect to the image pixels in order to find the best solution. This method is fast; however, it may fall on local minima that could lead to a misleading solution in the image construction process. As a consequence, several types of gradient descent algorithms have been developed to improve the basic one, such as the Active Set Algorithm for Conjugate Gradient Descent (ASA-CG) [9].

On the other hand, MCMC is based on a random process which determines the flux element positions in a pixel grid until the desired distribution of pixel flux fits the data and reaches an equilibrium distribution. The great advantage of this method is that it could find a global minimum, at the cost of being significantly slower than Gradient Descent.

By using these methods, several image reconstruction software have been developed to analyse optical/infrared interferometric data, and most of them are available to the community. However, due to the nature of the minimization process, these software cannot be considered “black boxes” which could produce science-grade images without any user interaction. The choice of regularization functions must be carried out with great attention [32], always based on the prior knowledge of the source’s brightness distribution. This choice has a significant influence on the resulting image (as we explore later), with some regularization functions emphasizing image smoothness while others emphasize sharp edges or compactness of the brightness distribution. A good image reconstruction is therefore not just the matter of having a good algorithm but also of using it correctly.

While most software can be understood from the principles summarized above, they still differ in a number of points. One of the most obvious difference is the kind of regularizations implemented. Some offer many different possibilities, while others are more restricted. Another less obvious difference, but not less important, is the likelihood term. This is mainly because optical interferometers do not provide complex visibilities but other observables such as the powerspectrum, the closure phase and the chromatic differential phase. These quantities are less sensitive to random delays induced by the atmosphere turbulence while the complex visibilities are completely destroyed by them. At this time, there is no consensus on the statistics of these observables and the image reconstruction algorithms implement different expressions for the likelihood term.

During the last twenty years several algorithms have reached sufficient maturity to produce science-grade images, among them we have: the Building-Block Method [12], BSMEM [5], MiRA [44], SQUEEZE [3], WISARD [19], and IRBis [13]. The capabilities of these software to recover images from optical/infrared long-baseline interferometric data have been shown several times in the community through the “Interferometric Imaging Beauty Contest” (see e.g., [36]). Additionally, these software have been used repeatedly to recover images from Fizeau interferometric data in Sparse Aperture Masking observations. For example, Tuthill et al. [45] used MACIM to recover the first images of Pinwheel nebulae; while [35] used BSMEM to recover images of several bow shocks around WR-stars in the central parsec of our Galaxy.

However, the standard use of several of these software packages was limited to recover monochromatic images. With the consolidation of the spectro-interferometric capabilities of the second generation instruments, GRAVITY [8] and MATISSE [17], at the Very Large Telescope Interferometer (VLTI), some of the aforementioned software have been updated, and new methods are being developed to obtain chromatic images in a systematic way. For example, at the VIII edition of the “Interferometric Imaging Beauty Contest” a benchmark of different algorithms was done with simulated chromatic data for the first time [34].

2 The necessity of chromatic imaging

2.1 Scientific cases for optical/infrared interferometry

Optical interferometry allows one to probe the innermost regions of several astrophysical environments. Here we describe three examples, but there are of course many more. Around young stars, where the accretion on the star itself is almost done and the gas and dust disk have settled, the process of planet formation occurs with planet accretion, planet migration, gap formation, etc. These regions measure only a few astronomical units across (∼ 10 au), i.e. for a system at 100pc, this corresponds to a hundred milli-arcseconds, i.e. barely the angular resolution of an 8m-class telescope. Therefore, the high angular resolution of long baseline interferometry is the only one that provide us the possibility to explore the inner regions (at scales of ∼ 1–10 milli-arcseconds) of such environments. However, such environments are quite complex and properly determining their structure is only possible if image reconstruction techniques are robust.

Another example is the study of the line-emitting regions in massive stars. Here, the photosphere of the star is not directly imaged to probe its nature and evolution, but the distribution and kinematics of the surrounding gas. These regions are also located at only few au from the star, but typically the sources are located at distances larger than 1kpc, preventing them from being resolved even with adaptive optics on 8m-class telescopes. While continuum optical interferometric images have been reconstructed for a few of these objects, studies of the line-emitting region have remained difficult. Chromatic imaging, however, was recently attempted for Phi Per and Eta Car [24, 33], where it was shown that the inclusion of differential phase information can vastly improve the image quality.

The third example is asymptotic giant branch (AGB) stars. These objects are one of the final steps in the evolution of low- and intermediate-mass stars. Since they return most of their processed material to the interstellar medium, they are important for understanding the chemical evolution of galaxies [11]. Therefore, understanding the atmospheric structure is crucial to bridge the physics of the enriched stellar interior and the evolution-driven winds. Chromatic interferometric imaging allows us to resolve the structure of the photosphere and together with the inner dust-shell layers, allowing us to investigate convection and the formation of dust at the most compact scales, among other physical phenomena (see e.g., [26, 28]).

2.2 Solving the phase problem

The problem of image reconstruction was faced already a few decades ago in radio astronomy. Several methods like CLEAN and the Maximum Entropy Method (MEM) were developed to produce images out of visibilities and phases. However, it was not until a new set of techniques were developed (e.g., hybrid mapping and self-calibration) that imaging with radio telescopes came to a boom, allowing the recovery of spectral images (see e.g., [29, 31]). In optical/infrared long-baseline interferometry, it was considered up to very recently that the phase information was completely lost due to the turbulent atmosphere. Therefore, the powerspectrum (often called squared visibility) and the argument of the bispectrum (the so-called closure phase) were defined as the main observables of the geometrical information of the source’s brightness distribution. However, the “new” available wavelength-dependent phase (the so-called “differential phase”), accessible through the spectro-interferometric capabilities of instruments like AMBER [30], GRAVITY [8], and MATISSE [17] provides much more information than the geometric information of the closure phase alone.

The differential phase is an interferometric observable that provides information about the flux centroid position of the source’s brightness distribution across the observed bandpass. This observable was not considered in optical/infrared interferometry imaging until [21, 37], and [20]. Indeed, differential phase provides a corrugated phase measurement, which, in theory, can be incorporated into a self-calibration algorithm, in a very similar way to what is done in radio interferometry [29].

As early as 2003, J. Monnier anticipated “revived activity [on self-calibration] as more interferometers with imaging capability begin to produce data”. And indeed, the conceptual bases for using differential phases in image reconstruction were laid in [21]. The Schmitt paper [37] was a first attempt to use differential phases in image reconstruction. They considered that the phase in the continuum was equal to zero, making it possible to use the differential phase (then equal to the phase) in the Hα emission line of the β Lyr system. In this way, they were able to image the shock region between the two stars at different orbital phases.

Millour et al. 2011[20] went one step further, by using an iterative process similar to radio-interferometry self-calibration, in order to reconstruct the phase of the object from the closure phases and differential phases. This way, they could recover the image of a rotating gas+dust disc around a supergiant star, whose image is asymmetric even in the continuum (non-zero phase). This method was subsequently used in a few papers to reconstruct images of supergiant stars (see e.g., [25, 27]) and Eta Carinae (e.g., [33, 48]). More recently, Hone et al. [14] published the first chromatic reconstruction of the Brγ structure in a YSO with IRBis. Mourard et al. [24] presented an extended method to tackle the image reconstruction challenges posed by the lacking of the closure phases and a proper calibration of the spectrally-dispersed visibilities.

2.3 Going one step further

An enormous effort has been made in the community to develop reconstruction methods one step further. For example, a project named POLCA funded by the French ANR (Agence Nationale de la Recherche) made the following advances on the subject:

  • Statistical analysis of AMBER data showed that the interferometric data do not follow the usual assumptions of uncorrelation and Gaussian (Normal) noise distribution. Correlations over time are significant and can be partly disentangled by considering differential visibilities in addition to absolute visibilities. A Student distribution of noise on the visibilities should be used instead of a Normal one it is expected since visibility is calculated as the division of two random variables, like in [41]. This could lead to a future improvement on descent algorithms used in model-fitting or image reconstruction [38].

  • New development on the core image reconstruction algorithm to take into account the wavelength-dependence of the data has been achieved and is distributed under the “PAINTER” software [38]. It works on chromatic datasets (i.e. both the bispectrum and the differential phase) and produces chromatic image cubes by using the ADMM descent algorithm and spatio-spectral regularizations. An improved version, presented in [39], uses wavelets for spatial regularization and Discrete Cosine Transform (DCT) for spectral regularization.

  • A future chromatic image reconstruction algorithm is in development by F. Soulez & E. Thiébaut under the name “MIRA-3D” [40]. This algorithm uses a joint spectro-spatial regularisation, and thus will be applicable to objects with components exhibiting different spectra.

  • The power of combining chromatic model-fitting and “grey” image reconstruction was demonstrated by [15] for low-spectral resolution datasets. The software “SPARCO” was developed to demonstrate this. The potential of this technique is great and allows one to perform “numerical coronagraphy” on the interferometric data, by removing the (main) contribution from the central star.

On the other hand, Fabien Baron, at Georgia State University, has developed a chromatic imaging algorithm called SQUEEZE [1,2,3]. This software uses a MCMC as engine for the minimization. The code is able to use Simulated Annealing and Parallel Tempering with Metropolis-Hasting moves. SQUEEZE allows us to use (i) visibilities (amplitudes and phases), (ii) powerspectra, (iii) bispectra, or combination of them as input observables. One of the key characteristics of this software is the large number of regularization functions offered, like the L0-norm, the L2-norm, Entropy, Total Variation, Wavelets, etc.

All these software efforts have had a tremendous impact in the development of chromatic imaging, and they could lead to a new generation of image reconstruction software. For example, combining the new core algorithms with the chromatic model-fitting features of “SPARCO” and self-calibration techniques could produce a leading edge image reconstruction suite suitable to reduce both archival AMBER data, and the new data being taken with the new spectro-imaging interferometric instruments GRAVITY, and MATISSE at the VLTI or MYSTIC [42, 43] at CHARA.

2.4 Toward integrated tools for image reconstruction

With the developments mentioned above, one can produce images, not only by using the closure phases but also the wavelength-differential phases. These developments have allowed us to recover simultaneously chromatic images, while, additional techniques like self-calibration have allowed us to perform imaging even with 2 telescopes. Nowadays, the number of image reconstruction packages available to the community is large, however each of these packages uses different interfaces and programming languages, making a comparison relatively difficult in practice.

In this respect, there is a working group at JRA4 aiming at providing advances in easing the use of image reconstruction software through a web-based interface called OImagingFootnote 1, in a similar way to what has been done for model-fitting and the LITproFootnote 2 software at JMMC. Similarly the input/output formats (OIFITS v1 vs OIFITS v2) and visualization tools should be standardized to allow the comparison of different software and different runs in the huge image-reconstruction parameter space.

Also of interest is a series of complementary recipes to produce “science-grade” images like Low Frequency Filling, Monte Carlo analysis [22], and other recipes (which have no specific name) like using central symmetry properties of the object [16], or reducing the field-of-view of the reconstruction to the photosphere of the central star only [23]. The integration of such recipes with the available image reconstruction software, either hard-coding them in the software or with external tools, would be valuable for the coming generation of imaging software.

3 An example application: the future MATISSE image reconstruction capacity, illustrated

Here, we have highlighted the importance of image reconstruction to scientifically assess the information encoded in the optical interferometry data. Therefore, characterizing the imaging capabilities of the different interferometric arrays is necessary, especially in the frame of the upcoming infrared beam-combiners. MATISSE (Multi-Aperture mid-Infrared SpectroScopic Experiment; [17, 18]) is one of the second-generation interferometric instruments of the VLTI. This instrument is conceived to interconnect up to four telescopes, either the Unit Telescopes (UTs) or the Auxiliary Telescopes (ATs) to capture visibilities, closure phases and differential phases in the mid-infrared. It represents a major advance compared with its predecessor MIDI, mainly, because it will allow us to recover, for the first time, the closure phase at three different bands: L-M (2.8–5.2 μm), and N (8–13 μm).

3.1 Data simulation

One of the major science case studies of MATISSE is the characterization of proto-planetary discs around young stellar objects. In this respect, image reconstruction represents a unique tool to obtain constraints on (i) the physics in the inner discs, (ii) the signatures of interaction between forming planets and the dusty disc, (iii) detection of companions in the disc-like structure, (iv) the signatures tracing different dust mineralogy (e.g., the silicate feature at 10 μm) and (v) the gas disc kinematics, among others. Therefore, we selected a prototypical Herbig Ae star as our image reconstruction source. HD179218 is a B9 star with an effective temperature Teff = 9600 K [7], a stellar mass M = 2.9 M [46] and located at a distance of 250 pc [47]. Thanks to a large collection of MIDI and Spectral Energy Distribution (SED) data, Menu et al. (in prep.) inferred a disc with an eccentric inner gap. Nevertheless no image of this structure has been obtained so far. A set of radiative transfer images obtained from the Menu et al. model was used to simulate the expected MATISSE uv coverage. We simulated three different interferometric arrays with the ATs, assuming MATISSE observations in low-resolution mode (R∼35). The three simulated configurations sample some of the small (A0-B2-C1-D0), medium (D0-G2-J3-K0) and large (A0-G1-J2-J3) telescope configurations available at the VLTI. We considered that the target was successfully observed at six different position angles over three different nights, each one with a different AT configuration. The applied noise model was generated using the MATISSE simulator developed at the Observatoire de la Côte d’Azur by Alexis Matter. This simulator uses the pre-computed theoretical interferometric observables and adds two main types of noise: (i) the fundamental noise and (ii) the calibration noise. Once the different error contributions are calculated, the theoretical observables are randomly changed following a Gaussian distribution within the computed error-bars. Figure 1 displays an example of the squared visibilities and closure phases recovered for the simulated AT configurations.

Fig. 1
figure 1

The figure displays the simulated squared visibilities (left) and closure phases (right) used to demonstrate the MATISSE imaging capabilities. The plotted observables correspond to three different observing nights, each one of them with a different ATs quadruplet. The colors represent the effective wavelength from 8μm (blue) to 13μm (red). Notice that the target is completely resolved and that it has a clear deviation from point-symmetry since the closure phases vary between -180 and 180

3.2 Recovering the images

To recover the images from our simulated data set, we used SQUEEZE. One of the key parameters in interferometric image reconstruction is the selection of the regularization functions. An important advantage of SQUEEZE is that it can process multiple regularization functions simultaneously, and because it uses MCMC sampling, they need not have well-defined gradients. The second important parameter for the reconstruction is the hyperparameter μ that controls the trade-off between the likelihood and the prior information of the brightness distribution encoded in the regularizers. Therefore, selecting the appropriate value of μ is crucial for the image reconstruction process. One of the most common methods to select the optimum μ is the L-curve [4, 6, 10, 15]. It computes the image solution for several values of μ, characterizing the response of the prior term versus the χ2. The optimal values are normally at the first inflection point of the L-curve. Figure 2 displays an example of this method, where R(x)i (in this case the Total Variation) is compared with χ2(x) for different values of μ. The plot shows clearly three areas; the so-called “under-regularized” region, in which different values of the regularizer produce similar χ2; and inflection point, which corresponds to the optimal value of μ; and the so-called “over-regularized” region, where the χ2 increases proportionally to each increment in the value of μ. This occurs because the regularizer is dominating the convergence criteria of the algorithm. To illustrate these effects over the reconstruction, we display the role of μ over the reconstruction of our object in Fig. 3.

Fig. 2
figure 2

L-curve obtained with different values of μ using the Total Variation as regularizer. The vertical axis displays the value of R(x)i and the horizontal axis the χ2(x). The optimal value of μ is at the elbow of the curve with the smallest χ2, in this case it is equal to 3.0e3. The optimal value of μ, together with the under-regularized and over-regularized regions are labelled on the frame

Fig. 3
figure 3

The image shows the impact of μ on the image reconstruction process. Eight reconstructed images of the source are shown using the hyperparameter values shown in Fig. 2. Notice that for the first three images (in the under-regularizer region) the general morphology of the object is recovered, but the reconstructed maps are clearly noisier than the one with the optimal value of μ = 3000. However, when the value of the hyperparameter is quite large, the structure of the source is completely erased and the code is not able to converge to reproduce the expected brightness distribution of the source

Figure 4 shows the N-band monochromatic reconstruction obtained with four different SQUEEZE regularization functions, to evaluate their impact. The target was considered to evolve as a grey-body object over all the channels inside the bandpass. Although this consideration is not true for most of the astrophysical objects, it represents a good starting point to calibrate the different parameters used for the reconstruction. In this case, we used both the squared visibilities and closure phases to recover the brightness distribution of our object. From this reconstruction, we could notice that the different regularizers were able to recover the general morphology of the target. Nevertheless, there were still some significant differences among them. For example, while the Total Variation was able to recover a smooth rim morphology, the other regularizers underestimated the brightness distribution of the rim for position angles between 90 and 180 (East of North). Additionally, all the reconstructed images show several bright-spots well localized along the rim, instead of a uniform distribution like in the model. This highlights the importance of a good selection of the regularization function and demonstrates the necessity of user interaction with the reconstruction process. It also shows that the reconstructed images are still models of the object’s brightness distribution and caution must be taken when physical parameters are derived from them.

Fig. 4
figure 4

Best reconstructed monochromatic images using SQUEEZE with four different regularizers. The leftmost panel displays an average of the model image for comparison. The colormap of the images is normalized to the maximum pixel value in the model frame. Notice the dependence of the reconstructed image according to the used regularization function

While the monochromatic images are crucial as first step to the reconstruction, one of the main goals of the new generation of infrared interferometers is to recover the morphological changes of the astrophysical objects along the bandpass of the observations. This aspect is particularly important for MATISSE, which will have a bandpass as large as Δλ ∼ 5 μm in the N-band. Therefore, we explored this capability by performing a polychromatic reconstruction by including the differential phase information of the data. The initial setup of the reconstructed images considers the best image from the first monochromatic reconstruction as starting point. In the previous reconstructions, the L0-norm and the TV regularizers exhibited the best performance. Therefore, for this new reconstruction, both of them were used together with a transpectral L2-norm regularization. Transpectral in this case means that the regularization was applied in the wavelength direction as well; more information about the implementation can be found in the SQUEEZE documentation. The hyperparameters were selected by tuning them manually with values close to the best ones obtained from the L-curve analysis of the monochromatic case. Nevertheless, we are aware that selecting the optimal ones from a multi-dimensional L-curve of the used regularizers can optimize these values. Fourteen images were recovered; each one of them corresponding to one of the simulated spectral channels. Figure 5 displays the recovered chromatic images. It is clear that, with this initial setup, the rim morphology was reproduced at all the reconstructed channels. However, the central source was only recovered in the first four of them. It is important to mention that the total flux in the central source corresponds to only a small percentage of the total flux in the object. Even for the first spectral channel at 8.18 μm, it only corresponds to 5% of the total flux, decreasing for longer wavelengths up to ∼0.8% at 12.72 μm.

Fig. 5
figure 5

Best reconstructed polychromatic images. The figure shows 14 recovered images that correspond to each one of the channels sampled with the simulated data. For comparison, each one of the model images are also plotted. The wavelength of each image is shown in the model frames. The colormap of the images is normalized to the maximum pixel value in the model frames. Since the amount of flux in the central source is considerably smaller than the flux in the ring, the central object is only recovered in the first four frames, for the rest of the images, the structure of the ring is dominant

4 Conclusion

  • The recovery of milliarcsecond resolution interferometric images in the infrared will represent a major breakthrough for the coming generation of beam-combiners. For example, on the one hand, MATISSE will allow us to image astrophysical objects in the mid-infrared with unprecedented resolution, representing a tremendous advantage with respect to its predecessor MIDI, which only allowed for parametric modelling of the interferometric data. On the other hand, GRAVITY will enable the possibility to use referenced-phase information by using the so-called dual-field mode. Even the James Webb Space Telescope will include a Sparse Aperture Masking mode on board, which will require mature infrared image reconstruction algorithms to recover the brightness distribution of the imaged objects.

  • Our current understanding of the image reconstruction problem and the current developed software allowed us to perform both monochromatic and polychromatic image reconstructions of simulated interferometric data. For the example here presented, we could recover the different components of a proto-typical young stellar object. However, we have also shown that image reconstruction is still not trivial and required a systematic study of the parameters used in the reconstruction, particularly, of the different regularizers and the value of the hyperparameters. Therefore, it is necessary to compare the results of the image reconstruction with several software and methods to better understand the systematics. To accomplish this task, a standarization of the benchmarking methods should be done to properly evaluate the outcome of the different methods in a systematic way.

  • The better we understand the requirements to achieve a science-grade images from interferometric observations, the easier it will be to provide tools and procedures to the community to make more accessible the use of the current techniques. This is a task that should be addressed in the coming years as part of an effort to broaden and engage the field with more members of the international community.

  • Testing the image capabilities of the different imaging algorithms is essential to have a full description of them and it is essential for the future of infrared interferometry in Europe.