Non-adaptive Heisenberg-limited metrology with multi-channel homodyne measurements

We show a protocol achieving the ultimate Heisenberg-scaling sensitivity in the estimation of a parameter encoded in a generic linear network, without employing any auxiliary networks, and without the need of any prior information on the parameter nor on the network structure. As a result, this protocol does not require a prior coarse estimation of the parameter, nor an adaptation of the network. The scheme we analyse consists of a single-mode squeezed state and homodyne detectors in each of the M output channels of the network encoding the parameter, making it feasible for experimental applications.


Introduction
Increasing the level of precision achievable in the estimation of physical properties of systems, such as temperatures, optical lengths and magnitude of external fields among others, is one of the multiple applications of quantum technologies that have been extensively studied in the recent years. In particular, the goal of quantum metrology-the field of science laying between quantum mechanics and estimation theory-is to propose and analyse estimation protocols that surpass the precision achievable by classical strategies by employing quantum probes and quantum measurement schemes. In fact it is well known that the classical limit on the precision achievable in the estimation of an unknown parameter when employing N probes, known as shot-noise limit, for which the error is of order 1/ √ N , can be surpassed by quantum strategies achieving the ultimate Heisenberg limit, where the estimation scales as 1/N [1][2][3][4][5][6][7][8][9].
The first proposed protocols reaching Heisenberg-scaling sensitivity heavily employed entanglement as a metrological resource [1][2][3], and several entanglement-based strategies have been recently studied with interesting results, especially in the cases of simultaneous estimation of multiple parameters with non-commuting generators [5][6][7][8][9][10]. Nonetheless, the entanglement fragility and the complicated procedures needed to generate entangled metrological probes, such as NOON or GHZ states [11,12], are two of the challenges that stimulated the search for more feasible estimation schemes making use of protocols implementing metrological resources that are easier to generate and to manipulate. Squeezed light [13,14] manifests useful properties (e.g. robustness to decoherence, relatively easy implementation, reduced noise below the vacuum shot-noise) which make it a perfect candidate as a feasible metrological resource [15][16][17][18][19]. Motivated by these favourable properties, many works have recently focused on the analysis and proposal of Gaussian metrological schemes, namely involving squeezed states as probes and homodyne detection as measurement, and the ultimate precision that these can achieve in the estimation of a single localised parameter [15][16][17][18]20,21], a function of parameters [22][23][24], or a single distributed parameter, such as the temperature or the electromagnetic field, affecting several component of the network [8,[25][26][27][28][29][30]. These schemes typically perform an analysis based on the study of the Cramér-Rao bound (CRB) [31], or its quantum counterpart [32], to assess the ultimate precision achievable in the estimation, and show whether the Heisenberg scaling can be achieved. Although the CRB analysis does not generally assure that the ultimate bound on the precision found can be achieved globally, namely with an estimation strategy which does not depend on the true value of the parameter to be estimated [33,34], local estimations are relevant in the typical interferometric framework, in which small deviations of the parameters need to be measured. Interestingly, it has been recently found that it is always possible to reach Heiseneberg-scaling sensitivity regardless of the structure of the network encoding the parameter, only employing a single squeezed vacuum state, a single-homodyne measurement, and an auxiliary network suitably engineered, whose preparation only requires a knowledge on the unknown parameter that can be obtained by a classical measurement [26,29,30]. The need for an auxiliary stage in such protocols arises from the fact that in general the probe is scattered by the network in all its output ports, while only a single port is eventually a e-mail: danilo.triggiani@port.ac.uk (corresponding author) b e-mail: vincenzo.tamma@port.ac.uk  Fig. 1 Optical set-up for the estimation at the Heisenberg-scaling precision of a single unknown parameter ϕ encoded arbitrarily in a generic passive Mchannel linear network. The parameter can be either localised in a single element of the network, or represent a global property affecting several components, such as a temperature or a magnetic field. A single source of coherent squeezed states with N = N D + N S average photons, where N D = d 2 and N S = sinh 2 r are the number of displaced and squeezed photons, respectively, is employed in a single input channel (the first in figure), while homodyne measurements are taken in every output channel measured through homodyne detection, so that an auxiliary network is required in order to refocus the probe on the only channel observed. A question that naturally arises is whether incrementing the number of observed channels would ease, if not completely lift, the requirement of an auxiliary stage and ultimately the requirement of a prior classical knowledge on the unknown parameter. Moreover, different Gaussian protocols rely on encoding the information about the unknown parameter on the displacement of the probe, requiring that a portion of the resources in the probe are employed in a non-vanishing displacement [28,35]. Despite concentrating all the photons in the squeezing is known to be the optimal allocation of the resources in the probe [26], encoding the parameter into a non-vanishing displacement can reduce the estimation process into the relatively simple task of inferring the parameter from the expectation value of a Gaussian probability density function [28].
In this work we investigate the ultimate precision achievable in the estimation of a parameter encoded in a generic linear network, when employing a single-mode squeezed coherent Gaussian state and performing homodyne detection on all the output channels (see Fig. 1). We show that, without making any assumption on the structure of the linear network nor on the nature of the parameter, it is always possible to reach Heisenberg-scaling sensitivity with such set-up, without the use of any auxiliary network. This allows for estimation protocols not requiring a preparatory stage nor a prior coarse estimation of the parameter, as opposed to the single-channel homodyne protocols in Refs. [29,30]. We also show that two independent contributions on the precision arise from our analysis: one originated from the presence of displaced photons in addition to squeezed photons, and the other from the squeezing of the probe. We find that both contributions can reach Heisenberg-scaling sensitivity independently, and this can be achieved expectedly when the local oscillators phases are chosen such that the noise in the outcome is reduced, namely when the squeezed quadratures are observed in each output channel. Thus, differently from the schemes in Refs. [29,30], it is then possible to employ this set-up to retrieve information on the parameter through measurements of the average value (i.e. the displacement) of the signal observed with homodyne detection, as well as through the modulation of the noise. Although it is not required to reach the Heisenberg-scaling sensitivity, the presence of an auxiliary network in general affects the precision of the estimation through a pre-factor multiplying the scaling. This comes in useful in those cases where priority is given to increasing the precision, at the expenses of engineering an auxiliary network to be added before the estimation protocol is started.

Set-up
Let us consider a generic M × M passive linear network whose action on any injected photon probe is given by the unitary operator U ϕ , in which the unknown parameter ϕ to be estimated is encoded in an arbitrary manner. The linearity and passivity of the network allow us to describe it with an M × M unitary matrix U ϕ related to the evolution operatorÛ ϕ bŷ The input probe is prepared in a single-mode squeezed coherent state |Ψ in =D 1 (d)Ŝ 1 (r )|vac with N = sinh 2 r + d 2 /2 ≡ N S + N D average number photons, whereŜ 1 (r ) = exp r (â †2 1 −â 2 1 )/2 is the squeezing operator with real squeezing parameter r , and 2 is the displacement operator with real displacement d, and it is injected in one input channel, say the first, of the linear network. In this case, only the first row of U ϕ is relevant in this protocol where we have made explicit the probability P j that each photon exits from the jth output port of the network, and the phaseγ j acquired in the process, with j from 1 to M. The unitarity of U ϕ assures that j P j = 1. A homodyne detection is then performed at each of the output channels, and the quadraturesx i,θ i are measured, where θ i is the ith local oscillator reference phase, from which we want to infer the value of ϕ. Due to the Gaussian nature of the scheme, the joint probability distribution p(x|ϕ) associated with the M-mode homodyne measurement is Gaussian Here, Σ is the M × M covariance matrix with elements (see "Appendix A") where δ i j is the Kronecker delta, γ i =γ i − θ i is the phase delay at the output of the ith channel relative to the correspondent local oscillator, and |Σ| is the determinant of Σ, which reads (see "Appendix A") and μ is the mean vector For any given unbiased estimatorφ, the statistical error in the estimation of ϕ after ν iterations of the measurement is limited by the Cramér-Rao bound (CRB) [31] Var where F (ϕ) is the Fisher information associated with the Gaussian distribution (3), and reads (see "Appendix B") where C = |Σ| Σ −1 is the cofactor matrix of Σ and Tr[·] denotes the trace. In the following we will discuss in detail expression (9) in the asymptotic limit of large N , showing which condition must be met in order for this set-up to reach Heisenberg-scaling precision in the estimation of ϕ, and compare differences and advantages with respect to the Heisenberg-scaling single-homodyne schemes [29,30]. We conclude this section by remarking that, in the case of a single channel, M = 1, the last term in the right-hand side of (9) vanishes, and thus the only relevant terms are the first two, containing the derivative of the mean μ and of the determinant of Σ, which reduces to the variance of a single-homodyne measurement. Interestingly enough, we will show that also in the multi-homodyne case, only the first two terms are relevant for the Heisenberg scaling in the asymptotic regime.

Heisenberg scaling of the Fisher information
In order to investigate the asymptotic behaviour of the Fisher information (9), it is convenient to express the elements of the cofactor matrix C in terms of the squeezing factor r (see "Appendix B"): where Notice that every element of C in the previous expressions, and of Σ in (4), scale at most as quick as N , namely , and the same asymptotic bounds hold for their derivatives with respect to ϕ, since neither P i norγ i depend on N . For this reason, in order for the Fisher information in (9) to asymptotically grow with Heisenberg scaling, it is essential to study the asymptotics of the determinant |Σ| in (5) and find the conditions for which it does not grow with N .
In fact, it is evident from Eq. (5) that in general |Σ| = O(N S ), and we show in "Appendix C" that the necessary condition for it to scale slower than N S is that the relative phases γ i tend to ±π/2 for large N S : in other words, the larger the number of photons employed in the squeezing of the probe to reach higher precisions, the closer the local oscillator phase needs to be tuned to the minimum-variance quadrature of each mode.
More precisely, as shown in "Appendix C", the conditions to reach Heisenberg scaling in the Fisher information (9), read When these conditions hold, we can introduce the finite quantities k i = lim N S →∞ N S (γ i ∓ π/2), and the determinant |Σ| reduces to while ∂ ϕ |Σ|, ∂ ϕ Σ, ∂ ϕ C and C tend to constant values, and ∂ ϕ μ scales as √ N D , thus making only the first two terms of the Fisher information dominant for large N .
Noticeably, both terms in the asymptotic Fisher information (14) give a Heisenberg-scaling precision in the estimation of the parameter ϕ, provided that both the average number of photons in the displacement N D and in the squeezing N S scale with the total average number of photons Moreover, it is worth noticing that the first term in Eq. (9), and thus in Eq. (14), depends on the information encoded in the displacement of the probe, and thus it vanishes if μ = 0, namely if the probe is a squeezed vacuum and N D = 0. The second term instead depends on the information on ϕ encoded in the variance of the measurement itself: it arises only from the interaction with the squeezed photons and vanishes if ∂ ϕ |Σ| = 0, namely when k avg = 0 in Eq. (14), and in particular when γ i = ±π/2, for i = 1, . . . , M, in Eq. (12), corresponding to quadratures with minimum squeezed variances, and thus locally insensible to the variations of the parameter.
Interestingly, this latter case is similar to the single squeezed vacuum and single-homodyne scenario found in the literature [29,30]: in fact, the second term in (14) represents a generalisation of the single-homodyne Fisher information F 1 (ϕ) = 8 (k)(∂ ϕ γ ) 2 N 2 , and it can be obtained by substituting k and ∂ ϕ γ with their averages over the probabilities P i , namely k → i P i k i and ∂ ϕ γ → i P i ∂ ϕ γ i .
We have then found that, also when employing multiple homodyne detections, one for each output port of the interferometer, the Heisenberg-scaling precision obtained through measurements of the squeezed noise (i.e. ∂ ϕ μ = 0) is only reached when the quantum fluctuations of the observed quadratures are reduced to their quantum limit, i.e. |Σ| = O(N −1 S ), while the variations of the unknown parameter ϕ still yield a visible effect on the outcomes of the measurements, i.e. ∂ ϕ |Σ| is not vanishing.
However, at the expense of introducing a nonzero displacement in the probe, it is possible to relax the condition ∂ ϕ |Σ| = 0, thus allowing us to choose k i = 0 in Eq. (13), thus effectively measuring the maximally squeezed quadratures at γ i = ±π/2. Indeed in such a case, even if the contribution to the Fisher information associated with only the squeezed photons in Eq. (14) is vanishing, it is still possible to reach Heisenberg-scaling precision through the information on the parameter encoded in the displacement of the probe.
An important feature of this protocol, which differentiate it from its single-homodyne counterpart, is that it does not require any adaptation of the network to the value of the unknown parameter, namely no auxiliary networks needs to be added at the input nor the output ofÛ ϕ to reach Heisenberg-scaling precision. The only condition (12) can be thought as a minimum-resolution requirement on the local oscillators phases, which can thus be achieved without adding further auxiliary networks.
However, this does not mean that the form of the network U ϕ does not affect the precision of our protocol in the estimation of ϕ: the terms k avg and (∂γ ) avg appearing in the constant factor in the Fisher information in Eq. (14) depend on the transition probabilities P i and on the derivatives of the relative phases ∂ ϕ γ i . In particular, an exceptionally poorly conceived network, e.g. one for which γ i is independent on ϕ for every i such that P i = 0, can be associated with a null factor (∂γ ) avg that sets to zero the Fisher information. In this case, adding a ϕ-independent auxiliary network V , either at the input or at the output of U ϕ , might modify both P i and γ i , and thus (∂γ ) avg .

Conclusions
We have shown that performing homodyne measurements at each output channel of an arbitrary linear network encoding an unknown distributed parameter ϕ to be estimated allows us to reach Heisenberg-scaling precision for a single-mode squeezed probe with no prior information on ϕ. The information on ϕ is encoded both in the displacement and in the squeezing of the probe, leading to two independent contributions which can both provide Heisenberg-scaling sensitivity. We have shown that the determinant of the covariance matrix associated with the measurement outcomes plays an important role in the enhanced sensitivity: in particular, we demonstrated that the conditions to reach Heisenberg scaling in either of the two contributions, which can be met manipulating the phases of the local oscillators, correspond to imposing that the determinant of the covariance matrix is of order N −1 for large N . Differently from protocols involving only homodyne measurements at a single channel, here there is no need for a refocusing auxiliary stage: the procedure is independent of the network and of the value of the parameter. This allows us to safely entrust the measurement operation to an independent party without sharing any information on the structure of the network, possibly opening up a further path towards secure sensing and cryptographic quantum metrology [39][40][41]. On the other hand, we showed that, despite not required to achieve the Heisenberg limit, one can still employ an auxiliary stage to further enhance the estimation precision by a constant factor. As a future step, it would be interesting to extend the results hereby presented beyond the Cramér-Rao bound analysis, also in the framework of a parameter distributed arbitrarily in a network. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendix A: Joint Detection Probability
Here we will first obtain the expectation value μ associated with the homodyne measurements on the probe after the interaction with the linear network shown in (6), then we derive the expression of the covariance matrix Σ shown in (4), and its determinant |Σ| in (5).
The initial phase-space displacement of the injected probe α 0 = Ψ in |ẑ|Ψ in , whereẑ = (x 1 , . . . ,x M ,p 1 , . . . ,p M ) and |Ψ in = D 1 (d)Ŝ 1 (r )|vac is a 2M-vector with d = √ N D . This vector is transformed by the linear network, and at the output reads where R is the 2M × 2M orthogonal and symplectic matrix associated with the interferometer unitary matrix U ϕ The local oscillator phases in the homodyne measurements are described by the 2M × 2M orthogonal matrix with Θ = diag(θ ) = diag(θ 1 , . . . , θ M ). This matrix represents a clockwise rotation in phase space for each of the M channels of the network, of angles θ i for the ith mode. The mean vector μ in Eq. (6) is then given by the first M elements of O θ α. The 2M × 2M symplectic covariance matrix Γ 0 of the squeezed stateŜ 1 (r )|vac reads where R is the M × M diagonal matrix R = diag(r, 0, . . . , 0). Once again, the action of the linear network U ϕ is represented by the orthogonal and symplectic matrix R in Eq. (17), so that the covariance matrix of the probe at the output is where, by direct calculation, The covariance matrix Σ = Σ x at the detection stage is then obtained by extracting the first M rows and columns from the matrix O θ Γ O T θ , and thus its elements as shown in (4) can be easily obtained. To evaluate the determinant |Σ|, we first notice that Γ 0 ≡ I/2 + K 0 , where I is the identity matrix and K 0 = Γ 0 − I/2 is a diagonal matrix of rank 2. Being the rank invariant under orthogonal rotations, the same holds true for O θ Γ O T θ ≡ I/2 + K , with rank(K ) = 2. By definition of rank, none of the sub-matrices of K can have rank greater than 2; hence, we can write with rank(A) ≤ 2, being A a sub-matrix of K . We can then apply the result presented in "Appendix E" to Σ and write |Σ| as a sum of determinants of the matrices obtained replacing any number of columns of I/2, with the respective columns of A where the first term is the determinant of I/2, the terms in the first summation are the contributions that arise substituting the ith column of I/2 with the ith column of A, and the terms in the last summations from substituting the ith and jth columns, and we also exploited the symmetry of A. Noticeably, since rank A ≤ 2, all the contributions involving the replacement of three or more columns of A are vanishing. By direct calculation, the expression in (5) can then be easily obtained.

Appendix B: Fisher information
In this Appendix we will obtain the expression for the Fisher information shown in Eq. (9), and the expression for the cofactor matrix in Eq. (10). By plugging the probability density function in Eq. (3) into the definition of the Fisher information (8), one can easily obtain where we used the matrix identity ∂ ϕ Σ −1 = −Σ −1 (∂ ϕ Σ)Σ −1 . We now can express the inverse of the covariance matrix in terms of its cofactor matrix C and its determinant, namely Σ −1 = C/ |Σ|, where the symmetry of the covariance matrix allows us to consider directly the cofactor matrix, and not its transpose. The second term reads We recognise in the first term of Eq. (27) Jacobi's formula for the derivative of the determinant, ∂ ϕ |Σ| = Tr[C∂ ϕ Σ], which allows us to obtain the expression shown in Eq. (9) In order to explicit the cofactor matrix C in terms of the elements of Σ, and thus in terms of squeezing parameter r , transition probabilities P i , and relative phases acquired γ i , i = 1, . . . , M, as displayed in (10), we first need to make some observations. First, the (s, t)-cofactor C st , which is defined as the determinant of the L − 1 × L − 1 sub-matrix of Σ obtained deleting the sth row and tth column, then multiplied by (−1) s+t , can also be thought as the determinant of the L × L matrix Σ [s,t],1 , where we denote with X [s,t],n the matrix obtained from the matrix X replacing all the elements in the the sth row and in the pth column with zeros, except the element (s, t) which is replaced by n, namely Second, as discussed in "Appendix A", Σ = I/2 + A, where A is a symmetric matrix with rank(A) = ρ ≤ 2. Thus, we can write the (s, t)-cofactor as C st = (I/2) [s,t],0 + A [s,t],1 , and evaluate this determinant as a sum of determinants of matrices obtained swapping columns of (I/2) [s,t],0 and A [s,t],1 , as discussed in detail in "Appendix E". Noticeably, by replacing a row and a column of A may increase its rank by one, so that rank(A [s,t],1 ) ≤ 3. It is convenient now to consider separately the simpler case s = t first, and then s = t. We notice that the matrix (I/2) [s,s],0 has a single zero eigenvalue, and thus each contribution to C ss is non-vanishing only if the sth columns of (I/2) [s,s],0 is replaced. We thus obtain which is the sum of terms obtained substituting the sth, the sth and ith, and the sth, ith and jth columns, respectively. Noticeably, replacing more than 3 columns yields vanishing contributions, since rank(A [s,t],1 ) ≤ 3. When s = t, (I/2) [s,t],0 has two null eigenvalues; hence, all the non-vanishing contributions must replace the sth and tth columns. For example, the only contribution obtained swapping the sth and tth columns is of the type where we also exploited the symmetry of A; the contribution obtained swapping the sth, tth and ith columns, with i = s, t are of the type where once again, we exploited the symmetry of A. Replacing more than 3 columns once again yields no contributions since rank(A [s,t],1 ) ≤ 3. The final expression for C st , s = t thus reads Replacing in (30) and (33) the definition of A = Σ − I/2, it is straightforward to obtain Eqs. (10).

Appendix C: Asymptotics
In this appendix we will study the asymptotic regime of the Fisher information (9), for large N = N S + N D = sinh 2 r + d 2 . We will show that the only conditions needed to reach Heisenberg scaling are the ones shown in (12), and that in this case the asymptotic expression for the Fisher information is (14). First, from the explicit expressions of Σ in (4) and C in (10), we notice that each of their matrix elements is at most of order of N S , since sinh r = √ N S , the single-photon probabilities P i are independent of N S , and the cosine and sine functions are limited. The same holds true for their derivatives: in particular ∂ ϕ P i and ∂ ϕ γ i = ∂ ϕγi do not depend on N S , where γ i =γ i − θ i is the phase acquired by the signal through the ith output port relatively to the local oscillators reference phases. Moreover, similar considerations can be applied for μ = O( √ N D ), and specifically for its derivative ∂ ϕ μ. This implies that, in order to reach a Heisenberg-scaling sensitivity, namely a scaling of order of N 2 in the Fisher information, the determinant |Σ| cannot be of any order higher than N 0 .
We thus first focus our attention on |Σ| shown in Eq. (5). In particular, we will suppose that, for large N , γ i tend to finite values γ 0i , namely that γ i = γ 0i + k i N −α , with k i ∈ R of order 1 and α > 0, since if γ i were to grow with N (i.e. α < 0), it would give rise to an oscillating asymptotic behaviour to |Σ|. By expanding the squeezing parameter r in powers of N S in (35), we obtain where In order to cancel the scaling with N S , D 1 must be equal to, or tend to, zero. After some trigonometry, and exploiting the passivity of the linear network U ϕ which sets i P i = 1, we can rewrite which tends to zero iff γ 0i = π/2 + nπ, with n ∈ Z.
In particular, the scaling of |Σ| for large N S will be of order N 0 S or lower only if γ i = π/2 + nπ + k i N −α S , with α ≥ 1/2. To see that, we notice that, for γ 0i = π/2, D 1 and D 2 scale with N −2α S , while D 3 scale with N 0 S . Thus, |Σ| scales with N 1−2α S for α ≤ 1 (and in particular with N 0 S for α = 1/2), and with N −1 S for α > 1. Noticeably, also D 2 tends to zero iff γ 0i = π/2 + nπ. We now study the asymptotics of the numerators appearing in the Fisher information, when the condition γ i = π/2 + k i N −α S , with α ≥ 1/2, is true. We first obtain the derivative of Σ from Eq. (4), and substitute γ i = π/2 + k i /N α S , with α ≥ 1/2, and we notice that it scales at most with N 1−α for 1/2 ≤ α < 1 and at most with N 0 for α ≥ 1. We then analyse the auxiliary term (11) of which we evaluate the derivative when which scales at most with N 1−α S for large N S . Moreover, the covariance matrix Σ in (4) asymptotically reads and thus, inserting (42) and (44) in the cofactor matrix (10), we obtain Finally, we can write It is easy to see that ∂ ϕ |Σ| scales at most with N 1−α S , since M i ∂ ϕ P i = 0, while the derivatives of the elements of C scale at most with N 1−α S for 1/2 ≤ α < 1, and at most with N 0 S for α ≥ 1. Our last step is to evaluate the asymptotics for ∂ ϕ μ, which can be easily evaluated differentiating equation (6) Now we can finally draw our conclusions and obtain the scaling of the Fisher information (9) by putting together all the asymptotic regimes found. First, we notice that, independently of α ≥ 1/2, the last term in Eq. (9) always scales with N S , hence only reaching shot-noise precision. The only terms which allow the Heisenberg scaling are then the first two: in fact, the first term scales with N D N 2α−1 S for 1/2 ≤ α ≤ 1, and with N D N S for α > 1, and thus reaching Heisenberg scaling for α ≥ 1 (condition α > 1 includes the case γ i = π/2), while the second term reaches sub-shot-noise scaling N 2−2|α−1| S for 1/2 < α < 3/2, with Heisenberg scaling for α = 1.
This shows that the only condition to reach Heisenberg scaling is that for i = 1, . . . , M, as shown in (12), or equivalently that asymptotically γ i π/2 + k i /N S with k i ∈ R of order 1, as well as that the only relevant terms under this condition are the first two in expression (9), and that only for α = 1 the first term is non-vanishing.
We can now finally prove (14). Substituting condition (50) inside Eqs. (41) and (43), we obtain from (46) the asymptotics where we once again exploited the passivity of the interferometer, so that M i=1 P i = 1, while we obtain from (36) which is the expression shown in (13). Lastly, from Eq. (49) when conditions (50) hold, we obtain the asymptotics Eqs. (51), (52) and (53) yield the asymptotic expression for the Fisher information shown in (14).

Appendix D: Maximum-likelihood estimator
We will now obtain the implicit equation that defines the maximum-likelihood estimatorφ MLE , which saturates the Heisenbergscaling Cramér-Rao bound (7), with the Fisher information given in Eq. (14), in the asymptotic regime of large ν. Let us then imagine that, after ν iterations, we collect ν sets of outcomes x i , i = 1, . . . , ν from the M homodyne measurements at the output of the linear networkÛ ϕ . The likelihood that these outcomes are observed for a given value ϕ of the unknown parameter is given by where p(x j |ϕ) is the probability density function given in (3). The maximum-Likelihood estimatorφ ≡φ(x 1 , . . . , x ν ) is defined as the value of ϕ which maximises the likelihood L that the outcomes x 1 , . . . , x ν are observed, and it is usually found by maximising the log-likelihood function where we exploited Jacobi's formula for the derivative of the determinant of a matrix, the identity Tr(Σ −1 ∂ ϕ Σ) = −Tr(∂ ϕ Σ −1 Σ), and the symmetry of Σ. Once inserting the expressions for the covariance matrix Σ from (4) and for the mean vector μ from (6) into the previous equation, one is able to obtain with numerical methods the maximum-likelihood estimator as the ϕ that solves (55). Equation (55) largely simplifies when the probe is a squeezed vacuum state-i.e. μ = 0-or when the maximally squeezed quadratures are measured-i.e. ∂ ϕ |Σ| = 0. In the first case, it becomes where it is possible to recognise the usual mean squared error estimatorΣ = ν j=1 x j x T j /ν for the variance matrix Σ, so that the solution of Eq. (56) can be seen as the value of ϕ that sets equal to zero a weighted mean of the elements of covariance matrix estimator. In the latter case, Eq. (55) becomes which can be seen as a weighted mean of the estimatorsμ = ν j=1 x j /ν of the mean μ.

Appendix E: Useful formulas for the determinant of a sum of two matrices
Let us consider an L × L matrix Z which can be written as is a real diagonal matrix, and rank(W ) = ρ ≤ L. In this appendix, we will show a way to write the determinant |Z | in terms of the elements of W , which is convenient for our purposes.
We exploit the identity [42] |Z | = |D where Θ α (X, Y ) is the sum of the determinants of the matrices obtained by replacing any set of α columns (rows) of X with the corresponding α columns (rows) of Y . Since rank(W ) = ρ, the rank of any α × α sub-matrix of W is zero if α > ρ, so we can write Let us make explicit the first, easier terms of the summation with the purpose to grasp the gist of this expression. For α = 0, no columns are replaced from D, hence Θ 0 (D, W ) = |D| = k d k . We notice that, if at least one of the eigenvalues of D is zero, this term vanishes. For α = 1, Θ 1 (D, W ) is the sum of determinants of matrices of the form ⎛ with i = 1, . . . , L. Due to the structure of this matrices, their determinant is straightforward and reduces to W ii × k =i d k . This means that, in general, Θ 1 (D, W ) = i W ii × k =i d k .
A key observation to make is that if two or more eigenvalues of D are zero, all these determinants are zero, and thus Θ 1 (D, W ) = 0; if instead a single eigenvalue is zero, say d j = 0, with j = 1, . . . , L, then only one of these determinants is non-vanishing, i.e. the determinant of the matrix obtained by replacing the jth column of D.
In this case then we have Θ 1 (D, W ) = W j j × k = j d k .
Let us now consider lastly the case α = 2. The matrices whose determinants contribute to Θ 2 (D, with i < i = 1, . . . , L. Once again, the determinants of these type of matrices are easy to be evaluated and read |W (i,i ) | k =i,i d k , where We notice that Once again, key observations can be made: if D has at least three null eigenvalues, then Θ 2 (D, W ) is vanishing. If only two eigenvalues are zero, e.g. d j = d j = 0, then there is only one contribution to Θ 2 (D, W ), given by the matrix obtained substituting the jth and j th columns of D, and in this case we have Θ 2 (D, W ) = |W ( j, j ) | k = j, j d k . If only one eigenvalue is zero, namely d j = 0, then θ 2 (D, W ) is given by the sum of all the determinants of the matrices where the jth column has been replaced, namely Θ 2 (D, Similarly, it is possible to extend these considerations to every value of α and finally obtain the compact form where C L α is the set of all the combinations of α items in a set of L, and W (γ ) denotes the α × α sub-matrix of W obtained by selecting the rows and columns with indices γ 1 , . . . , γ α .