Non-unitary evolution of neutrinos in matter and the leptonic unitarity test

We present a comprehensive study of the three-active plus $N$ sterile neutrino model as a framework for constraining leptonic unitarity violation induced at energy scales much lower than the electroweak scale. We formulate a perturbation theory with expansion in small unitarity violating matrix element $W$ while keeping (non-$W$ suppressed) matter effect to all orders. We show that under the same condition of sterile state masses $0.1\, \text{eV}^2 \lesssim m^2_{J} \lesssim (1-10)\, \text{GeV}^2$ as in vacuum, assuming typical accelerator based long-baseline neutrino oscillation experiment, one can derive a very simple form of the oscillation probability which consists only of zeroth-order terms with the unique exception of probability leaking term $\mathcal{C}_{\alpha \beta}$ of $\mathcal{O} (W^4)$. We argue, based on our explicit computation to fourth-order in $W$, that all the other terms are negligibly small after taking into account the suppression due to the mass condition for sterile states, rendering the oscillation probability {\em sterile-sector model independent}. Then, we identify a limited energy region in which this suppression is evaded and the effects of order $W^2$ corrections may be observable. Its detection would provide another way, in addition to detecting $\mathcal{C}_{\alpha \beta}$, to distinguish between low-scale and high-scale unitarity violation. We also solve analytically the zeroth-order system in matter with uniform density to provide a basis for numerical evaluation of non-unitary neutrino evolution.


Introduction
Studies of neutrino oscillation entered into a "matured phase" after the structure of the three-flavor lepton mixing [1] is elucidated. The long-lasted discovery phase of neutrino oscillation has been unambiguously concluded by the Super-Kamiokande (Super-K) atmospheric neutrino observation which discovered neutrino oscillation and hence neutrino mass [2]. It was followed by the KamLAND reactor and the solar neutrino experiments which uncovered the three-flavor nature of the mixing by observing oscillation and/or nonadiabatic flavor conversion of neutrinos in matter [3,4] in the 1-2 sector [5,6]. 1 The last step of understanding the three-flavor structure of neutrino oscillation was carried out by the reactor [13][14][15] and the accelerator [16,17] measurement of θ 13 . It lefts only the two remaining unknowns in the standard three-flavor mixing paradigm, that is, measurement of CP violating phase and determination of neutrino mass ordering. 2 The completion of the theory of the three-flavor neutrino mixing, however, necessitates the paradigm test. A well-known example of such efforts is to verify unitarity of the quark CKM matrix [19]. We have argued in ref. [20] that we may need a different strategy to test leptonic unitarity. That is, first prepare a generic framework which describes unitarity 1 The unique citation of the solar neutrino measurement here must be understood as the representative of all the foregoing solar neutrino experiments [7][8][9][10][11][12]. 2 Recently, however, there exists accumulating indication that CP phase δ takes value around ∼ 3π 2 [18].
violation at certain energy scale, and then confront it to experimental data. We contrasted the two typical alternatives, unitarity violation by new physics at high (E m W ) and low (E m W ) energy scales, which are dubbed as high-scale and low-scale unitarity violation, respectively. 3 They differ in certain characteristic features, such as absence (low-scale) or presence (high-scale) of violation of flavor universality and zero-distance flavor transition.
In a previous paper, we have proposed a model-independent framework for testing leptonic unitarity assuming low-scale unitarity violation [20]. Our framework is based on the three active plus N sterile lepton (called neutrino) system, which is unitary in the whole (3 + N ) dimensional state space but restriction to observables in the active neutrino subspace renders the theory non-unitary in that subspace. It is referred to as the "(3 + N ) space unitary model". We have shown that the restriction of sterile state masses to 0.1 eV 2 < ∼ m 2 J < ∼ 1 MeV 2 is sufficient to make the observables sterile-sector model independent. That is, the neutrino oscillation probability can be written in such a way that it is independent of details of the sterile mass spectrum and mixing with active neutrinos. The model-independent nature of the framework will be translated into that of the constraints obtained, thereby making leptonic unitarity test more powerful. See ref. [24,25] for a comprehensive analysis of the currently available neutrino data with the active plus sterile framework, and [25,26] for analyses of the future experiments.
Can one distinguish between high-scale and low-scale unitarity violation? The answer is in principle yes, given the above difference between them, which can only be done by detecting violation of flavor universality or zero-distance flavor transition. In ref. [20] we have pointed out another way of distinguishing them by observing the probability leaking term in the oscillation probability, which signals existence of energetically accessible sterile states. Unfortunately, this important trace of the hidden sterile sector has neither been mentioned in the context of unitarity test, nor included into the foregoing analyses before our analysis for JUNO [27]-like setting.
In this paper we will give a comprehensive treatment of the (3 + N ) space unitary model. Our particular emphasis in this paper is twofold: • We formulate a novel perturbative framework for the (3 + N ) space unitary model, which we call "small unitarity-violation perturbation theory". It allows us to calculate the oscillation probability in the presence of matter effect comparable in size to the vacuum effect.
• We show that the framework can be used in dual modes: It serves (1) as a suitable framework for leptonic unitarity test in neutrino oscillation experiments, and (2) as a hunting tool for unitarity violation effects, which could serve for another way of distinguishing low-scale unitarity violation from high-scale one.
It must be remarked that the sterile sector model-independent nature of the (3 + N ) space unitary model is demonstrated in ref. [20] only in vacuum and in matter to first order in matter perturbation theory. It is the first goal of this paper to formulate the appropriate framework to address this question, and show that the model-independence holds after inclusion of sizeable matter effect. In fact, we will observe that the same condition on the sterile neutrino masses guarantees this property. It may be useful in the application of our framework to some of the ongoing and next generation neutrino oscillation experiments [16,17,[28][29][30][31][32][33]. The discussion of the second point above, our second goal, will follow once the formulation of perturbation theory is completed. It may be worth to note that there exists difference between low-and high-scale unitarity violation with regard to need for and implication to future neutrino experiments. The scenario of high-scale unitarity violation has been studied extensively in the literature [25,[34][35][36][37][38][39][40][41]. 4 In this case, due to preserved SU (2) × U (1) symmetry at high scales, it is conceivable that the constraints from measurements using probes in the charged lepton sector play a dominant role. On the other hand, in the case of low-scale unitarity violation, neutrino oscillation experiments will play key role in constraining unitarity violation.

Essence of the present and the previous papers
In this section, we present essence of the present and the previous [20] papers in which we try to construct an adequate formulation to describe unitarity violation at low energies, E m W .

Unitary 3 active + N sterile neutrino system
We have shown in the (3 + N ) space unitary model that the active neutrino oscillation probability can be written in a sterile sector model-independent way under the constraint on sterile state masses to 0.1 eV 2 < ∼ m 2 J < ∼ 1 MeV 2 both in vacuum and to first order in matter effect [20]. The lower limit of the sterile mass range comes from the requirement that fast oscillations due to active-sterile and sterile-sterile mass differences are averaged out due to decoherence. Whereas the upper limit is for sterile neutrinos to take part in reactor neutrino experiments, which may be relaxed to m 2 J < ∼ (1 − 10) GeV 2 for accelerator neutrinos. Then, the obvious question is whether the conclusion remains the same when all order effect of matter is taken into account. We will answer the question in the positive in this paper.

Non-unitary evolution of neutrinos in vacuum
Despite large state space of (3 + N ) dimensions the resultant expression of the oscillation probability has a simple form under the above stated restriction on the sterile neutrino 4 Works have also been done on unitarity violation by sterile sector from somewhat different point of view, e.g., if it exists, how it could disturb measurement of lepton Kobayashi-Maskawa phase δCP, or mass ordering. See for example, [42][43][44][45][46][47][48][49]. mass spectrum. In vacuum it has the form in the appearance (α = β) as well as in the disappearance channels (α = β). 5 The ( 2. Probability leakage term C αβ appears reflecting the nature of low-energy unitarity violation in which the probability can flow out into the sterile state space from active neutrino space.
3. Due to non-unitarity, δ αβ term in the unitary case is modified to The points 2 and 3 above are important ones and the clarifying remarks about them are in order: • Presence or absence of the probability leakage term C αβ distinguishes between lowenergy and high-energy unitarity violation [20]. Unfortunately, C αβ may be small because it is of fourth order in W .
• Difference in normalization factor, the second term in (2.1), between unitary and nonunitary cases is of order ∼ W 4 (∼ W 2 ) in the appearance (disappearance) channels.
To understand the latter point, we notice that unitarity in the (3 + N ) space unitary model can be written as in the appearance channel (α = β), and in the disappearance channel (α = β), which justifies the above statement.
We emphasize, therefore, that the probability leaking term C αβ and the another con- in the oscillation probabilities are the same order, O(W 4 ), in the appearance channels. Hence, we do not see any good reasons why the former can be ignored, as was done in the existing literatures. It is also worth to note that O(W 2 ) difference in normalization in the disappearance channel would make detection of unitarity violation more feasible. It is one of the reasons for high sensitivity to unitarity violation that could be reached in disappearance measurement in the JUNO-like setting [20].

Non-unitary evolution of neutrinos in matter
We have also examined in ref. [20] the question of whether inclusion of the matter effect alters the above features of non-unitary evolution of neutrino states in vacuum. We have found that as far as we remain in the region of unitarity violating element |W | 0.1 6 or larger, the matter effect does not alter the above features of the oscillation probability in (2.1) under the same restriction on sterile masses. Notice that |W | 0.1 implies that the unitarity violating effect in the probability is of the order of |W | 4 ∼ 10 −4 , except for the O(W 2 ) difference in normalization constant in the disappearance probability. It is practically the limit of order of magnitude that can be explored by the next generation neutrino oscillation experiments.
We must note, however, that this conclusion is based on our treatment to first order in matter perturbation theory. Hence, the statement is not a very convincing one. Given the fact that setup of some of the next generation long-baseline (LBL) experiments require consideration of matter effect of comparable size with the vacuum effect, it is clear that a better treatment is necessary to understand the influence of the matter effect in the (3 + N ) model. In particular, the conditions that allows us to make the active space oscillation probabilities sterile-sector model independent have to be worked out. This task will be carried out in section 3.
As an outcome of the honest computation using the small unitarity-violation perturbation theory in the (3 + N ) space unitary model to order W 4 , we postulate the following theorem:

Uniqueness theorem
• All the matter dependent perturbative corrections in W in the oscillation probability vanish or can be ignored after averaging over the fast oscillations and using the suppression due to the large energy denominators, leaving only the probability leakage term C αβ , the first term in (2.1).
It must be remarked here that unitarity violation effects which are hidden in non-unitary active space mixing matrix U produces zeroth-to higher order effects of W . The above theorem is only about the terms generated by explicit perturbative corrections in W . 7 As a result of the explicit computation, we show that the oscillation probability in matter between active flavor neutrinos in the (3 + N ) space unitary model to fourth order in W in our small unitarity-violation perturbation theory can be written as where h i (i = 1, 2, 3) denote the energy eigenvalues of zeroth-order states of active neutrinos in matter, and X is the unitary matrix which diagonalize the zeroth-order Hamiltonian used to formulate our perturbation theory. C αβ is given in (2.2). The expression is valid under the restriction on sterile neutrino masses 0.1 For the restriction needed on the sterile state masses for smaller W , see section 3.4.
Notice that the normalization term, the second term both in (2.1) and (2.4), comes out from the contribution of zeroth-order Hamiltonian which contains all orders effect of the matter potential. It is true despite the fact that it is a vacuum effect, which indeed exists in vacuum. 8 The expression (2.4) is a very transparent result in the sense that (1) the vacuum non-unitary mixing matrix U is "dressed" in a simple way by the matter effect represented by X, the unitary matrix which diagonalizes the zeroth-order Hamiltonian, and (2) the probability leaking term and the normalization term stay as they are in vacuum.

Enhanced higher-order W corrections and nature of unitarity violation
We will point out that, despite our above theorem, there are regions of neutrino energy and baseline that condition for suppression due to the large energy denominators is not fully effective. In this region outside validity of the theorem we show that second order correction terms in W , together with the leaking term C αβ , may not be totally negligible, and it could be detectable. If it were the case, it could distinguish low-scale unitarity violation from high-scale one. 7 Our result also means that taking care of all order matter effect does not change the feature obtained by first-order treatment in matter perturbation theory. That is, the same condition on sterile states masses derived by using first-order matter perturbation theory suffices to guarantee the absence of matter-dependent higher order correction terms in W . The reasons for this feature will be explained at the end of section 3.5. 8 In matter, we have 3 j=1 (U X)αj(U X) * βj = 3 j,k,l=1 (U ) αk (U ) * βl X kj X * lj = 3 k=1 (U ) αk (U ) * βk where in the last step, we have used the unitarity relation 3 j=1 X kj X * lj = δ kl .

Small unitarity-violation perturbation theory of neutrino oscillation in matter
We formulate a perturbation theory of the (3 + N ) state unitary model using an expansion parameter of matrix elements of W signifying unitarity violation effects, assuming it small. In the main text we mostly confine ourselves to the formulas to second order in W , but include fourth order terms whenever it is necessary. Our formulation of the perturbative framework in this section will be done aiming at constructing a model-independent framework for leptonic unitarity test. Usage of the same probability formula as a hunting tool of unitarity violation and discriminator between low-scale and high-scale unitarity violation will be discussed in section 5. For simplicity, we take the uniform number density approximation for electrons and neutrons in matter. However, extension to the varying density case is, in principle, straightforward as far as adiabaticity holds.

3 active plus N sterile neutrino system in the flavor basis
The S matrix describes possible flavor changes after traversing a distance x ν α (x) = S αβ ν β (0), (3.1) and the oscillation probability is given by The neutrino evolution in flavor basis in the (3 + N ) space unitary model is governed by the Schrödinger equation Given the flavor basis Hamiltonian H, the S matrix is given by where T symbol indicates the "time ordering" (in fact "space ordering" here). The right-hand side of (3.3) may be written as e −iHx for the case of constant matter density. The flavor basis Hamiltonian H is (3 + N ) × (3 + N ) matrix:

2E
(J = 4, · · ·, 3 + N ). Here, m i (m J ) denote the mass of mostly active (sterile) neutrinos and E is the neutrino energy. ∆ A and ∆ B are related to Wolfenstein's matter potential [4] due to CC and NC reactions, a and b, as Here, G F is the Fermi constant, N e and N n are the electron and neutron number densities in matter. ρ and Y e denote, respectively, the matter density and number of electron per nucleon in matter. In (3.5), U denotes the flavor mixing matrix which relates (3 + N ) dimensional flavor neutrino states to the vacuum mass eigenstate basis as ν ζ = U ζzνz , where ζ runs over active flavor α = e, µ, τ and sterile flavor s = s 1 , · · ·, s N indices, z runs over mostly active i = 1, 2, 3 and mostly sterile mass eigenstate indices J = 4, 5, · · ·, N + 3.
As an application of our framework, we anticipate leptonic unitarity test in the LBL accelerator neutrino experiments which utilize atmospheric-scale neutrino oscillations. Therefore, we assume that the system satisfies the following conditions in formulating our perturbation theory where L denotes baseline, ∆m 2 ji ≡ m 2 j − m 2 i , and . (3.12) They probably ensure that our oscillation probability formulas have applicability to the terrestrial LBL and atmospheric neutrino experiments with baseline up to ∼ 10 4 km and energies from low to high, up to E ∼ 100 GeV. More precise discussions on where our formulas are valid will be given in sections 3.5 and 6.2.

Vacuum mass eigenstate basis, or tilde basis
To formulate perturbative treatment it is convenient to consider the vacuum mass eigenstate basis, the tilde basis, introduced in the previous sectioñ The tilde basis Hamiltonian is related to the flavor basis one as (3.14) The explicit form ofH is given bỹ We parametrize the (3 + N ) × (3 + N ) dimensional flavor mixing matrix U as The matrix U and V are 3 × 3 and N × N matrices, respectively, and W and Z have sizes that just fill in the space. In our (3 + N ) model, unitarity is obeyed in the whole (3 + N ) state space: Then, the HamiltonianH in vacuum mass eigenstate basis is given bỹ As in vacuum, the neutrino oscillation is governed only by the U and W matrices, and is independent of Z and V matrices. It is natural that V matrix does not show up in physical Hamiltonian matrix because the rotations inside sterile basis does not have any physical meaning, if we observe the system only by the Standard Model interactions.

Formulating small unitarity-violation perturbation theory
We now construct the small unitarity-violation perturbation theory. It is natural to consider the framework in which the tilde-basis HamiltonianH is decomposed into the un-perturbed and perturbed parts,H 0 +H 1 , as follows: Therefore, what we mean by "expansion by unitarity violation effect" is an expansion by the W matrix elements. 9 We assume, for simplicity, that all the W matrix elements are small and have the same order s . Then, 3 × N (N × 3) sub-matrix elements inH matt are of order s , while the pure sterile space N × N sub-matrix elements are of order 2 s . For simplicity, we often use the expression "expanding to order W n " which means to order n s in this paper.

Hat basis
To formulate perturbation theory withH 0 andH 1 given above we transform to a basis in which the un-perturbed part of the Hamiltonian is diagonal, which we call the "hat basis". Since the 3 × 3 sub-matrix ∆ a + U † AU inH 0 is hermitian, it can be diagonalized by the unitary transformation with X being the 3 × 3 unitary matrix. Then,H 0 can be diagonalized by using the zeroth-order Hamiltonian in the hat basis. SinceĤ 0 is diagonal it is easy to compute e ±iĤ 0 x : (3.23) Then, the perturbed Hamiltonian is given bŷ The eigenvalues ofH 0 is therefore h 1 , h 2 , h 3 , and ∆ J (J = 4, · · ·, 3 + N ). Therefore, the sterile neutrino masses are affected neither by the active states nor the matter potential in our zeroth-order unperturbed basis. It must be a good approximation because we have assumed that the sterile masses are much heavier than the active ones, and we are interested in the energy region a ∼ ∆m 2 31 .
To do real calculations of the S matrix elements we must solve the zeroth order Hamilto-nianH 0 . This task will be carried out in section 4.2, in which we derive explicit expressions of the eigenvalues h i and the unitary matrix X. Now, we formulate perturbation theory with the hat basis Hamiltonian,Ĥ 0 in (3.22) andĤ 1 in (3.24) after a clarifying note in the next subsection.

The relationship between quantities in various bases
So far we have introduced the tilde-and the hat-basis: where X is given by eq. (3.21). Therefore, Notice that both U and X are unitary, and hence UX is unitary too. The relationship between wave functions of various basis are given bŷ where y denote the hat-basis indices. Using the explicit parametrization of the U matrix we have It may be helpful for our discussions later to understand the relationship between S andŜ matrix elements. For this purpose, we denote them in the block form where the subscripts a and S indicate that they act (for the right index) to the active or the sterile subspaces. Notice that S aa and S aS , for example, are 3 × 3 and 3 × N matrices, respectively. Then, the relationship between S andŜ matrix elements can be written explicitly as

Computation ofŜ matrix elements
To as Ω(x) obeys the evolution equation where Then, Ω(x) can be computed perturbatively as where the "space-ordered" form in (3.35) is essential because of the highly nontrivial spatial dependence in H 1 . Upon obtaining Ω(x),Ŝ matrix can be obtained aŝ By knowingŜ matrix elements, the S matrix is obtained by using (3.27). The perturbing Hamiltonian H 1 defined in (3.34) has a structure (3.37) That is, (H 1 ) ij = 0 in the whole active neutrino subspace. The non-vanishing elements of H 1 are as follows: Inserting eq. (3.38) into (3.35), we can compute all the Ω matrix elements. The simplest ones in first order in H 1 , the second term in (3.35), are given by which serve as a building block of the perturbation series because of the structure in (3.35). The notation " [1]" implies that the terms come from first order perturbation with H 1 . For more about notations, see appendix A. We need to compute up to fourth order in H 1 because we want to keep all the order W 4 terms. The requirement arises because the probability leaking term, whose observation is crucial to distinguish between low-energy and high-energy unitarity violation, is of order W 4 . The other normalization term, the second term in (2.1), also deviates from the one in unitary case by a quantity of order W 4 in the appearance channels, but in an implicit way. The resulting expressions ofŜ matrix elements to order W 4 are summarized in appendix A.
There exists important consistency check in the calculation. That is, the identity relation betweenŜ matrix elements that follows from generalized T invariance: 10 whereŜ Ji is obtained by performing the exchange h i ↔ ∆ J inŜ iJ . The generalized T invariance relation will be explicitly verified by the computed results ofŜ matrix elements in appendix A. 11 To carry out a complete consistency check, we have obtained all theŜ matrix elements (includingŜ IJ ) to fourth order in W , and verified their generalized T invariance. However, only the ones which are required to obtain S matrix elements to fourth order are exhibited in appendix A.

Computation of S matrix elements
Given the results ofŜ matrix elements it is straightforward to calculate S matrix elements by using the formulas in eq. (3.27). The active neutrino space S matrix elements can be 10 As in the Standard Model in particle physics T invariance is broken in our system only by complex numbers in the mixing matrix. 11 SinceĤ system is a consistent dynamical system it is legitimate and easier to verify generalized T invariance in theŜ level, though it can be done in the S matrix level as well.
written in perturbative forms, Using (3.41) the explicit expressions of S matrix elements can be easily obtained with use ofŜ matrix elements given in appendix A. For example, S αβ in zeroth and second orders in W are given, respectively, by and

The oscillation probability to second order in W
In this section, we discuss the oscillation probability to second order in W . It is to illuminate the principle of calculation, how averaging over the fast oscillation works, and to show which constraints are obtained on the sterile state masses by the requirement of suppression by the energy denominator to make these sterile-sector model dependent terms negligible. Of course, we will calculate in this paper all the oscillation probabilities P (ν β → ν α ) in matter to fourth order in W to keep the necessary term, the probability leaking term C αβ , as mentioned earlier. The key features of the fourth-order terms will be described in the next section 3.6.
The following formulas include the cases of both disappearance (α = β) and appearance (α = β) channels. The oscillation probability P (ν β → ν α ) is given to second order in W as Notice that there is no matter dependent terms without suppression either by high-frequency oscillations ∝ cos(∆ K − h m )x (or sin), or by large energy denominators ∝ 1 ∆ K −h k . We take averaging over fast oscillations due to active-sterile and sterile-sterile mass squared differences which leads to where ... stands for averaging over neutrino energy within the uncertainty of energy resolution, as well as averaging over uncertainty of distance between production and detection points of neutrinos. 12 The second approximate equalities in (3.45) assume that there is no accidental degeneracy among the sterile state masses. That is, we assume that the relation |∆m 2 JK | |∆m 2 31 | always holds.
After averaging out the fast oscillations, P (ν β → ν α ) is given to second order in W by where the zeroth-order terms are rewritten in a familiar way. However, the expression in (3.46) does not mean a great simplification. That is, the matter dependent terms with sterile sector model-dependence remain.

Suppression by the energy denominator
We, therefore, have to examine the effect of suppression by the large energy denominator which characterizes transition between active-sterile sates, 1/(∆ K − h k ). We demand that the matter dependent terms in (3.46) be smaller than the probability leaking and the normalization terms of order ∼ W 4 . It leads to where L is the baseline distance and i and J denote, respectively, generic indices for active and sterile states. For notational convenience, we define λ i (i = 1, 2, 3) to be the eigenvalues of 3 × 3 submatrix 2EH 0 in (3.19) corresponding to the active neutrino mass squared in matter. Then, h i = λ i 2E . As one can easily see 13 the last one in (3.47) gives the severest 13 One can show that L ∼ 2E constraint (taking the matter potential due to CC in A and removing the factor 1 2E ) (3.48) Notice that, in order for the first inequality in (3.48) to be valid, we have restricted the energy region for a given matter density such that λ i remain in the order of active neutrino masses. Roughly speaking, it corresponds to −50 See e.g., figure 3 of ref. [50]. Clearly, it excludes the interesting region of "IceCube resonance" due to sterile neutrino mass of eV scales [51], for which an entirely different theoretical framework would be necessary. Then, we notice that in a regime W 2 ∼ 10 −2 , the condition in (3.48) is valid given the estimation (assuming Y e = 0.5) That is, the second-order matter dependent correction terms can be ignored in comparison with O(W 4 ) terms if ∆m 2 Jk > ∼ 0.1 eV 2 , which is already required in vacuum. If we want to treat the regime W 2 > ∼ 10 −n , we need to limit the sterile masses to ∆m 2 Jk m 2 J > ∼ 10 (n−3) eV 2 to keep our (3 + N ) space unitary model insensitive to details of the sterile sector [20]. We note, however, that terms of order W 4 ∼ 10 −4 may be the limit of exploration for near future neutrino oscillation experiments.
The condition (3.48) is identical with the one obtained using the first order matter perturbation theory [20], which may look strange to the readers. Let us understand the reason why taking care of all order matter effect does not alter the condition obtained by first-order treatment in matter perturbation theory. The matter-dependent term in the zeroth-order HamiltonianH 0 only involves U matrix, but no W matrix. Since we treat H 0 in an unperturbed fashion it produces all-order effect of the matter potential which is however independent of W matrix elements. On the other hand, perturbative effects that come from single or double powers in W inĤ 1 are always accompanied by the matter potential in the form of W A or W † A, as in eq. (3.37). That is, perturbative effect of W is always accompanied by matter potential, and hence can always be dealt with matter perturbation theory. It is the reason why the matter perturbation theory is able to yield the same condition on sterile masses as obtained in a fuller treatment of matter effect done in this paper. 14

The oscillation probability in fourth order in W
The oscillation probability in fourth order in W contains the two terms (3.50) We will show in appendix C.1 that the first term in (3.50), after averaging over the fast oscillations and using the suppression by energy denominator as discussed in the previous section, leaves the unique term, the probability leaking term C αβ in (2.1). We will also show in appendix C.2 that the second term in (3.50), under the same treatment for the first term, gives vanishing contribution. Therefore, no matter-dependent term remains after using energy denominator suppression and averaging over the fast oscillations.
In conclusion, the oscillation probability in matter between active flavor neutrinos in the (3 + N ) space unitary model to fourth order in W in our small unitarity-violation perturbation theory can be written as in eq. (2.4) in section 2. We hope that it serves as a useful tool to test leptonic unitarity in various ongoing and future neutrino oscillation experiments.

4
Analytical and numerical methods for solving non-unitary evolution in matter In this section, we describe the numerical and analytical methods for calculating the neutrino oscillation probability by solving non-unitary evolution in matter.

Numerical method for calculating neutrino oscillation probability
We describe a numerical method for computing the oscillation probability in matter. This method can be used, assuming adiabaticity, in cases with varying matter density. We show that in zeroth order in W the system simplifies to an evolution equation in the 3 × 3 active subspace.
We solve the Schrödinger equation in the vacuum mass eigenstate basis ("tilde basis"), where i = 1, 2, 3 and J = 4, 5, ···, 3+N denote mostly active and mostly sterile neutrino mass eigenstate labels, respectively. The initial condition with only active component implies Using the solution of equation (4.1), we need the wave function of active flavor component to calculate the probability at baseline x = L.
Therefore, in the mass-basis formulation only U and W are involved, which is consistent with our experience in W perturbation theory. An apparent contradiction to this property that one faces in the evolution equation in the flavor basis is resolved in appendix D.
A drawback of this method is that we have to solve explicitly the evolution of the sterile states which are coupled to the active states. Then, we need to specify the sterile sector model, and have to know how to deal with averaging over the fast modes.
We notice, however, that in the zeroth-order in W the system simplifies. Since the HamiltonianH is block-diagonal it suffices to solve the equation only in the 3 × 3 active neutrino subspace: The initial condition (4.2) and final reverse-back formula (4.3) involve only U matrix elements. Therefore, the oscillation probability in the zeroth-order in W can be calculable in a manner independent of sterile sector models. 15

An exact solution of zeroth-order oscillation probability
Here, we describe a method for obtaining the analytical solution of the zeroth-order Hamiltonian. The exact solution, as well as the numerical one described in the previous section, provides the basis for computing the higher order corrections in W .
We calculate an exact form of the oscillation probability P (ν β → ν α ) in leading order in our perturbative framework, the one in (2.4) except for C αβ , in the case of uniform matter density.
The zeroth-order S matrix element S (0) αβ in (3.42) can be written as 5) and the factor in parenthesis can be calculated by the KTY technique [52]. We want to diagonalize the Hamiltonian For our notational convenience we call this form of H 0 as H d . Note that h i = λ i 2E where with T = (2E) TrH 0 , (4.11) The adjugate of H 0 is defined as AdjH 0 ≡ (H 0 ) −1 detH 0 . Notice that T , A and D are invariant under unitary transformation of H 0 → KH 0 K † with K any unitary matrix and so are λ i . Following the notation in [52] we define p ij and q ij as (i, j = 1, 2, 3) (4.14) Notice that p ij and q ij are written only by the known (or given) quantities. Then, the equations together with unitarity of X, become the equations to determine XX † : They lead to the solution (k = 1, 2, 3) where k, l, m is cyclic, and sum over k is not implied in (4.17). Therefore, to zeroth-order in W expansion, the S matrix elements are given by , (4.18) and the oscillation probability by P (ν β → ν α ) = |S (0) αβ | 2 . Finally, armed with the solution (4.17), we can also calculate all higher order terms in oscillation probability for e.g. those in eq. (3.46) since only such combination X ik X * jk (no sum over k implied) can appear. In leptonic unitarity test, a clear understanding of the relationship between low-scale and high-scale unitarity violation may be one of the key issues. While the presence of probability leaking term C αβ , if detected, clearly testifies for low-scale unitarity violation [20], it would be better to provide a global view of the difference between them, looking for alternative ways to reach the goal. In this section, we would like to give a preliminary discussion on this topics. One would guess, on intuitive ground, that at zeroth order in W our system describes high-scale unitarity violation. There is no "W corrections" in high-scale unitarity violation because the energy scale is so high that the high-mass sector is truncated. It is in agreement with the formulations in ref. [25] with which we share the same evolution equation (4.4) in the vacuum mass eigenstate basis. See also [34]. 16 Based on this perspective, we just conclude this subsection by stating the generic characteristic differences between high-scale and low-scale unitarity violation: • Our system calculated with evolution equations (4.4) in leading (zeroth) order in W expansion, which is common to both high-and low-scale unitarity violation, would serve as a first indicator to the existence of leptonic unitarity violation. Existence of second and higher order corrections in W , if detected, uniquely identifies the case for low-scale unitarity violation.
• Presence (low-E) or absence (high-E) of the probability leaking term C αβ in (2.2) remains to be the best possible way to distinguish between high-and low-scale unitarity violation. 16 However, it appears that the flavor basis formulation of neutrino evolution in matter in high-scale unitarity violation poses highly nontrivial features such as non-Hermitian Hamiltonian [34], or the evolution equation i d dx να = j U ∆a + U † AU U † αβ ν β [41]. The latter is not equivalent to (4.4) in the vacuum mass eigenstate basis due to non-unitarity of the U matrix.

Oscillation probabilities with and without unitarity violation
Under the hope to uncover in which region of parameter space the effect of non-unitarity is most prominent, we first compare the oscillation probabilities with and without unitarity violation. We examine the three channels ν µ → ν e , ν µ → ν τ , and ν µ → ν µ . We do not intend to do quantitative analyses, nor attempt to cover the whole parameter space. Yet, we try to give the readers a feeling on how large and where are the effects of unitarity violation. Therefore, we present the results just for a specific choice of the parameters. Also, the use of uniform matter density ρ = 3.2 g cm −3 over the entire baseline is not realistic. 17 Notice that the results given in section 5.2 apply to both high-scale unitarity violation as well as low-scale one in its leading order in W . The probability leaking term is not included in the analysis in this section 5.2, but the effect is discussed in section 5.3.
non-unitary in E −L space. Here, the superscript (0) implies that it is calculated in zeroth-order in W using (4.4) with appropriate initial condition and final projection to flavor eigenstate. In most of the E − L space P (ν µ → ν e ) is small. However, we identify the two regions where P (ν µ → ν e ) is relatively large, > ∼ 0.3. One of them is at low energy, E < ∼ a few hundred MeV, and baseline L > ∼ 1000 km. The other one is a region E ∼ 10 GeV and L ∼ 10000 km. The former may be understood as due to the solar MSW enhancement, and the latter as the atmospheric MSW enhancement [3,4]. Roughly speaking, the regions with relatively large |∆P (ν µ → ν e )| overlap with these regions.

5.2.3
P (ν µ → ν τ ) and P (ν µ → ν µ ) In figures 2 and 3, the same quantities (in each upper (a) and lower (b) panel) are presented but in ν µ → ν τ and ν µ → ν µ channels, respectively. In contrast to ν µ → ν e channel, 17 One can apply our formulas of S matrix obtained under the constant matter density approximation to semi-realistic calculation for earth crossing neutrinos by using them in each shell (core, mantle, and crust regions, etc.) with proper connecting conditions at the boundaries. 18 It can be re-parametrized in terms of the "α matrix parametrization" defined in ref. [39].
non-unitary in space spanned by neutrino energy E and baseline L. In the lower panel (b), the iso-contour of the difference ∆P (ν µ → ν e ) ≡ P (ν µ → ν e ) standard − P (ν µ → ν e )  In the upper panel (a), presented is the iso-contour of P (ν µ → ν τ ) non-unitary is presented. The parameters used are the same as in figure 1. In the upper panel (a), presented is the iso-contour of P (ν µ → ν µ ) non-unitary is presented. The parameters used are the same as in figure 1. P (ν µ → ν τ ) and P (ν µ → ν µ ) contours are globally "vacuum effect dominated", apart from the solar MSW region, both in the standard (not shown) and the non-unitary cases. The first oscillation peak of P (ν µ → ν τ ) scales roughly as the vacuum oscillation peak does, L/10 3 km = 0.33E/1 GeV. This feature is more or less seen in P (ν µ → ν e ), but P (ν µ → ν τ ) has a higher peak height 0.7 − 0.8, and the effect of atmospheric MSW enhancement is less prominent.
For P (ν µ → ν µ ) non-unitary , roughly speaking, the relation P (ν µ → ν µ ) ≈ 1−P (ν µ → ν τ ) holds in region where P (ν µ → ν e ) is small. It must be the case in the unitary case, but even in non-unitary case the relation holds approximately because unitarity violation is small in our choice of the parameters. Therefore, P (ν µ → ν µ ) is large in region where P (ν µ → ν τ ) is small, and vice versa, as seen in figure 3. It appears that the anticorrelation is inherited to the relationship between ∆P (ν µ → ν µ ) and ∆P (ν µ → ν τ ).
Due to relatively large ∆P (ν µ → ν τ ) in the first and second oscillation maxima over the entire baseline, or similarly because of relatively large depletion of ∆P (ν µ → ν µ ) there, it should be possible to detect the effect of non-unitarity if the detector has a good τ (in the former channel) and µ (in the latter channel) detection capabilities. It would be ideal that the detector has sensitivities to the both channels because then one can take anticorrelation between µ and τ yields to enhance the sensitivity to unitarity violation.
Some comments on observational aspects: In the two regions where |∆P (ν µ → ν e )| is large, and ∆P (ν µ → ν µ ) in energy region E < ∼ 10 GeV may be explored by high-statistics atmospheric neutrino observation by Super-K/Hyper-K, or DUNE [28,30,31]. The atmospheric MSW enhanced region of P (ν µ → ν e ) would be a good target for PINGU extensions of IceCube and KM3NeT-ORCA [32,33]. P (ν µ → ν τ ) and P (ν µ → ν µ ) would be explored by them, with possibility of seeing anticorrelation between µ and τ yields. Although it is very interesting to investigate these experimental prospects, a detailed examination of these questions is beyond the scope of this paper.

5.3
The probability leaking and W correction terms As we stated in section 5.1, uncovering the effect of W correction or the probability leaking term C αβ in eq. (2.2) would offer another way of distinguishing low-scale unitarity violation from high-scale one. Let us give a brief sketch of how and where we might see visible effect.

A further note on the parameter choice
To discuss W correction and the probability leaking term we have to determine the W matrix. Given the non-unitary U matrix there is a way to construct the W matrix. In general, it is given by where S is a 3 × 3 matrix which diagonalizes 1 3×3 − U U † , w is diagonal matrix which consists of eigenvalues of 1 3×3 − U U † , and R is an arbitrary 3 × N complex matrix obeying RR † = 1 3×3 . The construction makes sense for N ≥ 3. Therefore, for a given N there is a large arbitrariness on the choice of the W matrix, and hence on the sizes of the W corrections and C αβ .
Lacking a guiding principle of how to choose the R matrix in (5.1), we examine the cases with largest and smallest possible values of C αβ for given values of unitarity violation 1 − 3 j=1 |U αj | 2 (α = e, µ, τ ). It is shown that in the (3 + N ) model C αβ is bounded from above and below as [20]  The lower bound is realized in the "universal scaling" model described in appendix E, which predicts (J = 4, 5, · · ·, 3 + N ). 19 It is shown in appendix E that under the assumption of equal sterile state masses the universal scaling model predicts the same W 2 correction terms as those of the (3 + 1) model.

How large are the W corrections and C αβ ?
Let us go back to the expression of the oscillation probability to second order in W , eq. (3.46), in section 3.4 to know where we might see visible effects. If we enter into the region ρE 10 (g/cm 3 )GeV at around the first oscillation maximum, the first two terms in 2Re{· · ·} in (3.46) can become large apart from W 2 suppression, It should be compared to (3.47). After taking account of W 2 suppression of ∼ 0.01 (assuming W 0.1), | AAL (∆ J −h i ) W 2 | ∼ 3 × 10 −2 at E ∼ 100 GeV, assuming ∆m 2 Ji = 0.1 eV 2 . To know more quantitatively their sizes, we fix the parameters as discussed in section 5.3.1 with a common m 2 J = 0.1 eV 2 , 20 and plot in figure 4 the order W 2 correction terms in P (ν µ → ν α ) in eq. (3.46) plus the probability leaking term C µα , α = e (top panel), α = τ (middle panel), and α = µ (bottom panel). Under the approximation of ignoring the fourth (and higher-order) corrections, they are identical to δP (ν µ → ν α ) ≡ P (ν µ → ν α ) − P (ν µ → ν α ) (0) . In each panel the three cases are examined. N = 1 case with maximal C µα (solid line), the universal scaling model with N = 3 (dotted line), and the order W 2 correction only (dashed line). The last case corresponds to the universal scaling model with N = ∞. The blue lines are for E = 10 GeV, and the red for E = 100 GeV.
Since C αβ is a constant term in the oscillation probability, it can in principle be distinguished from the other normalization term which shares U matrix element dependences with the oscillation terms. In particular, they can dominate for large m 2 J since the W 2 correction terms are suppressed by at least ∼ 1/m 2 J . In this case, they will be the sole indicator of low-scale unitarity violation. In general (though not in the N = 1 model), the order W 2 terms depend upon details of the sterile sector, e.g., matrix structure of W . Therefore, once the effect is seen it would give us useful information on the structure of low-scale leptonic unitarity violation.

Some remaining theoretical issues
In this section, we will give some remarks on the theoretical basis in our framework, basic one as well as on its perturbative aspects. They include our treatment of decoherence, generic structure of higher-order corrections and its relation to the "Uniqueness theorem" (see section 2.3), and absence of enhancement due to small solar mass splitting denominator.

6.1
Decoherence imposed onto coherent evolution system We have started with the Schrödinger equation (3.3) with Hamiltonian (3.5) assuming that all the neutrino states remain coherent. We have shown in this and the previous papers that the coherence between active and sterile, and sterile and sterile states are not maintained for sterile mass differences larger than 0.1 eV 2 . The effect of decoherence is taken into account by making average over the fast oscillations. We feel it desirable for the current treatment be replaced by the real quantum mechanical one using wave packets, in which the effect of decoherence would automatically come in. But, we do believe that our present framework is able to describe effectively the right physics derived from such improved treatment.

Smallness of expansion parameters and higher order corrections
Here, we discuss general structure of the perturbation series without recourse to averaging out the fast oscillations. The effective expansion parameters in our perturbative framework are the following four, We already saw them, except for the last one, in the discussion in section 3.5, and it can be seen by inspecting the expressions of the oscillation probabilities up to the fourth orders given in section 3.4 and appendix C. Formally, the expansion parameter is the first one in (6.1) in view of (3.35) with Ω [1], the kernel, in (3.39). But, the spacial integration in (3.35) produces different effective expansion parameters, the second and the third ones in (6.1). The extra factor of W 's without the kinematical factors is provided when transforming from theŜ to S matrices, as seen in section 3.3.4. For simplicity of the discussion in this section, we limit ourselves to the case of |W | ∼ 0.1. Under the same conditions we have imposed in section 3.5 the first one in (6.1) is 7.6 × 10 −4 for ∆m 2 Ji = 0.1 eV 2 and ρE = 10 (g/cm 3 )GeV, while the second and the third, which are comparable at around the first oscillation maximum, are estimated as 2.3 × 10 −2 for the same condition. Therefore, the smallness of the expansion parameter is ensured unless ρE 10 (g/cm 3 )GeV. In fact, a close examination of the order W 4 terms in the oscillation probability (see appendix C) shows that all the formally W 4 terms are actually further suppressed. The largest term in the fourth-order oscillation probabilities is of the one suppressed by a factor AW ∆ J −h i (ALW ) W 2 < ∼ 1.7 × 10 −7 , which is as small as ∼ 10 −4 even in the case |W | = 0.5. Therefore, we expect that the formula for the oscillation probability in (2.4) works under much relaxed conditions than the one in (3.48).

On Uniqueness theorem and matter-dependent dynamical phase
We have shown in sections 3.4 and 3.6 that there is no surviving matter dependent correction term in the oscillation probability up to order W 4 after averaging out fast oscillations and using the suppression by energy denominators. Should we expect that this feature is stable against higher order corrections beyond order W 4 , as postulated by the Uniqueness theorem, in our perturbative framework? We argue that the answer is Yes. We first note that higher-order corrections in terms of W are computed by using Ω[1] as the kernel, as indicated in eq. (3.35). Notice also that all the elements of Ω [1], except for Ω[1] JJ , carry the energy denominator, as shown in (3.39). Then, higher order correction terms are always accompanied by the energy denominators which are composed of some of the first three in (6.1), and therefore they are suppressed. The unique exception for it is the terms generated only by Ω[1] JJ which lacks the energy denominator. Therefore, apart from this special case, we have shown that higher-order corrections in W does not produce the surviving terms after averaging over fast oscillation and using the energy denominator suppression. It is consistent with what we saw in our explicit computation to order W 4 . This concludes our justification of the Uniqueness theorem.
We need to clear up the issue of special type of perturbative correction terms which involve only Ω[1] JJ as the kernel in (3.35). It produces the unique form ofŜ JJ aŝ a collection of terms of matter-dependent higher order renormalization to J W αJ W * βJ , the probability leaking term at the amplitude level. However, it exponentiates and has contribution to the S matrix element as 22 The unique form of S matrix, in principle, raises an interesting issue of dynamically generated phase produced jointly by unitarity violation and matter effect. 23 In our setting, however, it either disappears from the amplitude squared, or has vanishing effect when the high frequency oscillation is averaged out.

Absence of enhancement due to small solar mass splitting denominator
In perturbation theory one has to sum up intermediate states including off mass shell states. Therefore, even though we sit in the kinematic region where atmospheric-scale oscillations are large the energy denominator can become small, to the order of solar ∆m 2 mass splitting. Then, one might question whether the correction terms blow up at the small denominator, which would invalidate our perturbative treatment. Fortunately, one can show that the "singularity" which could be produced in the limit of small solar mass splitting always cancels against the small numerator of the similar size. This problem exists already in the second-order expression of the oscillation probability (3.44). See the second term in second order (in W ) term. If we denote h l − h k ≡ the term would have 1/ singularity in the limit of → 0. However, one can see by inspection by eye that the expression inside the square parenthesis is antisymmetric under l ↔ k, and hence it is of order or higher. Therefore, the singularity cancels. Notice that the antisymmetry under l ↔ k is not required for the whole expression including the matrix element factor.
The situation is a little bit more complicated in the fourth-order expression of the oscillation probability given in appendix C. In addition to 1/ singularity similar to the one we already saw, there exist apparent singularity of 1/ 2 type. See, for example, the second term in (C.5) and the last term in (C.7). But, an explicit calculation shows that the 1/ 2 singularity always cancels against order 2 numerator in the limit of small solar splitting.
This phenomenon is reminiscent of the finiteness of the oscillation probability at the small solar mass splitting limit in helio-perturbation theory with the unique expansion parameter (or a renormalized one), see e.g., [50] and the references therein. Possible interpretation of applicability of the perturbative framework to the region of solar level 22 It might be easier to obtain the phase factor if we use a different decomposition ofH from (3.19) by absorbing W † AW intoH0. 23 The phase itself needs not be small. Taking the matter potential of CC reaction and the earth diameter, . Therefore, ALW 2 can be order unity for |W | 0. 4. crossing has been discussed [56,57]. Another example for the similar phenomena is the one at the small atmospheric mass splitting limit with additional expansion parameter sin θ 13 . In this case it is observed that near the atmospheric resonance region not only the oscillation probability is finite but also its accuracy improves when the higher order terms to fourth order in sin θ 13 is added [58].
Then, one might ask if our small unitarity violation perturbation theory gives quantitatively accurate result at around the denominator with small solar mass splitting. However, we note that this problem is not relevant in our case because all these terms with apparent singularities vanish after averaging over the high-frequency oscillations and using the suppression by the sterile mass denominators. Yet, we must remark that if we investigate possible enhancement of the correction terms outside the condition (3.48), as done in section 5.3, the quantitative accuracy of the expression may become an issue.

Concluding remarks
In this paper, we have presented a comprehensive treatment of the three active plus N sterile neutrino model in the context of leptonic unitarity test. We have formulated an appropriate perturbative framework with expansion in small unitarity violating W matrix elements, while keeping (non-W suppressed) matter effect to all orders.
What we have done in this paper is mainly threefold: • We have shown that the same condition on sterile state masses 0.1 eV 2 < ∼ m 2 J < ∼ 1 MeV 2 , as we imposed in vacuum, is sufficient to make the (3 + N ) model sterile-sector model independent, apart from N dependence in the lower bound on probability leaking constant C αβ . We have shown in an explicit computation to fourth order in W that these higher order correction terms either vanish or are negligible after averaging over fast oscillations and using energy denominator suppression, leaving only the leaking term C αβ . We have argued by postulating the "Uniqueness theorem" that this feature prevails to all orders in W perturbation theory.
• As an outcome of our framework, we have derived very simple formulas for oscillation probabilities between active neutrinos in the (3+N ) model. We have analyzed them in ν µ → ν α channel, α = e, µ, τ , to know in which region of energy and baseline the effects of unitarity violation are large. We have observed relatively large effects in ν µ → ν µ and ν µ → ν τ channels and pointed out, though qualitatively, that anticorrelation of signals between them would enhance the sensitivity to unitarity violating effects.
• We have discussed the question of how to distinguish low-scale unitarity violation from high-scale one. We have pointed out that the second order W correction to the leading order, if detected, would signal low-scale unitarity violation. This is to add to the way we have discussed in [20], i.e., to detect the probability leaking term C αβ which testifies for the existence of "sterile", or undetectable but communicable, sector at low energy scales.
Notice that in "constraining mode" of unitarity violation, the model-independence of the framework translates into a universal nature of the bounds, thereby making them more powerful. Whereas in "discovery mode" of unitarity violation, the model dependence, in particular through the W dependent correction terms, is welcome because it serves for identifying the structure of the sterile sector.
During the course of this work, we have obtained the new results and had some interesting observations including: • We have obtained an exact solution of the Hamiltonian (4.6) for uniform matter density. It describes neutrino evolution in low-scale unitarity violation in zerothorder in W , which applies also to the case of high-scale unitarity violation. The key parts of the solution can also be utilized to calculate the higher order corrections in W . When applied to each shell inside the earth, it could provide a semi-quantitative way of simulating non-unitary neutrino evolution for the terrestrial experiments.
• We have observed outside of the region of validity of our Uniqueness theorem there is a limited phase space, ρE 10 (g/cm 3 )GeV, in which the second order W correction terms can be sizeable. If detected, it may allow us to probe structure of W matrix elements which bridges between active and sterile sectors.
• We have examined a toy model of equally distributed W matrix elements within each flavor, in which the probability leaking term scales as 1/N . Likewise, possible detection of C αβ may provide information of the hidden sterile sector.
We emphasize that, unlike in the case of high-scale unitarity violation, neutrino experiment is the most powerful way to execute leptonic unitarity test in the scenarios of low-scale unitarity violation. Nonetheless, we have to admit that our observations on what we could do for experimental detection of possible non-unitarity effects are too qualitative to make any definitive claim for possible detection in the future. Clearly, the better analyses are called for.
While we worked exclusively on the (3 + N ) state unitary model as a model of lowscale unitarity violation, we do not know if it is the unique choice, or it merely reflects our ignorance. Even in the case there exist more generic class of models for low-scale unitarity violation, the phenomenon of probability leaking is likely to survive. It is because the probability leaking must take place whenever the extra light sector exists and communicates with the three active neutrinos.
first research-oriented university in Ecuador [59], during which he was warmly supported by Ecuadorian people. He thanks kind supports by ICTP-SAIFR (FAPESP grant 2016/01343-7), UNICAMP (FAPESP grant 2014/19164-6), and PUC-Rio (CNPq) which enabled him to visit these institutions where part of this work was done. C.S.F. is supported by FAPESP under grants 2013/01792-8 and 2012/10995-7. H.N. is supported by CNPq. This work was supported in part by the Fermilab Neutrino Physics Center.

AŜ matrix elements
The method of computingŜ matrix elements is outlined in section 3.3.3. In this appendix A we carry out this task. With the expression of H 1 in (3.37) we compute perturbatively the matrix elements of Ω(x). Then,Ŝ(x) = e −iĤ 0 x Ω(x).
We denote computated results of the matrix elements ofŜ asŜ[n] to indicate that it is the one that comes from n-th order contribution in H 1 . Since the elements of H 1 are of order either W or W 2 ,Ŝ[n] generally has order W n or higher. To show that a particular contribution is of order W m we use the superscript "(m) . That is,Ŝ (m) [n] denotes contribution toŜ that arizes from n-th order perturbative contribution in H 1 and is of order W m .

A.1 Contribution toŜ matrix elements from zeroth and first order in H 1
The zeroth and first orderŜ matrix elements can be calculated as follows:

A.2 Contribution toŜ matrix elements from second order in H 1
Likewise,Ŝ matrix elements can be calculated in second order inĤ 1 by using the formula for Ω in (3.35) andŜ-Ω relation in (3.36) aŝ We now discuss generalized T invariance of second orderŜ matrix elements. Let us first examineŜ ij | i =j [2]. We first note that the matrix element transforms under generalized T transformation i ↔ j, U → U * , W → W * , etc. as Therefore, the matrix element part is consistent with generalized T invariance. However, the rest ofŜ ij [2]| i =j does not appear to be manifestly generalized T invariant. But, one notices thatŜ ij | i =j [2] can be written aŝ (A.4) Similarly, the first and the fourth terms inŜ IJ | I =J [2] are not manifestly generalized T invariant. But, they can also be written as manifestly generalized T invariant forms aŝ (A.5) Under generalized T transformation, the second term goes to the third term (both ix terms), and vice versa, and hence they are invariant. The situation is completely the same as in the fourth and the fifth terms. The first and the last terms are invariant in an analogous way asŜ ij | i =j [2]. ForŜ iJ [2] andŜ Ji [2] generalized T invariance holds as they are. ForŜ ii [2] andŜ II [2] the matrix factor is self-invariant, and therefore generalized T invariance holds.
What we should do in the rest of appendix is to computeŜ matrix elements perturbatively to fourth order in H 1 . In presenting the results of computation ofŜ matrix elements, however, we change strategy of our description. That is, • We presentŜ matrix elements at fixed order in W . To do this we, of course, have to sum up contributions that come from different order in H 1 . For example,Ŝ iJ [2].
• We present only the terms which are required to compute S matrix elements to order W 4 . In view of the relations betweenŜ and S matrix elements given in eq. (3.41), S IJ ,Ŝ IJ , andŜ (4) iJ (andŜ (4) Ji ) are all unnecessary.
• We only give the results of manifestly generalized T invariant form ofŜ matrix elements with which it must be straightforward to prove generalized T invariance.
A.3 Order W 2Ŝ matrix elements from Second-order in H 1 The second order in H 1 contributionŜ IJ [2] (andŜ II [2]) has both order W 2 and W 4 terms.
To compute S matrix elements to order W 4 we need only the former term because the latter produces order W 6 terms, as one can see in (3.41). Then, it is necessary to remind the readers that the expressions of order W 2 terms in S IJ [2] andŜ II [2] already exist as the first terms in (A.5) and (A.2), respectively.
By adding the contribution of the third and second order in H 1 we obtain forŜ matrix elementsŜ and similarly, forŜ (A.7) Using (A.6) and (A.7), one can easily confirm generalized T invarianceŜ Ji (U * , W * , etc).

B Structure of S matrix elements
Here we present details in computation of the S matrix elements.

B.1 The S matrix elements S (4) αβ
We decompose S (4) αβ into the following three pieces (include both α = β and α = β) where (n = 3, 4) "diag" and "offdiag", respectively, implies The latter two terms in (B.1) are given, respectively, by We do not display explicitly the expression of each term in (B.1). But, the notation of S Lk , whose notations are also transported to the oscillation probability.

C Expression of the oscillation probability in fourth order in W
The oscillation probability to second order in W is given in eq. (3.44) in section 3.4. What is left is, therefore, the expressions of the oscillation probability in fourth order in W , the explicit form of the two terms in (3.50), P (ν β → ν α ) = S C.1 Second order S matrix squared term: S The S matrix element S (2) αβ in eq. (3.43) contains four terms. To prevent too long expressions, we divide S (2) αβ 2 into the two terms, one sum of each term squared and the other one composed of cross terms. The first one is given by Except for the first term in (C.1) we did not try to unify the two exponential factors because the expressions become cumbersome. Apart from the last line in (C.1) all the terms are suppressed by the two energy denominators with ∆m 2 Jk which doubly suppress the active-sterile state transition. The first term in the last line is the probability leaking term mentioned in section 2.2.
The second term of S (2) αβ 2 (interference terms) is given by Ll .

E Universal scaling model of N sterile sector
Suppose that we obtain a particular parametrization of U matrix by taking N = 1 sterile sector, as we did in section 5.2. In this (3+1) model, the W matrix elements are completely determined, up to phase, by unitarity Now, we attempt to create a toy model of N sterile sector by "universal scaling". We postulate that all the W matrix elements are real and equal: which is consistent with (3 + N ) space unitarity. In this universal scaling model, the order W 2 correction terms in (3.46) remains unchanged provided that we further assume that all the sterile masses are equal. 24 It is because the W matrix elements enter into the W 2 terms in the form where n = 1 or 2. However, the leaking constant C αβ becomes smaller by a factor of N in the universal scaling model. In the (3 + 1) model, C αβ takes the largest value, the upper limit in eq. (5.2). Because C αβ is fourth order in W it is evident that in the universal scaling model, which is the lower limit of (5.2).