A framework for testing leptonic unitarity by neutrino oscillation experiments

If leptonic unitarity is violated by new physics at an energy scale much lower than the electroweak scale, which we call low-scale unitarity violation, it has different characteristic features from those expected in unitarity violation at high-energy scales. They include maintaining flavor universality and absence of zero-distance flavor transition. We present a framework for testing such unitarity violation at low energies by neutrino oscillation experiments. Starting from the unitary 3 active plus $N$ (arbitrary integer) sterile neutrino model we show that by restricting the active-sterile and sterile-sterile neutrino mass squared differences to $\gtrsim$ 0.1 eV$^2$ the oscillation probability in the $(3+N)$ model becomes insensitive to details of the sterile sector, providing a nearly model-independent framework for testing low-scale unitarity violation. Yet, the presence of the sterile sector leaves trace as a constant probability leaking term, which distinguishes low-scale unitarity violation from the high-scale one. The non-unitary mixing matrix in the active neutrino subspace is common for the both cases. We analyze how severely the unitarity violation can be constrained in $\nu_{e}$-row by taking a JUNO-like setting to simulate medium baseline reactor experiments. Possible modification of the features of the $(3+N)$ model due to matter effect is discussed to first order in the matter potential.


Introduction
Determination of leptonic mixing parameters, the three mixing angles and the two mass squared differences of neutrinos, marks a new epoch of physics beyond the standard model (SM). Despite that we still do not know the value of leptonic Kobayashi-Maskawa phase and the neutrino mass pattern, successes of hypothesis of using the standard three-flavor mixing in describing a wealth of experimental data prompts us to think about one further step, namely, the paradigm test. We feel that we have reached the right point that raising the question of how to test the three-flavor mixing framework itself is timely. The most common way of testing the framework is to verify unitarity of the mixing matrix. It appears to us that the two different strategies of testing leptonic unitarity are thinkable: • One is to show closing of the lepton unitarity triangle in an analogous way of unitarity test for the quark CKM matrix [1].
• The other is to prepare a model of unitarity violation, confront it against the available experimental data, and derive constraints on unitarity violation parameters.
The first method serves as a unitarity test purely within the framework of standard threeflavor mixing scheme of neutrinos, without recourse to any particular model of unitarity violation. This is the advantage of this method. See e.g., ref. [2] for this approach. In the second way, one introduces a general framework for leptonic unitarity violation, or a class of models that embody the property is constructed. To our opinion, there are pros and cons in the above two different strategies. In the first method, despite its charming model-independent nature, it is quite challenging to determine size of each side of the unitarity triangles experimentally [2]. The second method introduces model-dependent features into the unitarity test, which can be considered as a drawback of this approach. On the other hand, there is a definite underlying scenario behind the non-unitarity in the latter case, or at least general guidance to it. Therefore, once hinted, it may allow us to identify the cause of unitarity violation. We feel, therefore, that both methods of leptonic unitarity test must be pursued.
In studies of testing leptonic unitarity so far done along the second strategy above, it appears that people took two different attitudes. That is, one integrates out new physics effects at high energy scales to obtain effective theories of three generation leptons at low energies which represents non-unitarity in this limited subspace [3]. The other one takes more relaxed attitude in explicitly introducing SM singlet leptons, and examines the models in a relatively model independent fashion in SM subspace with non-unitarity [4,5]. In the latter approach, the masses of SM gauge-group singlet leptons can be large or small, reflecting varying underlying scenarios of new physics. If we take the former way, SU (2) × U (1) gauge invariance dictates that the same unitarity violation must also be manifest in the charged lepton sector. In the framework of ref. [3], generally speaking, the constraints on unitarity violation are dominated by the ones coming from the charged lepton sector. If we follow the latter way, it is more case-sensitive and neutrino experiments can play greater roles. One of the most interesting questions in the whole area of study of unitarity violation is to reveal qualitative differences between unitarity violation at highand low-energy scales.
It is the purpose of this paper to discuss low energy-scale unitarity violation, hereafter "low-scale unitarity violation" for short, in detail. By low-scale unitarity violation we mean that a "hidden" sector in state space to which probability flow occurs is located at low energy scales, like eV or MeV. It allows the hidden sector particles be produced along with neutrinos, and also they participate in neutrino oscillations assuming their mixing with neutrinos. We first recapitulate the interesting characteristic features of low-scale unitarity violation different from high-scale unitarity violation. They include (1) retaining flavor universality, and (2) lack of zero-distance flavor transitions. See section 2 for more about these points. Some specific scenarios of high-and low-scale unitarity violation were explored in refs. [6][7][8][9].
Then, we go on to construct a framework for experimental testing of low-scale unitarity violation. Since there is such interesting qualitative differences above between high-and low-scale unitarity violation, they must be tested and be distinguished from each other. We argue, in agreement with the preceding works [4,5], that the constraint by Z width measurement in LEP [10] makes extension of low-mass lepton sector to be essentially unique, only allowing inclusion of SM singlet fermions. Thus, our model of non-unitarity at low-energies utilizes three active neutrinos and an arbitrary numbers of sterile neutrino states.
We discuss in detail how the model prediction can be made insensitive to the details of the sterile sector, e.g., the mass spectrum of sterile neutrinos and mixing between active and sterile neutrinos. We find that the resultant expressions of oscillation probabilities in vacuum contain a new term, an explicit probability leakage term, which distinguishes between low-and high-scale unitarity violation. To our knowledge, the term has not been incorporated in the previous analyses of unitarity violation at low energies. We examine how this framework works by analyzing future medium-baseline reactor neutrino experiments. In the final two sections we discuss how CP violating terms in accelerator appearance measurement can be used to signal non-unitarity and how the matter effect affects the foregoing discussions above.
There is an obvious relation between the model we discuss in this paper and various versions of active plus sterile neutrino models proposed to provide description of the LSND-MiniBooNE anomaly (see a review [11] and references therein). We will make remarks on the relationship between them below at wherever appropriate. In particular, we should note that in the frameworks of 3 active plus a few sterile neutrinos the various bounds on the mixing parameters are derived by using the existing data. For the most recent comprehensive analysis, see ref. [12]. 1

Unitarity violation at high-and low-energy scales
The cause of unitarity violation in the lepton sector can be due to new physics beyond the SM at high-energy scales, or the ones at low energies. In the best studied high-scale seesaw scenario of neutrino mass [13][14][15][16], the three-flavor mixing of neutrinos has a tiny violation of unitarity due to the mixing of heavy right-handed neutrinos. A more generic formulation of high-scale unitarity violation was given by Antusch et al. [3] by taking the minimal unitarity violation scheme. One of the salient features in high-scale unitarity violation is that even though SM singlet leptons exist which mix with neutrinos, it is likely that such neutral leptons are much heavier than neutrinos. They are not produced copiously in the same processes as neutrinos are produced, and a physical transition from neutrinos to them are kinematically forbidden.
On the other hand, if we assume that unitarity violation occurs due to physics at an energy scale much lower than the electroweak scale, the light SM gauge group singlet leptons not only mix with neutrinos, but also their masses are so light that they participate in the process of neutrino oscillations. In this paper, we try to develop a framework for experimental test of unitarity violation by assuming such situation, to which we simply refer as "low-scale unitarity violation". Hereafter, we call the SM gauge group singlet fermions generically as "sterile neutrinos" for simplicity.
We notice that there are some characteristic features in high-and low-scale unitarity violation that one can recognize even without going into any details. They are: • Yes or no of violation of lepton universality: It is expected on general ground that due to non-unitarity of the lepton mixing matrix the lepton universality is violated. See refs. [3,4], and the references cited therein. 2 While it is a generic feature in highscale unitarity violation, lepton universality can be maintained in low-scale unitarity violation. It is because sterile neutrinos can be produced as well, for example in µ → e + steriles process. Assuming no detection sensitivity to sterile neutrinos, it masks the effect of non-unitary mixing matrix in the active neutrino sector.
• Yes or no of zero distance neutrino flavor transition: Similarly to the above point, in high-scale unitarity violation, kinematically forbidden active to sterile states transition entails zero-distance attenuation of probability of a given flavor neutrino [19]. It does not occur if sterile neutrinos can take part in the flavor oscillation processes, as we will show in section 4.1.
1 Though they are very relevant for this paper we do not implement these bounds into discussions in this paper. It is because they are not derived by using the generic (3 + N ) model, and the translation of their bound to our setting requires great care. Furthermore, the principal purpose of this paper is to provide suitable framework for leptonic unitarity test in high-precision experiments in the future. 2 General bounds on non-unitarity are discussed in the context of high-scale unitarity violation, e.g., in refs. [3,4,17,18].
• Of course, there are common features in high-and low-scale unitarity violation: Emission of sterile neutrinos, if kinematically allowed range of low to high masses, affects the observable spectrum of charged leptons. It can be utilized to place constraints on non-unitarity by using, e.g., electron spectra in beta and muon decays, or muon spectrum in pion decay, cosmological observations etc. See [20] for a comprehensive summary of the current status of the bounds for the 3+1 scenario. 3 In the rest of this paper, we construct a model of low-scale unitarity violation which can be used to test leptonic unitarity in neutrino experiments. Although the constraints from beta and muon decays etc. just mentioned above are relevant, we do not try to elaborate the discussions already given in [20] and the references cited therein.

A model of unitarity violation at low energies
Now, we introduce our model of unitarity violation at low energies. But, one recognizes immediately that there is no big room for this. Precision measurement of Z decay width at LEP [10] dictates that there is only three active neutrinos. Therefore, extra fermions we introduce which mix with neutrinos must be SM singlets, which we call "sterile neutrinos" in this paper. Then, we are left with the unique possibility, the system of three active neutrinos plus arbitrary number of sterile neutrinos which mix with each other. We denote the number of sterile neutrino states as N . We assume that our system of the three active neutrinos and N sterile neutrinos are complete, which we call the (3 + N ) space unitary model hereafter.
Though we deal with the particular model we want it as model-independent as possible within the framework of the (3 + N ) space unitary model. Therefore, we shall always keep number of sterile neutrinos N arbitrary in this paper. Toward constructing a framework for leptonic unitarity test, however, we must make additional requirement on our (3 + N ) space unitary model. We want to avoid the situation that experimental predictions of the model depend very sensitively on details of the N sterile neutrino sector, for example, on the mass spectrum of sterile states. In the rest of this section, we discuss how it can be achieved, and what are the conditions for this.

3 active +N sterile unitary system
To define the notation and for definiteness, we introduce the (3 + N ) space unitary system in vacuum. The Hamiltonian which governs the evolution of 3 active and N sterile state vector in flavor basis, ν = [ν e , ν µ , ν τ , ν s 1 , ν s 2 , · · ·, ν s N ] T , as i d dx ν = Hν is given by 4 Here, m i (m J ) denote the mass of mostly active (sterile) neutrinos and E is the neutrino energy.
For notations of the mass squared differences we use, in generic case including both active and sterile neutrinos, where a, b = 1, ..., 3 +N . When we want to distinguish between active-active, active-sterile, and sterile-sterile neutrino mass differences, we use where i, j, k, ... (small letters) = 1,2,3 are for active neutrinos, and I, J, K, ... (capital letters) = 4,.., 3+N for sterile neutrinos. The mixing matrix U relates the flavor eigenstate ν to the vacuum mass eigenstateν as ν ζ = U ζaνa (here, ζ = e, µ, τ, s 1 , ..., s N ), and hence it is a (3 + N ) × (3 + N ) matrix. By construction of the model, the matrix U is unitary. While the flavor index ζ above includes also sterile states, from now on, we will eventually single out only the active flavor indices α, β = e, µ, τ whenever they are explicitly specified in the formulas for the S matrix as well as for probabilities.

Averaging out the sterile oscillations due to decoherence
How to make our model insensitive to the mass spectrum in the sterile sector? The masses of the sterile neutrinos appear in the factor of e −iE J x = e −i degeneracy among the sterile states i.e. |∆m 2 JI | |∆m 2 31 |, the oscillation terms involving sterile masses can be averaged out due to (partial) decoherence if certain conditions are fulfilled. 5 Intuitively, decoherence occurs when the variation in the phase due to spatial and/or energy resolution is greater than 2π, From the terms in eq. (3.5) which depend, respectively, on the variation of baseline distance (δx) and that of energy (δE), we can classify the following two types of decoherence: i. Spatial resolution. In this case, decoherence happens if ii. Energy resolution. In this case, decoherence happens if Notice that the conditions derived heuristically above are in agreement with those obtained from formal approaches (i.e. wavepacket description) as e.g., in refs. [23][24][25]. Since we are interested in the decoherence involving sterile sector, the conditions (3.6) and (3.7) have to be fulfilled for ∆m 2 Ja which involve at least one sterile mass. This allows us to obtain lower bound on the scale of sterile sector |∆m 2 Ja | where our model becomes insensitive to the sterile mass spectrum. Notice that δx and δE in eqs. (3.6) and (3.7) are associated with the experimental setup (concerning both production and detection), e.g., with the production region of neutrinos and with energy resolution of a detector.
In the following, we discuss the conditions that must be satisfied for sterile oscillations to be averaged out. Most of the neutrino oscillation experiments work with the kinematical setting, either in which the former is for long-baseline (LBL) reactor neutrino experiments, KamLAND, JUNO, and RENO-50, and the latter for the accelerator LBL and the reactor θ 13 experiments. From eq. (3.6), the condition for averaging out due to the size of production region, i.e. δx = x prod reads x prod 4πE |∆m 2 Ja | , (3.9) 5 We are interested in partial decoherence where the oscillations involving sterile states are averaged out, whereas the active ones do oscillate.
where x prod denotes the size of the production region of neutrinos, e.g., core diameter for nuclear reactor neutrinos and the length of decay pipe for accelerator neutrino beams. Assuming the setting as in (3.8), the condition (3.9) leads to where we have taken, for the typical source sizes and baseline distances, x prod = 10 m and x = 50 km for reactor neutrinos, and x prod = 1 km and x = 1000 km for accelerator neutrinos. Due to energy resolution of a detector, using ∆m 2 21 = 7.5 × 10 −5 eV 2 , and |∆m 2 31 | = 2.4 × 10 −3 eV 2 , eqs. (3.7) and (3.8) lead to the condition on ∆m 2 Ja for the sterile oscillation to be averaged out: The aggressive choice of a typical 3% error in energy measurement in the former case is based on the JUNO proposal in [26], whereas a conservative choice of δE/E = 10% is made for accelerator neutrino experiments. Therefore, if we restrict ourselves to ∆m 2 Ji ∼ |∆m 2 JK | > ∼ 0.1 eV 2 , the fast oscillation due to the active-sterile and sterile-sterile mass squared differences can be averaged out by the effect of energy resolution. For the JUNOlike setting the requirement on |∆m 2 Ja | can be relaxed by an order of magnitude. We note that effect of averaging over the production points of neutrinos is less sizeable compared to that of energy resolution for the fast sterile oscillation to be averaged out, which therefore leads to more restrictive condition on the sterile state masses.

Requirement on the sterile mass spectrum
In addition to the condition of averaging out the fast sterile oscillations, we require that the masses of sterile neutrinos are light enough such that they can be produced in the same environment as neutrinos are produced. We do this because it is the most significant characteristic feature of unitarity violation at low energies. It gives raise to the condition m J < ∼ 1 MeV for reactor neutrinos, and m J < ∼ 100 MeV for accelerator neutrinos.
To summarize: In seeking the case that neutrino oscillation in our 3 active + N sterile neutrino system is insensitive to the detailed properties of the sterile sector, such as the mass spectrum of the sterile states and the fine structure of the active-sterile mixing, we require for sterile neutrino masses in our (3 + N ) space unitary model that The lower limit is from condition of averaging out the fast oscillations for accelerator neutrinos, and the upper one from producibility of sterile neutrinos in reactors.
With the conditions (3.6) and/or (3.7) of averaging out the fast oscillations being satisfied we can make approximation 6 where ... stands for averaging over neutrino energy within the uncertainty of energy resolution, as well as averaging over uncertainty of distance between production and detection points of neutrinos. The latter approximate equalities in (3.13) and (3.14) assume that there is no accidental degeneracy among the sterile state masses i.e. |∆m 2 JK | |∆m 2 31 |.

Cases in which sterile oscillations are not averaged out
While allowing wide range of sterile lepton masses, the condition (3.12) excludes the certain characteristic regions of sterile neutrino mass spectrum. Exclusion of higher masses is done under the spirit of low-scale unitarity violation, and therefore we consider it granted. But, there is no a priori reason for excluding sterile neutrino masses in regions ∆m 2 Ji ∼ |∆m 2 JK | ∼ the atmospheric ∆m 2 , or the solar ∆m 2 . In this case, however, one must expect severe model dependence in the experimental predictions by the (3 + N ) unitary model. Clearly, the number of CP violating phases depends on N , and the additional phases will play important roles in fitting data. Therefore, an extensive separate treatment is necessary to include this case.
How about sterile neutrino masses which are much lighter than the atmospheric, or the solar ∆m 2 ? Again, there is no a priori model-independent reason for excluding this case. If the active neutrino masses are such that KATRIN can detect the signal, m i > ∼ 0.2 eV, then, averaging out condition may be barely maintained for the fast active-sterile oscillation for very small sterile masses, ∆m 2 0.04 eV 2 . However, it would be accompanied by extremely slow developing (as a function of propagation distance x) sterile-sterile oscillations. Clearly, a separate analysis is needed to know to what extent the case survives a test with the currently available experimental data.
As a summary of our discussions in this section, we state as follows: If we construct (3 + N ) space unitary system as a model of low-scale unitarity violation, we can make the model predictions insensitive to details of the sterile neutrino sector, such as the mass spectrum. It requires us to restrict ourselves to the region of sterile neutrino masses 0.1 eV 2 < ∼ m 2 J < ∼ 1 MeV 2 . We assume this in our all subsequent discussions in this paper.

The oscillation probabilities in 3 active and N sterile model in vacuum
Given the Hamiltonian in (3.1), it is straightforward to compute the neutrino oscillation probabilities P (ν β → ν α ) in vacuum, where the Greek indices α, β, ... = e, µ, τ . Let us start by showing that there is no zero distance transition in our (3 + N ) space unitary model.

No zero distance transition in (3 + N ) × (3 + N ) unitary system
The oscillation probabilities take the form .
(4.1) At x = 0, P (ν β → ν α ) = δ αβ thanks to unitarity of the U matrix. It means, of course, no zero distance transition in the (3 + N ) space unitary model. This is in sharp contrast to the feature possessed by the high-scale unitarity violation [3,4].

The oscillation probabilities in the (3 + N ) model
Here, we derive the expressions of the oscillation probabilities in our (3 + N ) model when the active-sterile and sterile-sterile oscillations are averaged out. For this purpose we define a new notation of the (3 + N ) × (3 + N ) unitary matrix U. It can be parameterized as [4] satisfying UU † = U † U = 1 (3+N )×(3+N ) . The active space mixing matrix U is 3 × 3 matrix with elements U αi , the rectangular matrices W and Z are respectively 3 × N and N × 3 matrices with elements W αI and Z Iα , and the square matrix V is N × N matrix with elements V IJ . To develop general framework we do not make any assumptions on the size of W and Z matrix elements (besides |W |, |Z| < 1) in this paper. The oscillation probability is written in terms of S matrix as P (ν β → ν α ) = |S αβ | 2 , such that its integer index indicated by the capital letters like I, J and K (which run from 4 to 3 + N ) always refers to the sterile neutrino mass eigenstate. After squaring the S matrix, P (ν β → ν α ) has three terms: the first and second terms squared and the interference term, each of which can be easily computed. They are given, in order, as We notice that the last two lines vanish after averaging over energy resolution, as discussed in section 3. Then, we obtain the expressions of oscillation probabilities in our (3 + N ) model in vacuum. In the appearance channel, α = β, it reads and in the disappearance channel One should notice that after averaging over high-frequency sterile oscillations, the expressions in (4.6) and (4.7) have terms which look like the "zero-distance flavor transition". But, it cannot be the correct interpretation because the averaging procedure (even though it is on energy spectrum) inherently contains certain distance scale to observe destructive interference which leads to cancellation of oscillatory behavior.
The expression of the oscillation probabilities in (4.6) and (4.7) look similar to the ones in the standard three-flavor mixing. But, there are two important differences: • The active space mixing matrix U is not unitary, • There is a probability leaking term to the sterile neutrino sector, C αβ in (4.6) and C αα in (4.7).
The former is a common feature of the theories in which unitarity is violated in active neutrino subspace. In the unitary case the second term in (4.6) is δ αβ . On the other hand, the second point above, the existence of probability leaking term, is the characteristic feature of the low-scale unitarity violation. However, the term is omitted in the expression of the oscillation probability in the literature, e.g., in refs. [5,27], and was considered only for some specific models of sterile neutrinos, e.g., in [9,28]. Does the leaking term introduce a heavy model-dependence into the prediction by our (3 + N ) model? The answer is no: though it indeed displays some sterile sector model dependence, it is only a mild one. That is, the term can be treated as the channel dependent constant C αβ when this formula is used to analyze leptonic unitarity violation in vacuum.
We emphasize that the clearest evidence for low-scale unitarity violation is the demonstration of existence of probability leaking constant C αβ ≡ 3+N J=4 |W αJ | 2 |W βJ | 2 . Unfortunately, it would not be easy to carry out for the two reasons: (1) the term is small in size because it is the fourth order in unitarity-violating elements W αJ , and (2) it is just a constant term and hence it could be confused by the uncertainty in the flux normalization of neutrino beams.
Apart from the probability leaking term C αβ (α = β, α = β), our formulas agree with those of ref. [4]. On the other hand, the oscillation probability formulas in ref. [3] have extra normalization factor. Therefore, it looks like they do not agree with each other although they are both dealing with high-scale unitarity violation. But, since the normalization factor cancels against those included in the neutrino cross sections they are consistent, if the probability formulas in [4] are understood as the ones after the cancellation.

(3 + N ) state space unitarity and constraint on probability leaking term
In our three active plus N sterile neutrino model, unitarity is obeyed in the whole (3 + N ) state space, UU † = U † U = 1. It takes the form in the active 3 × 3 subspace The first relation in (4.9) implies that size of the probability leaking terms, C αβ or C αα , and the size of unitarity violation in active space U matrix are related to each other. In fact, it is easy to derive the upper and lower bounds on Since the last terms are non-negative we obtain the upper bounds 7 The lower bound is slightly nontrivial, but they are derived in appendix A: (4.12) The lower bounds depend on N , and therefore they are sterile-sector model dependent. But, since the upper bounds are more restrictive, as we will see in the analysis in section 5, we assume the least restrictive case, N = ∞ there. Using (4.11) and (4.12), and the fact that Suppose that the analysis of future experimental data indicates unitarity violation with nonzero value of C αα and

Summarizing our method of testing leptonic unitarity
Now, we can summarize our method of testing our (3 + N ) model of low-scale unitarity violation in vacuum: We fit the data by using the two ansatz: (1) the standard three-flavor mixing with unitary mixing matrix U PDG [1], and (2) the expressions of the oscillation probabilities in (4.6) and (4.7), with the non-unitary U matrix and the probability leaking terms C αβ and/or C αα . In the latter fit, it is important to place the constraints (4.11) on C αβ and C αα . In section 5 we present an analysis of simulated JUNO data within our formalism.
One can think of various features of the fit results that can be obtained in this way. To discuss possible implications, let us assume for conceptual clarity that a set of super-high precision measurement were done by experiments with perfectly controlled neutrino beam.
• If the fit results using (1) the standard three-flavor mixing, and (2) the (3 + N ) model reveal only small difference between them, it is an indication of absence of unitarity 7 As pointed out in ref. [5], for α = β and i = j cases respectively, there are two relevant bounds that can be obtained by applying Cauchy-Schwartz inequalities to unitarity constraints (4.9): These bounds are relevant when studying neutrino appearance να → ν β . violation. One can obtain quantitative bounds on how severely unitarity violation is constrained.
• If the fit revealed a discrepancy between (1) and (2), it is an indication of unitarity violation. It is likely that the first indication of unitarity violation comes from nonzero values of 1 − 3 i=1 |U αi | 2 (α = e, µ, τ ) in the disappearance channels, and/or 3 j=1 U αj U * βj in the appearance channels. They are both of the order of W 2 .
• If the measurement is sufficiently accurate to detect nonzero values of C αβ (α = β and/or α = β) of the order of W 4 , in addition to nonzero 1 − 3 i=1 |U αi | 2 and/or 3 j=1 U αj U * βj , it is a hint for low-scale unitarity violation.
• If the fit revealed a discrepancy between (1) and (2), indicating unitarity violation, and the fit results of C αβ (α = β and/or α = β) is outside the region allowed by the constraints (4.11). Nonvanishing C αβ suggests unitarity violation at low energies, which however implies that either both the conditions (3.6) and (3.7) are not satisfied or the scenario cannot be described by our (3 + N ) space unitary model.
The final consistency check for proving low-scale unitarity violation in the third case above is to verify (i) the consistency between the magnitudes of C αβ (∼ W 4 ) and , and (ii) over-all consistency between deviation of unitarity of U matrix and the size of W matrix expected from the (3 + N ) space unitarity (4.9). We note that the relative magnitudes of C αβ and 1 is also enforced by the upper and lower bounds (4.11) and (4.12), and therefore the property is in the heart of the (3 + N ) space unitary model.
A clarifying remark is in order: In the appearance oscillation probability, (4.6), 3 j=1 U αj U * βj comes in as squared and the term is of the same order ∼ W 4 as the leaking term C αβ . Therefore, one might think that the better accuracy may not be expected for 3 j=1 U αj U * βj . The statement above that " 3 j=1 U αj U * βj is the first indicator of unitarity violation" really means that the non-unitary U matrix elements are determined mostly by the x/E dependent oscillation terms and it determines (or strongly constrains) 3 j=1 U αj U * βj , and in this way a better accuracy is expected for The similar statement for disappearance channel also follows.

Unitarity violation: Case study using JUNO-like setting and the current constraints
In this section we carry out the first test of our framework describing low-scale unitarity violation by applying it to data to be obtained by medium-baseline reactor neutrino experiments. For definiteness we assume the JUNO-like setting as defined below. 8 We define our analysis method in section 5.1 and present the results in section 5.2. During the course of describing the results of our analysis, a comparison with the constraints currently available for the ν e channel will be done. For the ν µ and ν τ related channels, we will give a brief overview of the current constraints in section 5.3, together with miscellaneous remarks on the ν e channel.
In our analysis using the JUNO-like setting, we give special attention to the probability leaking term C αα (α = e) in eq. (4.7), as discussed in section 4.4. Of course, estimation of JUNO's capability of constraining (or probing) non-unitary nature of active space U matrix in the ν e sector is a very interesting point by itself. Yet, we must admit that our analysis using a simple-minded χ 2 cannot be considered as the real quantitative one. We use the expression of disappearance probability P (ν α → ν α ) (α = e) in eq. (4.7) for reactor neutrino analysis because it is identical to P (ν α →ν α ) assuming CPT invariance.

Analysis method
We basically follow the analysis done in [29] with some modification and simplification. In our statistical analysis, we define the χ 2 function which consists of two terms as, In the present analysis we do not take into account any data except for JUNO, not even precision measurement of sin 2 θ 13 by Daya Bay and RENO [30,31], which is expected to be improved to ∼ 3% level. On this point, we will make a comment in section 5.2. Following [32,33], the χ 2 stat is defined as, where dN obs /dE vis denotes the energy distributions of the observed (simulated) signal, and f norm is the flux normalization parameter for reactor neutrinos, to be varied freely subject to the pull term in χ 2 sys (see below) and we integrate up to E max vis = 8 MeV. Due to the lack of space, we do not describe here how to compute the event number distribution dN obs /dE vis , leaving it to appendix F.
We consider only one kind of systematic error to take into account the reactor neutrino flux uncertainty, and use σ fnorm = 3% as the reference value, assuming progress in understanding of the reactor neutrino flux at the JUNO measurement era. Yet, given the current status of simulating reactor neutrino flux, we also examine the case of σ fnorm = 6% for comparison.
There are five relevant free parameters to be fitted in our analysis, namely, |U e1 | 2 , |U e2 | 2 , 3 i=1 |U ei | 2 , C ee as well as the flux normalization parameter, f norm . These five parameters are varied freely under the conditions, as well as with the χ 2 sys defined in (5.3). For simplicity, we fix the two mass squared differences as ∆m 2 21 = 7.5 × 10 −5 eV 2 , ∆m 2 31 = 2.46 × 10 −3 eV 2 and consider only the case of normal mass hierarchy. We believe that even if we vary them our results would not change significantly.
Using the χ 2 function, we will determine the allowed ranges of the five parameters mentioned above, which will be projected into two or one dimensional subspace by using the conditions, 3, 6.18 and 11.93 (1, 4 and 9), (5.5) at 1, 2 and 3 σ CL, respectively, for two (one) degrees of freedom. The allowed contours obtained by following the above procedure for the cases of flux normalization uncertainties of 3% and 6% are presented in figures 1 and 2, respectively. Since we consider the input which corresponds to the case without unitarity violation, χ 2 min = 0 by construction as we do not take into account the statistical fluctuation in simulating the artificial data.
To understand better the features of the allowed contours in figures 1 and 2, we have also performed the analysis using the same procedure as above but without the constraints (5.4). The results of such analysis with σ fnorm = 3% are given in figure 3 in appendix B.

Analysis result
In this section we present the results of our analysis of simulated JUNO data with particular emphasis to the bounds on the parameters, C ee and 1 − 3 i=1 |U ei | 2 . A nonzero value of C ee implies existence of the low-scale unitarity violation, distinguishing it from the high-scale unitarity violation. Unfortunately, size of C ee is quite small because it is of the order of W 4 . While the latter, 1 − 3 i=1 |U ei | 2 , being of the order of W 2 , must be the first indicator of unitarity violation. We generate the input data without considering unitarity violation (corresponding to the standard three flavor scheme) but in the fit, we allow non-unitarity, in order to determine to what extent a JUNO-like experiment can constrain non-unitarity when the data are consistent with the standard three flavor scenario.

Comparison between the unitary and the non-unitary cases
In figures 1 and 2, presented are the allowed regions of C ee , 3 i=1 |U ei | 2 , |U ei | 2 (i = 1, 2), and the flux normalization f norm projected onto the various two-dimensional spaces at 1, 2, and 3 σ CL (each differentiated by colors) obtained with 5 years measurement by JUNO. 9 The reactor neutrino flux uncertainty is taken as 3% and 6% in figures 1 and 2, respectively. 9 To be more precise, we consider the total exposure corresponding to 5×35.8×20 = 3.58×10 3 kt·GW·yr.   Alternatively, the allowed regions of unitarity violation parameters C ee , and 1−(|U e1 | 2 + |U e2 | 2 + |U e3 | 2 ), as well as f norm , |U e1 | 2 , and |U e2 | 2 at 1 and 3 σ CL for 1 degree of freedom are summarized in table 1 for the both cases of the reactor flux normalization uncertainties of 3% and 6%.
We first concentrate on the former (the case for σ fnorm =3%) results given in figure 1. The colored solid contours are for the cases with unitarity violation, while the black dashed contours are for the standard unitary case. Since unitarity is preserved in the true (input) simulated data of JUNO, the contours obtained with ansatz assuming unitarity violation always contain the ones obtained with the standard unitary ansatz.
Let us understand some key features of figure 1. The unitarity violation parameter 1 − 3 i=1 |U ei | 2 is determined in strong correlation with the flux normalization f norm . It enters into the constant term in the probability in eq. (4.7) with α = β = e as where the approximate equality above is justified because of the smallness of C ee as seen in figure 1. Then, it is natural to expect that 1 − 3 i=1 |U ei | 2 would be constrained to the accuracy of ∼ 1 − 1 − σ fnorm ∼ 0.015. It seems to be consistent with figure 1, and the results given in table 1, 1 − 3 i=1 |U ei | 2 ≤ 0.01 (0.03) at 1σ (3σ) CL for one degree of freedom.
The probability leaking parameter C ee is constrained to be small, C ee < ∼ 2×10 −4 (10 −3 ) at 1σ (3σ) CL in figure 1 with two degrees of freedom, and C ee < 10 −4 (10 −3 ) at 1σ (3σ) CL in table 1 with one degree of freedom. The stringent constraints obtained for C ee can be understood as coming from the upper bound on C ee in eq. (5.4), which is imposed in the analysis. Using the above bound on the unitarity violation parameter with one degree of freedom, C ee ≤ (1 − 3 i=1 |U ei | 2 ) 2 = 10 −4 (9 × 10 −4 ) at 1σ (3σ) CL. They are quite consistent with the obtained upper bound on C ee in table 1. Noticing that 1 − 3 i=1 |U ei | 2 and C ee are of the order of W 2 and W 4 , respectively, it means that the W matrix elements are constrained to be order ∼ 10% by the JUNO measurement. 10 Does inclusion of precision data of sin 2 θ 13 to be obtained by future measurement by Daya Bay and RENO of ∼ 3% level significantly improve the sensitivity to unitarity violation? We believe that the answer is no, and here is the reasoning for our belief. The accuracy of measurement of sin 2 θ 13 in JUNO estimated in [33] is 7% level, which implies the accuracy δ(sin 2 θ 13 ) = 1.5×10 −3 . It probably means that in our framework the accuracy of measurement of |U e3 | 2 is ∼ 10 −3 , which is an order of magnitude smaller than the 1% level uncertainty of While |U e3 | 2 is measured by detecting small atmospheric ripples on the long-wavelength solar oscillations, 1 − 3 i=1 |U ei | 2 is determined in strong correlation with the flux normalization.
Therefore, it is important to reduce the flux uncertainty in order to increase the sensitivity to unitarity violation, and improvement of the |U e3 | 2 measurement would have much less impact on it.
To examine the effect of worsen reactor flux normalization uncertainty, we have repeated the same calculation with 6% error, as given in figure 2. As one can see from the figure, the over-all features of the correlation between the quantities of interests are unchanged. The extent of prolongation of contours due to the worsen flux uncertainty may be estimated once we understand the one for the unitarity violation parameter 1 − (|U e1 | 2 + |U e2 | 2 + |U e3 | 2 ). Following the same logic as above the accuracy of constraining this parameter is expected to be ∼ 1 − 1 − σ fnorm ∼ 0.03, which is again consistent with figure 2.   To know to what extent JUNO can tighten the current constraints on the ν e row elements, let us compare our results to the ones obtained in ref. [5]. We must remark that the authors of ref. [5] assumed 5% uncertainty of reactor neutrino flux. Whereas we use our results obtained by assuming 3% uncertainty for comparison. According to the estimate done in this reference (the fourth equation), the current uncertainties of |U e1 | 2 and |U e2 | 2 are 11% and 18% at 3σ CL, respectively. On the other hand, the results of our analysis with JUNO-like setting shows (see table 1) that at 3σ CL the uncertainties of |U e1 | 2 and |U e2 | 2 are, respectively, 1.9% and 2.3%. It implies a great improvement over the current constraints by a factor of 6 (8) for |U e1 | 2 (|U e2 | 2 ). For the 6% reactor flux normalization uncertainty, the uncertainties of both of |U e1 | 2 and |U e2 | 2 are 3.7% implying a factor of 3 (5) improvement for |U e1 | 2 (|U e2 | 2 ).

Understanding correlations between the parameters
One observes that, except for the ones which involve C ee , the allowed contours in the non-unitary case are much wider and expanded to the particular direction, indicating the correlations between the parameters taken in figure 1. Let us understand this feature. For this purpose we call readers attention to the bottom 4 panels (g), (h), (i), and (j) in figure 1. In the left-bottom panel (g), we see that |U e1 | 2 + |U e2 | 2 + |U e3 | 2 is restricted to be unity in the unitary case, as it should. Whereas, when unitarity violation is allowed, the contours are expanded into a left-up direction. The contour cannot expand to the right because |U e1 | 2 + |U e2 | 2 + |U e3 | 2 must be equal to or less than unity by (3 + N ) space unitarity, eq. (4.9). They can extend only to left-up direction because the effect of decrease of |U e1 | 2 +|U e2 | 2 +|U e3 | 2 has to be compensated by increase of the flux normalization f norm . It then explains the similar behavior of the contours in the panels (i), and (j). 12 In the panel (f), when unitarity violation is introduced, the allowed contours prolongate to left-down direction, indicating a positive correlation between |U e1 | 2 and |U e2 | 2 . If we assume the positive correlation between |U e1 | 2 and |U e2 | 2 , and taking into account that |U e3 | 2 |U e1 | 2 , |U e2 | 2 , we have the positive correlation between |U e1 | 2 and |U e1 | 2 + |U e2 | 2 + |U e3 | 2 (between |U e2 | 2 and |U e1 | 2 + |U e2 | 2 + |U e3 | 2 ), as indicated in the panel (b) ((d)). It almost completes the discussion to understand the features of correlations between the quantities plotted in figure 1. Now, what is left is to understand the reason for positive correlation between |U e1 | 2 and |U e2 | 2 , to which we now turn. In fact, it is quite a nontrivial feature to understand: If we run the same simulation without the constraint (4.11), we have a negative correlation between |U e1 | 2 and |U e2 | 2 . See figure 3 in appendix B. Here, we focus on the positive correlation between |U e1 | 2 and |U e2 | 2 seen in figure 1, and present a model to understand this feature. In appendix B, we will offer the possible explanation of negative correlation between |U e1 | 2 and |U e2 | 2 in the case without the constraint.
We have learned from the results of the analysis that 1 − 3 i=1 |U ei | 2 and C ee are consistently constrained to be small so that W 2 < ∼ 10 −2 . It means that the system is nearly unitary. In the unitary case, it is expected that the JUNO setting has sensitivity to the individual ∆m 2 31 and ∆m 2 32 waves. Let us suppose that this is the case also in the extended parameter space in our (3 + N ) model. Then, the suitable representation of P (ν e →ν e ) is given by the non-unitary version of the one derived in ref. [34] (α = e below): and φ α is a slowly varying function of x/E which depends only on the solar parameters, see [34]. The ± sign in front of φ α determines the mass ordering. Notice that the function inside the square bracket in (5.7) determines the way how the ∆m 2 31 and ∆m 2 32 waves are superposed, and we assume that the JUNO setting has the sensitivity to it, as was the case of our simple-minded analysis described in section 5.1 used for the unitary case [29]. Then, variations of the parameters must render the fast varying function of x/E inside the square bracket be invariant, at least approximately.
To compute the number of events, the probability in eq. (5.7) should be multiplied by the flux normalization factor f norm , as mentioned in the previous section. Then, we must analyze the effective probability defined as P (ν e →ν e ) eff ≡ f norm × P (ν e →ν e ). We now look for the transformations which render P (ν e →ν e ) eff invariant. They are where ξ is an arbitrary parameter. Notice that X, Y , ∆m 2 ee , and φ α are manifestly invariant under (5.10). The invariance of P (ν e →ν e ) eff under (5.10) implies that the allowed contours can be extended to this "invariance direction". Therefore, |U e1 | 2 and |U e2 | 2 must be positively correlated with each other, whereas |U ei | 2 (i = 1, 2) and f norm is negatively correlated. The former is consistent with the feature shown in panel (f), and the latter in agreement with the one in panel (g), (i), and (j) in figure 1. Similarly, C ee must have positive correlation with |U ei | 2 and negative correlation with f norm , the feature which, however, does not appear to be seen in figure 1. The most important reason for this is C ee is essentially determined by the conditions given in eq. (5.4), as mentioned earlier. 13

The current constraints on unitarity violation
We start by discussing the constraints obtained on unitarity violation in the ν µ and ν τ channels. We first focus on the relatively low mass sterile states m 2 J < ∼ 10 eV 2 , and rely on the results obtained by the authors of ref. [5], because their analysis is based on the (3 + N ) model. We also check the consistency of the results with those in ref. [12] keeping in mind that most of the analyses in this reference are done using the (3 + 1) model.
If the sterile states are more massive, m 2 J > ∼ 10 eV 2 , the kinematical constraints in beta and meson decays play more important role. As we mentioned in section 5.2.1, the kinematical constraint from neutrinoless double beta decay plays an important role for massive sterile neutrinos, |U e4 | 2 < ∼ 10 −3 for m 2 4 ∼ 1 keV 2 to |U e4 | 2 < ∼ 10 −6 for m 2 4 ∼ 1 MeV 2 [20]. However, no constraint on |U µ4 | 2 and |U τ 4 | 2 arizes for the mass range m 4 ≤ 1 MeV in which we are interested in the context of low-scale unitarity violation, according to the (3 + 1) model analysis in [20].
For unitarity violation at high scales, due to the SM SU (2) gauge invariance, the constraints coming from the charged lepton sector must also be considered [3]. While we do not describe them here the interested readers are advised to refer to, for example, refs. [3,4,17,18] and the ones quoted therein.

Structure of CP violation in the (3 + N ) space unitary model
As in the preceding section, we can use the formulas for P (ν µ → ν e ) and P (ν µ → ν µ ) given in (4.6) and (4.7) (β = µ, α = e etc.) to do unitarity test in the accelerator neutrino experiments with muon neutrino beam in near vacuum environment. While we postpone this task to future communications, we want to make remarks on structure of CP violation in the active neutrino sector of our (3 + N ) unitary model. We note that some authors addressed the issue of CP phase in theories with non-unitarity. See e.g., [37][38][39]. Yet, we believe that our discussion below nicely complements those given before.
The number of CP violating phases in non-unitary n × n U matrix can be counted by the similar way as in the CKM matrix: It is 2n 2 − n 2 − (2n − 1) = (n − 1) 2 , in which we have subtracted number of elements |U αi | and number of phases that can be absorbed into the neutrino wave functions. Hence, four phases exist in the U matrix in our (3 + N ) model (n = 3), and it can be parameterized, for example, as Using (4.6) the CP odd combination of the appearance oscillation probabilities is given by where we have defined the generalized Jarlskog invariants [40] J αβij ≡ Im U αi U * βi U * αj U βj . and taking imaginary part we obtain the relation Because of antisymmetry of J αβij mentioned above we can write S αβj as from which the relation S αβ1 + S αβ2 + S αβ3 = 0 follows. Then, one can easily show that CP odd combination ∆P βα can be written as 14 The form of CP-odd combination ∆P βα in (6.7) is interesting because CP violation effect is decomposed into two pieces, one unitary-like x/E dependence (first line), and the other "unitarity-violating" x/E dependence (second line). Of course, the coefficient of the first term receives unitarity violating effect through non-unitary U matrix elements in J αβ12 . But, it should be possible to disentangle between these two different x/E dependences by precision measurement of neutrino spectrum in the next generation experiments [41,42] provided that the non-unitarity effect is sufficiently large enough. Presence of the second term would provide with us a clear evidence for unitarity violation, because S αβi involves explicitly the W matrix elements which connect the active to sterile sectors. 15 To summarize: We have shown in near vacuum environments that the structure of CP odd combination of the appearance oscillation probabilities is illuminating enough to allow us to disentangle unitarity violating piece by studying the x/E dependence of the signal. 14  . By cyclic permutation one can obtain the other forms with the first coefficient J αβ23 or J αβ31 . 15 One must be careful so as not to misinterpret our statement. Through unitarity of U matrix (4.9), the U matrix always carries information of W matrix. Therefore, CP odd term is not the only place where we see the effect of non-unitarity. But, ∆P βα is special because an explicit W dependent piece may be singled out, as we emphasized above.

Unitarity violation in matter: Matter perturbation theory
In this paper we have developed a framework describing unitarity violation at low energies. It utilizes the three active and N sterile neutrino state space which is assumed to be complete, i.e., (3+N ) space unitarity. The key issue is whether the model can be formulated in such a way that its prediction is insensitive to the details of the sterile sector, for example, the sterile neutrino mass spectrum. In vacuum we have shown that our (3 + N ) model satisfies the requirement if m 2 J 0.1 eV 2 for J ≥ 4. An immediate question is if this feature survives in matter. In this section, we investigate this problem in a restricted framework of leading-order matter effect perturbation theory. We will answer the question in the positive but under the additional requirement, eq. (7.17).
We note that our approach which relies on matter perturbation theory is not purely academic. The resultant formulas for the disappearance and appearance probabilities, P (ν µ → ν µ ) and P (ν µ → ν e ), to first order in matter perturbation theory can be utilized in leptonic unitarity test in T2K and T2HK experiments [41,43]. Notice that keeping higher order terms in W is important because the bound obtainable by the ongoing and the next generation experiments may not be so stringent. Therefore, we do not make any assumptions on the size of W matrix elements in this paper (besides |W | < 1).

Matter perturbation theory of three active plus N sterile unitary system
We formulate the matter perturbation theory of (3 + N ) space unitary model by assuming x)E, with G F being the Fermi constant and N e (x) electron number density in matter, is the Wolfenstein matter potential [44]. In deriving the formulas for the oscillation probabilities, for simplicity, we assume chargeneutrality in matter, and take constant number density approximation for electron, proton and neutron. Inclusion of the spatial dependence can be done assuming adiabaticity, but it will not alter the results in a qualitative way.
To discuss neutrino oscillation in matter in the three active plus N sterile neutrino system the matter potential due to neutral current (NC) as well as charged current (CC) interactions must be taken into account. We therefore take the Hamiltonian in the flavor basis as 2E as before, as defined in eq. (3.2) and, The matter potentials A and B, which are respectively due to CC and NC interactions, take the forms and the values as where N n is the neutron number density in matter.

Perturbation theory in vacuum mass eigenstate basis
To formulate perturbative treatment it is convenient to work with the vacuum mass eigenstate basis defined asν = (U † )ν, in which the Hamiltonian is related to the flavor basis one asH ≡ U † HU =H 0 +H 1 , where 16 The S matrix in the flavor basis S(x) is related to the one in the vacuum mass eigenstate basisS(x) as We calculate perturbatively the elements ofS matrix. Toward the goal, we define Ω(x) as Ω(x) = e iH 0 xS (x), which obeys the evolution equation where Then, Ω(x) can be computed perturbatively as x 0 dx H 1 (x ) + · · ·, (7.9) 16 If we choose a different phase convention e.g.H1 = U † diag(∆A, 0, 0, ∆B, ..., ∆B)U, the S matrix discussed in the following will change but the physical observable (oscillation probability) remains the same, as it must. This is confirmed by an explicit calculation.
where the "space-ordered" form in (7.9) is essential because of the non-commutativity between H 1 (x) of different locations. Having obtained Ω(x),S matrix can be written as x Ω(x). (7.10) We calculateS matrix to first order in matter perturbation theory. SinceH 0 is diagonal, e ±iH 0 x takes the simple form diag e ±i∆ 1 x , e ±i∆ 2 x , e ±i∆ 3 x , e ±i∆ 4 x , · · ·, e ±i∆ 3+N x . Using eqs. (7.8) and (7.9) respectively, we first determine H 1 and then Ω. Using (7.5), the S matrix elements are given by theS matrix elements as where the expressions ofS matrix elements are given in appendix C. If we decompose S αβ to zeroth and the first order terms, S αβ = S (0) which is, of course, identical with (4.3), and The oscillation probabilities P (ν β → ν α ) in the appearance (β = α) and disappearance (β = α) channels can be computed to first order in matter perturbation theory as αβ . (7.14) Since the zeroth order term in P (ν β → ν α ) above is already given as the vacuum term, eq. (4.5), we only compute the first order matter correction terms. The results of P (ν α → ν α ) (1) and P (ν β → ν α ) (1) are given in appendices D and E, respectively.

Disappearance channels
For simplicity, we first discuss the oscillation probability in the disappearance channel. Given the zeroth-order term in eq. (4.5), we focus on the first-order term here. We present here P (ν α → ν α ) (1) after averaging over energy resolution and dropping the rapidly oscillating terms due to the large mass squared differences which involve sterile neutrinos 17 , leaving the full expression before averaging to appendix D. We find that the last two terms in (7.16) violate our requirement that the oscillation probability in our (3 + N ) model to be insensitive to the spectrum of sterile states unless they are smaller than C ab ∼ O(W 4 ) which implies A severer restriction is not required because these terms are already suppressed by W 2 apart from the energy denominator. From we notice that, unless W 2 is extremely small, W 2 < ∼ 10 −2 , the last two terms in (7.16) can be ignored under the same condition as in vacuum, ∆m 2 Jk > ∼ 0.1 eV 2 . If we discuss the region of W 2 which is much smaller, we need to restrict ourselves to the case of higher mass sterile neutrinos. If we treat the regime W 2 ∼ 10 −3 (W 2 < ∼ 10 −n ), we need to limit to ∆m 2 Jk m 2 J > ∼ 1 eV 2 (10 (n−3) eV 2 ) to keep our (3 + N ) space unitary model insensitive to details of the sterile sector. 17 The averaging out procedure involves not only (3.14) but also (∆Ax) sin(∆ k − ∆J )x ≈ (∆Ax) sin(∆K − ∆J )x ≈ 0, (7.15) and cosine as well. It is justified because the rapidly oscillating sine functions are imposed onto monotonic slowly increasing function of x. This feature arises due to |∆ A | ∆ J ≈ |A| Assuming the further restriction to the sterile mass spectrum such that condition (7.17) is fulfilled, we obtain the final form of the first-order matter correction to P (ν α → ν α ) as This expression is written in terms of only active space U matrix elements. Therefore, with additional condition on the sterile neutrino mass spectrum given in (7.17), the effect of unitarity violation is only through the non-unitarity U matrix to first order in matter perturbation theory. Thus, we find that the most important modification in the oscillation probability due to non-unitarity is in the vacuum expression in the disappearance channel.

Appearance channels
Despite the expression of P (ν β → ν α ) (1) given in appendix E is a little cumbersome, it has a simple form after averaging over neutrino energy within the energy resolution and using the condition (7.17): Again, the survived matter correction terms are written in terms of only active space U matrix elements, leaving the important effect of unitarity violation only in the vacuum term.
The obvious question would be: Do the features obtained in the leading order in matter perturbation theory, in particular, the restriction to the sterile masses (7.17), prevails to higher orders? A tantalizing feature of the sterile mass condition (7.17) is that its fulfillment relies on smallness of A/∆m 2 Jk in our present discussion. Therefore, better treatment of the matter effect is necessary to know whether our (3 + N ) model can be insensitive to details of the sterile sector under reasonably strong matter effect. We hope to return to these questions in the near future.

Conclusions
In this paper, we have discussed the relationship between low-scale unitarity violation, the one due to new physics at much lower energies than the electroweak scale, and the conventional high-scale unitarity violation. They include (1) presence (absence) of lepton flavor universality in low-scale (high-scale) unitarity violation, and (2) absence (presence) of zero-distance flavor transition in low-scale (high-scale) unitarity violation. In the case of low-scale unitarity violation, it is likely that extension of low energy lepton sector may enrich the features of neutrino mixing and the effects could be detectable by the precision neutrino oscillation experiments.
To provide a framework for leptonic unitarity test, by embodying such features of lowscale unitarity violation, we have constructed a three-active plus N -sterile neutrino model which is assumed to be unitary in the whole (3 + N ) dimensional state space. Presence of the sterile sector results in non-unitarity in active three neutrino subspace. Though inside this specific model, we sought the possibility that the framework is nearly modelindependent to better serve unitarity test. Namely, we require the prediction of the (3+N ) model be insensitive to the properties of the sterile sector, such as the number of states N and detailed features of the mass spectrum. We have shown that restriction to the sterile neutrino masses to m 2 J ≥ 0.1 eV 2 (J ≥ 4), due to decoherence, is sufficient to achieve the desired properties, under a mild assumption of no accidental degeneracy in the mass spectrum, i.e., |∆m 2 Ja | |∆m 2 31 |, or ∆m 2 21 where J = 4, .., 3 + N, a = 1, ..., 3 + N . The characteristic features of unitarity violation, as modeled by our (3 + N ) space unitary model, are as follows: • the neutrino oscillation probability contains the constant term C αβ in (4.8) (α = β for appearance channels, and α = β for disappearance channels), describing the probability leaking into the sterile subspace.
• the mixing matrix in 3 × 3 active neutrino subspace is non-unitary.
While the second feature is common to high-and low-scale unitarity violation, the first feature is unique to low-scale unitarity violation. Since probability leaking occurs due to presence of sterile sector which has energies comparable to active neutrinos we suspect that the first feature above is generic in low-scale unitarity violation even outside of our (3 + N ) model. In our (3 + N ) space unitary model, the first observable which signals non-unitarity would be nonzero values of 1 − 3 i=1 |U αi | 2 (α = e, µ, τ ) in the disappearance channels, and/or On the other hand, the probability leaking term C αβ (see (4.8)) is of the order of W 4 . To verify low-scale unitarity violation, finding a nonzero values of C αβ would be enough. But, to prove that unitarity violation occurs in the manner predicted by the (3 + N ) space unitary model, the consistency between order of magnitudes of 3 j=1 U αj U * βj 2 and C αβ (α = β) (and the corresponding quantities in the disappearance channels) must be checked.
Thus, we have presented a framework for analysis of unitarity violation in the lepton sector which is suitable for low-scale unitarity violation. To examine how it works we have analyzed a simulated data of medium baseline reactor neutrino experiments prepared by assuming a JUNO-like setting. By analyzing the data with our simple-minded statistical procedure, we have shown that the expected superb performance of JUNO would allow us to constrain unitarity violation and the probability leaking parameters as 1 − 3 i=1 |U ei | 2 ≤ 0.01(0.03) and C ee < ∼ 10 −4 (10 −3 ) at 1σ (3σ) CL (one degree of freedom), respectively, by its 5 years measurement.
We have also discussed in a qualitative way how to detect unitarity violation in accelerator appearance measurement. Using the antisymmetry property of the generalized Jarlskog invariants we have shown in section 6 that the CP odd combination P (ν β → ν α ) − P (ν β →ν α ) can be decomposed into the two terms with different x/E dependences. See eq. (6.7). If measurement of neutrino energy spectra is sufficiently accurate it would be possible to single out the explicit W matrix dependent piece, providing with us a clear evidence for unitarity violation.
Finally, we have addressed the question of how inclusion of the matter effect alters the nearly model-independent feature of our (3+N ) space unitary model. We have learned that if we discuss the region W 2 > ∼ 10 −2 the condition on the sterile neutrino masses m 2 J > ∼ 0.1 eV 2 needed in vacuum is sufficient, but if we want to treat case of even smaller W 2 , W 2 < ∼ 10 −n , restriction to sterile masses to m 2 J > ∼ 10 (n−3) eV 2 is necessary for our (3 + N ) space unitary model be insensitive to details of the sterile sector. Though our treatment in section 7 is restricted to first order in matter perturbation theory it is perfectly applicable to the analysis for a class of the LBL experiments, for example, T2HK. Clearly the similar discussion must be attempted under environment of larger matter effect that is expected in some of the next generation LBL experiments such as DUNE.
A Bounds on the probability leaking term by (3 + N ) space unitarity Here we would like to derive upper and lower bounds on C αβ ≡ 3+N J=4 |W αJ | 2 |W βJ | 2 taking into account the constraint from (3 + N ) space unitarity. First we have the following identity where in the second line, we have used the unitarity constraint (second relation of (4.9)). Holding the first term fixed, we can maximize (minimize) C αβ by minimizing (maximizing) the non-negative second term. Geometrically, the lengths of vectors W α ≡ {W α1 , ..., W αN } and W β ≡ {W β1 , ..., W βN } in N -vector space are fixed and we are rotating them to find configurations which minimize or maximize C αβ . The second term is non-negative and its minimum is zero. 18 Hence C αβ is bounded from above by The maximum of the second term in eq. (A.1) occurs when all the elements of W α and W β are respectively equal, W αI ≡ v and W βJ ≡ w: 19 The second step above follows from the definition C min αβ = N v 2 w 2 . Solving for C min αβ , we have The bound on C αα follows by the similar treatment. 18 There is a unique configuration: WαI , W βI = 0 for one and only one I while the rest are zero. 19 If the readers are not convinced by this argument they can derive the same lower bound (A.4) by using the Lagrange multiplier method in which one considers and minimize H in terms of |WαJ |, η, and ξ.

B Comparison between the non-unitary constrained and constraint-free cases
We recognized that for better understanding of the correlations between |U e1 | 2 and |U e2 | 2 , and other features of the contours allowed by JUNO data, it is worthwhile to examine the case with and without the constraint (4.11) and compare the results of both cases. The resultant contours of such analysis are presented in figure 3. The solid and the dashed contours are the cases with and without constraints (4.11). Of course, the regions outside the solid contours are unphysical in our (3 + N ) state space unitary model. Yet, comparison between the cases with and without is revealing to understand the features of the contours, as we see below.
Whereas the dashed contours correspond also to the cases with unitarity violation but without the above restrictions.
One immediately notices that there exist clear differences between the allowed contours obtained with and without constraints, both in size of the contours and the characteristic features of correlations. In particular, the features of correlations between |U e1 | 2 and |U e2 | 2 are completely different, as seen in the panel (f). That is, while |U e1 | 2 and |U e2 | 2 are positively correlated in the case with constraint, they are negatively correlated in the case without constraint with significantly prolongated contours toward left-up direction. Let us understand the features of the results. See section 5.2.2 for a model that enables us to understand the positive correlation between |U e1 | 2 and |U e2 | 2 in the case with constraint (4.11).
It appears to us that the features of the contours without constraint can be understood by the following model. Suppose that ∆m 2 31 and ∆m 2 32 waves cannot be discriminated by the JUNO-like setting. 20 Then, we can make approximation |∆m 2 31 | ≈ |∆m 2 32 | ≡ ∆m 2 atm , which leads to theν e oscillation probability P (ν e →ν e ) = C ee + |U e1 Again we can consider the effective probability by multiplying the flux normalization factor f norm to the probability as done in section 5.2.2, P (ν e →ν e ) eff ≡ f norm ×P (ν e →ν e ). If we assume that fit to the data can separate the oscillations of the two different frequencies associated with ∆m 2 21 and ∆m 2 atm as well as the constant term, we can obtain the following three observables, if written with the notations in addition to ∆m 2 21 and ∆m 2 atm . Clearly, the three observables cannot determine the five parameters, and this is the reason why the size of the allowed regions without restrictions (dashed curves) shown in figure 3 are much larger than those with restrictions (solid curves).
Let us now focus on the problem of the correlations. Unlike in the case discussed in section 5, the presence of the leaking term C ee without conditions given in eq. (5.4) makes a big difference. First, from figure 3(g) we notice the negative correlation between f norm and x + y + z, which is naturally expected to keep the energy independent term constant within some uncertainty. This behavior is qualitatively similar to the case with restrictions shown by the solid curves. The impact of the inclusion of C ee without restrictions is to enlarge significantly the allowed region but keeping the same qualitative feature of anticorrelation.
One might argue that if x + y + z is decreased, f norm does not necessarily have to be increased since C ee can be increased such that the energy independent term (the first quantity given in (B.2)) is kept constant, which is true. However, it would not be possible to keep also the coefficients of energy dependent terms (the second and third quantities given in (B.2)) simultaneously constant by increasing C ee and decreasing x + y + z. Therefore, the anticorrelation between f norm and x + y + z is needed even if C ee can be varied freely.
However, we did not find any significant correlation between C ee and f norm as we can see in figure 3(h). This is because the parameter C ee appears only once in the probability as a constant term, completely independent from other terms whereas x, y and z appear also in the third and fourth terms. If C ee is increased (decreased), f norm does not have 20 Notice that this is a completely different question from whether the JUNO setting can discriminate between ∆m 2 31 and ∆m 2 32 waves in the unitary case.
to be decreased (increased) because the other parameters x, y and z can be independently adjusted (no restriction for x, y and z) such that the effective probability is kept constant. Let us try to understand also the other correlations among the parameters. In particular, we focus on the ones in the panels (a), (c), (e) and (f) of figure 3. For the sake of discussion let us assume that f norm is fixed to some value, e.g., to unity, as its variation seems to be not essential to understand the correlations we want to discuss below. It may be partly because, among the five parameters, f norm is already restricted by the pull term given in eq. (5.3).
For a given value of f norm , if C ee is increased, (x + y + z) 2 must be decreased which can explains the behavior we see in figure 3 (a). In our analysis, the true (input) values of x, y and z (let us denote, respectively as x 0 , y 0 and z 0 ) are set to be, respectively, x 0 = 0.675, y 0 = 0.303 and z 0 = 0.0218. Ignoring the small value of z 0 , when C ee is increased, x + y should be decreased but keeping xy constant. We note that this is possible only if x extends to the region which is smaller than x 0 and simultaneously y extends to the region where its value is larger than y 0 , 21 which can explain the behaviors we can see in the panels (c) and (e) of figure 3. By combining the results shown in panels (a), (c) and (e), we can also understand the correlations we see in the panels of (b) and (d). Once we understand the correlations of C ee − x (negative) and C ee − y (positive), we can understand why x = |U e1 | 2 and y = |U e2 | 2 are anticorrelated to each other as we can see in the panel (f) in figure 3.

CS matrix elements
Here, we present the results of the active-active (ij), active-sterile (iJ), and the sterilesterile (IJ) space matrix elements ofS to first order in matter perturbation theory formulated in section 7. The ii and ij elements arẽ The iJ and Ij elements arẽ (C.2) 21 One might wonder why the other possibility, increasing and decreasing, respectively, x and y from x0 and y0, does not work. The reason is as follows. Suppose that x and y are varied from x0 and y0 as x0 → rx0 and y0 → y0/r such that xy kept constant. Then, if Cee is increased, we need rx0 +y0/r < x0 +y0, which implies r < 1 for x0 > y0.
In sterile-sterile subspace,S matrix is given bỹ

D Oscillation probabilities in the disappearance channels
The first-order matter correction in the disappearance oscillation probability P (ν α → ν α ) is given by (next page)

E Oscillation probabilities in the appearance channels
Here, we give the result of first-order matter correction term in the appearance oscillation probability P (ν α → ν α ). For bookkeeping purpose we decompose it into the three terms: The first term is given by The third term is given by -39 -

F Number of events for the JUNO-like setting
We compute the distribution of the number of events coming from the inverse β-decay (IBD) reaction,ν e + p → e + + n, as a function of the visible energy by performing the following integral, dσ(E, E e ) dE e ×P i (ν e →ν e ; L i , E)R(E e , E vis ), (F.1) where n p is the number of target (free protons), assumed to be ∼ 1.44 × 10 33 for 20 kt (assuming a similar proton fraction 12% as in the case of the Daya Bay detectors [45]), t exp is the exposure, det is the detection efficiency assumed to be 100% for simplicity, dφ i (E)/dE is the differential fluxes of reactor neutrinos, dσ(E, E e )/dE e is the IBD cross section, P i (ν e →ν e ; L i , E) is theν e survival probabilities for a given baseline L i and neutrino energy E, and R(E e , E vis ) is the Gaussian resolution function (see below). We ignore the matter effect and use the probability in vacuum as it is an excellent approximation. We note, however, that it is necessary to take it into account for the precision measurement of the solar parameters, see [33,46]. As a reasonable approximation for our purpose, we ignore the neutron recoil in the IBD reaction and simply assume that neutrino energy, E, and the positron energy, E e , is related as E e = E − (m n − m p ) E − 1.3 MeV. Due to the finite energy resolution, the event distribution can not be obtained as a function of E e (true positron energy) but as a function of the reconstructed or so called visible energy, E vis , which is approximately related to neutrino energy as E vis E − (m n − m p ) + m e , after taking into account the energy resolution (see the text below). Regarding the cross section, dσ(E, E e )/dE e , we use the one found in [47].
The differential flux of reactor neutrino dφ(E)/dE can be computed as, where P th is the thermal power of the reactor, E 210 MeV is the average energy released by per fission computed by taking into account the ratios of the fuel compositions of the reactor (see below).
We can replace, as a good approximation, the reactor complex consisting of 6 and 4 reactors, respectively, at Yangjiang and Taishan sites by a single reactor with the thermal power of 35.8 GW placed at the baseline L = 52.5 km from the JUNO detector. We also include the contributions from the far reactor complexes at Daya Bay (with the baseline of 215 km) and Huizhou (with the baseline of 265 km) sites, which contribute, respectively, about 3% and 2% in terms of the total number of events.
For the reactor spectra S(E), which is nothing but the number of neutrinos being emitted per fission per energy (MeV), we use the convenient analytic expressions found in [48] with the typical fuel compositions of the reactors, 235 U: 239 Pu: 238 U: 241 Pu = 0.59: 0.28: 0.07: 0.06. For simplicity, we ignore the contributions for geoneutrinos in this work, as it is not very important for our purpose.
R(E e , E vis ) is the function which takes into account the finite energy resolution of the detector and is given by where the energy resolution is assumed to be [49], σ(E e ) (E e + m e ) = 3% (E e + m e )/MeV , (F.4) The expected total number of events at JUNO for the 5 years of exposure with 100% detection efficiency (corresponding to the total exposure of 3.6 × 10 3 kt·GW·yr) is 1.4 × 10 5 .