Neyman-type sample allocation for domains-efficient estimation in multistage sampling


 We consider a problem of allocation of a sample in two- and three-stage sampling. We seek allocation which is both multi-domain and population efficient. Choudhry et al. (Survey Methods 38(1):23–29, 2012) recently considered such problem for one-stage stratified simple random sampling without replacement in domains. Their approach was through minimization of the sample size under constraints on relative variances in all domains and on the overall relative variance. To attain this goal, they used nonlinear programming. Alternatively, we minimize here the relative variances in all domains (controlling them through given priority weights) as well as the overall relative variance under constraints imposed on total (expected) cost. We consider several two- and three-stage sampling schemes. Our aim is to shed some light on the analytic structure of solutions rather than in deriving a purely numerical tool for sample allocation. To this end, we develop the eigenproblem methodology introduced in optimal allocation problems in Niemiro and Wesołowski (Appl Math 28:73–82, 2001) and recently updated in Wesołowski and Wieczorkowski (Commun Stat Theory Methods 46(5):2212–2231, 2017) by taking under account several new sampling schemes and, more importantly, by the (single) total expected variable cost constraint. Such approach allows for solutions which are direct generalization of the Neyman-type allocation. The structure of the solution is deciphered from the explicit allocation formulas given in terms of an eigenvector 
 
 $${\underline{v}}^*$$
 
 
 
 
 v
 ̲
 
 
 ∗
 
 
 
 of a population-based matrix 
 
 $$\mathbf{D}$$
 
 D
 
 
 . The solution we provide can be viewed as a multi-domain version of the Neyman-type allocation in multistage stratified SRSWOR schemes.


Introduction
Consider a stratified SRSWOR in a population U of size N with strata U 1 , . . . , U I , which form a partition of U , and let N h denote the size of the stratum U h . For a variable Y in U , we denote y k = Y(k), k ∈ U . The standard estimator of the total τ = k∈U y k has the formτ st = I h=1 N hȳh , whereȳ h = 1 n h k∈S h y k with n h denoting the size of the sample S h drawn from the stratum U h , h = 1, . . . , I . The variance of this estimator The basic question for such a setting is the optimal allocation, n = (n 1 , . . . , n I ), of the sample among the strata. To this end, one may assign a given (relative) variance of the estimatorτ st and minimize the costs expressed, for example, by the total sample size I h=1 n h . A related approach is to fix a total sample size n = n 1 + · · · + n I and minimize the (relative) variance. Both cases are examples of the classical Neyman optimal allocation procedure which, for example, in the second case results in the allocation n h = n N h S h I g=1 N g S g , h = 1, . . . , I . In both settings, the result is a simple consequence of minimization using the Lagrange function or can be concluded via the Schwartz inequality.
Recently, we observe a growing interest in more refined allocation methods (also in two-stage sampling) based on nonlinear programming ensuring efficient estimation procedures for the whole population, see, for example, Clark and Steel (2000), Lednicki and Wieczorkowski (2003), Clark (2009), Khan et al. (2010), Münnich et al. (2012), Gabler et al. (2012), Ballin and Barcaroli (2013), Valliant et al. (2013Valliant et al. ( , 2015. Much less is known for allocation procedures which are domains efficient or both population and domains efficient-see, for example, Costa et al. (2004), Longford (2006), Choudhry et al. (2012)-referred to as CRH in the sequel, Molefe and Clark (2015) and Keto and Pahkinen (2017). All of them are again based on nonlinear programming and are designed for single-stage sampling schemes. To the best of our knowledge, the only examples of domains-efficient allocation procedures in two-stage sampling schemes are those related to the eigenproblem approach. Such approach will be explained and discussed in the sequel.
In the stratified SRSWOR, we may treat strata as domains (consequently, we will change the subscript h denoting a stratum into i denoting a domain), that is, we would like to control not only the overall (relative) variance but also (relative) variances in each of the domains. In the context of both multi-and small-area estimations, Longford (2006) suggested to minimize (under a constraint given by the total sample size) the objective function where P i , i = 1, . . . , I are relative preassigned weights which describe "importance" of domains, P + = I i=1 P i and G is a weight responsible for a priority for the variance of the population mean estimator. In the context of model-assisted methodology, this approach has been recently developed in Molefe and Clark (2015). Mathematically, the problem reduces to the Neyman allocation scheme. Similarly, when a given value is assigned for (1), the total sample size is minimized. The weights (P i , i = 1, . . . , I ) are designed in order to cover, at least to some extent, jointly the optimality issue for domains and for the whole population. As pointed out in Friedrich and Münnich (2018), the approach of Gabler et al. (2012) can be used also in this context (actually, they mention the case with G P + = 0). Since the objective function (1) is a weighted sum of domains and population variances, this approach does not give any convenient tool to control the quality of population and domains means estimators. Moreover, it is not clear how to assess the impact of values of weights P i , i = 1, . . . , I , and G P + on variances D 2 (ȳ i ), i = 1, . . . , I , and D 2 (ȳ st ). These issues are clearly visible in the numerical example given in "Appendix," where such approach is confronted with the one we propose in this paper.
Our approach can be treated as a an alternative to a direct setting of CRH. They proposed an approach, where also both multi-and small-area estimations were considered. CRH minimize the total sample size g(n) = n 1 + · · · + n I under the constraints for relative variances of estimators of domain totals where τ i = k∈U i y k is the total for the domain U i , i = 1, . . . , I , and the constraint on the relative variance of the estimator of population total Note that in this approach one specifies conditions for each of the domains and for the whole population separately. The problem was solved under additional box constraints of the form 0 < n i ≤ N i , i = 1, . . . , I , by a nonlinear programming method involving the popular Newton-Raphson algorithm. The NLP solution, as the one described above, is an efficient tool for applications. Such purely numerical approaches to allocation problems are popular in real surveys. A drawback of such methods is that they gave just numerical values and do not provide any information on the structure of the solution, which, for example, can be important for designing priorities for the domains. Now we will describe an alternative approach to the problem of domains-overallefficient allocation in the sampling scheme considered in CRH. The approach will allow to see the analytic form of the solution. The respective expression is based on a unique direction in the space R I , where the dimension I is equal to the number of domains. The rest of this section is just a warm-up illustration for the eigenproblem methodology we will apply in full swing in several multistage schemes in the main part of the paper.
We would like to minimize each T i , i = 1, . . . , I , as well as S under the constraint on the total size of the sample. It can be achieved in the following way. To each domain U i , its (known) priority weight κ i > 0 is assigned. These weights describe ratios of relative variances through Equivalently, we can write where T is an unknown positive constant. Such approach allows to fully control domains variability of (relative) variances of estimators-see the numerical example in "Appendix." Moreover, under the above constraint, the unknown parameter T controls not only relative variances in domains but also the overall relative variance S of the estimator of the population mean. It follows from the fact that under (4), due to (3), the relative overall variance S can be written as Therefore, when we optimize relative variances within domains, the overall relative variance gets automatically optimized as well. This general rule will hold also for the multistage schemes considered in the sequel.
. . , I , Eq. (4) can be written as . . . , I , and, due to (5), the constraint I i=1 n i = n assumes the form Multiplying (5) by v i and using (6), we get which is equivalent to and diag(c) is a diagonal matrix with the vector c being its diagonal. Consequently, by the Perron-Frobenius theorem (for more details, see the Proof of Theorem 2.1), there exists a unique simple positive eigenvalue λ * of the matrix D = a a T n − diag(c) and the respective eigenspace is spanned by a vector v * with all components positive. This vector, v * , up to normalization, that is the respective direction in the space R I is responsible for the efficient allocation. Therefore, in our problem above, Using again the constraint on the sample size, we see that Moreover, with this optimal allocation Remark 1.1 Of course, there is an alternative numerical solution of this problem-see, for example, Lednicki and Wesołowski (1994) (referred to as LW below). From (5), one gets Now the sample size constraint leads to the equation for unknown T . It is obvious that there exists a unique positive solution T = T * , which has to be derived numerically. Then, the allocation is given by (7) with T = T * .
As we mentioned above, there are alternatives for the eigenproblem approach to the (domains-population)-efficient allocation issue in the case of SRSWOR in domains. Except for a possibility mentioned in Remark 1.1, the same allocation can be obtained (up to box constraints) by CRH methodology if T i := κ i T * (with the value of T * as computed in the eigenproblem procedure) and g is minimized through the NLP procedure. Similarly, each of these three approaches (CRH, LW and eigenproblem) can be applied in the case of stratified SRSWOR in each of the domains. It suffices to start the procedure with the Neyman allocation in every domain.
However, the situation changes drastically when two-stage (or multistage) sampling is taken under account. Then, as it will be explained in the following sections, even in the simplest case of a two-stage sampling with SRSWOR at both stages (and no stratification), the formula relating the sizes of samples at the first and the second stage with variances, an analog of the one which lead to (7), does not allow to get a simple equation, as (8) in Remark 1.1 for the unknown T . Therefore, such direct numerical approach is not possible. To the best of our knowledge, no analogs of the NLP procedure from CRH are available in the literature in the multistage setting. Nevertheless, nonlinearly constrained optimization solvers, for example MINOS, MOSEK or IPOPT, available on the Web through NEOS server can be used as potential tools for NLP answers to the two-stage extension of the original CRH problem.
It appears that in such as well as in a more complicated situation, optimal allocation issue can be conveniently handled through the eigenproblem methodology, which provides insight into the structure of the optimal solutions, though in some non-typical cases it may give not the optimal but only approximately optimal results. It suffers from the same drawbacks as the original Neyman optimal allocation; i.e., the natural box constraints can be violated and the solution typically is not integer valued. The main aim of the present paper is to show how such an eigenproblem approach works in several new settings involving multistage sampling. In Sect. 2, we consider twostage sampling with stratified SRSWOR on both stages. Special simplified cases are described in Sect. 3. Then, we deal with the situation in which at one of the stages pps sampling with replacement is used while at the other the sample is drawn according to the SRSWOR. Finally, in Sect. 5 we analyze three-stage sampling with SRSWOR at every stage. In all these cases, the allocation problem with the total cost constraint is solved via an eigenproblem for rank-one perturbations of diagonal matrices. The case of the pps sampling with replacement at the first stage and the SRSWOR at the second stage is rather special-then, the eigenproblem is for a matrix of rank one and thus an analytic form of the eigenvector responsible for allocation is available.
The eigenproblem approach to efficient allocation in domains originally was proposed in Niemiro and Wesołowski (2001) (NW in the sequel) and recently developed in Wesołowski and Wieczorkowski (2017) (WW in the sequel). The major difference between the setting of these two papers and our setting is the form of the cost constraints: Here, we consider the single total cost constraint, while two constraints, one on the sample size of the PSUs sample and one on the expected sample size of the SSUs sample, were imposed jointly in these earlier papers. There are important consequences of such a change in the cost constraints. Due to the form of the cost constraint, our solution is a direct generalization of the Neyman-type allocation. In particular, it gives the Neyman-type solution in case when there are no domains (i.e., the whole population is a single domain). At the technical level, the population matrix D, everything is based on, is a rank-one perturbation of a diagonal matrix, while it was a rank-two perturbation of a diagonal matrix in NW and WW. There is also an important difference with NW and WW with respect to the structure of the allocation. The common feature is that there is an eigenvector v * of the matrix D which plays important role in the optimal allocation; however, in the case we consider here, it influences only the optimal allocation at the first stage, while in the cases considered in NW and WW the optimal allocation on both stages depends explicitly on respective version of v * .
2 Two-stage sampling with stratified SRSWOR at both stages For any i = 1, . . . , I , the subpopulation V i of primary sampling units (PSUs) of ith domain in U is stratified: . Also every PSU understood as a collection of secondary sampling units (SSUs) is stratified: A sample S is chosen as follows: At the first stage, a PSU's sample As an example, one can consider a survey of population of students in a given country with parameters to be estimated at the regions level (subpopulations) and at the whole country level as well. Then, SSUs are just students, while PSUs are schools. Schools in each region are stratified into educational districts, and pupils in each school are stratified into grades. That is, U is the population of students, V i is subpopulation of schools in ith region, while V i,h is the stratum of schools in hth district of V i and M i,h is the number of schools in V i,h . Moreover, W i,h, j,g denotes students of grade g of jth school from district V i,h and N i,h, j,g denotes the number of students in W i,h, j,g . A sample S (I ) i,h of m i,h schools is drawn according to SRSWOR from V i,h , and then a sample S Here and below in the formulas for variances, a single subscript i refers to region The variance of π -estimator of the total of Y over subpopulation U i has the form, see, for example, Särndal et al. (1992, Ch. 4.3), and The actual cost of the survey generated by the sample S can be modeled by the quantity where c 2 I ,i,h > 0 and c 2 I I ,i,h, j > 0 are costs generated by a single PSU from hth stratum of PSUs in ith domain (we assume that it is constant within the stratum) and a single SSU from jth PSU of hth stratum of PSUs in ith domain (we assume that it is constant within the PSU), respectively. Obviously, due to randomness of S (I ) i,h , the actual cost is a random variable. In such a situation, when one wants to impose a constraint on the total cost, the standard approach is to impose a constraint on its expected variable cost (EVC), see, for example, Ch. 12.8.1 of Särndal et al. (1992), which in the case considered here assumes the form: We also assume that priority weights (κ i , i = 1, . . . , I ) ∈ (0, 1) I , such that I i=1 κ i = 1, for relative variances of estimators of means in subpopulations are preassigned, that is where . . , I . We wrote above γ 2 i,h since we will be assuming that it is nonnegative.
We want to find the allocation that is a set of two tables: a two-way table m = (m i,h ) and a four-way table n = (n i,h, j,g ), which give minimal domain-wise relative variances T i , i = 1, . . . , I and minimal relative overall variance S, under the constraints (10) imposed through priority weights and the EVC constraint (9).
The result below says that it can be achieved by searching for positive eigenvalue of a certain matrix based on population quantities and costs coefficients. The allocation is obtained from the respective eigenvector. The approach parallels earlier developments in this setting where, instead of using a single total average cost constraint, the first-stage and second-stage costs were treated separately. In particular, NW in 2001 considered a two-stage scheme with separate constraints for the size of PSUs and SSUs sample and with stratified sampling either at the first or at the second stage. As it has been already mentioned, a similar problem has been recently investigated in WW for two-stage stratified SRSWOR on both stages as well as a scheme with stratified Hartley-Rao scheme at the first stage and stratified SRSWOR at the second stage (also some variations of theses two basic setups were considered there). In that paper, again two constraints were jointly imposed: one for the cost incurred by the PSUs sample size, I i=1 H i h=1 m i,h = m, and one for the cost generated by the expected SSUs g=1 n i,h, j,g = n (these formulas refer obviously to the stratified SRSWOR on both stages). In meantime, the eigenproblem approach has been further developed in a series of papers: Kozak (2004) (multivariate version of NW was considered with an application to agricultural surveys), Kozak and Zieliński (2005) (the original eigenproblem approach from NW, where it was assumed that relative variances are the same for all domains, was adapted to include priority weights for domains; also an application related to the real forestry survey was given). Only single-stage schemes were considered in both these papers. In the context, we consider here probably the most interesting is the paper (Kozak et al. 2008). These authors were concerned with a two-stage sampling with stratification at the first stage together with a single cost constraint similar to (9) and domains-related constraints like (10). However, their approach was restricted to the case when SSU's sample sizes are the same for all PSU's in a given stratum of a given domain. Also they did not consider stratification at the second stage. The latter restriction does not seem to be as serious as the former.
In our main result below, we use the notation introduced earlier in this section.
Assume that D has a positive eigenvalue λ * with a respective eigenvector v * = (v * 1 , . . . , v * I ) T . Then, λ * is simple and unique and v * has all coordinates of the same sign. The allocation which minimizes all relative variances in domains T i , i = 1, . . . , I , (as well as the relative variance S in the whole population) under domain relative variance constraints (10) and overall EVC constraint (9) is given by and Moreover, the minimal relative variances in the domains are T i = κ i T , i = 1, . . . , I and the overall relative variance is S = T Remark 2.1 Note that when the condition is satisfied for a matrix of the form D = a a T C +diag(c) with C > 0 and a, c ∈ (0, ∞) d , then D has a positive eigenvalue (see Prop. 2.1 in WW). Note that in the framework of Theorem 2.1, condition (14) assumes the form The above assumption as well as the assumption that γ 2 i,h > 0 is related to convexity of the function being minimized, and as such they are necessary also for the convex NLP methods to provide the unique solution (see also Remark 2.3).

Remark 2.2
Note that the problem we solved in Theorem 2.1 can be formulated equivalently as: Minimize the overall relative variance S under constraints (10) on relative variances T i in domains (i = 1, . . . , I ) and the expected overall cost constraint (9). The reason for validity of such a rephrasing of the original problem is that

Remark 2.3
The optimal allocation problem in two-stage sampling when no domains efficiency is taken under account has the well-known Neyman-type solution. For example, in case of no stratification on both stages, such solution under EVC constraint is given in Ch. 12.8.1 of Särndal et al. (1992). Our formulas (11) and (12) reduce to (12.8.13) and (12.8.14) of Särndal et al. (1992) in the case when I = 1, H 1 = 1 and G 1,1, j = 1, that is when the whole population is a single domain, neither PSUs nor SSUs within PSU are stratified. The optimal allocation in the case of single domain with stratified SRSWOR for PSUs and SRSWOR for SSUs in every PSU from the first-stage sample is considered in Saini and Kumar (2015). The authors provide the NLP solution and then conclude that the same result can be obtained via Neyman-type approach. Actually, they consider a p-variate case. However, their optimal allocation formulas (16) and (17) for p = 1 are again special cases of (11) and (12). Note that the assumption A h > 0 needed also for the numerical solution in that paper is (again for p = 1) in full agreement with γ 2 ih > 0, which we assume in Theorem 2.1. In the case of the population U consisting just of a single domain, i.e., when I = 1, the eigenvector cancels out from (11) and formulas (11) and (12) for optimal allocation are immediately reduced to (with the index i = 1 suppressed) Moreover, the optimal relative variance (13) assumes the form Note that these formulas are exact versions of the Neyman optimal allocation and the Neyman optimal variance for two-stage sampling with stratified SRSWOR at both stages.

Remark 2.4
The allocation results given in Theorem 2.1 should be compared to the domain-efficient allocation in the same stratified SRSWOR on both stages but with separate constraints for the size of the first-stage sample and for the expected size of the second-stage sample as given in Theorem 3.3 of WW. The basic difference is that in the latter paper both m i,h and n i,h, j,g depend on the eigenvector v * , while in the above result the eigenvector appears only in formula (11) for m i,h and formula (12) is free from v * . This is the major, and by no means obvious, structural consequence of the fact that the constraint we consider here is imposed on the expected costs of the first and the second stage jointly.
Proof of Theorem 2.1 Note that since κ i , i = 1, . . . , I , are fixed and known, minimizing relative variances T i = T i (m, n), i = 1, . . . , I , is equivalent to minimize T under constraints (9) and (10). Therefore, the Lagrange function has the form Note that Consequently, λ i > 0 and Moreover, The above after multiplication by m i,h can alternatively be written as Note that due to (15), the second and fourth terms above cancel out and thus Now (12) follows by combining (16) with (15). To find m i,h , we plug (15) and (16) into the cost constraint (9) and obtain Now let us insert (15) and (16) in the constraint (10). It leads to the equation Multiply its both sides by v i := √ λ i , divide by ρ i and rewrite as Expanding now √ μ according to (17), we obtain which is valid for any i = 1, . . . , I . Note that in terms of the vector a defined in the formulation of the theorem, the above can be rewritten as The final part of the proof follows closely the argument given in WW and is recalled here just for the readers' convenience.
Consider now the matrix D and let λ * be its positive eigenvalue. To show that it is simple, unique and the eigenvector v * attached to this eigenvalue has all coordinates of the same sign, we use the celebrated Perron-Frobenius theorem: If A is a matrix with all strictly positive entries, then there exists a unique positive eigenvalue ν of A; it is simple and such that ν > |λ| for any other eigenvalue λ of A. The respective eigenvector (attached to ν ) has all entries strictly positive (up to scalar multiplication)-see, for example, Kato (1981, Th. 7.3 in Ch. 1).
Fix a number ρ > max 1≤i≤I c i > 0. The matrix D + ρI, where I is the identity matrix, has all entries strictly positive. For any eigenvalue δ j of D and respective eigenvector w j (D + ρI)w j = (δ j + ρ)w j , j = 1, . . . , I .
That is δ j + ρ and w j , j = 1, . . . , d, are respective eigenvalues and eigenvectors of the matrix D + ρI. By the Perron-Frobenius theorem, there exists j 0 such that δ j 0 + ρ j ≥ |δ j + ρ| for any j = 1, . . . , I and respective eigenvector w j 0 has all entries of the same sign. Consequently, δ j 0 + ρ j ≥ δ j + ρ, and thus δ j 0 ≥ δ j for any j = 1, . . . , I . Therefore, by assumption that λ * is the unique positive eigenvalue of D, it follows that T = λ * = δ j 0 and the respective eigenvector v * = w j 0 has all entries of the same sign. Now formulas (11) and (12) follow directly from (16) and (15).

Remark 2.5
Of course, as always when such allocation problems are solved without the natural box constraints: m i,h ≤ M i,h and n i,h, j,g ≤ N i,h, j,g (and this is the case of eigenproblem approach), the solution may violate some of them. Then, it is standard to set m i,h = M i,h and n i,h, j,g = N i,h, j,g in all instances of violation of the respective box constraint and then repeat the minimization procedure for reduced population and reduced cost constraint. It may produce solutions which are not optimal (though, typically, close to them). On the other hand, it is known, for example, in the case of the problem of optimal allocation in stratified SRSWOR that it is possible to reduce the population since the optimal solution requires to take n h = N h in some strata. Then, minimization can be performed on such reduced population-see, for example, Lemma 1 in Stenger and Gabler (2005). This approach has been developed by introducing box constraints to the numerical procedure of optimal allocation in Gabler et al. (2012); computational aspects of such procedures are analyzed in Münnich et al. (2012) (with further references given in that paper). We do not consider here also exact optimality with respect to integer solutions. In this context, it is worth to mention again stratified SRSWOR for which an integervalued optimal allocation has been recently given in Wright (2017) and, another one, even earlier by Friedrich et al. (2015). Here, we are fully satisfied with, for example, random rounding of non-integer allocation, which typically gives solutions close to optimal.

Stratification only at the first stage
This is probably the most popular of the two-stage schemes used in practice. In this case, we have G i,h, j = 1 for any (i, h, j) and thus it allows for a considerable simplification of the notation used in Sect. 2. The constraints imposed by priority weights for relative variances in domains assume the form  (14), are satisfied], we conclude that the optimal allocation at the first stage is: v * is the eigenvector (with positive components) of the matrix D = and the optimal allocation at the second stage is Due to its important role in practice, we chose this setting for presenting the core part of an R-code which produces the domain-efficient allocation. Assume that vectors a:= a and c:= c have already been computed according to (19) and that the vector of priority weights (κ i , i = 1, . . . , I ) is denoted by kap. Then, to find the respective eigenvector, one may use the following code in R (function eigen being its essence) if ( After computing the eigenvector v := v as given in the last line of the R-code above, one can calculate the optimal sample sizes m i,h and n i,h, j according to (18) and (20), respectively.
The R-code given above was adapted from the full R-code as given in https:// github.com/rwieczor/eigenproblem_sample_allocation, which was created (in connection with WW) for optimal fixed precision allocation in subpopulations in two-stage sampling with the stratified Hartley-Rao π ps scheme at the first stage and SRSWOR at the second stage and with constraints imposed separately on the size of the sample at the first and on the expected size of the sample at the second stage.

Stratification only at the second stage
Here, we have H i = 1 for any i. It allows to also simplify the notation of Sect. 2. The constraints imposed by priority weights for relative variances in domains assume the form and N i, j,g is number of SSUs, S 2 i, j,g is the population variance among SSUs and n i, j,g is the sample size, in gth SSU stratum of jth PSU from V i . Moreover, G i, j is the number of SSUs strata in jth PSU from V i , M i = #(V i ) and Here, the version of the cost constraint (9) is From Theorem 2.1 (if its assumptions are satisfied), it follows that the optimal allocation at the first stage is , and the optimal allocation at the second stage is

No stratification at stage one and two
That is, we assume H i = 1 and G i,h, j = 1 for any (i, h, j). In this case, the formulas are further simplified. The constraints imposed by priority weights for relative variances in domains assume the form j S i, j and N i, j is number of SSUs, S 2 i, j is the population variance among SSUs and n i, j is the sample size, in jth PSU from V i . Moreover, with M i and D i defined as in Sect. 3.2. The cost constraint (9) assumes a simple form From Theorem 2.1 (if its assumptions are satisfied), we conclude that the optimal allocation at the first stage is v * is the eigenvector (having all components positive) of the matrix D = a a T C −diag(c) with c defined as in Sect. 3.2 and the optimal allocation at the second stage is

pps Sampling at the first stage and SRSWOR at the second stage
We draw the PSUs ordered sample S (I ) = (K 1 , . . . , K m ) using pps sampling, meaning that PSUs are drawn m times with replacement (that is, independently), jth with probability p j which is proportional to its size, j ∈ V (population of PSUs). Then, if jth PSU belongs to S (I ) , we draw (by SRSWOR) from it a sample (of size n j ) of SSUs, obtaining in this way the sample S , j ∈ S (I ) . Such sampling scheme is considered in Ch. 4.5 of Särndal et al. (1992) (in particular, in Result 4.5.1 the unbiased estimator and its variance are given). A population-efficient allocation procedure for this setup has been given recently in Valliant et al. (2015) as one of the options in the PracTools R package. Importance of this scheme is due to the fact that when the sample of PSUs is sufficiently small, sampling with or without replacement gives the same results. Consequently, very often in practice, the first-stage variance in π ps without replacement sampling is approximated by its pps version. It appears that in such case the eigenproblem methodology we develop here allows for a closed analytic formula for the eigenvector responsible for the domains-efficient allocation. It follows from the fact that the respective population matrix is of rank one. The details are given below.
The unbiased estimator of the population total τ = k∈U y k iŝ Kr p Kr , y k for any PSU j. Its variance has the form where for any j ∈ V we denote To obtain the optimal allocation of the sample at the first and at the second stage in the domains with given priority weights κ i , i = 1, . . . , I , we need to minimize under the constraints given by the priority weights T i = κ i T and the EVC constraint where C is the total expected cost of the survey, c I ,i is the cost incurred by a PSU from V i (assumed to be constant within the domain) and c I I ,i, j is the cost incurred by a SSU belonging to the jth PSU from the ith domain. This setting is somewhat different, actually, simpler than considered earlier. It is due to the fact that in the expression for T i all summands are multiplied by 1/m i .
Theorem 4.1 Assume that for any i = 1, . . . , I j∈V i Then, the allocation minimizing T i = κ i T , i = 1, . . . , I , (as well as the relative variance S in the whole population) under the cost constraint (21) has the form Proof Similarly, as in the proof of Theorem 2.1, we consider the Lagrange function Again, following the steps of the proof of Theorem 2.1, we arrive at Thus, the formula for n i, j follows. Inserting both expressions from (22) into the cost constraints, we obtain On the other hand, plugging formulas (22) into the constraints T i = κ i T , we get That is, v is an eigenvector of the matrix D = 1 C a a T associated with eigenvalue T . Since the matrix D is semi-positive definite of rank 1, the number T is its only nonzero simple positive eigenvalue. Moreover, note that v * := √ Ca is the eigenvector of D associated with eigenvalue a 2 /C. Finally, from (22), we obtain

SRSWOR at the first stage and pps sampling at the second stage
For completeness of the picture for two-stage sampling involving pps approach, let us consider the situation when the PSUs sample S (I ) is drawn through SRSWOR and the SSUs sample by sampling with replacement with probabilities p k proportional to the size of kth unit. Here, the simplification of Sect. 4.1 is no longer available. This case falls under the general framework developed in Sect. 3.3.
The standard estimator of the total iŝ where m is the number of PSUs drawn by the SRSWOR from the total of M PSUs in the population, n j is the number of "with-replacement" draws from jth PSU, K j, is the SSU drawn from jth PSU in the th draw (with replacement), j ∈ V (PSUs population of size M). Evidently,t is unbiased for the population total. Its variance is and for any j ∈ V t j = k∈P SU j y k , Consequently, to obtain the optimal allocation of the samples (on the first and second stage) in the domains with given priority weights κ i , i = 1, . . . , I , we need to minimize under the constraints given by the priority weights T i = κ i T and the expected cost constraint where C is the total expected cost of the survey, c I ,i is the cost incurred by a PSU from V i (assumed to be constant within the domain) and c I I ,i, j is the cost incurred by a SSU belonging to the jth PSU from the ith domain.
Since the structure of the problem is exactly the same as for the one considered in Sect. 3.3, we conclude that the optimal allocation has the form

Three-stage sampling without stratification
In multistage sampling, typically, we do not go beyond three-stage sampling. This scheme is described in detail, for example, in Särndal et al. (1992, Ch. 4.4.2). The optimal allocation of the sample between three stages under the cost constraints, with the additional simplifying assumption that the sizes of SSU and TSU (tertiary sampling unit) samples do not depend on PSU or SSU, respectively, had been studied already in Cochran (1977, Ch. 10.8) (see also Singh 2003, Ch. 10.4). Recently, the optimal allocation procedure, using a simplified variance formula with the standard constraints regarding the total costs, was designed in Valliant et al. (2015) as a part of their PracTools R package. An application of such a simple three-stage sampling design is given, for example, in Tate and Hudgens (2007).
In this section, we analyze the eigenproblem approach to the domain-efficient allocation of sample in three-stage sampling, but first we recall the Neyman-type optimal allocation in the case of no domains.
It is well known that the variance of the standard estimatort of the total of a variable Y in a population U under three-stage sampling with SRSWOR on every stage has the form where L and denote the number of PSUs, M j and m j the number of SSUs in the jth PSU, N j,k and n j,k the number of TSUs in ( j, k)th SSU, in population and in the sample, respectively; moreover, S 2 I , S 2 I I , j , S 2 I I I , j,k denote population variances for PSUs in U , SSUs in jth PSU and TSUs in ( j, k)th SSU.
Then, the minimization of D 2 (or D 2 /τ 2 ) under the cost constraints where c 2 I , c 2 I I , j and c 2 I I I , j,k are costs generated by each PSU, each SSU belonging to jth PSU and each TSU belonging to kth TSU from jth PSU of ith subpopulation, while C denotes the overall cost of the survey, obtained through the standard Neyman approach leads to the following optimal allocation solution where γ 2 = L(L S 2 I − L j=1 M j S 2 I I , j ) is assumed to be positive, where β 2 j = M j M j S 2 I I , j − j k=1 N j,k S 2 I I I , j,k is also assumed to be positive, and where δ 2 j,k = N 2 j,k S 2 I I I , j,k . The optimal variance assumes the form Since we will be considering three-stage sampling in subpopulations, all these quantities will be related to a subpopulation by additional subscript i = 1, . . . , I . Similarly, as in previous sections, we will be interested in minimization of relative variances in subpopulations, provided they satisfy the constraints defined by priority weights (κ i , i = 1, . . . , I ), which assume the form where T is unknown and has to be minimized under an additional total EVC constraint which in the case of three-stage sampling assumes the form where c 2 I ,i , c 2 I I ,i, j and c 2 I I I ,i, j,k are costs generated by each PSU from ith subpopulation, each SSU belonging to jth PSU of ith subpopulation and, each TSU belonging to kth TSU from jth PSU of ith subpopulation, while C denotes the overall cost of the survey. Therefore, the Lagrange function, up to a constant shift, is of a rather complicated, though regular form: Denoting ρ i = τ 2 √ κ i and differentiating F with respect to: 3. n i, j,k we get Note that from (32), we get Multiply now (32) by n i, j,k /m i, j and insert it into (31). After cancellations, one gets Now multiply (31) by m i, j / i and insert both into (30). After cancellations, one gets Formulas for n i, j,k and m i, j follow directly from (33) and (34) and from (34) and (35), respectively. After inserting (33), (34) and (35) into (29), we obtain On the other hand, if we plug (33), (34) and (35) into the constraint (28), we obtain Denote v i = √ λ i and multiply the above equation by v i /κ i . Then, after cancellations and upon denoting Assume that the matrix D has a positive eigenvalue λ * . Then, it is unique and simple and the respective eigenvector v * has all coordinates of the same sign. The allocation , m and n which minimizes all relative domain-wise variances T i , i = 1, . . . , I , (as well as the relative variance S in the whole population) under the constraints T i = κ i T , i = 1, . . . , I , and under the EVC constraint (29) has the form Remark 5.1 Note that in the case of no domains, i.e., when I = 1, the allocation formulas (37)-(39) as well as the formula for the optimal variance (40) are simplified to the Neyman-type allocation and optimal variance formulas as given in (24)-(26) and (27), respectively. Note also that only the allocation of the first-stage sample and the optimal base of the variance depend on the eigenvector v * . Formulas (38) and (39) for the allocation of the second-and third-stage samples are given directly in terms of population quantities with no reference to the eigenvector v * .

Conclusions
In this paper, we search for Neyman-type solutions to domains-efficient allocation in multistage stratified sampling. Such a solution can be seen as an alternative to the purely numerical one proposed in CRH for stratified single-stage scheme. We develop the eigenproblem method originating in NW and use eigenvalues and eigenvectors for allocation which, under specified priority coefficients for the constraints on the domains relative variances, assures optimal estimation both in the whole population and in the domains. In particular, we consider two-and three-stage sampling. The novelty of the solutions we provide here, with respect to what is known for eigenproblem approach to domains-efficient allocation, is with respect to several aspects. The most important is that, in contrast to earlier situations, as, for example, in WW, a single total cost constraint is taken under account. In previous papers instead, two constraints related to (expected) samples sizes of the PSUs and SSUs, respectively, were jointly imposed. In those papers, the two-stage sampling with SRSWOR (or Hartley-Rao) schemes with stratification either at the first or the second stage was considered. Here, we apply the eigenproblem methodology also to new sampling schemes: stratified SRSWOR at both stages as well as pps sampling with replacement and SRSWOR either at the first or the second stage and to the three-stage sampling with SRSWOR at each stage. In each of these cases, the allocation which assures optimality (under given domain priority weights) of estimators of domain totals is given in terms of eigenvectors of a population-dependent matrix (which typically is rank-one perturbations of a diagonal matrix). Moreover, the standard errors of the estimates in the domains and in the whole population are given in terms of the respective eigenvalue. The latter allows to interpret the solution as a direct generalization of Neyman-type optimal allocation to the multi-domain case. Another important consequence of the approach we use here is that through the analytic formulas, we obtained, the structure of the optimal allocation can be seen. For example, it is visible that only the first-stage optimal allocation is influenced by the eigenvector v * of the population matrix D.