1 Introduction

Consider a stratified SRSWOR in a population U of size N with strata \(U_1,\ldots ,U_I\), which form a partition of U, and let \(N_h\) denote the size of the stratum \(U_h\). For a variable \({\mathcal {Y}}\) in U, we denote \(y_k={\mathcal {Y}}(k)\), \(k\in U\). The standard estimator of the total \(\tau =\sum _{k\in U}\,y_k\) has the form \({\hat{\tau }}_{st}=\sum _{h=1}^I\,N_h{\bar{y}}_h\), where \({\bar{y}}_h=\tfrac{1}{n_h}\sum _{k\in {\mathcal {S}}_h}\,y_k\) with \(n_h\) denoting the size of the sample \({\mathcal {S}}_h\) drawn from the stratum \(U_h\), \(h=1,\ldots ,I\). The variance of this estimator is \(D^2({\hat{\tau }}_{st})=\sum _{h=1}^I\,\left( \tfrac{1}{n_h}-\tfrac{1}{N_h}\right) \,N_h^2S_h^2\), where \(S_h^2=\tfrac{1}{N_h-1}\sum _{k\in U_h}\,(y_k-{\bar{y}}_h)^2\) is hth stratum population variance.

The basic question for such a setting is the optimal allocation, \({\underline{n}}=(n_1,\ldots ,n_I)\), of the sample among the strata. To this end, one may assign a given (relative) variance of the estimator \({\hat{\tau }}_{st}\) and minimize the costs expressed, for example, by the total sample size \(\sum _{h=1}^I\,n_h\). A related approach is to fix a total sample size \(n=n_1+\cdots +n_I\) and minimize the (relative) variance. Both cases are examples of the classical Neyman optimal allocation procedure which, for example, in the second case results in the allocation \(n_h=n\tfrac{N_hS_h}{\sum _{g=1}^I\,N_gS_g}\), \(h=1,\ldots ,I\). In both settings, the result is a simple consequence of minimization using the Lagrange function or can be concluded via the Schwartz inequality.

Recently, we observe a growing interest in more refined allocation methods (also in two-stage sampling) based on nonlinear programming ensuring efficient estimation procedures for the whole population, see, for example, Clark and Steel (2000), Lednicki and Wieczorkowski (2003), Clark (2009), Khan et al. (2010), Münnich et al. (2012), Gabler et al. (2012), Ballin and Barcaroli (2013), Valliant et al. (2013, 2015). Much less is known for allocation procedures which are domains efficient or both population and domains efficient—see, for example, Costa et al. (2004), Longford (2006), Choudhry et al. (2012)—referred to as CRH in the sequel, Molefe and Clark (2015) and Keto and Pahkinen (2017). All of them are again based on nonlinear programming and are designed for single-stage sampling schemes. To the best of our knowledge, the only examples of domains-efficient allocation procedures in two-stage sampling schemes are those related to the eigenproblem approach. Such approach will be explained and discussed in the sequel.

In the stratified SRSWOR, we may treat strata as domains (consequently, we will change the subscript h denoting a stratum into i denoting a domain), that is, we would like to control not only the overall (relative) variance but also (relative) variances in each of the domains. In the context of both multi- and small-area estimations, Longford (2006) suggested to minimize (under a constraint given by the total sample size) the objective function

$$\begin{aligned} \sum _{i=1}^I\,P_i\,D^2({\bar{y}}_i)+GP_+D^2({\bar{y}}_{st}), \end{aligned}$$
(1)

where \(P_i\), \(i=1,\ldots ,I\) are relative preassigned weights which describe “importance” of domains, \(P_+=\sum _{i=1}^I\,P_i\) and G is a weight responsible for a priority for the variance of the population mean estimator. In the context of model-assisted methodology, this approach has been recently developed in Molefe and Clark (2015). Mathematically, the problem reduces to the Neyman allocation scheme. Similarly, when a given value is assigned for (1), the total sample size is minimized. The weights \((P_i,\,i=1,\ldots ,I)\) are designed in order to cover, at least to some extent, jointly the optimality issue for domains and for the whole population. As pointed out in Friedrich and Münnich (2018), the approach of Gabler et al. (2012) can be used also in this context (actually, they mention the case with \(GP_+=0\)). Since the objective function (1) is a weighted sum of domains and population variances, this approach does not give any convenient tool to control the quality of population and domains means estimators. Moreover, it is not clear how to assess the impact of values of weights \(P_i\), \(i=1,\ldots ,I\), and \(GP_+\) on variances \(D^2({\bar{y}}_i)\), \(i=1,\ldots ,I\), and \(D^2({\bar{y}}_{st})\). These issues are clearly visible in the numerical example given in “Appendix,” where such approach is confronted with the one we propose in this paper.

Our approach can be treated as a an alternative to a direct setting of CRH. They proposed an approach, where also both multi- and small-area estimations were considered. CRH minimize the total sample size

$$\begin{aligned} g({\underline{n}})=n_1+\cdots +n_I \end{aligned}$$

under the constraints for relative variances of estimators of domain totals

$$\begin{aligned} T_i:=N_i^2\left( \tfrac{1}{n_i}-\tfrac{1}{N_i}\right) \,\tfrac{S_i^2}{\tau _i^2}\le RV_{oi},\quad i=1,\ldots ,I, \end{aligned}$$
(2)

where \(\tau _i=\sum _{k\in U_i}\,y_k\) is the total for the domain \(U_i\), \(i=1,\ldots ,I\), and the constraint on the relative variance of the estimator of population total

$$\begin{aligned} {S}:=\tfrac{1}{\tau ^2}\,\sum _{i=1}^I\,\left( \tfrac{1}{n_i}-\tfrac{1}{N_i}\right) N_i^2 S_i^2=\tfrac{1}{\tau ^2}\,\sum _{i=1}^I\,T_i\tau _i^2\le RV_o. \end{aligned}$$
(3)

Note that in this approach one specifies conditions for each of the domains and for the whole population separately. The problem was solved under additional box constraints of the form \(0<n_i\le N_i\), \(i=1,\ldots ,I\), by a nonlinear programming method involving the popular Newton–Raphson algorithm.

The NLP solution, as the one described above, is an efficient tool for applications. Such purely numerical approaches to allocation problems are popular in real surveys. A drawback of such methods is that they gave just numerical values and do not provide any information on the structure of the solution, which, for example, can be important for designing priorities for the domains.

Now we will describe an alternative approach to the problem of domains-overall-efficient allocation in the sampling scheme considered in CRH. The approach will allow to see the analytic form of the solution. The respective expression is based on a unique direction in the space \({{\mathbb {R}}}^I\), where the dimension I is equal to the number of domains. The rest of this section is just a warm-up illustration for the eigenproblem methodology we will apply in full swing in several multistage schemes in the main part of the paper.

We would like to minimize each \(T_i\), \(i=1,\ldots ,I\), as well as \({S}\) under the constraint on the total size of the sample. It can be achieved in the following way. To each domain \(U_i\), its (known) priority weight \(\kappa _i>0\) is assigned. These weights describe ratios of relative variances through

$$\begin{aligned} \tfrac{T_i}{T_j}=\tfrac{\kappa _i}{\kappa _j}\,\quad \forall \,i,j=1,\ldots ,I. \end{aligned}$$

Equivalently, we can write

$$\begin{aligned} \left( \tfrac{1}{n_i}-\tfrac{1}{N_i}\right) \tfrac{N_i^2S_i^2}{\tau _i^2}=\kappa _i T,\quad i=1,\ldots ,I, \end{aligned}$$
(4)

where T is an unknown positive constant. Such approach allows to fully control domains variability of (relative) variances of estimators—see the numerical example in “Appendix.” Moreover, under the above constraint, the unknown parameter T controls not only relative variances in domains but also the overall relative variance \({S}\) of the estimator of the population mean. It follows from the fact that under (4), due to (3), the relative overall variance S can be written as

$$\begin{aligned} {S}=\left( \tfrac{1}{\tau ^2}\,\sum _{i=1}^I\,\kappa _i\tau ^2_i \right) \,T. \end{aligned}$$

Therefore, when we optimize relative variances within domains, the overall relative variance gets automatically optimized as well. This general rule will hold also for the multistage schemes considered in the sequel.

Upon denoting \(\gamma _i^2=\tfrac{N_i^2S_i^2}{\tau _i^2}\), \(i=1,\ldots ,I\), Eq. (4) can be written as

$$\begin{aligned} \tfrac{\gamma _i^2}{\kappa _i\,n_i}-\tfrac{\gamma _i^2}{\kappa _iN_i}=T,\quad i=1,\ldots ,I. \end{aligned}$$
(5)

Now we denote \(v_i=\tfrac{n_i\sqrt{\kappa _i}}{\gamma _i}\), \(i=1,\ldots ,I\), and, due to (5), the constraint \(\sum _{i=1}^I\,n_i=n\) assumes the form

$$\begin{aligned} 1=\tfrac{n}{\sum _{i=1}^I\,\tfrac{\gamma _i}{\sqrt{\kappa _i}}\,v_i}. \end{aligned}$$
(6)

Multiplying (5) by \(v_i\) and using (6), we get

$$\begin{aligned} n^{-1}\tfrac{\gamma _i}{\sqrt{\kappa _i}}\,\sum _{r=1}^I\,\tfrac{\gamma _r}{\sqrt{\kappa _r}}\,v_r-\tfrac{\gamma _i^2}{\kappa _i\,N_i}\,v_i=Tv_i,\quad i=1,\ldots ,I, \end{aligned}$$

which is equivalent to

$$\begin{aligned} \left( \tfrac{{\underline{a}}\,{\underline{a}}^T}{n}-\mathrm {diag}({\underline{c}})\right) \,{\underline{v}}=T{\underline{v}}\end{aligned}$$

with \({\underline{v}}=(v_1,\ldots ,v_I)^T\), where

$$\begin{aligned} {\underline{a}}=\left( \tfrac{\gamma _i}{\sqrt{\kappa _i}},\,i=1,\ldots ,I\right) ^T,\quad \quad {\underline{c}}=\left( \tfrac{\gamma _i^2}{\kappa _i\,N_i},\,i=1,\ldots ,I\right) ^T \end{aligned}$$

and \(\mathrm {diag}({\underline{c}})\) is a diagonal matrix with the vector \({\underline{c}}\) being its diagonal. Consequently, by the Perron–Frobenius theorem (for more details, see the Proof of Theorem 2.1), there exists a unique simple positive eigenvalue \(\lambda ^*\) of the matrix \(\mathbf{D}=\tfrac{{\underline{a}}\,{\underline{a}}^T}{n}-\mathrm {diag}({\underline{c}})\) and the respective eigenspace is spanned by a vector \({\underline{v}}^*\) with all components positive. This vector, \({\underline{v}}^*\), up to normalization, that is the respective direction in the space \(R^I\) is responsible for the efficient allocation. Therefore, in our problem above, \(T=\lambda ^*\), \({\underline{v}}=(v_i,\,i=1,\ldots ,I)={\underline{v}}^*\) and thus

$$\begin{aligned} n_i\propto \tfrac{\gamma _i}{\sqrt{\kappa _i}}\,v^*_i,\quad i=1,\ldots ,I. \end{aligned}$$

Using again the constraint on the sample size, we see that

$$\begin{aligned} n_i=n\,\tfrac{\gamma _i\,v_i^*}{\sqrt{\kappa _i}\,\sum _{r=1}^I\,\tfrac{\gamma _r}{\sqrt{\kappa _r}}\,v_r^*},\quad i=1,\ldots ,I. \end{aligned}$$

Moreover, with this optimal allocation

$$\begin{aligned} T_i=\kappa _i \lambda ^*,\quad i=1,\ldots ,I\qquad \text{ and }\qquad {S}=\tfrac{\lambda ^*}{\tau ^2}\,\sum _{i=1}^I\,\kappa _i\tau _i^2. \end{aligned}$$

Remark 1.1

Of course, there is an alternative numerical solution of this problem—see, for example, Lednicki and Wesołowski (1994) (referred to as LW below). From (5), one gets

$$\begin{aligned} n_i=\tfrac{\gamma _i^2N_i}{\kappa _i N_i T+\gamma _i^2},\quad i=1,\ldots ,I. \end{aligned}$$
(7)

Now the sample size constraint leads to the equation

$$\begin{aligned} n=\sum _{i=1}^I\,\tfrac{\gamma _i^2N_i}{\kappa _iN_i T+\gamma _i^2} \end{aligned}$$
(8)

for unknown T. It is obvious that there exists a unique positive solution \(T=T^*\), which has to be derived numerically. Then, the allocation is given by (7) with \(T=T^*\).

As we mentioned above, there are alternatives for the eigenproblem approach to the (domains-population)-efficient allocation issue in the case of SRSWOR in domains. Except for a possibility mentioned in Remark 1.1, the same allocation can be obtained (up to box constraints) by CRH methodology if \(T_i:=\kappa _i T^*\) (with the value of \(T^*\) as computed in the eigenproblem procedure) and g is minimized through the NLP procedure. Similarly, each of these three approaches (CRH, LW and eigenproblem) can be applied in the case of stratified SRSWOR in each of the domains. It suffices to start the procedure with the Neyman allocation in every domain.

However, the situation changes drastically when two-stage (or multistage) sampling is taken under account. Then, as it will be explained in the following sections, even in the simplest case of a two-stage sampling with SRSWOR at both stages (and no stratification), the formula relating the sizes of samples at the first and the second stage with variances, an analog of the one which lead to (7), does not allow to get a simple equation, as (8) in Remark 1.1 for the unknown T. Therefore, such direct numerical approach is not possible. To the best of our knowledge, no analogs of the NLP procedure from CRH are available in the literature in the multistage setting. Nevertheless, nonlinearly constrained optimization solvers, for example MINOS, MOSEK or IPOPT, available on the Web through NEOS server can be used as potential tools for NLP answers to the two-stage extension of the original CRH problem.

It appears that in such as well as in a more complicated situation, optimal allocation issue can be conveniently handled through the eigenproblem methodology, which provides insight into the structure of the optimal solutions, though in some non-typical cases it may give not the optimal but only approximately optimal results. It suffers from the same drawbacks as the original Neyman optimal allocation; i.e., the natural box constraints can be violated and the solution typically is not integer valued. The main aim of the present paper is to show how such an eigenproblem approach works in several new settings involving multistage sampling. In Sect. 2, we consider two-stage sampling with stratified SRSWOR on both stages. Special simplified cases are described in Sect. 3. Then, we deal with the situation in which at one of the stages \(\mathrm {pps}\) sampling with replacement is used while at the other the sample is drawn according to the SRSWOR. Finally, in Sect. 5 we analyze three-stage sampling with SRSWOR at every stage. In all these cases, the allocation problem with the total cost constraint is solved via an eigenproblem for rank-one perturbations of diagonal matrices. The case of the \(\mathrm {pps}\) sampling with replacement at the first stage and the SRSWOR at the second stage is rather special—then, the eigenproblem is for a matrix of rank one and thus an analytic form of the eigenvector responsible for allocation is available.

The eigenproblem approach to efficient allocation in domains originally was proposed in Niemiro and Wesołowski (2001) (NW in the sequel) and recently developed in Wesołowski and Wieczorkowski (2017) (WW in the sequel). The major difference between the setting of these two papers and our setting is the form of the cost constraints: Here, we consider the single total cost constraint, while two constraints, one on the sample size of the PSUs sample and one on the expected sample size of the SSUs sample, were imposed jointly in these earlier papers. There are important consequences of such a change in the cost constraints. Due to the form of the cost constraint, our solution is a direct generalization of the Neyman-type allocation. In particular, it gives the Neyman-type solution in case when there are no domains (i.e., the whole population is a single domain). At the technical level, the population matrix \(\mathbf{D}\), everything is based on, is a rank-one perturbation of a diagonal matrix, while it was a rank-two perturbation of a diagonal matrix in NW and WW. There is also an important difference with NW and WW with respect to the structure of the allocation. The common feature is that there is an eigenvector \({\underline{v}}^*\) of the matrix \(\mathbf{D}\) which plays important role in the optimal allocation; however, in the case we consider here, it influences only the optimal allocation at the first stage, while in the cases considered in NW and WW the optimal allocation on both stages depends explicitly on respective version of \({\underline{v}}^*\).

2 Two-stage sampling with stratified SRSWOR at both stages

For any \(i=1,\ldots ,I\), the subpopulation \({\mathcal {V}}_i\) of primary sampling units (PSUs) of ith domain in U is stratified: \({\mathcal {V}}_i=\bigcup _{h=1}^{H_i}\,{\mathcal {V}}_{i,h}\). Let \(M_{i,h}\) denote number of PSUs in \({\mathcal {V}}_{i,h}\). Also every PSU understood as a collection of secondary sampling units (SSUs) is stratified: A PSU j from the stratum \({\mathcal {V}}_{i,h}\) is stratified into \(\bigcup _{g=1}^{G_{j,h,i}}\,{\mathcal {W}}_{i,h,j,g}\).

A sample \({\mathcal {S}}\) is chosen as follows: At the first stage, a PSU’s sample \({\mathcal {S}}^{(I)}_{i,h}\) of size \(m_{i,h}\) is selected from \({\mathcal {V}}_{i,h}\) according to the SRSWOR, \(h=1,\ldots ,H_i\), \(i=1,\ldots ,I\). At the second stage for each PSU \(j\in {\mathcal {S}}^{(I)}_{i,h}\), an SSU’s sample \({\mathcal {S}}^{(II)}_{i,h,j,g}\) of size \(n_{i,h,j,g}\) is selected from \({\mathcal {W}}_{i,h,j,g}\), according to the SRSWOR, \(g=1,\ldots ,G_{i,h,j}\). Let \(N_{i,h,j,g}\) denote the number of SSUs in \({\mathcal {W}}_{i,h,j,g}\). Finally, we set

$$\begin{aligned} {\mathcal {S}}=\bigcup _{i=1}^I\,\bigcup _{h=1}^{H_i}\,\bigcup _{j\in {\mathcal {S}}^{(I)}_{i,h}}\,\bigcup _{g=1}^{G_{i,h,j}}\,{\mathcal {S}}_{i,h,j,g}^{(II)}. \end{aligned}$$

As an example, one can consider a survey of population of students in a given country with parameters to be estimated at the regions level (subpopulations) and at the whole country level as well. Then, SSUs are just students, while PSUs are schools. Schools in each region are stratified into educational districts, and pupils in each school are stratified into grades. That is, U is the population of students, \({\mathcal {V}}_i\) is subpopulation of schools in ith region, while \({\mathcal {V}}_{i,h}\) is the stratum of schools in hth district of \({\mathcal {V}}_i\) and \(M_{i,h}\) is the number of schools in \({\mathcal {V}}_{i,h}\). Moreover, \({\mathcal {W}}_{i,h,j,g}\) denotes students of grade g of jth school from district \({\mathcal {V}}_{i,h}\) and \(N_{i,h,j,g}\) denotes the number of students in \({\mathcal {W}}_{i,h,j,g}\). A sample \({\mathcal {S}}^{(I)}_{i,h}\) of \(m_{i,h}\) schools is drawn according to SRSWOR from \({\mathcal {V}}_{i,h}\), and then a sample \({\mathcal {S}}^{(II)}_{i,h,j,g}\) of \(n_{i,h,j,g}\) students is drawn by SRSWOR from \({\mathcal {W}}_{i,h,j,g}\) for each school j belonging to \({\mathcal {S}}^{(I)}_{i,h}\). Here and below in the formulas for variances, a single subscript i refers to region \({\mathcal {V}}_i\), a double subscript ih refers to district \({\mathcal {V}}_{i,h}\), a triple subscript ihj refers to jth school from district \({\mathcal {V}}_{i,h}\) and a quadruple subscript ihjg refers to grade g of jth school in district \({\mathcal {V}}_{i,h}\).

The variance of \(\pi \)-estimator of the total of \({\mathcal {Y}}\) over subpopulation \(U_i\) has the form, see, for example, Särndal et al. (1992, Ch. 4.3),

$$\begin{aligned}&\sum _{h=1}^{H_i}\,\left( \tfrac{1}{m_{i,h}}-\tfrac{1}{M_{i,h}}\right) M^2_{i,h}D^2_{i,h}\\&\qquad +\,\sum _{h=1}^{H_i}\tfrac{M_{i,h}}{m_{i,h}}\, \sum _{j\in {\mathcal {V}}_{i,h}}\,\sum _{g=1}^{G_{i,h,j}}\,\left( \tfrac{1}{n_{i,h,j,g}}-\tfrac{1}{N_{i,h,j,g}}\right) N_{i,h,j,g}^2S_{i,h,j,g}^2, \end{aligned}$$

where

$$\begin{aligned} D^2_{i,h}=\tfrac{1}{M_{i,h}-1}\sum _{j\in {\mathcal {V}}_{i,h}}\left( t_j-{\bar{t}}_{i,h}\,\right) ^2 \end{aligned}$$

with

$$\begin{aligned} t_j=\sum _{k\in PSU(j)}\,y_k\qquad \forall \;\;\mathrm {PSU}(j),\quad {\bar{t}}_{i,h}=\tfrac{1}{M_{i,h}}\sum _{j\in {\mathcal {V}}_{i,h}}\,t_j; \end{aligned}$$

and

$$\begin{aligned} S_{i,h,j,g}^2=\tfrac{1}{N_{i,h,j,g}-1}\,\sum _{k\in {\mathcal {W}}_{i,h,j,g}}\,\left( y_k-{\bar{t}}_{i,h,j,g}\right) ^2 \end{aligned}$$

with

$$\begin{aligned} {\bar{t}}_{i,h,j,g}=\tfrac{1}{N_{i,h,j,g}}\,\sum _{k\in {\mathcal {W}}_{i,h,j,g}}\,y_k. \end{aligned}$$

The actual cost of the survey generated by the sample \({\mathcal {S}}\) can be modeled by the quantity

$$\begin{aligned} \sum _{i=1}^I\,\sum _{h=1}^{H_i}\,m_{i,h}\left( c_{I,i,h}^2+ \sum _{j\in {\mathcal {S}}^{(I)}_{i,h}}\,c_{II,i,h,j}^2\,\sum _{g=1}^{G_{i,h,j}}\,n_{i,h,j,g}\right) , \end{aligned}$$

where \(c_{I,i,h}^2>0\) and \(c_{II,i,h,j}^2>0\) are costs generated by a single PSU from hth stratum of PSUs in ith domain (we assume that it is constant within the stratum) and a single SSU from jth PSU of hth stratum of PSUs in ith domain (we assume that it is constant within the PSU), respectively. Obviously, due to randomness of \({\mathcal {S}}^{(I)}_{i,h}\), the actual cost is a random variable. In such a situation, when one wants to impose a constraint on the total cost, the standard approach is to impose a constraint on its expected variable cost (EVC), see, for example, Ch. 12.8.1 of Särndal et al. (1992), which in the case considered here assumes the form:

$$\begin{aligned} \sum _{i=1}^I\,\sum _{h=1}^{H_i}\,c_{I,i,h}^2m_{i,h} +\sum _{i=1}^I\,\sum _{h=1}^{H_i}\,\tfrac{m_{i,h}}{M_{i,h}}\,\sum _{j\in {\mathcal {V}}_{i,h}}\,c_{II,i,h,j}^2\,\sum _{g=1}^{G_{i,h,j}}\,n_{i,h,j,g}=C. \end{aligned}$$
(9)

We also assume that priority weights \((\kappa _i,\,i=1,\ldots ,I)\in (0,1)^I\), such that \(\sum _{i=1}^I\,\kappa _i=1\), for relative variances of estimators of means in subpopulations are preassigned, that is

$$\begin{aligned} T_i= & {} \tfrac{1}{\tau _i^2}\sum _{h=1}^{H_i}\,\tfrac{1}{m_{i,h}}\left( \gamma _{i,h}^2+M_{i,h} \sum _{j\in {\mathcal {V}}_{i,h}}\,\sum _{g=1}^{G_{i,h,j}}\,\tfrac{\beta _{i,h,j,g}^2}{n_{i,h,j,g}}\right) \nonumber \\&-\tfrac{\sum _{h=1}^{H_i}\,M_{i,h}\,D_{i,h}^2}{\tau _i^2}=\kappa _i\,T,\qquad i=1,\ldots ,I, \end{aligned}$$
(10)

where

$$\begin{aligned} \gamma _{i,h}^2=M_{i,h}\,\left( M_{i,h}D_{i,h}^2-\sum _{j\in {\mathcal {V}}_{i,h}}\,\sum _{g=1}^{G_{i,h,j}}\,N_{i,h,j,g}\,{S}_{i,h,j,g}^2\right) \end{aligned}$$

and

$$\begin{aligned} \beta _{i,h,j,g}=N_{i,h,j,g}\,S_{i,h,j,g} \end{aligned}$$

for \(\tau _i=\sum _{h=1}^{H_i}\,\sum _{j\in {\mathcal {V}}_{i,h}}\,t_j\), \(i=1,\ldots ,I\). We wrote above \(\gamma _{i,h}^2\) since we will be assuming that it is nonnegative.

We want to find the allocation that is a set of two tables: a two-way table \({\underline{m}}=(m_{i,h})\) and a four-way table \({\underline{n}}=(n_{i,h,j,g})\), which give minimal domain-wise relative variances \(T_i\), \(i=1,\ldots ,I\) and minimal relative overall variance \({S}\), under the constraints (10) imposed through priority weights and the EVC constraint (9).

The result below says that it can be achieved by searching for positive eigenvalue of a certain matrix based on population quantities and costs coefficients. The allocation is obtained from the respective eigenvector. The approach parallels earlier developments in this setting where, instead of using a single total average cost constraint, the first-stage and second-stage costs were treated separately. In particular, NW in 2001 considered a two-stage scheme with separate constraints for the size of PSUs and SSUs sample and with stratified sampling either at the first or at the second stage. As it has been already mentioned, a similar problem has been recently investigated in WW for two-stage stratified SRSWOR on both stages as well as a scheme with stratified Hartley–Rao scheme at the first stage and stratified SRSWOR at the second stage (also some variations of theses two basic setups were considered there). In that paper, again two constraints were jointly imposed: one for the cost incurred by the PSUs sample size, \(\sum _{i=1}^I\sum _{h=1}^{H_i}m_{i,h}=m\), and one for the cost generated by the expected SSUs sample size, \(\sum _{i=1}^I\,\sum _{h=1}^{H_i}\,\tfrac{m_{i,h}}{M_{i,h}}\,\sum _{j\in {\mathcal {V}}_{i,h}}\,\sum _{g=1}^{G_{i,h,j}}\,n_{i,h,j,g}=n\) (these formulas refer obviously to the stratified SRSWOR on both stages). In meantime, the eigenproblem approach has been further developed in a series of papers: Kozak (2004) (multivariate version of NW was considered with an application to agricultural surveys), Kozak and Zieliński (2005) (the original eigenproblem approach from NW, where it was assumed that relative variances are the same for all domains, was adapted to include priority weights for domains; also an application related to the real forestry survey was given). Only single-stage schemes were considered in both these papers. In the context, we consider here probably the most interesting is the paper (Kozak et al. 2008). These authors were concerned with a two-stage sampling with stratification at the first stage together with a single cost constraint similar to (9) and domains-related constraints like (10). However, their approach was restricted to the case when SSU’s sample sizes are the same for all PSU’s in a given stratum of a given domain. Also they did not consider stratification at the second stage. The latter restriction does not seem to be as serious as the former.

In our main result below, we use the notation introduced earlier in this section.

Theorem 2.1

Assume that \(M_{i,h}D_{i,h}^2>\sum _{j\in {\mathcal {V}}_{i,h}}\,\sum _{g=1}^{G_{i,h,j}}\,N_{i,h,j,g}\,S_{i,h,j,g}^2\) for all \(h=1,\ldots ,H_i\), \(i=1,\ldots ,I\). Let \(\mathbf{D}=\tfrac{{\underline{a}}\,{\underline{a}}^T}{C}-\mathrm {diag}({\underline{c}})\) where \({\underline{a}}=(a_i,\,i=1,\ldots ,I)^T\), \({\underline{c}}=(c_i,\,i=1,\ldots ,I)\),

$$\begin{aligned} a_i=\tfrac{\nu _i}{\rho _i},\quad \text{ with }\quad \nu _i=\sum _{h=1}^{H_i}\left( c_{I,i,h}\gamma _{i,h}+\sum _{j\in {\mathcal {V}}_{i,h}}\,c_{II,i,h,j} \,\sum _{g=1}^{G_{i,h,j}}\,\beta _{i,h,j,g}\right) ,\quad \rho _i=\tau _i\sqrt{\kappa _i}\end{aligned}$$

and

$$\begin{aligned} c_i=\tfrac{1}{\rho _i^2}\,\sum _{h=1}^{H_i}\,M_{i,h}\,D_{i,h}^2,\quad i=1,\ldots ,I. \end{aligned}$$

Assume that \(\mathbf{D}\) has a positive eigenvalue \(\lambda ^*\) with a respective eigenvector \({\underline{v}}^*=(v_1^*,\ldots ,v_I^*)^T\).

Then, \(\lambda ^*\) is simple and unique and \({\underline{v}}^*\) has all coordinates of the same sign.

The allocation which minimizes all relative variances in domains \(T_i\), \(i=1,\ldots ,I\), (as well as the relative variance \({S}\) in the whole population) under domain relative variance constraints (10) and overall EVC constraint (9) is given by

$$\begin{aligned} m_{i,h}=C\tfrac{v_i^*\gamma _{i,h}}{\rho _ic_{I,i,h}\sum _{r=1}^I\,v_r^*\,\nu _r/\rho _r} \end{aligned}$$
(11)

and

$$\begin{aligned} n_{i,h,j,g}=\tfrac{c_{I,i,h}M_{i,h}\beta _{i,h,j,g}}{c_{II,i,h,j}\gamma _{i,h}}. \end{aligned}$$
(12)

Moreover, the minimal relative variances in the domains are \(T_i=\kappa _i\,T\), \(i=1,\ldots ,I\) and the overall relative variance is \({S}=\tfrac{T}{\tau ^2}\sum _{i=1}^I \,\kappa _i\tau _i^2\) with

$$\begin{aligned} T=\lambda ^*=\tfrac{1}{\sum _{i=1}^I\rho _i^2}\,\left[ \tfrac{1}{C}\,\left( \sum _{i=1}^I\,\tfrac{\rho _i}{v_i^*}\,\nu _i\right) \left( \sum _{i=1}^I\,\tfrac{v_i^*}{\rho _i}\,\nu _i\right) -\sum _{i=1}^I\,\sum _{h=1}^{H_i}\,M_{i,h}\,D_{i,h}^2\right] . \end{aligned}$$
(13)

Remark 2.1

Note that when the condition

$$\begin{aligned} \sum _{i=1}^d\,\tfrac{a_i^2}{c_i}>C \end{aligned}$$
(14)

is satisfied for a matrix of the form \(\mathbf{D}=\tfrac{{\underline{a}}\,{\underline{a}}^T}{C}+\mathrm {diag}({\underline{c}})\) with \(C>0\) and \({\underline{a}},{\underline{c}}\in (0,\infty )^d\), then \(\mathbf{D}\) has a positive eigenvalue (see Prop. 2.1 in WW). Note that in the framework of Theorem 2.1, condition (14) assumes the form

$$\begin{aligned} \sum _{i=1}^d\,\tfrac{\nu _i^2}{\sum _{h=1}^{H_i}\,M_{i,h}\,D_{i,h}^2}>C. \end{aligned}$$

The above assumption as well as the assumption that \(\gamma _{i,h}^2>0\) is related to convexity of the function being minimized, and as such they are necessary also for the convex NLP methods to provide the unique solution (see also Remark 2.3).

Remark 2.2

Note that the problem we solved in Theorem 2.1 can be formulated equivalently as: Minimize the overall relative variance \({S}\) under constraints (10) on relative variances \(T_i\) in domains (\(i=1,\ldots ,I\)) and the expected overall cost constraint (9). The reason for validity of such a rephrasing of the original problem is that \({S}=T\tfrac{1}{\tau ^2}\sum _{i=1}^I\,\kappa _i\tau _i^2\) which is a consequence of \(T_i=\kappa _i T\), \(i=1,\ldots ,I\).

Remark 2.3

The optimal allocation problem in two-stage sampling when no domains efficiency is taken under account has the well-known Neyman-type solution. For example, in case of no stratification on both stages, such solution under EVC constraint is given in Ch. 12.8.1 of Särndal et al. (1992). Our formulas (11) and (12) reduce to (12.8.13) and (12.8.14) of Särndal et al. (1992) in the case when \(I=1\), \(H_1=1\) and \(G_{1,1,j}=1\), that is when the whole population is a single domain, neither PSUs nor SSUs within PSU are stratified. The optimal allocation in the case of single domain with stratified SRSWOR for PSUs and SRSWOR for SSUs in every PSU from the first-stage sample is considered in Saini and Kumar (2015). The authors provide the NLP solution and then conclude that the same result can be obtained via Neyman-type approach. Actually, they consider a p-variate case. However, their optimal allocation formulas (16) and (17) for \(p=1\) are again special cases of (11) and (12). Note that the assumption \(A_h>0\) needed also for the numerical solution in that paper is (again for \(p=1\)) in full agreement with \(\gamma _{ih}^2>0\), which we assume in Theorem 2.1.

In the case of the population U consisting just of a single domain, i.e., when \(I=1\), the eigenvector cancels out from (11) and formulas (11) and (12) for optimal allocation are immediately reduced to (with the index \(i=1\) suppressed)

$$\begin{aligned} m_h=C\tfrac{\gamma _{h}}{c_{I,h}\,\sum _{\ell =1}^H\left( c_{I,\ell }\gamma _{\ell }+\sum _{j\in {\mathcal {V}}_{\ell }}\,c_{II,\ell ,j} \,\sum _{g=1}^{G_{\ell ,j}}\,\beta _{\ell ,j,g}\right) }\quad \text{ and }\quad n_{h,j,g}=\tfrac{c_{I,h}\,M_h\beta _{h,j,g}}{c_{II,h,j}\gamma _h}. \end{aligned}$$

Moreover, the optimal relative variance (13) assumes the form

$$\begin{aligned} D^2_{opt}=\tfrac{1}{C}\,\left[ \sum _{h=1}^H\left( c_{I,h}\gamma _h+\sum _{j\in {\mathcal {V}}_h}\,c_{II,h,j} \,\sum _{g=1}^{G_{h,j}}\,\beta _{h,j,g}\right) \right] ^2-\sum _{h=1}^H\,M_h\,D_h^2. \end{aligned}$$

Note that these formulas are exact versions of the Neyman optimal allocation and the Neyman optimal variance for two-stage sampling with stratified SRSWOR at both stages.

Remark 2.4

The allocation results given in Theorem 2.1 should be compared to the domain-efficient allocation in the same stratified SRSWOR on both stages but with separate constraints for the size of the first-stage sample and for the expected size of the second-stage sample as given in Theorem 3.3 of WW. The basic difference is that in the latter paper both \(m_{i,h}\) and \(n_{i,h,j,g}\) depend on the eigenvector \({\underline{v}}^*\), while in the above result the eigenvector appears only in formula (11) for \(m_{i,h}\) and formula (12) is free from \({\underline{v}}^*\). This is the major, and by no means obvious, structural consequence of the fact that the constraint we consider here is imposed on the expected costs of the first and the second stage jointly.

Proof of Theorem 2.1

Note that since \(\kappa _i\), \(i=1,\ldots ,I\), are fixed and known, minimizing relative variances \(T_i=T_i({\underline{m}},{\underline{n}})\), \(i=1,\ldots ,I\), is equivalent to minimize T under constraints (9) and (10). Therefore, the Lagrange function has the form

$$\begin{aligned}&F(T,{\underline{m}},{\underline{n}})=T+\sum _{i=1}^I\,\lambda _i\left( \tfrac{T_i({\underline{m}},{\underline{n}})}{\kappa _i}-T\right) \\&\qquad +\,\mu \left( \sum _{i=1}^I\,c_{I,i,h}^2\,\sum _{h=1}^{H_i}m_{i,h} +\sum _{j=1}^J\,\sum _{h=1}^{H_j}\,\tfrac{m_{j,h}}{M_{j,h}}\,\sum _{i\in {\mathcal {W}}_{j,h}}\,c_{II,i,h,j}^2\,\sum _{g=1}^{G_{j,h,i}}\,n_{j,h,i,g}\right) . \end{aligned}$$

Note that

$$\begin{aligned} \tfrac{\partial \,F}{\partial \,n_{i,h,j,g}}=-\tfrac{\lambda _iM_{i,h}\beta _{i,h,j,g}^2}{\rho _i^2 m_{i,h}n_{i,h,j,g}^2}+\mu \,\tfrac{m_{i,h}}{M_{i,h}}\,c_{II,i,h,j}^2=0. \end{aligned}$$

Consequently, \(\lambda _i>0\) and

$$\begin{aligned} m_{i,h}n_{i,h,j,g}=\tfrac{\sqrt{\lambda _i}M_{i,h}\beta _{i,h,j,g}}{\,c_{II,i,h,j}\rho _i\sqrt{\mu }}. \end{aligned}$$
(15)

Moreover,

$$\begin{aligned} \tfrac{\partial \,F}{\partial \,m_{i,h}}= & {} -\tfrac{\lambda _i}{m_{i,h}^2\rho _i^2}\left( \gamma _{i,h}^2+M_{i,h} \sum _{j\in {\mathcal {V}}_{i,h}}\,\sum _{g=1}^{G_{i,h,j}}\,\tfrac{\beta _{i,h,j,g}}{n_{i,h,j,g}}\right) \\&+\,\mu \left( c_{I,i,h}^2+\tfrac{1}{M_{i,h}}\sum _{j\in {\mathcal {V}}_{i,h}}\,c_{II,i,h,j}^2\,\sum _{g=1}^{G_{i,h,j}}\,n_{i,h,j,g}\right) =0. \end{aligned}$$

The above after multiplication by \(m_{i,h}\) can alternatively be written as

$$\begin{aligned}&-\tfrac{\lambda _i\gamma _{i,h}^2}{m_{i,h}\rho _i^2} -\tfrac{\lambda _iM_{i,h}}{\rho _i^2}\sum _{j\in {\mathcal {V}}_{i,h}}\,\sum _{g=1}^{G_{i,h,j}}\,\tfrac{\beta _{i,h,j,g}^2}{m_{i,h}n_{i,h,j,g}}\\&\qquad +\,\mu \,c_{I,i,h}^2m_{i,h} +\mu \tfrac{1}{M_{i,h}}\sum _{j\in {\mathcal {V}}_{i,h}}\,c_{II,i,h,j}^2\,\sum _{g=1}^{G_{i,h,j}}\,m_{i,h}n_{i,h,j,g}=0. \end{aligned}$$

Note that due to (15), the second and fourth terms above cancel out and thus

$$\begin{aligned} m_{i,h}=\tfrac{\sqrt{\lambda _i}\gamma _{i,h}}{c_{I,i,h}\,\rho _i\sqrt{\mu }}. \end{aligned}$$
(16)

Now (12) follows by combining (16) with (15).

To find \(m_{i,h}\), we plug (15) and (16) into the cost constraint (9) and obtain

$$\begin{aligned} \sqrt{\mu }=\tfrac{1}{C}\,\sum _{i=1}^I\,\tfrac{\sqrt{\lambda _i}}{\rho _i}\,\nu _i \end{aligned}$$
(17)

Now let us insert (15) and (16) in the constraint (10). It leads to the equation

$$\begin{aligned} \sum _{h=1}^{H_i}\,\tfrac{\gamma _{i,h}^2\,c_{I,i,h}\sqrt{\mu }}{\sqrt{\lambda _i}\gamma _{i,h}} +\sum _{h=1}^{H_i}\,M_{i,h}\sum _{j\in {\mathcal {V}}_{i,h}}\sum _{g=1}^{G_{i,h,j}} \,\tfrac{\beta ^2_{i,h,j,g}\,c_{II,i,h,j}\,\sqrt{\mu }}{\sqrt{\lambda _i}M_{i,h}\beta _{i,h,j,g}}-\tfrac{1}{\rho _i}\sum _{h=1}^{H_i}\,M_{i,h}\,D_{i,h}^2=\rho _i\,T. \end{aligned}$$

Multiply its both sides by \(v_i:=\sqrt{\lambda _i}\), divide by \(\rho _i\) and rewrite as

$$\begin{aligned} \sqrt{\mu }\tfrac{\nu _i}{\rho _i}-c_iv_i=Tv_i. \end{aligned}$$

Expanding now \(\sqrt{\mu }\) according to (17), we obtain

$$\begin{aligned} C^{-1}\,\tfrac{\nu _i}{\rho _i}\,\sum _{r=1}^I\,\tfrac{v_r\nu _r}{\rho _r}-c_iv_i=Tv_i, \end{aligned}$$

which is valid for any \(i=1,\ldots ,I\). Note that in terms of the vector \({\underline{a}}\) defined in the formulation of the theorem, the above can be rewritten as

$$\begin{aligned} \tfrac{a_i\,{\underline{a}}^T}{C}{\underline{v}}-c_iv_i=Tv_i,\quad i=1,\ldots ,I, \end{aligned}$$

or equivalently \(\left( \tfrac{{\underline{a}}\,{\underline{a}}^T}{C}-\mathrm {diag}({\underline{c}})\right) {\underline{v}}=T{\underline{v}}\).

The final part of the proof follows closely the argument given in WW and is recalled here just for the readers’ convenience.

Consider now the matrix \(\mathbf{D}\) and let \(\lambda ^*\) be its positive eigenvalue. To show that it is simple, unique and the eigenvector \({\underline{v}}^*\) attached to this eigenvalue has all coordinates of the same sign, we use the celebrated Perron–Frobenius theorem: If \(\mathbf{A}\) is a matrix with all strictly positive entries, then there exists a unique positive eigenvalue \(\nu \) of \(\mathbf{A}\); it is simple and such that \(\nu >|\lambda |\) for any other eigenvalue \(\lambda \) of \(\mathbf{A}\). The respective eigenvector (attached to \(\nu \) ) has all entries strictly positive (up to scalar multiplication)—see, for example, Kato (1981, Th. 7.3 in Ch. 1).

Fix a number \(\rho> \max _{1\le i\le I} c_i>0\). The matrix \(\mathbf{D} + \rho \mathbf{I}\), where \(\mathbf{I}\) is the identity matrix, has all entries strictly positive. For any eigenvalue \(\delta _j\) of \(\mathbf{D}\) and respective eigenvector \({\underline{w}}_j\)

$$\begin{aligned} (\mathbf{D} + \rho \mathbf{I}){\underline{w}}_j = (\delta _j+\rho ){\underline{w}}_j,\quad j=1,\ldots ,I. \end{aligned}$$

That is \(\delta _j+\rho \) and \({\underline{w}}_j\), \(j = 1,\ldots ,d\), are respective eigenvalues and eigenvectors of the matrix \(\mathbf{D} + \rho \mathbf{I}\). By the Perron–Frobenius theorem, there exists \(j_0\) such that \(\delta _{j_0}+\rho j \ge |\delta _j+\rho |\) for any \(j=1,\ldots ,I\) and respective eigenvector \({\underline{w}}_{j_0}\) has all entries of the same sign. Consequently, \(\delta _{j_0}+\rho j \ge \delta _j+\rho \), and thus \(\delta _{j_0}\ge \delta _j\) for any \(j=1,\ldots ,I\). Therefore, by assumption that \(\lambda ^*\) is the unique positive eigenvalue of \(\mathbf{D}\), it follows that \(T = \lambda ^* =\delta _{j_0}\) and the respective eigenvector \({\underline{v}}^* ={\underline{w}}_{j_0}\) has all entries of the same sign.

Now formulas (11) and (12) follow directly from (16) and (15). \(\square \)

Remark 2.5

Of course, as always when such allocation problems are solved without the natural box constraints: \(m_{i,h}\le M_{i,h}\) and \(n_{i,h,j,g}\le N_{i,h,j,g}\) (and this is the case of eigenproblem approach), the solution may violate some of them. Then, it is standard to set \(m_{i,h}=M_{i,h}\) and \(n_{i,h,j,g}=N_{i,h,j,g}\) in all instances of violation of the respective box constraint and then repeat the minimization procedure for reduced population and reduced cost constraint. It may produce solutions which are not optimal (though, typically, close to them). On the other hand, it is known, for example, in the case of the problem of optimal allocation in stratified SRSWOR that it is possible to reduce the population since the optimal solution requires to take \(n_h=N_h\) in some strata. Then, minimization can be performed on such reduced population—see, for example, Lemma 1 in Stenger and Gabler (2005). This approach has been developed by introducing box constraints to the numerical procedure of optimal allocation in Gabler et al. (2012); computational aspects of such procedures are analyzed in Münnich et al. (2012) (with further references given in that paper).

We do not consider here also exact optimality with respect to integer solutions. In this context, it is worth to mention again stratified SRSWOR for which an integer-valued optimal allocation has been recently given in Wright (2017) and, another one, even earlier by Friedrich et al. (2015). Here, we are fully satisfied with, for example, random rounding of non-integer allocation, which typically gives solutions close to optimal.

3 Special cases

3.1 Stratification only at the first stage

This is probably the most popular of the two-stage schemes used in practice. In this case, we have \(G_{i,h,j}=1\) for any (ihj) and thus it allows for a considerable simplification of the notation used in Sect. 2. The constraints imposed by priority weights for relative variances in domains assume the form

$$\begin{aligned} T_i= & {} \tfrac{1}{\tau _i^2}\,\sum _{h=1}^{H_i}\,\tfrac{1}{m_{i,h}}\left( \gamma _{i,h}^2+M_{i,h} \sum _{j\in {\mathcal {V}}_{i,h}}\,\tfrac{\beta _{i,h,j}^2}{n_{i,h,j}}\right) \\&-\,\tfrac{1}{\tau _i^2}\sum _{h=1}^{H_i}\,M_{i,h}\,D_{i,h}^2=\kappa _i\,T,\qquad i=1,\ldots ,I, \end{aligned}$$

where

$$\begin{aligned} \gamma _{i,h}^2=M_{i,h}\,\left( M_{i,h}D_{i,h}^2-\sum _{j\in {\mathcal {V}}_{i,h}}\,\,N_{i,h,j}\,S_{i,h,j}^2\right) ,\qquad \beta _{i,h,j}=N_{i,h,j}{S}_{i,h,j} \end{aligned}$$

and in jth PSU from \({\mathcal {V}}_{i,h}\): The number of SSUs is \(N_{i,h,j}\), the population variance among SSUs is \(S_{i,h,j}^2\) and the sample size is \(n_{i,h,j}\). Here, \(D_{i,h}^2\) and \(M_{i,h}\) have the same definition as in Sect. 2.

The cost constraint (9) changes to

$$\begin{aligned} \sum _{i=1}^I\,\sum _{h=1}^{H_i}\,c_{I,i,h}^2m_{i,h} +\sum _{i=1}^I\,\sum _{h=1}^{H_i}\,\tfrac{m_{i,h}}{M_{i,h}}\,\sum _{j\in {\mathcal {V}}_{i,h}}\,c_{II,i,h,j}^2\,n_{i,h,j}=C \end{aligned}$$

From Theorem 2.1 [if its assumptions, in particular the respective version of (14), are satisfied], we conclude that the optimal allocation at the first stage is:

$$\begin{aligned} m_{i,h}=C\tfrac{v_i^* \gamma _{i,h}}{\rho _i\,c_{I,i,h}^2\, \sum _{r=1}^I\,v_r^*\,\nu _r/\rho _r}\quad \text{ where }\quad \nu _r=\sum _{s=1}^{H_r}\left( c_{I,r,s}\gamma _{r,s}+\sum _{t\in {\mathcal {V}}_{r,s}}\,c_{II,r,s,t}\,\beta _{r,s,t}\right) ,\nonumber \\ \end{aligned}$$
(18)

\({\underline{v}}^*\) is the eigenvector (with positive components) of the matrix \(\mathbf{D}=\tfrac{{\underline{a}}{\underline{a}}^T}{C}-\mathrm {diag}({\underline{c}})\) with

$$\begin{aligned} a_i=\tfrac{\nu _i}{\rho _i},\quad c_i=\tfrac{1}{\rho _i^2}\,\sum _{h=1}^{H_i}\,M_{i,h}\,D_{i,h}^2, \end{aligned}$$
(19)

and the optimal allocation at the second stage is

$$\begin{aligned} n_{i,h,j}=\tfrac{c_{I,i,h}M_{i,h}\beta _{i,h,j}}{c_{II,i,h,j}\,\gamma _{i,h}}. \end{aligned}$$
(20)

Due to its important role in practice, we chose this setting for presenting the core part of an R-code which produces the domain-efficient allocation. Assume that vectors a\(:={\underline{a}}\) and c\(:={\underline{c}}\) have already been computed according to (19) and that the vector of priority weights \((\kappa _i,\, i=1,\ldots ,I)\) is denoted by \(\texttt {kap}\). Then, to find the respective eigenvector, one may use the following code in \(\mathrm {R}\) (function eigen being its essence)

figure a

After computing the eigenvector \({\underline{v}}:=\texttt {v}\) as given in the last line of the R-code above, one can calculate the optimal sample sizes \(m_{i,h}\) and \(n_{i,h,j}\) according to (18) and (20), respectively.

The R-code given above was adapted from the full R-code as given in https://github.com/rwieczor/eigenproblem_sample_allocation, which was created (in connection with WW) for optimal fixed precision allocation in subpopulations in two-stage sampling with the stratified Hartley–Rao \(\pi \)ps scheme at the first stage and SRSWOR at the second stage and with constraints imposed separately on the size of the sample at the first and on the expected size of the sample at the second stage.

3.2 Stratification only at the second stage

Here, we have \(H_i=1\) for any i. It allows to also simplify the notation of Sect. 2. The constraints imposed by priority weights for relative variances in domains assume the form

$$\begin{aligned} T_i=\tfrac{1}{m_i\tau _i^2}\left( \gamma _i^2+M_i \sum _{j\in {\mathcal {V}}_i}\,\sum _{g=1}^{G_{i,j}}\tfrac{\beta _{i,j,g}^2}{n_{i,j,g}}\right) -\tfrac{1}{\tau _i^2}M_i\,D_i^2=\kappa _i\,T,\qquad i=1,\ldots ,I, \end{aligned}$$

where

$$\begin{aligned} \beta _{i,j,g}=N_{i,j,g}S_{i,j,g},\qquad D^2_i=\tfrac{1}{M_i-1}\sum _{j\in {\mathcal {V}}_i}\left( t_j-{\bar{t}}_i\,\right) ^2. \end{aligned}$$

and \(N_{i,j,g}\) is number of SSUs, \(S_{i,j,g}^2\) is the population variance among SSUs and \(n_{i,j,g}\) is the sample size, in gth SSU stratum of jth PSU from \({\mathcal {V}}_i\). Moreover, \(G_{i,j}\) is the number of SSUs strata in jth PSU from \({\mathcal {V}}_i\), \(M_i=\#({\mathcal {V}}_i)\) and

$$\begin{aligned} \gamma _i^2=M_i\,\left( M_iD_i^2-\sum _{j\in {\mathcal {V}}_i}\,\sum _{g=1}^{G_{i,j}}\,N_{i,j,g}\,S_{i,j,g}^2\right) . \end{aligned}$$

Here, the version of the cost constraint (9) is

$$\begin{aligned} \sum _{i=1}^I c_{I,i}^2\,m_i +\sum _{i=1}^I\,\tfrac{m_i}{M_i}\,\sum _{j\in {\mathcal {V}}_i}\,c_{II,i,j}^2\,\sum _{g=1}^{G_{i,j}}\,n_{i,j,g}=C. \end{aligned}$$

From Theorem 2.1 (if its assumptions are satisfied), it follows that the optimal allocation at the first stage is

$$\begin{aligned} m_i=C\tfrac{v_i^* \gamma _i}{\rho _i\,c_{I,i}\, \sum _{r=1}^I\,v_r^*\,\nu _r/\rho _r},\quad \text{ where }\quad \nu _r= c_{I,r}\gamma _r+\sum _{t\in {\mathcal {V}}_r}\,c_{II,r,t}\,\sum _{u=1}^{G_{r,t}}\beta _{r,t,u} , \end{aligned}$$

\({\underline{v}}^*\) is the eigenvector (having all components positive) of the matrix \(\mathbf{D}=\tfrac{{\underline{a}}\,{\underline{a}}^T}{C}-\mathrm {diag}({\underline{c}})\) with

$$\begin{aligned} a_i=\tfrac{\nu _i}{\rho _i},\quad c_i=\tfrac{M_iD_i^2}{\rho _i^2}, \end{aligned}$$

and the optimal allocation at the second stage is

$$\begin{aligned} n_{i,j,g}=\tfrac{c_{I,i}M_i\beta _{i,j,g}}{c_{II,i,j}\gamma _i}. \end{aligned}$$

3.3 No stratification at stage one and two

That is, we assume \(H_i=1\) and \(G_{i,h,j}=1\) for any (ihj). In this case, the formulas are further simplified. The constraints imposed by priority weights for relative variances in domains assume the form

$$\begin{aligned} T_i=\tfrac{1}{\tau _i^2m_i}\left( \gamma _i^2+M_i \sum _{j\in {\mathcal {V}}_i}\,\tfrac{\beta _{i,j}^2}{n_{i,j}}\right) -\tfrac{1}{\tau _i^2}\,M_i\,D_i^2=\kappa _i\,T,\qquad i=1,\ldots ,I, \end{aligned}$$

where

$$\begin{aligned} \beta _{i,j}=N_{i,j}S_{i,j} \end{aligned}$$

and \(N_{i,j}\) is number of SSUs, \(S_{i,j}^2\) is the population variance among SSUs and \(n_{i,j}\) is the sample size, in jth PSU from \({\mathcal {V}}_i\). Moreover,

$$\begin{aligned} \gamma _i^2=M_i\,\left( M_iD_i^2-\sum _{j\in {\mathcal {V}}_i}\,N_{i,j,}\,S_{i,j}^2\right) \end{aligned}$$

with \(M_i\) and \(D_i\) defined as in Sect. 3.2. The cost constraint (9) assumes a simple form

$$\begin{aligned} \sum _{i=1}^I\,c_{I,i}^2\, m_i +\sum _{i=1}^I\,\tfrac{m_i}{M_i}\,\sum _{j\in {\mathcal {V}}_i}\,c_{II,i,j}^2\,n_{i,j}=C. \end{aligned}$$

From Theorem 2.1 (if its assumptions are satisfied), we conclude that the optimal allocation at the first stage is

$$\begin{aligned} m_i=C\tfrac{v_i^* \gamma _i}{\rho _i\,c_{I,i}\, \sum _{r=1}^I\,v_r^*\,\nu _r/\rho _r},\quad \text{ where }\quad \nu _r=c_{I,r}\gamma _r+\sum _{t\in {\mathcal {V}}_r}\,c_{II,r,t}\,\beta _{r,t}, \end{aligned}$$

\({\underline{v}}^*\) is the eigenvector (having all components positive) of the matrix \(\mathbf{D}=\tfrac{{\underline{a}}\,{\underline{a}}^T}{C}-\mathrm {diag}({\underline{c}})\) with

$$\begin{aligned} a_i=\tfrac{\nu _i}{\sqrt{\kappa _i}}, \end{aligned}$$

\({\underline{c}}\) defined as in Sect. 3.2 and the optimal allocation at the second stage is

$$\begin{aligned} n_{i,j}=\tfrac{c_{I,i}M_i\beta _{i,j}}{c_{II,i,j}\,\gamma _i}. \end{aligned}$$

4 Two-stage sampling with \(\mathrm {pps}\) sampling

4.1 \(\mathrm {pps}\) Sampling at the first stage and SRSWOR at the second stage

We draw the PSUs ordered sample \({\mathcal {S}}^{(I)}=(K_1,\ldots ,K_m)\) using \(\mathrm {pps}\) sampling, meaning that PSUs are drawn m times with replacement (that is, independently), jth with probability \(p_j\) which is proportional to its size, \(j\in {\mathcal {V}}\) (population of PSUs). Then, if jth PSU belongs to \({\mathcal {S}}^{(I)}\), we draw (by SRSWOR) from it a sample (of size \(n_j\)) of SSUs, obtaining in this way the sample \({\mathcal {S}}^{(II)}_j\), \(j\in {\mathcal {S}}^{(I)}\). Such sampling scheme is considered in Ch. 4.5 of Särndal et al. (1992) (in particular, in Result 4.5.1 the unbiased estimator and its variance are given). A population-efficient allocation procedure for this setup has been given recently in Valliant et al. (2015) as one of the options in the PracTools R package. Importance of this scheme is due to the fact that when the sample of PSUs is sufficiently small, sampling with or without replacement gives the same results. Consequently, very often in practice, the first-stage variance in \(\pi \mathrm {ps}\) without replacement sampling is approximated by its \(\mathrm {pps}\) version. It appears that in such case the eigenproblem methodology we develop here allows for a closed analytic formula for the eigenvector responsible for the domains-efficient allocation. It follows from the fact that the respective population matrix is of rank one. The details are given below.

The unbiased estimator of the population total \(\tau =\sum _{k\in U}\,y_k\) is

$$\begin{aligned} {\hat{\tau }}=\tfrac{1}{m}\sum _{r=1}^m\,\tfrac{{\hat{t}}_{K_r}}{p_{K_r}}, \end{aligned}$$

where \({\hat{t}}_j=\tfrac{1}{n_j}\sum _{k\in {\mathcal {S}}_j^{(II)}}\,y_k\) for any PSU j. Its variance has the form

$$\begin{aligned} D^2({\hat{\tau }})=\tfrac{1}{m}\sum _{j\in {\mathcal {V}}}\,p_j\left( \tfrac{t_j}{p_j}-\tau \right) ^2+\tfrac{1}{m}\sum _{j\in {\mathcal {V}}}\,\tfrac{D^2_j}{p_j}, \end{aligned}$$

where for any \(j\in {\mathcal {V}}\) we denote

$$\begin{aligned} t_j=\sum _{k\in PSU(j)}\,y_k,\quad \quad D^2_j=N_j^2\left( \tfrac{1}{n_j}-\tfrac{1}{N_j}\right) S^2_j,\quad \quad S^2_j=\tfrac{1}{N_j-1}\sum _{k\in PSU(j)}\,\left( y_k-\tfrac{t_j}{N_j}\right) ^2. \end{aligned}$$

To obtain the optimal allocation of the sample at the first and at the second stage in the domains with given priority weights \(\kappa _i\), \(i=1,\ldots ,I\), we need to minimize

$$\begin{aligned} T_i=\tfrac{1}{\tau _i^2}\left( \tfrac{1}{m_i}\sum _{j\in {\mathcal {V}}_i}\,p_{i,j}\left( \tfrac{t_{i,j}}{p_{i,j}}-\tau _i\right) ^2+\tfrac{1}{m_i}\sum _{j\in {\mathcal {V}}_i}\,\tfrac{D^2_{i,j}}{p_{i,j}}\right) ,\quad i=1,\ldots ,I, \end{aligned}$$

under the constraints given by the priority weights \(T_i=\kappa _i T\) and the EVC constraint

$$\begin{aligned} \sum _{i=1}^I\,\left( m_ic_{I,i}^2+m_i\,\sum _{j\in {\mathcal {V}}_i}\,c_{II,i,j}^2p_{i,j}n_{i,j}\right) =C, \end{aligned}$$
(21)

where C is the total expected cost of the survey, \(c_{I,i}\) is the cost incurred by a PSU from \(V_i\) (assumed to be constant within the domain) and \(c_{II,i,j}\) is the cost incurred by a SSU belonging to the jth PSU from the ith domain.

This setting is somewhat different, actually, simpler than considered earlier. It is due to the fact that in the expression for \(T_i\) all summands are multiplied by \(1/m_i\).

Theorem 4.1

Assume that for any \(i=1,\ldots ,I\)

$$\begin{aligned} \sum _{j\in {\mathcal {V}}_i}\,\left[ p_{i,j}\left( \tfrac{t_{i,j}}{p_{i,j}}-\tau _i\right) ^2-\tfrac{N_{i,j}S_{i,j}^2}{p_{i,j}}\right] >0. \end{aligned}$$

Then, the allocation minimizing \(T_i=\kappa _i\,T\), \(i=1,\ldots ,I,\) (as well as the relative variance \({S}\) in the whole population) under the cost constraint (21) has the form

$$\begin{aligned} m_i=C\tfrac{A_i\left( c_{I,i}A_i+\sum _{j\in V_i}\,c_{II,i,j}B_{i,j}\sqrt{p_{i,j}}\right) }{c_{I,i}\,\sum _{r=1}^I\,c_{I,r}A_r+\sum _{s\in V_r}\,c_{II,r,s}B_{r,s}\sqrt{p_{r,s}}},\quad i=1,\ldots ,I, \end{aligned}$$

and

$$\begin{aligned} n_{i,j}=\tfrac{c_{I,i}B_{i,j}}{A_i,c_{II,i,j}\sqrt{p_{i,j}}},\quad j\in {\mathcal {V}}_i,\;i=1,\ldots ,I, \end{aligned}$$

where

$$\begin{aligned} A_i^2=\tfrac{1}{\tau _i^2\kappa _i}\,\sum _{j\in {\mathcal {V}}_i}\,\left[ p_{i,j}\left( \tfrac{t_{i,j}}{p_{i,j}}-t_i\right) ^2-\tfrac{N_{i,j}S_{i,j}^2}{p_{i,j}}\right] , \quad \text{ and }\quad B_{i,j}^2=\tfrac{N_{i,j}^2S_{i,j}^2}{p_{i,j}}. \end{aligned}$$

Proof

Similarly, as in the proof of Theorem 2.1, we consider the Lagrange function

$$\begin{aligned} F(T,(m_i),(n_{i,j});(\lambda _i),\mu )= & {} T+\sum _{i=1}^I\,\lambda _i\left( \tfrac{A_i^2}{m_i}+\sum _{j\in \mathcal {{\mathcal {V}}}_i}\,\tfrac{B_{i,j}^2}{m_in_{i,j}}-T\right) \\&+\,\mu \sum _{i=1}^I\,m_i\left( c_{I,i}^2+\sum _{j\in {\mathcal {V}}_i}\,c_{II,i,j}^2p_{i,j}n_{i,j}\right) \end{aligned}$$

Again, following the steps of the proof of Theorem 2.1, we arrive at

$$\begin{aligned} m_in_{i,j}=\tfrac{\sqrt{\lambda _i}B_{i,j}}{\sqrt{\mu }\,c_{II,i,j}\sqrt{p_{i,j}}}\quad \text{ and }\quad m_i=\tfrac{\sqrt{\lambda _i}A_i}{\sqrt{\mu }c_{I,i}}. \end{aligned}$$
(22)

Thus, the formula for \(n_{i,j}\) follows.

Inserting both expressions from (22) into the cost constraints, we obtain

$$\begin{aligned} \sqrt{\mu }=\,\tfrac{{\underline{a}}^T\,{\underline{v}}}{C}, \end{aligned}$$

where

$$\begin{aligned} {\underline{a}}=(a_1,\ldots ,a_I)^T\qquad \text{ with }\quad a_i=c_{I,i}A_i+\sum _{j\in {\mathcal {V}}_i}\,c_{II,i,j}B_{i,j}\sqrt{p_{i,j}} \end{aligned}$$

and \({\underline{v}}=(v_1,\ldots ,v_I)\) with \(v_i=\sqrt{\lambda _i}\).

On the other hand, plugging formulas (22) into the constraints \(T_i=\kappa _i T\), we get

$$\begin{aligned} Tv_i=\sqrt{\mu }\,C\,a_i={\underline{a}}^T\,{\underline{v}}\,a_i,\quad i=1,\ldots ,I, \end{aligned}$$

which is equivalent to

$$\begin{aligned} \tfrac{1}{C}\,{\underline{a}}\,{\underline{a}}^T\,{\underline{v}}=T{\underline{v}}. \end{aligned}$$

That is, \({\underline{v}}\) is an eigenvector of the matrix \(\mathbf{D}=\tfrac{1}{C}{\underline{a}}\,{\underline{a}}^T\) associated with eigenvalue T. Since the matrix \(\mathbf{D}\) is semi-positive definite of rank 1, the number T is its only nonzero simple positive eigenvalue. Moreover, note that \({\underline{v}}^*:=\sqrt{C}{\underline{a}}\) is the eigenvector of \(\mathbf{D}\) associated with eigenvalue \(\Vert {\underline{a}}\Vert ^2/C\). Finally, from (22), we obtain

$$\begin{aligned} m_i=C\tfrac{a_iA_i}{c_{I,i}\,\Vert {\underline{a}}\Vert ^2}\qquad \text{ and }\qquad n_{i,j}=\tfrac{c_{I,i}\,B_{i,j}}{A_i\,c_{II,i,j}\sqrt{p_{i,j}}},\quad j\in {\mathcal {V}}_i,\;i=1,\ldots ,I. \end{aligned}$$

\(\square \)

4.2 SRSWOR at the first stage and pps sampling at the second stage

For completeness of the picture for two-stage sampling involving pps approach, let us consider the situation when the PSUs sample \({\mathcal {S}}^{(I)}\) is drawn through SRSWOR and the SSUs sample by sampling with replacement with probabilities \(p_k\) proportional to the size of kth unit. Here, the simplification of Sect. 4.1 is no longer available. This case falls under the general framework developed in Sect. 3.3.

The standard estimator of the total is

$$\begin{aligned} {\hat{t}}=\tfrac{M}{m}\,\sum _{j\in {\mathcal {S}}^{(I)}}\,\tfrac{1}{n_j}\sum _{\ell =1}^{n_j}\,\tfrac{y_{K_{j,\ell }}}{p_{K_{j,\ell }}}, \end{aligned}$$

where m is the number of PSUs drawn by the SRSWOR from the total of M PSUs in the population, \(n_j\) is the number of “with-replacement” draws from jth PSU, \(K_{j,\ell }\) is the SSU drawn from jth PSU in the \(\ell \)th draw (with replacement), \(j\in {\mathcal {V}}\) (PSUs population of size M). Evidently, \({\hat{t}}\) is unbiased for the population total. Its variance is

$$\begin{aligned} D^2({\hat{t}})=M^2\left( \tfrac{1}{m}-\tfrac{1}{M}\right) S_I^2+\tfrac{M}{m}\,\sum _{j\in {\mathcal {V}}}\,\tfrac{1}{n_j}D^2_{II,j}, \end{aligned}$$

where

$$\begin{aligned} S_I^2=\tfrac{1}{M-1}\,\sum _{j\in {\mathcal {V}}}\,\left( t_j-{\bar{t}}\right) ^2,\quad {\bar{t}}=\tfrac{1}{M}\,\sum _{j\in {\mathcal {V}}}\,t_j, \end{aligned}$$

and for any \(j\in V\)

$$\begin{aligned} t_j=\sum _{k\in PSU_j}\,y_k,\quad \quad D_{II,j}^2=\sum _{k\in PSU_j}\,\left( \tfrac{y_k}{p_k}-t_j\right) ^2p_k. \end{aligned}$$

Consequently, to obtain the optimal allocation of the samples (on the first and second stage) in the domains with given priority weights \(\kappa _i\), \(i=1,\ldots ,I\), we need to minimize

$$\begin{aligned} T_i=\tfrac{1}{\tau _i^2}\left( \tfrac{M_i^2S_{I,i}^2}{m_i}+\tfrac{M_i}{m_i}\,\sum _{j\in {\mathcal {V}}_i}\,\tfrac{1}{n_{i,j}}D^2_{II,i,j}-M_iS_{I,i}^2\right) ,\quad i=1,\ldots ,I, \end{aligned}$$

under the constraints given by the priority weights \(T_i=\kappa _i T\) and the expected cost constraint

$$\begin{aligned} \sum _{i=1}^I\,\left( c_{I,i}^2m_i+\tfrac{m_i}{M_i}\,\sum _{j\in {\mathcal {V}}_i}\,c_{II,i,j}^2n_{i,j}\right) =C, \end{aligned}$$

where C is the total expected cost of the survey, \(c_{I,i}\) is the cost incurred by a PSU from \(V_i\) (assumed to be constant within the domain) and \(c_{II,i,j}\) is the cost incurred by a SSU belonging to the jth PSU from the ith domain.

Since the structure of the problem is exactly the same as for the one considered in Sect. 3.3, we conclude that the optimal allocation has the form

$$\begin{aligned} m_i=C\tfrac{v_i^* \gamma _i}{\sqrt{\kappa _i}\,c_{I,i}\, \sum _{r=1}^I\,v_r^*\,\tfrac{1}{\sqrt{\kappa _r}}\,\left( c_{I,r}\gamma _r+\sum _{t\in {\mathcal {V}}_r}\,c_{II,r,t}\,\beta _{r,t}\right) },\quad i=1,\ldots ,I, \end{aligned}$$

and

$$\begin{aligned} n_{i,j}=C\tfrac{M_i}{m_i}\,\tfrac{v_i^*\beta _{i,j}}{\sqrt{\kappa _i}\,c_{II,i,j}\,\sum _{r=1}^I\,v_r^*\,\tfrac{1}{\sqrt{\kappa _r}}\, \left( c_{I,r}\,\gamma _r+\sum _{t\in {\mathcal {V}}_r}\,{\underline{c}}_{II,r,t}\,\beta _{r,t}\right) },\quad j\in {\mathcal {V}}_i,\;i=1,\ldots ,I, \end{aligned}$$

where

$$\begin{aligned} \gamma _i^2=\tfrac{M_iS_{I,i}^2}{\tau _i^2},\qquad \beta _{i,j}^2=\tfrac{D^2_{II,i,j}}{\tau _i^2} \end{aligned}$$

and \({\underline{v}}^*\) is the eigenvector (having all components of the same sign) of the matrix \(\mathbf{D}=\tfrac{{\underline{a}}{\underline{a}}^T}{C}-\mathrm {diag}({\underline{c}})\) with components of \({\underline{a}}\) of the form

$$\begin{aligned} a_i=\tfrac{c_{I,i}\gamma _i\,+\,\sum _{j\in {\mathcal {V}}_i}\,c_{II,i,j}\,\beta _{i,j}}{\sqrt{\kappa _i}},\quad \text{ and }\quad c_i=\tfrac{M_iS_{I,i}^2}{\tau _i^2},\quad i=1,\ldots ,I. \end{aligned}$$

5 Three-stage sampling without stratification

In multistage sampling, typically, we do not go beyond three-stage sampling. This scheme is described in detail, for example, in Särndal et al. (1992, Ch. 4.4.2). The optimal allocation of the sample between three stages under the cost constraints, with the additional simplifying assumption that the sizes of SSU and TSU (tertiary sampling unit) samples do not depend on PSU or SSU, respectively, had been studied already in Cochran (1977, Ch. 10.8) (see also Singh 2003, Ch. 10.4). Recently, the optimal allocation procedure, using a simplified variance formula with the standard constraints regarding the total costs, was designed in Valliant et al. (2015) as a part of their PracTools R package. An application of such a simple three-stage sampling design is given, for example, in Tate and Hudgens (2007).

In this section, we analyze the eigenproblem approach to the domain-efficient allocation of sample in three-stage sampling, but first we recall the Neyman-type optimal allocation in the case of no domains.

It is well known that the variance of the standard estimator \({\hat{t}}\) of the total of a variable \({\mathcal {Y}}\) in a population U under three-stage sampling with SRSWOR on every stage has the form

$$\begin{aligned} D^2= & {} \left( \tfrac{1}{\ell }-\tfrac{1}{L}\right) \,L^2S_I^2+\tfrac{L}{\ell }\sum _{j=1}^L\,\left( \tfrac{1}{m_j}-\tfrac{1}{M_j}\right) \,M_j^2S_{II,j}^2\\&+\,\tfrac{L}{\ell }\sum _{j=1}^L\,\tfrac{M_j}{m_j}\,\sum _{k=1}^{M_j}\,\left( \tfrac{1}{n_{j,k}}-\tfrac{1}{N_{j,k}}\right) N_{j,k}^2S_{III,j,k}^2, \end{aligned}$$

where L and \(\ell \) denote the number of PSUs, \(M_j\) and \(m_j\) the number of SSUs in the jth PSU, \(N_{j,k}\) and \(n_{j,k}\) the number of TSUs in (jk)th SSU, in population and in the sample, respectively; moreover, \(S_I^2\), \(S_{II,j}^2\), \(S_{III,j,k}^2\) denote population variances for PSUs in U, SSUs in jth PSU and TSUs in (jk)th SSU.

Then, the minimization of \(D^2\) (or \(D^2/\tau ^2\)) under the cost constraints

$$\begin{aligned} c_I^2\ell +\tfrac{\ell }{L}\sum _{j=1}^{L}\,c_{II,j}^2m_{j} +\tfrac{\ell }{L}\sum _{j=1}^{L}\tfrac{m_{j}}{M_{j}}\sum _{k=1}^{M_{j}}\,c_{III,j,k}^2n_{j,k}=C, \end{aligned}$$
(23)

where \(c_{I}^2\), \(c_{II,j}^2\) and \(c_{III,j,k}^2\) are costs generated by each PSU, each SSU belonging to jth PSU and each TSU belonging to kth TSU from jth PSU of ith subpopulation, while C denotes the overall cost of the survey, obtained through the standard Neyman approach leads to the following optimal allocation solution

$$\begin{aligned} \ell =\tfrac{C\gamma }{c_I\left( c_I\gamma +\sum _{j=1}^L\left( c_{II,j}\beta _j+\sum _{k=1}^{M_j}\,c_{III,j,k}\delta _{j,k}\right) \right) }, \end{aligned}$$
(24)

where \(\gamma ^2=L(LS_I^2-\sum _{j=1}^L\,M_jS_{II,j}^2)\) is assumed to be positive,

$$\begin{aligned} m_j=\tfrac{c_IL\beta _j}{c_{II,j} \gamma },\quad j=1,\ldots ,L, \end{aligned}$$
(25)

where \(\beta _j^2=M_j\left( M_jS_{II,j}^2-\sum _{k=1}^j\,N_{j,k}S_{III,j,k}^2\right) \) is also assumed to be positive, and

$$\begin{aligned} n_{j,k}=\tfrac{c_{II,j}M_j\delta _{j,k}}{c_{III,j,k}\beta _j},\quad k=1\ldots ,M_j,\;j=1,\ldots ,L, \end{aligned}$$
(26)

where \(\delta _{j,k}^2=N_{j,k}^2S_{III,j,k}^2\). The optimal variance assumes the form

$$\begin{aligned} D^2_{opt}=\tfrac{1}{C}\left( c_I\gamma +\sum _{j=1}^L\left( c_{II,j}\beta _j+\sum _{k=1}^{M_j}\,c_{III,j,k}\delta _{j,k}\right) \right) ^2-LS_I^2. \end{aligned}$$
(27)

Since we will be considering three-stage sampling in subpopulations, all these quantities will be related to a subpopulation by additional subscript \(i=1,\ldots ,I\).

Similarly, as in previous sections, we will be interested in minimization of relative variances in subpopulations, provided they satisfy the constraints defined by priority weights \((\kappa _i,\,i=1,\ldots ,I)\), which assume the form

$$\begin{aligned}&\tfrac{\left( \tfrac{1}{\ell _i}-\tfrac{1}{L_i}\right) \,L_i^2S_{I,i}^2 +\tfrac{L_i}{\ell _i}\sum _{j=1}^{L_i}\,\left( \tfrac{1}{m_{i,j}}-\tfrac{1}{M_{i,j}}\right) \,M_{i,j}^2S_{II,i,j}^2 +\tfrac{L_i}{\ell _i}\sum _{j=1}^{L_i}\,\tfrac{M_{i,j}}{m_{i,j}}\,\sum _{k=1}^{M_{i,j}}\,\left( \tfrac{1}{n_{i,j,k}}-\tfrac{1}{N_{i,j,k}}\right) N_{i,j,k}^2S_{III,i,j,k}^2}{\tau _i^2}\nonumber \\&\qquad =\kappa _iT \end{aligned}$$
(28)

where T is unknown and has to be minimized under an additional total EVC constraint which in the case of three-stage sampling assumes the form

$$\begin{aligned} \sum _{i=1}^I\,c_{I,i}^2\ell _i+\sum _{i=1}^I\,\tfrac{\ell _i}{L_i}\sum _{j=1}^{L_i}\,c_{II,i,j}^2m_{i,j} +\sum _{i=1}^I\,\tfrac{\ell _i}{L_i}\sum _{j=1}^{L_i}\tfrac{m_{i,j}}{M_{i,j}}\sum _{k=1}^{M_{i,j}}\,c_{III,i,j,k}^2n_{i,j,k}=C,\nonumber \\ \end{aligned}$$
(29)

where \(c_{I,i}^2\), \(c_{II,i,j}^2\) and \(c_{III,i,j,k}^2\) are costs generated by each PSU from ith subpopulation, each SSU belonging to jth PSU of ith subpopulation and, each TSU belonging to kth TSU from jth PSU of ith subpopulation, while C denotes the overall cost of the survey.

Therefore, the Lagrange function, up to a constant shift, is of a rather complicated, though regular form:

$$\begin{aligned}&F(T,{\underline{\ell }},{\underline{m}},{\underline{n}})=T+\sum _{i=1}^I\,\tfrac{\lambda _i}{\tau _i^2}\,\left( \tfrac{\gamma _i^2 +L_i\sum _{j=1}^{L_i}\,\tfrac{\beta _{i,j}^2+M_{i,j} \,\sum _{k=1}^{M_{i,j}}\,\tfrac{\delta _{i,j,k}^2}{n_{i,j,k}}}{m_{i,j}}}{\kappa _i\ell _i}-T\right) \\&\quad +\,\mu \left( \sum _{i=1}^I\,\ell _i\left( c_{I,i}^2+\tfrac{1}{L_i}\sum _{j=1}^{L_i}\,m_{i,j}\left( c_{II,i,j}^2 +\tfrac{1}{M_{i,j}}\sum _{k=1}^{M_{i,j}}\,c_{III,i,j,k}^2n_{i,j,k}\right) \right) \right) , \end{aligned}$$

where

$$\begin{aligned}&\gamma _i^2=L_i\left( L_iS_{I,i}^2-\sum _{j=1}^{L_i}\,M_{i,j}S_{II,i,j}^2\right) ,\\&\beta _{i,j}^2=M_{i,j}\left( M_{i,j}S_{II,i,j}^2-\sum _{k=1}^{M_{i,j}}\,N_{i,j,k}S_{III,i,j,k}^2\right) \quad \text{ and }\quad \delta _{i,j,k}^2=N_{i,j,k}^2S_{III,i,j,k}^2. \end{aligned}$$

Denoting \(\rho _i=\tau ^2\sqrt{\kappa _i}\) and differentiating F with respect to:

  1. 1.

    \(\ell _i\) we get

    $$\begin{aligned}&-\tfrac{\lambda _i}{\rho _i^2\ell _i^2}\left( \gamma _i^2+L_i\sum _{j=1}^{L_i}\,\tfrac{1}{m_{i,j}}\left( \beta _{i,j}^2 +M_{i,j}\,\sum _{k=1}^{M_{i,j}}\,\tfrac{\delta _{i,j,k}^2}{n_{i,j,k}}\right) \right) \nonumber \\&\qquad +\,\mu \left( c_{I,i}^2+\tfrac{1}{L_i}\sum _{j=1}^{L_i}m_{i,j}\,\left( c_{II,i,j}^2+\tfrac{1}{M_{i,j}}\sum _{k=1}^{M_{i,j}}\,c_{III,i,j,k}^2n_{i,j,k}\right) \right) =0,\qquad \quad \end{aligned}$$
    (30)
  2. 2.

    \(m_{i,j}\) we get

    $$\begin{aligned}&-\tfrac{\lambda _iL_i}{\rho _i^2\ell _i m_{i,j}^2}\left( \beta _{i,j}^2+M_{i,j}\,\sum _{k=1}^{M_{i,j}}\,\tfrac{\delta _{i,j,k}^2}{n_{i,j,k}}\right) \nonumber \\&\qquad +\,\mu \tfrac{\ell _i}{L_i}\left( c_{II,i,j}^2 +\tfrac{1}{M_{i,j}}\sum _{k=1}^{M_{i,j}}\,c_{III,i,j,k}^2n_{i,j,k}\right) =0, \end{aligned}$$
    (31)
  3. 3.

    \(n_{i,j,k}\) we get

    $$\begin{aligned} -\tfrac{\lambda _iL_iM_{i,j}}{\rho _i^2\ell _im_{i,j}n_{i,j,k}^2}\delta _{i,j,k}^2+\mu \tfrac{\ell _i}{L_i}\, \tfrac{m_{i,j}}{M_{i,j}}\,c_{III,i,j,k}^2=0. \end{aligned}$$
    (32)

Note that from (32), we get

$$\begin{aligned} \ell _i m_{i,j} n_{i,j,k}=\tfrac{\sqrt{\lambda _i}\,L_iM_{i,j}\delta _{i,j,k}}{\sqrt{\mu }\,\sqrt{\kappa _i}\,c_{III,i,j,k}}. \end{aligned}$$
(33)

Multiply now (32) by \(n_{i,j,k}/m_{i,j}\) and insert it into (31). After cancellations, one gets

$$\begin{aligned} \ell _i m_{i,j}=\tfrac{\sqrt{\lambda _i}\,L_i\beta _{i,j}}{\sqrt{\mu }\,\sqrt{\kappa _i}\,c_{II,i,j}}. \end{aligned}$$
(34)

Now multiply (31) by \(m_{i,j}/\ell _i\) and insert both into (30). After cancellations, one gets

$$\begin{aligned} \ell _i=\tfrac{\sqrt{\lambda _i}\gamma _i}{\sqrt{\mu }\,\sqrt{\kappa _i}\,c_{I,i}}. \end{aligned}$$
(35)

Formulas for \(n_{i,j,k}\) and \(m_{i,j}\) follow directly from (33) and (34) and from (34) and (35), respectively.

After inserting (33), (34) and (35) into (29), we obtain

$$\begin{aligned} \sqrt{\mu }=\tfrac{1}{c}\sum _{i=1}^I\,\sqrt{\lambda _i}\,\tfrac{c_{I,i}\gamma _i+\sum _{j=1}^{L_i}\,\left( c_{II,i,j}\beta _{i,j} +\sum _{k=1}^{M_{i,j}}\,c_{III,i,j,k}\delta _{i,j,k}\right) }{\sqrt{\kappa _i}}. \end{aligned}$$
(36)

On the other hand, if we plug (33), (34) and (35) into the constraint (28), we obtain

$$\begin{aligned}&\tfrac{\sqrt{\mu }\,\sqrt{\kappa _i}\,c_{I,i}\,\gamma _i}{\sqrt{\lambda _i}}+ L_i\sum _{j=1}^{L_i}\,\left( \tfrac{\sqrt{\mu }\,\sqrt{\kappa _i}\,c_{II,i,j}\,\beta _{i,j}}{\sqrt{\lambda _i}\,L_i}+M_{i,j} \,\sum _{k=1}^{M_{i,j}}\,\tfrac{\sqrt{\mu }\,\sqrt{\kappa _i}\,c_{III,i,j,k}\,\delta _{i,j,k}}{\sqrt{\lambda _i}\,L_iM_{i,j}}\right) \\&\quad -\,\tfrac{L_iS_{I,i}^2}{\kappa _i\tau _i^2}=\kappa _iT. \end{aligned}$$

Denote \(v_i=\sqrt{\lambda _i}\) and multiply the above equation by \(v_i/\kappa _i\). Then, after cancellations and upon denoting

$$\begin{aligned} \nu _i=c_{I,i}\,\gamma _i+ \,\sum _{j=1}^{L_i}\,\left( c_{II,i,j}\,\beta _{i,j}+\,\sum _{k=1}^{M_{i,j}}\,c_{III,i,j,k}\,\delta _{i,j,k}\right) \end{aligned}$$

we have

$$\begin{aligned} \sqrt{\mu }\,\tfrac{\nu _i}{\sqrt{\kappa _i}}-c_iv_i=Tv_i, \end{aligned}$$

where \(c_i=\tfrac{L_iS_{I,i}^2}{\kappa _i\tau _i^2}\), \(i=1,\ldots ,I\). Expanding now \(\sqrt{\mu }\) according to (36), we conclude that

$$\begin{aligned} \tfrac{{\underline{a}}{\underline{a}}^T}{C}{\underline{v}}-\mathrm {diag}({\underline{c}}){\underline{v}}=T{\underline{v}}, \end{aligned}$$

where \({\underline{a}}=(a_i,\,i=1,\ldots ,I)^T\) with

$$\begin{aligned} a_i=\tfrac{\nu _i}{\sqrt{\kappa _i}},\quad i=1,\ldots ,I, \end{aligned}$$

and \({\underline{c}}=(c_i,\,i=1,\ldots ,I)\). Consequently, for \(\mathbf{D}=\tfrac{{\underline{a}}{\underline{a}}^T}{C}-\mathrm {diag}({\underline{c}})\) we have \(\mathbf{D}{\underline{v}}=T{\underline{v}}\).

Theorem 5.1

Assume that

$$\begin{aligned} L_iS_{I,i}^2-\sum _{j=1}^{L_i}\,M_{i,j}S_{II,i,j}^2>0,\quad i=1,\ldots ,I, \end{aligned}$$

and

$$\begin{aligned} M_{i,j}S_{II,i,j}^2-\sum _{k=1}^{M_{i,j}}\,N_{i,j,k}S_{III,i,j,k}^2>0,\quad j=1,\ldots ,L_i,\;i=1,\ldots ,I. \end{aligned}$$

Assume that the matrix \(\mathbf{D }\) has a positive eigenvalue \(\lambda ^*\). Then, it is unique and simple and the respective eigenvector \({\underline{v}}^*\) has all coordinates of the same sign.

The allocation \({{\underline{\ell }}}\), \({\underline{m}}\) and \({\underline{n}}\) which minimizes all relative domain-wise variances \(T_i\), \(i=1,\ldots ,I\), (as well as the relative variance \({S}\) in the whole population) under the constraints \(T_i=\kappa _i T\), \(i=1,\ldots ,I\), and under the EVC constraint (29) has the form

$$\begin{aligned}&\ell _i=C\tfrac{v_i^*\gamma _i}{\sqrt{\kappa _i}\,c_{I,i}\sum _{r=1}^I\,v_r^*\,\tfrac{1}{\sqrt{\kappa _r}}\nu _r}, \end{aligned}$$
(37)
$$\begin{aligned}&m_{i,j}=\tfrac{c_{I,i}L_i\beta _{i,j}}{\gamma _i c_{II,i,j}} \end{aligned}$$
(38)

and

$$\begin{aligned} n_{i,j,k}=\tfrac{c_{II,i,j}M_{i,j}\delta _{i,j,k}}{\beta _{i,j}c_{III,i,j,k}} \end{aligned}$$
(39)

for any \(k=1,\ldots ,M_{i,j}\), \(j=1,\ldots ,L_i\), \(i=1,\ldots ,I\).

Moreover, the minimal relative variances in the domains are \(T_i=\kappa _i,T\), \(i=1,\ldots ,I\), where T the base of the relative variance has the form

$$\begin{aligned} T=\lambda ^*=\tfrac{1}{C}\left( \sum _{i=1}^I\,\tfrac{\nu _i\tau _i\sqrt{\kappa _i}}{v_i^*}\right) \left( \sum _{i=1}^I\,\tfrac{\nu _iv_i^*}{\tau _i\sqrt{\kappa _i}}\right) -\sum _{i=1}^I\,L_iS_{I,i}^2. \end{aligned}$$
(40)

Remark 5.1

Note that in the case of no domains, i.e., when \(I=1\), the allocation formulas (37)–(39) as well as the formula for the optimal variance (40) are simplified to the Neyman-type allocation and optimal variance formulas as given in (24)–(26) and (27), respectively.

Note also that only the allocation of the first-stage sample and the optimal base of the variance depend on the eigenvector \({\underline{v}}^*\). Formulas (38) and (39) for the allocation of the second- and third-stage samples are given directly in terms of population quantities with no reference to the eigenvector \({\underline{v}}^*\).

6 Conclusions

In this paper, we search for Neyman-type solutions to domains-efficient allocation in multistage stratified sampling. Such a solution can be seen as an alternative to the purely numerical one proposed in CRH for stratified single-stage scheme. We develop the eigenproblem method originating in NW and use eigenvalues and eigenvectors for allocation which, under specified priority coefficients for the constraints on the domains relative variances, assures optimal estimation both in the whole population and in the domains. In particular, we consider two- and three-stage sampling. The novelty of the solutions we provide here, with respect to what is known for eigenproblem approach to domains-efficient allocation, is with respect to several aspects. The most important is that, in contrast to earlier situations, as, for example, in WW, a single total cost constraint is taken under account. In previous papers instead, two constraints related to (expected) samples sizes of the PSUs and SSUs, respectively, were jointly imposed. In those papers, the two-stage sampling with SRSWOR (or Hartley–Rao) schemes with stratification either at the first or the second stage was considered. Here, we apply the eigenproblem methodology also to new sampling schemes: stratified SRSWOR at both stages as well as \(\mathrm {pps}\) sampling with replacement and SRSWOR either at the first or the second stage and to the three-stage sampling with SRSWOR at each stage. In each of these cases, the allocation which assures optimality (under given domain priority weights) of estimators of domain totals is given in terms of eigenvectors of a population-dependent matrix (which typically is rank-one perturbations of a diagonal matrix). Moreover, the standard errors of the estimates in the domains and in the whole population are given in terms of the respective eigenvalue. The latter allows to interpret the solution as a direct generalization of Neyman-type optimal allocation to the multi-domain case. Another important consequence of the approach we use here is that through the analytic formulas, we obtained, the structure of the optimal allocation can be seen. For example, it is visible that only the first-stage optimal allocation is influenced by the eigenvector \({\underline{v}}^*\) of the population matrix \(\mathbf{D}\).