Order Statistics Based on a Combined Simple Random Sample from a Finite Population and Applications to Inference

Ozturk, Omer; Balakrishnan, Narayanaswamy; Kravchuk, Olena

doi:10.1007/s13171-020-00228-x

Order Statistics Based on a Combined Simple Random Sample from a Finite Population and Applications to Inference

Published: 12 January 2021

Volume 85, pages 77–101, (2023)
Cite this article

Sankhya A Aims and scope Submit manuscript

Omer Ozturk ORCID: orcid.org/0000-0002-0235-4772¹,
Narayanaswamy Balakrishnan² &
Olena Kravchuk³

334 Accesses
Explore all metrics

Abstract

In this paper, we study probability distributions of order statistics from a set obtained by combining several simple random samples (SRS) selected from the same finite population. Each simple random sample is taken using without replacement selection procedure and does not contain any ties. On the other hand, in the combined sample, the same observation may appear more than once since each SRS is selected from the same finite population. Consequently, the number of the distinct observations in the combined sample is a discrete random variable. We provide the probability mass function of this discrete random variable. Next, using the order statistics in the combined SRSs, we construct confidence intervals for the quantiles and outer-inner confidence intervals for the quantile interval of a finite population. Finally, we also present a prediction interval for a future observation from the same finite population.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Ulrich Knief & Wolfgang Forstmeier

A simple introduction to Markov Chain Monte–Carlo sampling

Article Open access 11 March 2016

Don van Ravenzwaaij, Pete Cassey & Scott D. Brown

A Systematic Review of Hidden Markov Models and Their Applications

Article 12 May 2020

Bhavya Mor, Sunita Garhwal & Ajay Kumar

References

Arnold, B.C., Balakrishnan, N. and Nagaraja, H.N. (1992). A first course in order statistics. Wiley, New York.
MATH Google Scholar
Balakrishnan, N., Beutner, E. and Cramer, E. (2010). Exact two-sample nonparametric confidence, prediction, and tolerance intervals on ordinary and progressively Type-II right censored data. Test 19, 68–91.
Article MATH Google Scholar
Berred, A. and Nevzorov, V. (2009). Characterizations based on order statistics under sampling without replacement. J. Stat. Plan. Inference 139, 547–557.
Article MATH Google Scholar
Chatterjee, A. (2011). Asymptotic properties of sample quantiles from a finite population. Ann Inst. Stat. Math. 63, 157–179.
Article MATH Google Scholar
Conti, P.L. and Marella, D. (2015). Inference for quantiles of a finite population: asymptotic versus resampling results. Scand. J. Stat. 42, 545–561.
Article MATH Google Scholar
David, H.A. and Nagaraja, H.N. (2003). Order statistics, 3rd edn. Wiley, Hoboken.
Book Google Scholar
Francisco, C.A. and Fuller, W.A. (1991). Quantile estimation with a complex survey design. Ann. Stat. 19, 454–495.
Article MATH Google Scholar
Intrigliolo, D.S. and Castel, J.R. (2007). Evaluation of grapevine water status from trunk diameter variations. Irrig. Sci. 26, 49–59. https://doi.org/10.1007/s00271-007-0071-2.
Article Google Scholar
Kuk, A.Y.C. (1988). Estimation of distribution functions and medians under sampling with unequal probabilities. Biometrika 75, 97–103.
Article MATH Google Scholar
Malinovsky, Y. and Rinott, Y. (2011). Best invariant and minimax estimation of quantiles in finite populations. J. Stat. Plan. Inference 141, 2633–2644.
Article MATH Google Scholar
Meyer, J.S. (1987). Outer and inner confidence intervals for finite population quantile intervals. J. Am. Stat. Assoc. 82, 201–204.
Article MATH Google Scholar
Ozturk, O. and Balakrishnan, N (2019). Constructing quantile confidence intervals using extended simple random sample in finite populations. Statistics 53, 792–806. https://doi.org/10.1080/02331888.2019.1624754.
Article MATH Google Scholar
Rao, J.N.K., Kovar, J.G. and Mantel, H.J. (1990). On estimating distribution functions and quantiles from survey data using auxiliary information. Biometrika 77, 365–375.
Article MATH Google Scholar
Royal, R.M. (1992). The model based (prediction) approach to finite population sampling theory. Lecture Notes-Monograph Series, vol. 17, Current Issues in Statistical Inference: Essays in Honor of D. Basu, pp. 225–240.
Sedransk, J. and Meyer, J. (1978). Confidence intervals for the quantiles of a finite population: simple random and stratified simple random sampling. J. R. Stat. Soc. Ser. B 40, 239–252.
MATH Google Scholar
Sitter, R.R. and Wu, C. (2001). A note on Woodruff confidence intervals for quantiles. Stat. Probab. Lett. 52, 353–358.
Article MATH Google Scholar
Shao, J. (1994). L-Statistics in complex survey problems. Ann. Stat.22, 946–967.
Article MATH Google Scholar
Smith, P.J. and Sedransk, J. (1983). Lower bounds for confidence coefficients for confidence intervals for finite population quantiles. Commun. Stat.- Theory Methods 12, 1329–1344.
Article MATH Google Scholar
Wang, J.C. and Opsomer, J.D. (2011). On asymptotic normality and variance estimation for nondifferentiable survey estimators. Biometrika 98, 91–106.
Article MATH Google Scholar
Woodruff, R.S. (1952). Confidence intervals for medians and other position measures. J. Am. Stat. Assoc. 47, 635–646.
Article MATH Google Scholar

Download references

Acknowledgments

This research is supported by the Australian Grain Research and Development Corporation (GRDC) as part of the Statistics for the Australian Grains Industry project (UA00164). The data were collected by Bachelor of Agriculture Science students of the School of Agriculture, Food and Wine, University of Adelaide in 2019 under the supervision of Mr. Peter Kasprzak. The data was kindly made available for our purposes and other publications, as well as to improve the management and research decisions in the Coombe vineyard.

Author information

Authors and Affiliations

Department of Statistics, The Ohio State University, 1958 Neil Avenue, Columbus, OH, 43210, USA
Omer Ozturk
Department of Mathematics and Statistics, MacMaster University, 1280 Main Street West, Ontario, L8S 4K1, Canada
Narayanaswamy Balakrishnan
School of Agriculture, Food and Wine, University of Adelaide, Adelaide, Australia
Olena Kravchuk

Authors

Omer Ozturk
View author publications
You can also search for this author in PubMed Google Scholar
Narayanaswamy Balakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Olena Kravchuk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Omer Ozturk.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix:

Proof 1 (Proof of Theorem 1).

Suppose j = 0. Then, there is no tie in the combined sample. The number of combined samples with no ties can be obtained by selecting n₁ units out of N in the first sample and then n₂ units out of the remaining N − n₁ in the second sample, that is,

$$ \begin{array}{@{}rcl@{}} \text{ Number of sampled}~S_{1:2}~\text{with no ties}=\left( \begin{array}{c}{N} \\ {n_{1}}\end{array}\right)\left( \begin{array}{c} {N-n_{1}} \\ {n_{2}}\end{array}\right). \end{array} $$

Since the samples are selected independently, we readily have

$$ \begin{array}{@{}rcl@{}} P(j=0)= \frac{\left( \begin{array}{c} {N} \\ {n_{1}}\end{array}\right)\left( \begin{array}{c}{N-n_{1}} \\ {n_{2}}\end{array}\right)}{ \left( \begin{array}{c} {N} \\ {n_{1}}\end{array}\right) \left( \begin{array}{c}{N} \\ {n_{2}}\end{array}\right)}. \end{array} $$

Now, suppose 0 < j ≤ n₁. We must have j tied pairs in the combined sample S_1,2, and suppose that j specific units in the population are tied in both samples. Then, there are

$$ \begin{array}{@{}rcl@{}} \left( \begin{array}{c} {N-j} \\ {n_{1}-j}\end{array}\right)\left( \begin{array}{c} {N-n_{1}} \\ {n_{2}-j}\end{array}\right) \end{array} $$

different ways to select the remaining untied units in both samples. Since j units in the population can be selected in $\left (\begin {array}{c} {N} \\ {j}\end {array}\right )$ different ways, using the product rule we then obtain

$$ \begin{array}{@{}rcl@{}} P(j~\text{tied pairs in } S_{1:2})= \frac{\left( \begin{array}{c}{N} \\ {j}\end{array}\right) \left( \begin{array}{c} {N-j} \\ {n_{1}-j}\end{array}\right)\left( \begin{array}{c} {N-n_{1})} \\ {n_{2}-j}\end{array}\right)}{\left( \begin{array}{c} {N} \\ {n_{1}}\end{array}\right) \left( \begin{array}{c} {N} \\ {n_{2}}\end{array}\right)}, \end{array} $$

which completes the proof. □

Proof 2 (Proof of Theorem 2).

This is a special case of Theorem 3 and so there is no need to present its proof. □

Proof 3 (Proof of Theorem 3).

By conditioning on u_K = u, we can write

$$ \begin{array}{@{}rcl@{}} P(Z_{(i:u)}=x_{t}|u_{K}=u)= f_{(i:u)}(x_{t}), \end{array} $$

where f_(i:u) is the pmf of the i-th order statistic from a sample of size u selected without replacement from the population $\mathcal {P}$. If i < n_K, the unconditional distribution follows from the joint distribution as

$$ \begin{array}{@{}rcl@{}} P(Z_{(i:n_{T})}=x_{t})= \sum\limits_{u=n_{K}}^{n_{T}} f_{(i:u)}(x_{t}) P(u_{K}=u), \quad i=1, \ldots, n_{k}-1, \end{array} $$

where P(u_K ≥ i) = 1 if i < n_k. If i ≥ n_K, the i-th order statistic is observed only if there are at least i distinct observations in the combined sample S_1:K. Hence we must use the truncated distribution of U_k to marginalize the distribution of $Z_{(i:n_{T})}$ to get

$$ \begin{array}{@{}rcl@{}} P(Z_{(i:n_{T})}=x_{t})= \sum\limits_{u=n_{K}}^{n_{T}} \frac{f_{(i:u)}(x_{t}) P(u_{K}=u)}{P(u_{K} \ge i)}, \quad i=n_{K}, \ldots, n_{T}. \end{array} $$

The proof then gets completed by combining these two pieces together. □

R-function

############################# # This function is used by StepKF ### tieF=function(KV,N,n,m){ ret<-rep(0,width(KV)) ki=1 for(k in KV){ if(k > min(n,m)) {print("k must be less than minimum of n and m");return} ret[ki]= choose(N,k)*(choose(N-k,n-k)*choose(N-n,m-k))/(choose (N,n)*choose(N,m)) ki=ki+1 } return(ret) }

######################################## # This function computes the probability mass function # of sample size of distinct observations in the # K-th combined sample # P1: probability mass function of sample size in (K-1)-st step # NV: Sample size vector, ordered from smallest to largest StepKF=function(NV){ K=width(NV) # width of sample size vector P1= matrix(c(NV[1],1),ncol=2) for (k in (2:K)){ ML=min(P1[,1],NV[k]) MU=sum(NV[1:k]) # maximum value of distinct observations in step k Ck=NV[k]:MU # The range of sample size of distinct P2=matrix(0,ncol=2,nrow=width(Ck)) # updated pmf of sample size # in the K-th step P2[,1]=Ck # Values of sample size of distinct obs in step k pd=dim(P1) dv=P1[,1] # Values of sample size of distinct obs in step k-1 d=width(dv) newind=1:width(Ck) # this is used to locate the sample size in the range # in k-th step oldind=1:d # this is used to locate the sample size in the range # in (k-1)-th step for(dd in dv){ Jk=0:min(dd,NV[k]) # The tie vector in the K-th step given that # (k-1)-st step had dd distinct observations JkD=sort(Jk,decreasing=TRUE) dold=which(dd==dv) MV=sort(c(P1[dold,1],NV[k])) TPkV=tieF(JkD,N,MV[1],MV[2]) # this computes the probability having #JkD ties when we combine the distinct observations # in step k-1 with sample n_k for(jkn in JkD){ dnew=which(Ck==(dd+NV[k]-jkn)) P2[dnew,2]=P2[dnew,2]+P1[dold,2]*tieF(jkn,N,MV[1],MV[2]) # this updates # the probability of sample size in step k } } P1=P2 } colnames(P1)=c("u_K","P(U_K)") return(P1) }

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ozturk, O., Balakrishnan, N. & Kravchuk, O. Order Statistics Based on a Combined Simple Random Sample from a Finite Population and Applications to Inference. Sankhya A 85, 77–101 (2023). https://doi.org/10.1007/s13171-020-00228-x

Download citation

Received: 04 March 2019
Accepted: 04 November 2020
Published: 12 January 2021
Issue Date: February 2023
DOI: https://doi.org/10.1007/s13171-020-00228-x

Keywords and phrases

AMS (2000) subject classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Order Statistics Based on a Combined Simple Random Sample from a Finite Population and Applications to Inference

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

A simple introduction to Markov Chain Monte–Carlo sampling

A Systematic Review of Hidden Markov Models and Their Applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix:

Proof 1 (Proof of Theorem 1).

Proof 2 (Proof of Theorem 2).

Proof 3 (Proof of Theorem 3).

R-function

Rights and permissions

About this article

Cite this article

Keywords and phrases

AMS (2000) subject classification

Navigation

Order Statistics Based on a Combined Simple Random Sample from a Finite Population and Applications to Inference

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

A simple introduction to Markov Chain Monte–Carlo sampling

A Systematic Review of Hidden Markov Models and Their Applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix:

Proof 1 (Proof of Theorem 1).

Proof 2 (Proof of Theorem 2).

Proof 3 (Proof of Theorem 3).

R-function

Rights and permissions

About this article

Cite this article

Share this article

Keywords and phrases

AMS (2000) subject classification

Search

Navigation