Abstract
In sample surveys, the model calibration approach is an improvement over the usual calibration approach, where the concept of the calibration approach is generalized to obtain a model-assisted estimator using more complex models based on complete auxiliary information. In many surveys, the study and auxiliary variables vary across locations and the observations tend to be similar for the nearby units than those located further apart. In such situations, a simple global model cannot explain the relationships between some sets of variables. This phenomenon is known as spatial non-stationarity which is considered by the geographically weighted regression (GWR) model. It can capture the spatially varying relationship between different variables. In the present study, GWR-based model calibration estimators of population total of the study variable were developed in the context of geo-referenced complex survey designs when complete auxiliary information along with their spatial locations is available at population level. The asymptotic properties of the developed GWR-based model calibration estimators were evaluated under a set of assumptions. Under the same set of assumptions, the variances and estimators of variances of the developed estimators were given. Through a spatial simulation study, the performance of the developed estimators was compared to that of existing estimators and found to be more efficient than the existing ones. Supplementary materials accompanying this paper appear online
Similar content being viewed by others
References
Ahmad T, Bhatia VK, Sud UC, Rai A, Sahoo PM (2013) Study to develop an alternative methodology for estimation of cotton production. Project Report, IASRI publication, New Delhi
Ahmad T, Sud UC, Rai A, Sahoo PM (2020) An alternative sampling methodology for estimation of cotton yield using double sampling approach. J Indian Soc Agricul Stat 74(3):217–226
Basu D (1971) Foundations of Statistical Inference, A Symposium, eds. V. P. Godambe and D. A. Sprott, Toronto: Holt Rinehart and Winston
Biswas A, Rai A, Ahmad T, Sahoo PM (2017) Spatial estimation and rescaled spatial bootstrap approach for finite population. Commun Stat Theory Methods 46(1):373–388. https://doi.org/10.1080/03610926.2014.995820
Brunsdon C, Fotheringham AS, Charlton ME (1996) Geographically weighted regression: a method for exploring spatial non-stationarity. Geogr Anal 28(4):281–298. https://doi.org/10.1111/j.1538-4632.1996.tb00936.x
Cassel CM, Särndal CE, Wretman JH (1976) Some results on generalized difference estimation and generalized regression estimation for finite populations. Biometrika 63(3):615–620. https://doi.org/10.2307/2335742
Deville JC, Särndal CE (1992) Calibration estimators in survey sampling. J Am Stat Assoc 87(418):376–382. https://doi.org/10.2307/2290268
Fotheringham AS, Brunsdon C, Charlton M (1996) The geography of parameter space: an investigation of spatial non-stationarity. Int J Geogr Inf Sci 10:605–627. https://doi.org/10.1080/02693799608902100
Fotheringham AS, Brunsdon C, Charlton M (2002) Geographically weighted regression: the analysis of spatially varying relationships. John Wiley & Sons, UK
Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685
Liu C, Wei C, Su Y (2018) Geographically weighted regression model-assisted estimation in survey sampling. J Nonparam Stat 30(4):906–925. https://doi.org/10.1080/10485252.2018.1499907
Pebesma EJ (2004) Multivariable geostatistics in S: the GSTAT package. Comput Geosci 30(7):683–691. https://doi.org/10.1016/j.cageo.2004.03.012
Särndal CE (1980) On \(\Pi \)-inverse weighting versus best linear weighting in probability sampling. Biometrika 67(3):639–650
Särndal CE, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer, Verlag
Wu C, Sitter RR (2001) A model calibration approach to using complete auxiliary information from survey data. J Am Stat Assoc 96:185–193. https://doi.org/10.1198/016214501750333054
Wu C, Thompson ME (2020) Sampling theory and practice. Springer, Cham
Acknowledgements
The authors would like to thank the anonymous referees for their constructive comments and suggestions which led to the significant improvement in the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no potential conflict of interest relevant to this article.
Data Availability Statement
Data sharing is not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix
Appendix
Proof of Theorem 1
By applying Taylor series approximation to \(\varvec{x}_{i}^{T}\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) \) at \(\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) \varvec{=}\bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) \), we get
where \({\varvec{\beta }\left( u_{i} \right) }^{{*}}\in \{\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) ,\bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) \}\) or \(\{\bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) ,\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) \}\).
Using assumptions (ii) and (iii) and summing both side of Eq. (A1) over whole population, we get
Using assumptions (ii) and (iii) and multiplying both side of Eq. (A1) by survey weights and summing over whole sample, we get
Using assumption (i), we get
Subtracting Eq. (A3) from Eq. (A2) and using the Eq. (A4), we get
Now, since, both \(\hat{B}_{N}=O_{p}\mathbf {(}1\mathbf {)}\) and \(\hat{B}_{N}^{*}=O_{p}\mathbf {(}1\mathbf {)}\), using the results of the Eq. (A4) in Eq. (6), we can finally write
Since, \(\hat{Y}_{HT}\) is a design-unbiased estimator for population total Y, hence, both\( \hat{Y}_{MC,1}\) and \(\hat{Y}_{MC,2}\) are also asymptotically design-unbiased.
Now, if \(\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) \varvec{\rightarrow \beta }\left( u_{i} \right) \), where \(\varvec{\beta }\left( u_{i} \right) \) is the true superpopulation parameter, then
Similarly, \(E_{\xi }\left( \hat{B}_{N}^{*} \right) =1\), where the expectation is taken over superpopulation model \(\xi \).
Hence, using assumption (ii), we get
Similarly, \(E_{\xi }\left( \hat{Y}_{MC,2}-Y \right) \quad =\) 0.
Hence, both the estimators \(\hat{Y}_{MC,1}\) and \(\hat{Y}_{MC,2}\) are model-unbiased. Therefore, theorem 1 is proved. \(\square \)
Proof of Theorem 2
By using second-order Taylor series approximation to \(\varvec{x}_{i}^{T}\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) \) at \(\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) \varvec{=}\bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) \), we get
where \({\varvec{\beta }\left( u_{i} \right) }^{{*}}{\in } \{\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) ,\bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) \}\) or \(\{\bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) ,\hat{\varvec{\beta }}^{\varvec{gwr}}\left( u_{i} \right) \}\).
Using assumption (iv) in above expression, we get
where \(\varvec{K}\left( \varvec{x}_{\varvec{i}}, \bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) \right) \varvec{=}\left. \frac{\partial }{\partial \varvec{t}_{i}}\varvec{x}_{i}^{T}\varvec{t}_{i} \right| _{\varvec{t}_{\varvec{i}}\varvec{=}\bar{\varvec{\beta } }^{\varvec{gwr}}\left( u_{i} \right) }\)
By taking difference of the above two equations, we get,
By assumption (ii), we get
By assumption (i), we get,
Thus, by using results of Eqs. (A9) and (A10) in Eq. (A8), we get
Following Wu and Sitter (2001), we can write \(\hat{B}_{N}= B_{N}+o_{p}\mathbf {(}1\mathbf {)}\) and \(\hat{B}_{N}^{*}= B_{N}^{*}+o_{p}\mathbf {(}1\mathbf {)}\).
Hence, the proposed estimators can be linearized as
where \(Z_{i}=y_{i}-\mu _{i}B_{N}\) and \(Z_{i}^{'}=y_{i}-\mu _{i}B_{N}^{*}\).
Thus, following Särndal et al. (1992), the asymptotic design variances of \(\hat{Y}_{MC,1}\) and \(\hat{Y}_{MC,2}\) are expressed as
Following Särndal et al. (1992), the estimates of the variances of the proposed model calibration estimators are expressed as
where \(z_{i}=y_{i}-\hat{\mu }_{i}\hat{B}_{N}\), \(z_{i}^{'}=y_{i}-\hat{\mu }_{i}\hat{B}_{N}^{*}\).
Hence, theorem 2 is proved. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Saha, B., Biswas, A., Ahmad, T. et al. Geographically Weighted Regression-Based Model Calibration Estimation of Finite Population Total Under Geo-referenced Complex Surveys. JABES (2023). https://doi.org/10.1007/s13253-023-00576-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13253-023-00576-9