Abstract
To analyze the singly-truncated bivariate economic data, we establish a class of singly-truncated bivariate normal distributions via stochastically representing the original bivariate normal random vector as a mixture of the singly-truncated part and its complementary components. Aided with the stochastic representaion, we creatively construct two novel unified and simple algorithms—the expectation–maximization algorithm as well as the minorization–maximization algorithm—to calculate the maximum likelihood estimates of the means and covariance matrix for the model of interest. In addition, we also develop a DA algorithm for posterior sampling in Bayesian analysis. Both simulation results and two real data applications in economics, collaborated by comparisons with existing methods, demonstrate the effectiveness and stability of proposed methodologies.
Similar content being viewed by others
References
Azzalini, A., & Dalla Valle, A. (1996). The multivariate skew-normal distribution. Biometrika, 83(4), 715–726.
Amemiya, T. (1974). Multivariate regression and simultaneous equation models when the dependent variables are truncated normal. Econometrica, 42(6), 999–1012.
Arismendi, J. C. (2013). Multivariate truncated moments. Journal of Multivariate Analysis, 117, 41–75.
Breslaw, J. A. (1994). Random sampling from a truncated multivariate normal distribution. Applied Mathematics Letters, 7(1), 1–6.
Cohen, A. C. (1957). Restriction and selection in multinormal distributions. The Annals of Mathematical Statistics, 28, 731–741.
Deb, P., & Trivedi, P. K. (1997). Demand for medical care by the elderly: A finite mixture approach. Journal of Applied Econometrics, 12, 313–336.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1), 1–38.
Dyer, D. D. (1973). On moments estimation of the parameters of a truncated bivariate normal distribution. Journal of the Royal Statistical Society, Series C, 22, 287–291.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions and Pattern Analysis and Machine Intelligence, 6, 721–741.
Griffths, W. (2002). A Gibbs’ sampler for the parameters of a truncated multivariate normal distribution. Department of Economics–Working Papers Series 856, The University of Melbourne.
Gupta, A. K., & Tracy, D. S. (1976). Recurrence relations for the moments of truncated multinormal distribution. Communications in Statistics-Theory and Methods, 5, 855–865.
Horrace, W. C. (2005). Some results on the multivariate truncated normal distribution. Journal of Multivariate Analysis, 94(1), 209–221.
Kan, R., & Robotti, C. (2017). On moments of folded and truncated multivariate normal distributions. Journal of Computational and Graphical Statistics, 26(4), 930–934.
Khatri, C. G., & Jaiswal, M. C. (1963). Estimation of parameters of a truncated bivariate normal distribution. Journal of the American Statistical Association, 58, 519–526.
Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Annals of Statistics, 20(3), 1350–1360.
Leppard, P., & Tallis, G. M. (1989). Algorithm AS 249: Evaluation of the mean and covariance of the truncated multinormal distribution. Journal of Royal Statistical Society, Series C, 38(3), 543–553.
Manjunath, B. G., & Wilhelm, S. (2021). Moments calculation for the doubly truncated multivariate normal density. Journal of Behavioral Data Science, 1(1), 17–33.
Murphy, K. P. (2007). Conjugate Bayesian analysis of the Gaussian distribution [Online]. Available: http://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf.
Muthén, B. (1990). Moments of the censored and truncated bivariate normal distribution. British Journal of Mathematical and Statistical Psychology, 43, 131–143.
Nurminen, H., Rui, R., Ardeshiri, T., Bazanella, A., & Gustafsson, F. (2016). Mean and covariance matrix of a multivariate normal distribution with one doubly-truncated component. Sweden: Technical Report from Automatic Control at Linköpings Universitet.
Okun, A. M. (1962). Potential GNP: Its measurement and significance. In Proceedings of the Business and Economics Statistics Section of the American Statistical Association (pp. 89–104).
Phillips, A. W. (1958). The relation between unemployment and the rate of change of money wage rates in the United Kingdom, 1861–1957. Economica, 25(100), 283–299.
Rosenbaum, S. (1961). Moments of a truncated bivariate normal distribution. Journal of the Royal Statistical Society, Series B, 23, 405–408.
SAS Institute Inc. (1998). Solving Business Problems Using SAS Enterprise Miner Software. SAS Institute White Paper, SAS Institute Inc., Cary, NC.
Shah, S. M., & Parikh, N. T. (1964). Moments of single and doubly truncated standard bivariate normal distribution. Vidya, 7, 82–91.
Singh, N. (1960). Estimation of parameters of a multivariate normal population from truncated and censored samples. Journal of Royal Statistical Society, Series B, 22, 307–311.
Tallis, G. M. (1961). The moment generating function of the truncated multi-normal distribution. Journal of Royal Statistical Society, Series B, 23, 223–229.
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distribution by data augmentation (with discussion). Journal of the American Statistical Association, 83, 528–540.
Tian, G. L., Huang, X. F., & Xu, J. F. (2019). An assembly and decomposition (AD) approach for constructing seperable minorizing functions in a class MM algorithms. Statistica Sinica, 29, 961–982.
Umeki, K., Sumida, A., Seino, T., Lim, E., & Honjo, T. (2006). Fitting the truncated bivariate normal distribution to the relationship between diameter and length of current-year shoots in Betula Platyphylla in Hokkaido, Northern Japan. Published in Second International Symposium on Plant Growth Modeling, Simulation, Visualization and Applications. https://doi.org/10.1109/pma.2006.15.
Wilhelm, S., & Manjunath, B. G. (2010). tmvtnorm: A package for the truncated multivariate normal distribution. The R Journal, 2(1), 1–25.
Xiang, Y., & Zhu, Z. (2013). Comparison of two ZIP models with an application to ratemaking. Journal of Applied Statistics and Management, 32(5), 854–862.
Yip, K. C. H., & Yau, K. K. W. (2005). On modeling claim frequency data in general insurance with extra zeros. Insurance, Mathematics & Statistics, 36, 153–163.
Yu, J. W., & Tian, G. L. (2011). Efficient algorithms for generating truncated multivariate normal distributions. Acta Mathematicae Applicatae Sinica (English Series), 27(4), 601–612.
Funding
Y LIU’s research was fully supported by grants (23YJC910004 &22YJAZH038) from Humanities and Social Science Foundation of Ministry of Education of China and a grant (12171483) from National Natural Science Foundation of China. GL TIAN’s research was fully supported by a grant (12171225) from National Natural Science Foundation of China. C ZHANG’s research was fully supported by a grant (11801380) from National Natural Science Foundation of China. H QIN’s research was fully supported by a grant (12371261) from National Natural Science Foundation of China.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
This work does not have any conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proof of Theorem 2.1
To obtain the conditional distribution of \(\textbf{x}|\textbf{y}\), we first derive the conditional distribution of \((\textbf{x},\textbf{u})|\textbf{y}\). Given both \(\textbf{y}={\varvec{y}}\) and \(U_4=1\), i.e., \(\textbf{u}=(0,0,0,1)\), the complete-data random vector \(\textbf{x}\) and the observed-data random vector \(\textbf{y}\) share the same distribution, i.e., the conditional distribution of \(\textbf{x}|(\textbf{y}={\varvec{y}},\textbf{u}=(0,0,0,1))\) degenerates to the point \({\varvec{y}}\). Therefore,
Note that when \(U_4=0\), then \(\textbf{u}\) could be (1,0,0,0) or (0,1,0,0) or (0,0,1,0), thus, given \(\textbf{y}={\varvec{y}}\) and \(U_4=0\), i.e., \(\textbf{u}\ne (0,0,0,1)\), the conditional distribution of \(\textbf{x}|(\textbf{y}={\varvec{y}},\textbf{u}\ne (0,0,0,1))\) may have the same distribution of the unobserved random vector \(\textbf{y}_1\) or \(\textbf{y}_2\) or \(\textbf{y}_3\) depending on the value of \(\textbf{u}\). Therefore,
where \(f_1(\cdot )\), \(f_2(\cdot )\) and \(f_3(\cdot )\) denote the density functions of the complementary singly-truncated bivariate normal vector \(\textbf{y}_1\), \(\textbf{y}_2\) and \(\textbf{y}_3\), respectively.
By combining Eq. (A.1) with Eq. (A.2), the conditional distribution of \(\textbf{x}|\textbf{y}\) is determined by the following density function:
Appendix B: Details for Deriving \(Q({\varvec{\mu }},{\varvec{\Sigma }}|{\varvec{\mu }}^{(t)},{\varvec{\Sigma }}^{(t)})\) specified by Eq. (3.9) in the proposed MM algorithm
The difficulty in deriving the explicit expressions of the MLEs for \(({\varvec{\mu }},{\varvec{\Sigma }})\) from the observed-data log-likelihood function \(\ell ({\varvec{\mu }},{\varvec{\Sigma }}|Y_\textrm{obs})\) defined by (3.1) lies in the integration part, i.e., \(-n\log [p_{11}({\varvec{\mu }},{\varvec{\Sigma }})]\). Thus, the key to find the surrogate Q-function for \(\ell ({\varvec{\mu }},{\varvec{\Sigma }}|Y_\textrm{obs})\) is to make an appropriate amplification and minification on this term.
Note that \(p_{00}({\varvec{\mu }},{\varvec{\Sigma }})+p_{01}({\varvec{\mu }},{\varvec{\Sigma }})+p_{10}({\varvec{\mu }},{\varvec{\Sigma }})+p_{11}({\varvec{\mu }},{\varvec{\Sigma }})=1\) and \(\log (\cdot )\) is a concave function, thus,
where \(({\varvec{\mu }}^{(t)},{\varvec{\Sigma }}^{(t)})\) is the t-th approximation of the MLEs \(({\varvec{{\hat{\mu }}}}, {\varvec{{\hat{\Sigma }}}})\). Based on (B.1), we have
Then,
where \(c_1^{(t)}\) is a constant that does not depend on \(({\varvec{\mu }},{\varvec{\Sigma }})\).
However, from \(Q_1({\varvec{\mu }},{\varvec{\Sigma }}|{\varvec{\mu }}^{(t)},{\varvec{\Sigma }}^{(t)})\), we still can not resolve the MLEs due to the complex integral form of \(\log [ p_{00}({\varvec{\mu }},{\varvec{\Sigma }})+p_{01}({\varvec{\mu }},{\varvec{\Sigma }})+p_{10}({\varvec{\mu }},{\varvec{\Sigma }}) ]\). Next, we try to construct a surrogate function for it, again, we need to use the concavity of the log function. Since
therefore,
where \(c_2^{(t)}\) is a constant that does not depend on \(({\varvec{\mu }},{\varvec{\Sigma }})\).
Unfortunately, the Q-function given by Eq. (B.3) is still not the right one because the complex integral forms included in \(\log [p_{00}({\varvec{\mu }},{\varvec{\Sigma }})]\), \(\log [p_{01}({\varvec{\mu }},{\varvec{\Sigma }})]\) and \(\log [p_{10}({\varvec{\mu }},{\varvec{\Sigma }})]\). To solve this problem, we need to use the following integration version of the Jensen’s inequality:
where \({\varvec{z}}\) is a real number vector defined on the domain \({{\mathbb {Z}}}\), \(f(\cdot )\) is a positive multivariate real function defined on \({{\mathbb {Z}}}\) and \(g(\cdot )\) is a multivariate pdf defined on \({{\mathbb {Z}}}\). By employing Eq. (B.4), we have
where \(c_{31}^{(t)}\) is a constant that free of \(({\varvec{\mu }},{\varvec{\Sigma }})\). Similarly,
where \(c_{32}^{(t)}\) and \(c_{33}^{(t)}\) are normalizing constants. By combining Eqs. (B.5)–(B.7), the result shown in Eq. (3.9) can be obtained immediately.
Appendix C: Numerical Simulation Results
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, Y., Tian, GL., Zhang, C. et al. A General Inferential Framework for Singly-Truncated Bivariate Normal Models with Applications in Economics. Comput Econ (2024). https://doi.org/10.1007/s10614-023-10525-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s10614-023-10525-w