Abstract
We present a new model in Social Networks which allows experts in this field to analyze social networks. In this paper, a joint random effect linear model for analysing longitudinal inflated [0,1]-support and inflated count response variables, where there is the possibility of non-ignorable missing values for inflated [0,1]-support response variable, has been presented. Considering the posterior distribution of unknowns given all available information. A Monte Carlo EM algorithm is used for estimating the posterior distribution of the parameters. A sensitivity of the results to the assumptions is also investigated the perturbation from missing at random to not missing at random. Influence of small perturbation of these elements on posterior displacement is also studied. Finally, for showing the applicability of the proposed model, results from analyzing Enron email dataset and student activity and profile dataset are presented. Also, a new statistical monitoring to study the longitudinal social network datasets via considering attributes which are important in various applications is provided. For this purpose, a complete definition of responsiveness rate in social networks as a [0,1]-support variable has been presented.
Similar content being viewed by others
Notes
\(P_{Y_{(i,j)t}}^{INFBE_{k,l}}\), a probability measure on the measurable space \(({\Omega }, \mathfrak {B})\), is absolutely continuous with respect to the σ −finite measure μ = μL + δk + δl, where μL indicates the Lebesgue measure and δc is a point mass at c.
References
Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B (Methodological) 44, 139–177.
Anholetoa, T., Sandovala, M.C. and Bottera, D.A. (2012). Adjusted Pearson residuals in beta regression models. J. Stat. Comput. Simul. 84, 999–1014.
Azarnoush, B.K., Paynabar, J.B. and Runger, G. (2016). Monitoring temporal homogeneity in attributed network streams. J. Qual. Technol. 48, 28–43.
Barreto-Souza, W. and Simas, A.B. (2017). Improving estimation for beta regression models via EM algorithm and related diagnostic tools. J. Stat. Comput. Simul.87, 2847–2867.
Cameron, A.C. and Trivedi, P.K. (1998). Regression Analysis of Count Data. Cambridge University Press, Cambridge.
Cook, R.D. (1986). Assessment of Local Influence. J. Royal Statist. Soc., Ser. B. 48, 133–169.
Choudhary, P. and Singh, U. (2015). A survey on social network analysis for counter terrorism. Int. J. Comput. Appl. 112, 24–29.
Demidenko, E. (2013). Mixed Models Theory and Applications With R. John Wiley Sons, New York.
Dempster, A.P. and Laird, N.M. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm. J. R. Statist. Soc B. 39, 1–38.
Ferrari, S.L.P. and Cribari-Neto, F. (2004). Beta regression for modeling rates and proportions. J. Appl. Stat. 31, 799–815.
Ferrari, S.L.P. and Pinheiro, E.C. (2010). Improved Likelihood Inference in Beta Regression. JJ. Stat. Comput. Simul. 81, 431–443.
Hahn, G.J. and Shapiro, S. (1994). Statistical models in engineering. Wiley, New york.
Hunger, M., Baumert, J. and Holle, R. (2011). Analysis of SF-6D index data: is beta regression appropriate?. Value Health 14, 759–767.
Hunger, M., Dring A. and Holle, R. (2012). Longitudinal beta regression models for analyzing health-related quality of life scores over time. BMC Med. Res. Methodol. 12, 144.
Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995). Continuous univariate distributions: John Wiley and Sons.
Keeping, E.S. (2010). Introduction to Statistical Inference. Dover Publications, New Jersey.
Little, R.J. and Rubin, D.B. (2002). Statistical analysis with missing data: John Wiley Sons.
Lusher, D., Koskinen, J. and Robins, G. (2013), Exponential random graph models for social networks: theory, methods, and applications. Cambridge University Press.
McLachlan, G. and Peel, D. (2000). Finite mixture models willey series in probability and statistics.
Ospina, R. and Ferrari, S.L.P. (2010). Inflated beta distributions. Stat. Pap. 51, 1–11.
Ospina, R. and Ferrari, S.L.P. (2012). A general class of zero-or-one inflated beta regression models. Computational Statistics and Data Analysis 56, 1609–1623.
Patil, K. (2016). Validation of beta distribution for spectrum usage using Kolmogorov-Smirnov test. nt. J. Comput. Appl. 144, 479–482.
Rubin, D.B. (1976). Inference and missing data. Biometrica. 82, 669–710.
Smithson, M. and Verkuilen, J. (2006). A better lemon squeezer Maximum-likelihood regression with beta-distributed dependent variables. Psychol. Methods 11, 54.
Tu, W. (2002). Zero inflated data. Encyclopedia of Environmetrics 4, 2387–2391.
Vern, J. and Kuile Smithson, M. (2012). Mixed and mixture regression models for continuous bounded responses using the beta distribution. J. Educ. Behav. Stat. 37, 82–113.
Vuong, Q.H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica 57, 307–333.
Wadsworth, G.P. (1960). Introduction to probability and random variables: McGraw-Hill.
Wilson, P. (2015). The misuse of the vuong test for non-nested models to test for zero-inflation. Econ. Lett. 127, 151–153.
Wood, S.N. (2006). Generalizedadditive models: an introduction with R. Chapman and Hall/CRC.
Yang, Z., Hardin, J.W., Addy, C.L. and Vuong, Q.H. (2007). Testing approaches for overdispersion in poisson regression versus the generalized poisson model. Biom. J. 49, 565–584.
Zhou, X. and Changchun, T. (2011). Monte carlo EM algorithm for two-component mixture of generalized linear random effects models with varying coefficients. International Conference on Electronic and Mechanical Engineering and Information Technology.
Funding
Open access funding provided by Shahid Beheshti Unversity
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The Authors declare that there is no conflict of interest
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Samani, E.B., Tabrizi, E. Joint Linear Modeling of Mixed Data and Its Application to Email Analysis. Sankhya B 85, 175–209 (2023). https://doi.org/10.1007/s13571-023-00304-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13571-023-00304-w
Keywords
- Joint linear model
- inflated beta distributions
- inflated power series distribution
- sensitivity analysis
- responsiveness rate
- social network.