Joint Linear Modeling of Mixed Data and Its Application to Email Analysis

Samani, Ehsan Bahrami; Tabrizi, Elham

doi:10.1007/s13571-023-00304-w

Joint Linear Modeling of Mixed Data and Its Application to Email Analysis

Published: 10 March 2023

Volume 85, pages 175–209, (2023)
Cite this article

Sankhya B Aims and scope Submit manuscript

77 Accesses
1 Citation
Explore all metrics

Abstract

We present a new model in Social Networks which allows experts in this field to analyze social networks. In this paper, a joint random effect linear model for analysing longitudinal inflated [0,1]-support and inflated count response variables, where there is the possibility of non-ignorable missing values for inflated [0,1]-support response variable, has been presented. Considering the posterior distribution of unknowns given all available information. A Monte Carlo EM algorithm is used for estimating the posterior distribution of the parameters. A sensitivity of the results to the assumptions is also investigated the perturbation from missing at random to not missing at random. Influence of small perturbation of these elements on posterior displacement is also studied. Finally, for showing the applicability of the proposed model, results from analyzing Enron email dataset and student activity and profile dataset are presented. Also, a new statistical monitoring to study the longitudinal social network datasets via considering attributes which are important in various applications is provided. For this purpose, a complete definition of responsiveness rate in social networks as a [0,1]-support variable has been presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Analysis of ERG Models for Multilevel, Multiplex, and Multilayered Networks with Sampled or Missing Data

Modeling of networked populations when data is sampled or missing

Article 01 April 2023

Ian E. Fellows & Mark S. Handcock

Semi-Parametric Models for Negative Binomial Panel Data

Article 01 August 2016

Brajendra C. Sutradhar, Vandna Jowaheer & R. Prabhakar Rao

Notes

\(P_{Y_{(i,j)t}}^{INFBE_{k,l}}\), a probability measure on the measurable space \(({\Omega }, \mathfrak {B})\), is absolutely continuous with respect to the σ −finite measure μ = μ_L + δ_k + δ_l, where μ_L indicates the Lebesgue measure and δ_c is a point mass at c.

References

Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B (Methodological) 44, 139–177.
Article MathSciNet MATH Google Scholar
Anholetoa, T., Sandovala, M.C. and Bottera, D.A. (2012). Adjusted Pearson residuals in beta regression models. J. Stat. Comput. Simul. 84, 999–1014.
Article MathSciNet Google Scholar
Azarnoush, B.K., Paynabar, J.B. and Runger, G. (2016). Monitoring temporal homogeneity in attributed network streams. J. Qual. Technol. 48, 28–43.
Article Google Scholar
Barreto-Souza, W. and Simas, A.B. (2017). Improving estimation for beta regression models via EM algorithm and related diagnostic tools. J. Stat. Comput. Simul.87, 2847–2867.
Article MathSciNet MATH Google Scholar
Cameron, A.C. and Trivedi, P.K. (1998). Regression Analysis of Count Data. Cambridge University Press, Cambridge.
Book MATH Google Scholar
Cook, R.D. (1986). Assessment of Local Influence. J. Royal Statist. Soc., Ser. B. 48, 133–169.
MathSciNet MATH Google Scholar
Choudhary, P. and Singh, U. (2015). A survey on social network analysis for counter terrorism. Int. J. Comput. Appl. 112, 24–29.
Google Scholar
Demidenko, E. (2013). Mixed Models Theory and Applications With R. John Wiley Sons, New York.
MATH Google Scholar
Dempster, A.P. and Laird, N.M. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm. J. R. Statist. Soc B. 39, 1–38.
MATH Google Scholar
Ferrari, S.L.P. and Cribari-Neto, F. (2004). Beta regression for modeling rates and proportions. J. Appl. Stat. 31, 799–815.
Article MathSciNet MATH Google Scholar
Ferrari, S.L.P. and Pinheiro, E.C. (2010). Improved Likelihood Inference in Beta Regression. JJ. Stat. Comput. Simul. 81, 431–443.
Article MathSciNet MATH Google Scholar
Hahn, G.J. and Shapiro, S. (1994). Statistical models in engineering. Wiley, New york.
MATH Google Scholar
Hunger, M., Baumert, J. and Holle, R. (2011). Analysis of SF-6D index data: is beta regression appropriate?. Value Health 14, 759–767.
Article Google Scholar
Hunger, M., Dring A. and Holle, R. (2012). Longitudinal beta regression models for analyzing health-related quality of life scores over time. BMC Med. Res. Methodol. 12, 144.
Article Google Scholar
Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995). Continuous univariate distributions: John Wiley and Sons.
Keeping, E.S. (2010). Introduction to Statistical Inference. Dover Publications, New Jersey.
Google Scholar
Little, R.J. and Rubin, D.B. (2002). Statistical analysis with missing data: John Wiley Sons.
Lusher, D., Koskinen, J. and Robins, G. (2013), Exponential random graph models for social networks: theory, methods, and applications. Cambridge University Press.
McLachlan, G. and Peel, D. (2000). Finite mixture models willey series in probability and statistics.
Ospina, R. and Ferrari, S.L.P. (2010). Inflated beta distributions. Stat. Pap. 51, 1–11.
Article MathSciNet MATH Google Scholar
Ospina, R. and Ferrari, S.L.P. (2012). A general class of zero-or-one inflated beta regression models. Computational Statistics and Data Analysis 56, 1609–1623.
Article MathSciNet MATH Google Scholar
Patil, K. (2016). Validation of beta distribution for spectrum usage using Kolmogorov-Smirnov test. nt. J. Comput. Appl. 144, 479–482.
Google Scholar
Rubin, D.B. (1976). Inference and missing data. Biometrica. 82, 669–710.
Google Scholar
Smithson, M. and Verkuilen, J. (2006). A better lemon squeezer Maximum-likelihood regression with beta-distributed dependent variables. Psychol. Methods 11, 54.
Article Google Scholar
Tu, W. (2002). Zero inflated data. Encyclopedia of Environmetrics 4, 2387–2391.
Google Scholar
Vern, J. and Kuile Smithson, M. (2012). Mixed and mixture regression models for continuous bounded responses using the beta distribution. J. Educ. Behav. Stat. 37, 82–113.
Article Google Scholar
Vuong, Q.H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica 57, 307–333.
Article MathSciNet MATH Google Scholar
Wadsworth, G.P. (1960). Introduction to probability and random variables: McGraw-Hill.
Wilson, P. (2015). The misuse of the vuong test for non-nested models to test for zero-inflation. Econ. Lett. 127, 151–153.
Article MathSciNet MATH Google Scholar
Wood, S.N. (2006). Generalizedadditive models: an introduction with R. Chapman and Hall/CRC.
Yang, Z., Hardin, J.W., Addy, C.L. and Vuong, Q.H. (2007). Testing approaches for overdispersion in poisson regression versus the generalized poisson model. Biom. J. 49, 565–584.
Article MathSciNet MATH Google Scholar
Zhou, X. and Changchun, T. (2011). Monte carlo EM algorithm for two-component mixture of generalized linear random effects models with varying coefficients. International Conference on Electronic and Mechanical Engineering and Information Technology.

Download references

Funding

Open access funding provided by Shahid Beheshti Unversity

Author information

Authors and Affiliations

Department of Statistics, Faculty of Mathematical Science, Shahid Beheshti University, Tehran, Iran
Ehsan Bahrami Samani & Elham Tabrizi

Authors

Ehsan Bahrami Samani
View author publications
You can also search for this author in PubMed Google Scholar
Elham Tabrizi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ehsan Bahrami Samani.

Ethics declarations

Conflict of Interest

The Authors declare that there is no conflict of interest

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(rar 324 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Samani, E.B., Tabrizi, E. Joint Linear Modeling of Mixed Data and Its Application to Email Analysis. Sankhya B 85, 175–209 (2023). https://doi.org/10.1007/s13571-023-00304-w

Download citation

Received: 06 May 2022
Accepted: 18 January 2023
Published: 10 March 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s13571-023-00304-w

Keywords

PACS Nos

Primary 62J02; Secondary 62J05

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint Linear Modeling of Mixed Data and Its Application to Email Analysis

Abstract

Access this article

Similar content being viewed by others

Bayesian Analysis of ERG Models for Multilevel, Multiplex, and Multilayered Networks with Sampled or Missing Data

Modeling of networked populations when data is sampled or missing

Semi-Parametric Models for Negative Binomial Panel Data

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Electronic supplementary material

(rar 324 KB)

Rights and permissions

About this article

Cite this article

Keywords

PACS Nos

Navigation

Joint Linear Modeling of Mixed Data and Its Application to Email Analysis

Abstract

Access this article

Similar content being viewed by others

Bayesian Analysis of ERG Models for Multilevel, Multiplex, and Multilayered Networks with Sampled or Missing Data

Modeling of networked populations when data is sampled or missing

Semi-Parametric Models for Negative Binomial Panel Data

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Electronic supplementary material

(rar 324 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

PACS Nos

Search

Navigation