Non-parametric comparison and classification of two large-scale populations

Ghoreishi, S. K.; Wu, Jingjing; Ghoreishi, Ghazal S.

doi:10.1007/s42952-022-00198-w

Non-parametric comparison and classification of two large-scale populations

Research Article
Published: 21 November 2022

Volume 52, pages 234–247, (2023)
Cite this article

Journal of the Korean Statistical Society Aims and scope Submit manuscript

S. K. Ghoreishi¹,
Jingjing Wu² &
Ghazal S. Ghoreishi³

93 Accesses
Explore all metrics

Abstract

In this paper, we investigate a non-parametric approach to compare two groups in microarray data. This is done using a threshold penalized-distance likelihood function, which is made up of a penalty and a suitable threshold distance, and is applicable when sample size is small or when the data is not normally distributed. We also use this function to classify new data. This is based on objects that are identified as differences between the two groups, not for all objects. We also study a real data application to illustrate our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical Methodologies for Analyzing Genomic Data

Clustering Methods for Microarray Data Sets

Genomic Outlier Detection in High-Throughput Data Analysis

Data availability

We used Efron’s microarray prostate data (singh2002 data set available in “sda” library in R software).

References

Bayati, M., Ghoreishi, S. K., & Wu, J. (2021). Bayesian analysis of restricted penalized empirical likelihood. Computational Statistics, 36(2), 1321–39.
Article MathSciNet MATH Google Scholar
Benjamini, Yoav, Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 289–300.
Bhattacharya, Anirban, Pati, Debdeep, Pillai, Natesh S., & Dunson, David B. (2015). Dirichlet–Laplace priors for optimal shrinkage. Journal of the American Statistical Association, 110(512), 1479–1490.
Article MathSciNet MATH Google Scholar
Campbell, M. J., & Shantikumar, S. (2016). Parametric and non-parametric tests for comparing two or more groups. HealthKnowledge. Viitattu, 2, 2020.
Google Scholar
Churchill, G. A. (2004). Using ANOVA to analyze microarray data. Biotechniques, 37(2), 173–7.
Article Google Scholar
Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model. Statistical Science, 23, 1–22.
MathSciNet MATH Google Scholar
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of statistics, 32(2), 407–499.
Article MathSciNet MATH Google Scholar
Fay, Michael P., & Proschan, Michael A. (2010). Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Statistics Surveys, 4, 1–39.
Article MathSciNet MATH Google Scholar
Gao, L., Wang, J., Zhao, Y., Liu, J., Cai, D., Zhang, X., et al. (2021). Identification of sulforaphane regulatory network in hepatocytes by microarray data analysis based on GEO database. Bioscience Reports, 41(2), 26.
Article Google Scholar
Ghoreishi, S.K, Ghoreishi, G. S., & Jingjing, W. (2022). Penalized-distance likelihood functions in sparse and non-sparse high-dimensional. Journal of Statistical Theory and practice (To appear).
Johnstone, I. M., & Silverman, B. W. (2004). Needles and straw in haystacks: Empirical bayes estimates of possibly sparse sequences. The Annals of Statistics., 32(4), 1594–1649.
Article MathSciNet MATH Google Scholar
Kumar, M., Rath, N. K., Swain, A., & Rath, S. K. (2015). Feature selection and classification of microarray data using MapReduce based ANOVA and K-nearest neighbor. Procedia Computer Science, 1, 54.
Google Scholar
Nueda, M. J., Conesa, A., Westerhuis, J. A., Hoefsloot, H. C., Smilde, A. K., Talón, M., & Ferrer, A. (2007). Discovering gene expression patterns in time course microarray experiments by ANOVA-SCA. Bioinformatics, 23(14), 1792–800.
Article Google Scholar
Stretch, C., Khan, S., Asgarian, N., Eisner, R., Vaisipour, S., Damaraju, S., et al. (2013). Effects of sample size on differential gene expression, rank order and prediction accuracy of a gene signature. PLoS One,8(6), e65380.
Tarca, A. L., Romero, R., & Draghici, S. (2006). Analysis of microarray experiments of gene expression profiling. American Journal of Obstetrics and Gynecology,195(2), 373–88.
Tinker, A. V., Boussioutas, A., & Bowtell, D. D. (2006). The challenges of gene expression microarrays for the study of human cancer. Cancer Cell,9(5), 333–339.
Zhao, Y. Y., & Lin, J. G. (2019). Estimation and test of jump discontinuities in varying coefficient models with empirical applications. Computational Statistics & Data Analysis,139, 145–63.
Zhao, Y. Y., Lin, J. G., Huang, X. F., & Wang, H. X. (2016). Adaptive jump-preserving estimates in varying-coefficient models. Journal of Multivariate Analysis,149, 65–80.

Download references

Funding

There was no fund in this work.

Author information

Authors and Affiliations

Department of Statistics, University of Qom, Qom, Iran
S. K. Ghoreishi
Department of Mathematics and Statistics, University of Calgary, Calgary, Canada
Jingjing Wu
Faculty of Electrical Engineering, Shahid Beheshti University, Tehran, Iran
Ghazal S. Ghoreishi

Authors

S. K. Ghoreishi
View author publications
You can also search for this author in PubMed Google Scholar
Jingjing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ghazal S. Ghoreishi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. K. Ghoreishi.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethical conduct

In this work, we have used a well-known dataset to apply our methodology, so we have no ethical conduct.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ghoreishi, S.K., Wu, J. & Ghoreishi, G.S. Non-parametric comparison and classification of two large-scale populations. J. Korean Stat. Soc. 52, 234–247 (2023). https://doi.org/10.1007/s42952-022-00198-w

Download citation

Received: 11 April 2022
Accepted: 28 October 2022
Published: 21 November 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s42952-022-00198-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Non-parametric comparison and classification of two large-scale populations

Abstract

Access this article

Similar content being viewed by others

Statistical Methodologies for Analyzing Genomic Data

Clustering Methods for Microarray Data Sets

Genomic Outlier Detection in High-Throughput Data Analysis

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical conduct

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Non-parametric comparison and classification of two large-scale populations

Abstract

Access this article

Similar content being viewed by others

Statistical Methodologies for Analyzing Genomic Data

Clustering Methods for Microarray Data Sets

Genomic Outlier Detection in High-Throughput Data Analysis

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical conduct

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation