Abstract
Single-cell ribonucleic acid (RNA) sequencing technology is used to analyze transcriptomes of each cell individually and helps to identify rare cell populations. By using traditional applications, it is difficult to understand and analyze the transcriptomic profiles of cells at the single-cell level. So, to overcome these kinds of issues, machine learning technologies are playing a great role. In this paper, we analyzed single-cell RNA seq data by implementing linear dimensional reduction, identifying highly variable features, clustering the cells, nonlinear dimensional reduction, and identifying gene markers. This type of single-cell RNA sequencing analysis is much needed in identifying transcriptomic profile challenges in cells and heterogeneous characteristics. Our study helps researchers who are doing fundamental research in the field of bioinformatics and computational biology concerning single-cell RNA sequencing data.
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
1k Brain Cells from an E18 Mouse from C57BL/6 mice (v2, 150x150), Single Cell Immune Profiling Dataset by Cell Ranger 3.0.0, 10x Genomics, (2018, November 19).
Alessandri L, Cordero F, Beccuti M, Licheri N, Arigoni M, Olivero M, Di Renzo MF, Sapino A, Calogero R (2021) Sparsely-connected autoencoder (SCA) for single cell RNAseq data mining. NPJ Syst Biol Appl 7(1):1
Angerer P, Simon L, Tritschler S, Wolf FA, Fischer D, Theis FJ (2017) Single cells make big data: New challenges and opportunities in transcriptomics. Curr Opin Syst Biol 4:85–91
Azizi E, Carr AJ, Plitas G, Cornish AE, Konopacki C, Prabhakaran S, Nainys J, Wu K, Kiseliovas V, Setty M, Choi K, Fromme RM, Dao P, McKenney PT, Wasti RC, Kadaveru K, Mazutis L, Rudensky AY, Pe’er D (2018) Single-cell map of diverse immune phenotypes in the breast tumor microenvironment. Cell 174(5):1293–1308
Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning. Springer, New York
Butler A, Hoffman P, Smibert P, Papalexi E, Satija R (2018) Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 36(5):411–420
Cho H, Berger B, Peng J (2018) Generalizable and scalable visualization of single-cell data using neural networks. Cell Syst 7(2):185–191
Eraslan G, Simon LM, Mircea M, Mueller NS, Theis FJ (2019) Single-cell RNA-seq denoising using a deep count autoencoder. Nat Commun 10(1):1–14
Eraslan G, Avsec Ž, Gagneur J, Theis FJ (2019) Deep learning: new computational modelling techniques for genomics. Nat Rev Genet 20(7):389–403
Ge S, Wang H, Alavi A, Xing E, Bar-Joseph Z (2021) Supervised adversarial alignment of single-cell RNA-seq data. J Comput Biol 28(5):501–513
Han X, Wang R, Zhou Y, Fei L, Sun H, Lai S, Saadatpour A, Zhou Z, Chen H, Ye F, Huang D (2018) Mapping the mouse cell atlas by microwell-seq. Cell 172(5):1091–1107
Hao Y, Hao S, Andersen-Nissen E. WMM III, S. Zheng, A. Butler, MJ Lee, AJ Wilk, C. Darby, M. Zagar, P. Hoffman, M. Stoeckius, E. Papalexi, EP Mimitou, J. Jain, A. Srivastava, T. Stuart, LB Fleming, B. Yeung, AJ Rogers, JM McElrath, CA Blish, R. Gottardo, P. Smibert, R. Satija (2021) Integrated analysis of multimodal single-cell data. Cell.https://doi.org/10.1016/j.cell.2021.04.048.
Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res 21(9):1543–1551
Jiang L, Chen H, Pinello L, Yuan GC (2016) GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol 17(1):1–13
Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, Peshkin L, Weitz DA, Kirschner MW (2015) Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161(5):1187–1201
Kolodziejczyk AA, Kim JK, Svensson V, Marioni JC, Teichmann SA (2015) The technology and biology of single-cell RNA sequencing. Mol Cell 58(4):610–620
Lakkis J, Wang D, Zhang Y, Hu G, Wang K, Pan H, Ungar L, Reilly MP, Li X, Li M (2021) A joint deep learning model enables simultaneous batch effect correction, denoising, and clustering in single-cell transcriptomics. Genome Res 31(10):1753–1766
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Li H, Courtois ET, Sengupta D, Tan Y, Chen KH, Goh JJL, Kong SL, Chua C, Hon LK, Tan WS, Wong M, Choi PJ, Wee LJ, Hillmer AM, Tan IB, Robson P, Prabhakar S (2017) Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors. Nat Genet 49(5):708–718
Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16(6):321–332
Lin P, Troup M, Ho JW (2017) CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Genome Biol 18(1):1–11
Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, McCarroll SA (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161(5):1202–1214
MacQueen, J. (1967). Classification and analysis of multivariate observations. In: 5th Berkeley Symp. Math. Statist. Probability, pp 281–297
Ni Z, Chen S, Brown J, Kendziorski C (2020) CB2 improves power of cell detection in droplet-based single-cell RNA sequencing data. Genome Biol 21(1):1–10
Ntranos V, Kamath GM, Zhang JM, Pachter L, Tse DN (2016) Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts. Genome Biol 17(1):1–14
Parekh S, Ziegenhain C, Vieth B, Enard W, Hellmann I (2018) zUMIs-A fast and flexible pipeline to process RNA sequencing data with UMIs. Gigascience
Satija R, Farrell JA, Gennert D, Schier AF, Regev A (2015) Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 33(5):495–502
Shapiro E, Biezuner T, Linnarsson S (2013) Single-cell sequencing- based technologies will revolutionize whole-organism science. Nat Rev Genet 14(9):618–630
Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, WMM III, Hao Y, Stoeckius M, Smibert P, Satija R (2019) Comprehensive integration of single-cell data. Cell 177:1888–1902. https://doi.org/10.1016/j.cell.2019.05.031
van den Brink SC, Sage F, Vértesy Á, Spanjaard B, Peterson-Maduro J, Baron CS, Robin C, Van Oudenaarden A (2017) Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat Methods 14(10):935–936
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Xu C, Su Z (2015) Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31(12):1974–1980
Xu Y, Das P, McCord RP (2022) SMILE: mutual information learning for integration of single-cell omics data. Bioinformatics 38(2):476–486
Zhang JM, Fan J, Fan HC, Rosenfeld D, Tse DN (2018) An interpretable framework for clustering single-cell RNA-Seq datasets. BMC Bioinform 19(1):1–12
Zhao J, Wang N, Wang H, Zheng C, Su Y (2021) SCDRHA: a scRNA-seq data dimensionality reduction algorithm based on hierarchical autoencoder. Front Genet 12:733906
Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J, Gregory MT (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8(1):14049
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rajesh, M., Martha, S. Analysis of transcriptome of single-cell RNA sequencing data using machine learning. Soft Comput 27, 9131–9141 (2023). https://doi.org/10.1007/s00500-023-08432-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-08432-1