Skip to main content
Log in

Analysis of transcriptome of single-cell RNA sequencing data using machine learning

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Single-cell ribonucleic acid (RNA) sequencing technology is used to analyze transcriptomes of each cell individually and helps to identify rare cell populations. By using traditional applications, it is difficult to understand and analyze the transcriptomic profiles of cells at the single-cell level. So, to overcome these kinds of issues, machine learning technologies are playing a great role. In this paper, we analyzed single-cell RNA seq data by implementing linear dimensional reduction, identifying highly variable features, clustering the cells, nonlinear dimensional reduction, and identifying gene markers. This type of single-cell RNA sequencing analysis is much needed in identifying transcriptomic profile challenges in cells and heterogeneous characteristics. Our study helps researchers who are doing fundamental research in the field of bioinformatics and computational biology concerning single-cell RNA sequencing data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  • 1k Brain Cells from an E18 Mouse from C57BL/6 mice (v2, 150x150), Single Cell Immune Profiling Dataset by Cell Ranger 3.0.0, 10x Genomics, (2018, November 19).

  • Alessandri L, Cordero F, Beccuti M, Licheri N, Arigoni M, Olivero M, Di Renzo MF, Sapino A, Calogero R (2021) Sparsely-connected autoencoder (SCA) for single cell RNAseq data mining. NPJ Syst Biol Appl 7(1):1

    Article  Google Scholar 

  • Angerer P, Simon L, Tritschler S, Wolf FA, Fischer D, Theis FJ (2017) Single cells make big data: New challenges and opportunities in transcriptomics. Curr Opin Syst Biol 4:85–91

    Article  Google Scholar 

  • Azizi E, Carr AJ, Plitas G, Cornish AE, Konopacki C, Prabhakaran S, Nainys J, Wu K, Kiseliovas V, Setty M, Choi K, Fromme RM, Dao P, McKenney PT, Wasti RC, Kadaveru K, Mazutis L, Rudensky AY, Pe’er D (2018) Single-cell map of diverse immune phenotypes in the breast tumor microenvironment. Cell 174(5):1293–1308

    Article  Google Scholar 

  • Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning. Springer, New York

  • Butler A, Hoffman P, Smibert P, Papalexi E, Satija R (2018) Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 36(5):411–420

    Article  Google Scholar 

  • Cho H, Berger B, Peng J (2018) Generalizable and scalable visualization of single-cell data using neural networks. Cell Syst 7(2):185–191

    Article  Google Scholar 

  • Eraslan G, Simon LM, Mircea M, Mueller NS, Theis FJ (2019) Single-cell RNA-seq denoising using a deep count autoencoder. Nat Commun 10(1):1–14

    Article  Google Scholar 

  • Eraslan G, Avsec Ž, Gagneur J, Theis FJ (2019) Deep learning: new computational modelling techniques for genomics. Nat Rev Genet 20(7):389–403

    Article  Google Scholar 

  • Ge S, Wang H, Alavi A, Xing E, Bar-Joseph Z (2021) Supervised adversarial alignment of single-cell RNA-seq data. J Comput Biol 28(5):501–513

    Article  MathSciNet  MATH  Google Scholar 

  • Han X, Wang R, Zhou Y, Fei L, Sun H, Lai S, Saadatpour A, Zhou Z, Chen H, Ye F, Huang D (2018) Mapping the mouse cell atlas by microwell-seq. Cell 172(5):1091–1107

    Article  Google Scholar 

  • Hao Y, Hao S, Andersen-Nissen E. WMM III, S. Zheng, A. Butler, MJ Lee, AJ Wilk, C. Darby, M. Zagar, P. Hoffman, M. Stoeckius, E. Papalexi, EP Mimitou, J. Jain, A. Srivastava, T. Stuart, LB Fleming, B. Yeung, AJ Rogers, JM McElrath, CA Blish, R. Gottardo, P. Smibert, R. Satija (2021) Integrated analysis of multimodal single-cell data. Cell.https://doi.org/10.1016/j.cell.2021.04.048.

  • Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res 21(9):1543–1551

    Article  Google Scholar 

  • Jiang L, Chen H, Pinello L, Yuan GC (2016) GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol 17(1):1–13

    Article  Google Scholar 

  • Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, Peshkin L, Weitz DA, Kirschner MW (2015) Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161(5):1187–1201

    Article  Google Scholar 

  • Kolodziejczyk AA, Kim JK, Svensson V, Marioni JC, Teichmann SA (2015) The technology and biology of single-cell RNA sequencing. Mol Cell 58(4):610–620

    Article  Google Scholar 

  • Lakkis J, Wang D, Zhang Y, Hu G, Wang K, Pan H, Ungar L, Reilly MP, Li X, Li M (2021) A joint deep learning model enables simultaneous batch effect correction, denoising, and clustering in single-cell transcriptomics. Genome Res 31(10):1753–1766

    Article  Google Scholar 

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  • Li H, Courtois ET, Sengupta D, Tan Y, Chen KH, Goh JJL, Kong SL, Chua C, Hon LK, Tan WS, Wong M, Choi PJ, Wee LJ, Hillmer AM, Tan IB, Robson P, Prabhakar S (2017) Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors. Nat Genet 49(5):708–718

    Article  Google Scholar 

  • Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16(6):321–332

    Article  Google Scholar 

  • Lin P, Troup M, Ho JW (2017) CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Genome Biol 18(1):1–11

    Article  Google Scholar 

  • Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, McCarroll SA (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161(5):1202–1214

    Article  Google Scholar 

  • MacQueen, J. (1967). Classification and analysis of multivariate observations. In: 5th Berkeley Symp. Math. Statist. Probability, pp 281–297

  • Ni Z, Chen S, Brown J, Kendziorski C (2020) CB2 improves power of cell detection in droplet-based single-cell RNA sequencing data. Genome Biol 21(1):1–10

    Article  Google Scholar 

  • Ntranos V, Kamath GM, Zhang JM, Pachter L, Tse DN (2016) Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts. Genome Biol 17(1):1–14

    Article  Google Scholar 

  • Parekh S, Ziegenhain C, Vieth B, Enard W, Hellmann I (2018) zUMIs-A fast and flexible pipeline to process RNA sequencing data with UMIs. Gigascience

  • Satija R, Farrell JA, Gennert D, Schier AF, Regev A (2015) Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 33(5):495–502

    Article  Google Scholar 

  • Shapiro E, Biezuner T, Linnarsson S (2013) Single-cell sequencing- based technologies will revolutionize whole-organism science. Nat Rev Genet 14(9):618–630

    Article  Google Scholar 

  • Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, WMM III, Hao Y, Stoeckius M, Smibert P, Satija R (2019) Comprehensive integration of single-cell data. Cell 177:1888–1902. https://doi.org/10.1016/j.cell.2019.05.031

    Article  Google Scholar 

  • van den Brink SC, Sage F, Vértesy Á, Spanjaard B, Peterson-Maduro J, Baron CS, Robin C, Van Oudenaarden A (2017) Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat Methods 14(10):935–936

    Article  Google Scholar 

  • Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  • Xu C, Su Z (2015) Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31(12):1974–1980

    Article  Google Scholar 

  • Xu Y, Das P, McCord RP (2022) SMILE: mutual information learning for integration of single-cell omics data. Bioinformatics 38(2):476–486

    Article  Google Scholar 

  • Zhang JM, Fan J, Fan HC, Rosenfeld D, Tse DN (2018) An interpretable framework for clustering single-cell RNA-Seq datasets. BMC Bioinform 19(1):1–12

    Article  Google Scholar 

  • Zhao J, Wang N, Wang H, Zheng C, Su Y (2021) SCDRHA: a scRNA-seq data dimensionality reduction algorithm based on hierarchical autoencoder. Front Genet 12:733906

    Article  Google Scholar 

  • Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J, Gregory MT (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8(1):14049

    Article  Google Scholar 

Download references

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mothe Rajesh.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rajesh, M., Martha, S. Analysis of transcriptome of single-cell RNA sequencing data using machine learning. Soft Comput 27, 9131–9141 (2023). https://doi.org/10.1007/s00500-023-08432-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-08432-1

Keywords

Navigation