Parallel Density-Based Downsampling of Cytometry Data

  • Martin Nemček
  • Tomáš JarábekEmail author
  • Mária Lucká
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1005)


Identification of cellular populations is the first step in analyzing cytometry data. To identify both abundant and outlying rare cellular populations a density-based preprocessing of data to equalize representations of populations is needed. Density-based downsampling keeps representative points in the cellular space while discarding irrelevant ones. We propose a fast and fully deterministic algorithm for density calculation, based on space partitioning, tree representation and an iterative approach to downsampling utilizing fast calculation of density. We compared our algorithm with SPADE, the most used approach in this area, achieving comparable results in a significantly shorter runtime.


Downsampling Cytometry Density Parallel computing Outlier detection 



This work was partially supported by the Scientific Grant Agency of The Slovak Republic, Grant No. VG 1/0458/18 and APVV-16-0484.


  1. 1.
    Amir, E.A.D., et al.: viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol. 31(6), 545–552 (2013)CrossRefGoogle Scholar
  2. 2.
    Bruggner, R.V., et al.: Automated identification of stratifying signatures in cellular subpopulations. Proc. Natl. Acad. Sci. 111(26), E2770–E2777 (2014)CrossRefGoogle Scholar
  3. 3.
    Fruchterman, T.M.J., Reingold, E.M.: Graph drawing by force-directed placement. Softw. Pract. Exper. 21(11), 1129–1164 (1991)CrossRefGoogle Scholar
  4. 4.
    Levine, J., et al.: Data-driven phenotypic dissection of aml reveals progenitor-like cells that correlate with prognosis. Cell 162(1), 184–197 (2015)CrossRefGoogle Scholar
  5. 5.
    Li, H., et al.: Gating mass cytometry data by deep learning. Bioinformatics 33(21), 3423–3430 (2017)CrossRefGoogle Scholar
  6. 6.
    Qiu, P.: Toward deterministic and semiautomated SPADE analysis. Cytometry. Part: J. Int. Soc. Anal. Cytol. 91, 281–289 (2017)CrossRefGoogle Scholar
  7. 7.
    Qiu, P., et al.: Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE. Nat. Biotechnol. 29(10), 886–891 (2011)CrossRefGoogle Scholar
  8. 8.
    Saeys, Y., et al.: Computational flow cytometry: helping to make sense of high-dimensional immunology data. Nat. Rev. Immunol. 16(7), 449–462 (2016)CrossRefGoogle Scholar
  9. 9.
    Spitzer, M., Nolan, G.: Mass cytometry: single cells, many features. Cell 165(4), 780–791 (2016)CrossRefGoogle Scholar
  10. 10.
    Tanner, S.D., et al.: An introduction to mass cytometry: fundamentals and applications. Cancer Immunol. Immunother. 62(5), 955–965 (2013)CrossRefGoogle Scholar
  11. 11.
    Verschoor, C.P., et al.: An introduction to automated flow cytometry gating tools and their implementation. Front. Immunol. 6, 380 (2015)CrossRefGoogle Scholar
  12. 12.
    Weber, L.M., Robinson, M.D.: Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data. Cytom. Part A 89(12), 1084–1096 (2016)CrossRefGoogle Scholar
  13. 13.
    Zare, H., et al.: Data reduction for spectral clustering to analyze high throughput flow cytometry data. BMC Bioinform. 11(1), 403 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Martin Nemček
    • 1
  • Tomáš Jarábek
    • 1
    Email author
  • Mária Lucká
    • 1
  1. 1.Faculty of Informatics and Information TechnologiesSlovak University of TechnologyBratislavaSlovakia

Personalised recommendations