Identification of cellular populations is the first step in analyzing cytometry data. To identify both abundant and outlying rare cellular populations a density-based preprocessing of data to equalize representations of populations is needed. Density-based downsampling keeps representative points in the cellular space while discarding irrelevant ones. We propose a fast and fully deterministic algorithm for density calculation, based on space partitioning, tree representation and an iterative approach to downsampling utilizing fast calculation of density. We compared our algorithm with SPADE, the most used approach in this area, achieving comparable results in a significantly shorter runtime.
Downsampling Cytometry Density Parallel computing Outlier detection
This is a preview of subscription content, log in to check access.
This work was partially supported by the Scientific Grant Agency of The Slovak Republic, Grant No. VG 1/0458/18 and APVV-16-0484.
Amir, E.A.D., et al.: viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol. 31(6), 545–552 (2013)CrossRefGoogle Scholar
Bruggner, R.V., et al.: Automated identification of stratifying signatures in cellular subpopulations. Proc. Natl. Acad. Sci. 111(26), E2770–E2777 (2014)CrossRefGoogle Scholar