We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Using single-cell cytometry to illustrate integrated multi-perspective evaluation of clustering algorithms using Pareto fronts.
- Authors
Putri, Givanna H; Koprinska, Irena; Ashhurst, Thomas M; King, Nicholas J C; Read, Mark N
- Abstract
Motivation Many 'automated gating' algorithms now exist to cluster cytometry and single-cell sequencing data into discrete populations. Comparative algorithm evaluations on benchmark datasets rely either on a single performance metric, or a few metrics considered independently of one another. However, single metrics emphasize different aspects of clustering performance and do not rank clustering solutions in the same order. This underlies the lack of consensus between comparative studies regarding optimal clustering algorithms and undermines the translatability of results onto other non-benchmark datasets. Results We propose the Pareto fronts framework as an integrative evaluation protocol, wherein individual metrics are instead leveraged as complementary perspectives. Judged superior are algorithms that provide the best trade-off between the multiple metrics considered simultaneously. This yields a more comprehensive and complete view of clustering performance. Moreover, by broadly and systematically sampling algorithm parameter values using the Latin Hypercube sampling method, our evaluation protocol minimizes (un)fortunate parameter value selections as confounding factors. Furthermore, it reveals how meticulously each algorithm must be tuned in order to obtain good results, vital knowledge for users with novel data. We exemplify the protocol by conducting a comparative study between three clustering algorithms (ChronoClust, FlowSOM and Phenograph) using four common performance metrics applied across four cytometry benchmark datasets. To our knowledge, this is the first time Pareto fronts have been used to evaluate the performance of clustering algorithms in any application domain. Availability and implementation Implementation of our Pareto front methodology and all scripts and datasets to reproduce this article are available at https://github.com/ghar1821/ParetoBench. Supplementary information Supplementary data are available at Bioinformatics online.
- Subjects
LATIN hypercube sampling; CYTOMETRY; ALGORITHMS; KEY performance indicators (Management)
- Publication
Bioinformatics, 2021, Vol 37, Issue 14/15, p1972
- ISSN
1367-4803
- Publication type
Article
- DOI
10.1093/bioinformatics/btab038