A theorem proving approach for automatically synthesizing visualizations of flow cytometry data

Sunny Raj; Faraz Hussain; Zubir Husein; Neslisah Torosdagli; Damla Turgut; Narsingh Deo; Sumanta Pattanaik; Chung-Che (Jeff) Chang; Sumit Kumar Jha

doi:10.1186/s12859-017-1662-4

BMC Bioinformatics (Jun 2017)

A theorem proving approach for automatically synthesizing visualizations of flow cytometry data

Sunny Raj,
Faraz Hussain,
Zubir Husein,
Neslisah Torosdagli,
Damla Turgut,
Narsingh Deo,
Sumanta Pattanaik,
Chung-Che (Jeff) Chang,
Sumit Kumar Jha

Affiliations

Sunny Raj: Computer Science Department, University of Central Florida
Faraz Hussain: School of Computing, University of Utah
Zubir Husein: Computer Science Department, University of Central Florida
Neslisah Torosdagli: Computer Science Department, University of Central Florida
Damla Turgut: Computer Science Department, University of Central Florida
Narsingh Deo: Computer Science Department, University of Central Florida
Sumanta Pattanaik: Computer Science Department, University of Central Florida
Chung-Che (Jeff) Chang: Department of Pathology, Florida Hospital
Sumit Kumar Jha: Computer Science Department, University of Central Florida

DOI: https://doi.org/10.1186/s12859-017-1662-4
Journal volume & issue: Vol. 18, no. S8
pp. 1 – 11

Abstract

Read online

Abstract Background Polychromatic flow cytometry is a popular technique that has wide usage in the medical sciences, especially for studying phenotypic properties of cells. The high-dimensionality of data generated by flow cytometry usually makes it difficult to visualize. The naive solution of simply plotting two-dimensional graphs for every combination of observables becomes impractical as the number of dimensions increases. A natural solution is to project the data from the original high dimensional space to a lower dimensional space while approximately preserving the overall relationship between the data points. The expert can then easily visualize and analyze this low-dimensional embedding of the original dataset. Results This paper describes a new method, SANJAY, for visualizing high-dimensional flow cytometry datasets. This technique uses a decision procedure to automatically synthesize two-dimensional and three-dimensional projections of the original high-dimensional data while trying to minimize distortion. We compare SANJAY to the popular multidimensional scaling (MDS) approach for visualization of small data sets drawn from a representative set of benchmarks, and our experiments show that SANJAY produces distortions that are 1.44 to 4.15 times smaller than those caused due to MDS. Our experimental results show that SANJAY also outperforms the Random Projections technique in terms of the distortions in the projections. Conclusions We describe a new algorithmic technique that uses a symbolic decision procedure to automatically synthesize low-dimensional projections of flow cytometry data that typically have a high number of dimensions. Our algorithm is the first application, to our knowledge, of using automated theorem proving for automatically generating highly-accurate, low-dimensional visualizations of high-dimensional data.

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal

Abstract

Keywords