Converting tabular data into images for deep learning with convolutional neural networks

Yitan Zhu; Thomas Brettin; Fangfang Xia; Alexander Partin; Maulik Shukla; Hyunseung Yoo; Yvonne A. Evrard; James H. Doroshow; Rick L. Stevens

doi:10.1038/s41598-021-90923-y

Scientific Reports (May 2021)

Converting tabular data into images for deep learning with convolutional neural networks

Yitan Zhu,
Thomas Brettin,
Fangfang Xia,
Alexander Partin,
Maulik Shukla,
Hyunseung Yoo,
Yvonne A. Evrard,
James H. Doroshow,
Rick L. Stevens

Affiliations

Yitan Zhu: Computing, Environment and Life Sciences, Argonne National Laboratory
Thomas Brettin: Computing, Environment and Life Sciences, Argonne National Laboratory
Fangfang Xia: Computing, Environment and Life Sciences, Argonne National Laboratory
Alexander Partin: Computing, Environment and Life Sciences, Argonne National Laboratory
Maulik Shukla: Computing, Environment and Life Sciences, Argonne National Laboratory
Hyunseung Yoo: Computing, Environment and Life Sciences, Argonne National Laboratory
Yvonne A. Evrard: Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc.
James H. Doroshow: Developmental Therapeutics Branch, National Cancer Institute
Rick L. Stevens: Computing, Environment and Life Sciences, Argonne National Laboratory

DOI: https://doi.org/10.1038/s41598-021-90923-y
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Convolutional neural networks (CNNs) have been successfully used in many applications where important information about data is embedded in the order of features, such as speech and imaging. However, most tabular data do not assume a spatial relationship between features, and thus are unsuitable for modeling using CNNs. To meet this challenge, we develop a novel algorithm, image generator for tabular data (IGTD), to transform tabular data into images by assigning features to pixel positions so that similar features are close to each other in the image. The algorithm searches for an optimized assignment by minimizing the difference between the ranking of distances between features and the ranking of distances between their assigned pixels in the image. We apply IGTD to transform gene expression profiles of cancer cell lines (CCLs) and molecular descriptors of drugs into their respective image representations. Compared with existing transformation methods, IGTD generates compact image representations with better preservation of feature neighborhood structure. Evaluated on benchmark drug screening datasets, CNNs trained on IGTD image representations of CCLs and drugs exhibit a better performance of predicting anti-cancer drug response than both CNNs trained on alternative image representations and prediction models trained on the original tabular data.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal