Examining data visualization pitfalls in scientific publications

Vinh T Nguyen; Kwanghee Jung; Vibhuti Gupta

doi:10.1186/s42492-021-00092-y

Visual Computing for Industry, Biomedicine, and Art (Oct 2021)

Examining data visualization pitfalls in scientific publications

Vinh T Nguyen,
Kwanghee Jung,
Vibhuti Gupta

Affiliations

Vinh T Nguyen: Department of Information Technology, TNU – University of Information and Communication Technology
Kwanghee Jung: Department of Educational Psychology, Leadership, and Counseling, Texas Tech University
Vibhuti Gupta: Department of Computer Science and Data Science, Meharry Medical College

DOI: https://doi.org/10.1186/s42492-021-00092-y
Journal volume & issue: Vol. 4, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Data visualization blends art and science to convey stories from data via graphical representations. Considering different problems, applications, requirements, and design goals, it is challenging to combine these two components at their full force. While the art component involves creating visually appealing and easily interpreted graphics for users, the science component requires accurate representations of a large amount of input data. With a lack of the science component, visualization cannot serve its role of creating correct representations of the actual data, thus leading to wrong perception, interpretation, and decision. It might be even worse if incorrect visual representations were intentionally produced to deceive the viewers. To address common pitfalls in graphical representations, this paper focuses on identifying and understanding the root causes of misinformation in graphical representations. We reviewed the misleading data visualization examples in the scientific publications collected from indexing databases and then projected them onto the fundamental units of visual communication such as color, shape, size, and spatial orientation. Moreover, a text mining technique was applied to extract practical insights from common visualization pitfalls. Cochran’s Q test and McNemar’s test were conducted to examine if there is any difference in the proportions of common errors among color, shape, size, and spatial orientation. The findings showed that the pie chart is the most misused graphical representation, and size is the most critical issue. It was also observed that there were statistically significant differences in the proportion of errors among color, shape, size, and spatial orientation.

Published in Visual Computing for Industry, Biomedicine, and Art

ISSN: 2524-4442 (Online)
Publisher: SpringerOpen
Country of publisher: Singapore
LCC subjects: Fine Arts: Drawing. Design. Illustration; Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: http://vciba.springeropen.com

About the journal

Abstract

Keywords