IEEE Access (Jan 2023)

Toward a Practical and Timely Diagnosis of Application’s I/O Behavior

  • Tania Esteves,
  • Ricardo Macedo,
  • Rui Oliveira,
  • JOAO PAULO

DOI
https://doi.org/10.1109/ACCESS.2023.3322104
Journal volume & issue
Vol. 11
pp. 110184 – 110207

Abstract

Read online

We present DIO, a generic tool for observing inefficient and erroneous I/O interactions between applications and in-kernel storage backends that lead to performance, dependability, and correctness issues. DIO eases the analysis and enables near real-time visualization of complex I/O patterns for data-intensive applications generating millions of storage requests. This is achieved by non-intrusively intercepting system calls, enriching collected data with relevant context, and providing timely analysis and visualization for traced events. We demonstrate its usefulness by analyzing four production-level applications. Results show that DIO enables diagnosing inefficient I/O patterns that lead to poor application performance, unexpected and redundant I/O calls caused by high-level libraries, resource contention in multithreaded I/O that leads to high tail latency, and erroneous file accesses that cause data loss. Moreover, through a detailed evaluation, we show that, when comparing DIO’s inline diagnosis pipeline with a similar state-of-the-art solution, our system captures up to $28{x}$ more events while keeping tracing performance overhead between 14% and 51%.

Keywords