IEEE Access (Jan 2019)

A Methodology to Assess Output Vulnerability Factors for Detecting Silent Data Corruption

  • Junchi Ma,
  • Zongtao Duan,
  • Lei Tang

DOI
https://doi.org/10.1109/ACCESS.2019.2936893
Journal volume & issue
Vol. 7
pp. 118135 – 118145

Abstract

Read online

As process technology scales, electronic devices become more susceptible to soft error induced by radiation. Silent data corruption (SDC) is considered the most severe outcome incurred by soft error. The effects of faulty variables on producing SDC vary widely. Without a profiling of vulnerability of variables, the derived detectors often incur low SDC detection rate or unacceptable overhead. To assess the vulnerability of variables to SDC, this paper proposes a metric called Output Vulnerability Factor (OVF). The metric is used to rank the variable's priority in the detector derivation process in order to selectively protect the most SDC-prone variable in the program. The calculation of OVF is based on enhanced Dynamic Dependence Graph (eDDG), a proposed instruction-level error propagation model. We filter out the edges representing the identified crash propagation path and perform a backward traversal of the eDDG to obtain SDC propagation path. Further, error masking probability is estimated for the edges refer to value comparison and logistic operation. Fault injections show that our approach achieves an SDC detection rate of 65.0% with the top 10% high OVF variables monitored. Compared with previous methods, the SDC detection rate increases by 12-21%.

Keywords