Frontiers in Computer Science (Oct 2024)

A survivability analysis of enterprise hard drives incorporating the impact of workload

  • Aman Mallik,
  • B Ranjith Reddy,
  • Gadadhar Sahoo

DOI
https://doi.org/10.3389/fcomp.2024.1400943
Journal volume & issue
Vol. 6

Abstract

Read online

IntroductionHard disk drive (HDD) failure is a significant cause of downtime in enterprise storage systems. Research suggests that data access rates strongly influence the survival probability of HDDs.MethodsThis paper proposes a model to estimate the probability of HDD failure, using factors such as the total data (TD) read or written and the average access rate (AAR) for a specific drive model. The study utilizes a dataset of HDD failures to analyze the effects of these variables.ResultsThe model was validated using case studies, demonstrating a strong correlation between access rate management and reduced HDD failure risk. The results indicate that managing data access rates through improved throttle commands can significantly enhance drive reliability.DiscussionOur approach suggests that optimizing throttle commands at the storage controller level can help mitigate the risk of HDD failure by controlling data access rates, thereby improving system longevity and reducing downtime in enterprise storage systems.

Keywords