Online Academic Journal of Information Technology (Sep 2016)

Investigation of Amazon and Google for Fault Tolerance Strategies in Cloud Computing Services

  • Sinan Can Açan,
  • Shereen Al-raheym

DOI
https://doi.org/10.5824/1309-1581.2016.4.001.x
Journal volume & issue
Vol. 7, no. 25
pp. 7 – 22

Abstract

Read online

Cloud computing has recently become an attractive topic due to its ability to offer information technology solutions through virtual machines as on-demand services to share and consume resources over the Internet. As a result of rapid development in such services, the necessity of fault tolerance in the cloud is a major concern with reliability, availability and dependability which are more critical to this new service type. This work investigates techniques and means of tolerating cloud services as well as cloud customers’ systems/enterprises execution over the cloud safe from failures. Failures in cloud enabled services should be expected to occur hence they should be handled. The essential features of implementing fault tolerance strategies guarantee the business continuity, avoid financial lost, recovering systems from failures, and provide disaster recovery as well. The specific focus is to explore scenarios of avoiding/recovering from failures through redundancy, checkpoint and replication. Commercial IaaS providers such as Amazon’s AWS and Google’s GCE are taken as examples as they tolerate their infrastructure from failures; in this way a robust architecture with fault tolerance property could be built for a system/enterprise. Hence, general conceptual steps with fault tolerance considerations have been proposed.

Keywords