Memory sharing for handling memory overload on physical machines in cloud data centers

Yaozhong Ge; Yu-Chu Tian; Zu-Guo Yu; Weizhe Zhang

doi:10.1186/s13677-023-00405-x

Journal of Cloud Computing: Advances, Systems and Applications (Feb 2023)

Memory sharing for handling memory overload on physical machines in cloud data centers

Yaozhong Ge,
Yu-Chu Tian,
Zu-Guo Yu,
Weizhe Zhang

Affiliations

Yaozhong Ge: School of Computer Science, Queensland University of Technology
Yu-Chu Tian: School of Computer Science, Queensland University of Technology
Zu-Guo Yu: Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University
Weizhe Zhang: School of Cyberspace Science, Harbin Institute of Technology

DOI: https://doi.org/10.1186/s13677-023-00405-x
Journal volume & issue: Vol. 12, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Over-committing computing resources is a widely adopted strategy for increased cluster utilization in Infrastructure as a Service (IaaS) cloud data centers. A potential consequence of over-committing computing resources is memory overload of physical machines (PMs). Memory overload occurs if memory usage exceeds a defined alarm threshold, exposing running computation tasks at a risk of being terminated by the operating system. A prevailing measure to handle memory overload of a PM is live migration of virtual machines (VMs). However, this not only consumes network bandwidth, CPU, and other resources, but also compels a temporary unavailability of the VMs being migrated. To handle memory overload, we present a memory sharing system in this paper for PMs in cloud data centers. With memory sharing, a PM automatically borrows memory from a remote PM when necessary, and releases the borrowed memory when memory overload disappears. This is implemented through swapping inactive memory pages to remote memory resource. Experimental studies conducted on InfiniBand-networked PMs show that the memory sharing system is fully functional. The measured throughput and latency are around 929 Mbps and 1.3 $$\mu$$ μ s, respectively, on average for remote memory access. They are similar to those from accessing a local-volatile memory express solid-state drive, and thus are promising in real applications.

Published in Journal of Cloud Computing: Advances, Systems and Applications

ISSN: 2192-113X (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofcloudcomputing.springeropen.com

About the journal

Abstract

Keywords