Achieving Consistent Real-Time Latency at Scale in a Commodity Virtual Machine Environment Through Socket Outsourcing-Based Network Stacks

Oscar F. Garcia; Yasushi Shinjo; Calton Pu

doi:10.1109/ACCESS.2018.2877296

IEEE Access (Jan 2018)

Achieving Consistent Real-Time Latency at Scale in a Commodity Virtual Machine Environment Through Socket Outsourcing-Based Network Stacks

Oscar F. Garcia,
Yasushi Shinjo,
Calton Pu

Affiliations

Oscar F. Garcia: ORCiD; Department of Computer Science, University of Tsukuba, Tsukuba, Japan
Yasushi Shinjo: Department of Computer Science, University of Tsukuba, Tsukuba, Japan
Calton Pu: College of Computing, Georgia Institute of Technology, Atlanta, GA, USA

DOI: https://doi.org/10.1109/ACCESS.2018.2877296
Journal volume & issue: Vol. 6
pp. 69961 – 69977

Abstract

Read online

It is challenging to achieve a consistent real-time (RT) response time in commodity virtual machine (VM) environments because they have longer and more complex network protocol stacks. This paper analyzes such network stacks and proposes a method that achieves consistent latency in a Linux KVM-based hosted environment. The analysis identifies a priority inversion in the interrupt-first host kernel of vanilla Linux, and the proposed method addresses it by using the PREEMPT_RT patch. Subsequently, the analysis identifies another priority inversion in softirq handling of the host kernel. The proposed method addresses it by dividing softirq handling into RT and non-RT types. The analysis then identifies the cache pollution problem by co-located non-RT servers and the latter priority inversion in a guest kernel. The proposed method addresses them by socket outsourcing, in which a guest kernel delegates network processing to the host kernel. The proposed method achieved consistent latency. Compared to the threaded interrupt handling method, the proposed method reduced the standard deviation (SD) of the latencies of a simple RT server by a factor of 6, achieving 5.6% higher throughput and 32% lower CPU utilization. Compared to the exclusive CPU method, the proposed method reduced the SD by a factor of 2 and prevented underutilization of the exclusive CPU. The proposed method was more scalable in terms of the number of RT VMs. A four-CPU host was able to execute 40 RT VMs using the proposed method while maintaining the throughputs of non-RT servers.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords