Dynamic K-Means Clustering of Workload and Cloud Resource Configuration for Cloud Elastic Model

Tariq Daradkeh; Anjali Agarwal; Marzia Zaman; Nishith Goel

doi:10.1109/ACCESS.2020.3042716

IEEE Access (Jan 2020)

Dynamic K-Means Clustering of Workload and Cloud Resource Configuration for Cloud Elastic Model

Tariq Daradkeh,
Anjali Agarwal,
Marzia Zaman,
Nishith Goel

Affiliations

Tariq Daradkeh: ORCiD; Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
Anjali Agarwal: ORCiD; Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
Marzia Zaman: ORCiD; Cistech Limited, Ottawa, ON, Canada
Nishith Goel: Cistech Limited, Ottawa, ON, Canada

DOI: https://doi.org/10.1109/ACCESS.2020.3042716
Journal volume & issue: Vol. 8
pp. 219430 – 219446

Abstract

Read online

Cloud elasticity involves timely provisioning and de-provisioning of computing resources and adjusting resources size to meet the dynamic workload demand. This requires fast, and accurate resource scaling methods at minimum cost (e.g. pay as you go) that match with workload demands. Two dynamic changing parameters must be defined in an elastic model, the workload resource demand classes, and the data center resource reconfiguration classes. These parameters are not labeled for cloud management system while data center logs are being captured. Building an advance elastic model is a critical task, which defines multiple classes under these two categories i.e. for workload and for provisioning. A dynamic method is therefore required to define (during configuration time window) the workload classes and resource provisioning classes. Unsupervised learning model such as K-Means has many challenges such as time complexity, selection of optimum number of clusters (representing the classes), and determining centroid values of the clusters. All clustering methods depend on minimizing mean square error between center of population in same class member. These methods are often enhanced using guidelines to find out the centroids, but they suffer from K-Means limitations. For the application of clustering cloud log traces, most of the reported work use K-Means clustering to label workload types. However, there is no work reported that label data center scaling classes. In this work, a novel method is proposed to analyze the characteristics of both workloads and datacenter configurations using clustering method, and is based on random variable model transformation (kernel density estimator) guide. This method enhances K-Means clustering by automatically determining optimum number of classes and finding the mean centroids for the clusters. In addition, it improves the accuracy and the time complexity of standard K-Means clustering model, by best correlating between clustering attributes using statistical correlation methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords