Automated Clustering of Virtual Machines based on Correlation of Resource Usage
Abstract
The recent growth in demand for modern applicationscombined with the shift to the Cloud computing paradigm have led to the establishment of large-scale cloud data centers. The increasing size of these infrastructures represents a major challenge in terms of monitoring and management of the system resources. Available solutions typically consider every Virtual Machine (VM) as a black box each with independent characteristics, and face scalability issues by reducing the number of monitored resource samples, considering in most cases only average CPU usage sampled at a coarse time granularity. We claim that scalability issues can be addressed by leveraging thesimilarity between VMs in terms of resource usage patterns.In this paper we propose an automated methodology to cluster VMs depending on the usage of multiple resources, both systemand network-related, assuming no knowledge of the services executed on them. This is an innovative methodology that exploits the correlation between the resource usage to cluster together similar VMs. We evaluate the methodology through a case study with data coming from an enterprise datacenter, and we show that high performance may be achieved in automatic VMs clustering. Furthermore, we estimate the reduction in the amount of data collected, thus showing that our proposal may simplify the monitoring requirements and help administrators totake decisions on the resource management of cloud computing datacenters.
Keywords
Cloud computing, VM Clustering, k-means, Correlation analysis
Full Text:
PDFDOI: http://dx.doi.org/10.24138/jcomss.v8i4.164
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.