Autonomic Cloud Management System (ACMS)

Overview

Cloud enhances cooperation, scaling, performance, and accessibility by reducing cost, improving performance and providing on-demand computing, storage and network resources that can be accessed using heterogeneous thin or thick client platforms (e.g., mobile phones, laptops, and PDAs). While current high performance computing systems are designed to handle the peak workload, the workload varies greatly during runtime. Many studies have shown that data servers typically operate at a low utilization of 10% to 15%, while their consumption of power is close to those at peak loads. Virtualization and cloud management are common methods to increase utilization efficiency. However, power consumption is still a challange. Based on IDC, the cost spent for power consumption and cooling of data centers has doubled since 2000 and was expected to reach $40 billion annually by 2012. In addition, it is estimated that power cost during the lifetime of a cloud system can be 2-3 times higher than the infrasturcture cost. High power consumption is also creating hot-spots in a data center which reduces the reliability in long term. Therefore, power and performance management is a big challange in this era. Our goal is to minimize the power consumption while maintaining the high performance for Platform as a service (PaaS) by scaling up/down the hardware resources at runtime. In this project, we are developing an autonomic power and performance management system based on “AppFlow based reasoning” targeting cloud systems and data centers. Appflow is an n-dimensional data structure usedto characterize the current operational points of hardware and software resources as well as to keep track of their projections. In our approach, we classify the workloads into a set of workload types and for each workload type, we model the behavior of this workload into one AppFlow type. Similar to case based reasoning, the online monitoring and analysis of the workload will then aim at determining the appropriate AppFlow that can be used to describe accurately the current workload. Once that is determined, we can then configure the datacenter or cloud system resources according to the determined AppFlow type such that the workload performance is maximized and power consumption is minimized as shown in the figure below. Our experimental results showed that our approach can reduce the power consumption up to 84% compared to static resource allocation and up to 30% compared to other methods with minimum performance degradation.