Blue Eyes: Scalable and Reliable System Management for Cloud Computing

abstract

With the advent of cloud computing, massive and automated system management has become more important for successful and economical operation of computing resources. However, traditional monolithic system management solutions are designed to scale to only hundreds or thousands of systems at most. In this paper, we present Blue Eyes, a new system management solution to handle hundreds of thousands of systems. Blue Eyes enables highly scalable and reliable system management with a multi-server scaleout architecture. In particular, we structure the management servers into a hierarchical tree to achieve scalability, and management information is replicated into secondary servers to provide reliability and high availability. In addition, Blue Eyes is designed to extend the existing single server implementation without significantly restructuring the code base. Several experimental results with the prototype have demonstrated that Blue Eyes can reliably handle typical management tasks for a large scale of endpoints with dynamic load-balancing across the servers, near linear performance gain with server additions, and an acceptable network overhead. 2009 IEEE.

name of conference

2009 IEEE International Symposium on Parallel & Distributed Processing

authors

Da Silva, Dilma

published proceedings

2009 IEEE International Symposium on Parallel & Distributed Processing

author list (cited authors)

Song, S., Ryu, K. D., & Da Silva, D.

citation count

6

complete list of authors

Song, Sukhyun||Ryu, Kyung Dong||Da Silva, Dilma

publication date

January 2009

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS Journal

keywords

Automated System Management
Availability
Blue Eyes
Cloud Computing
Dynamic Load-balancing
Eyes
File Servers
Hierarchical Tree
Information Management
Internet
Large-scale Systems
Management Information
Management Server
Massive System Management
Multiserver Scale-out Architecture
Network Overhead
Network Servers
Performance Gain
Prototypes
Reliable System Management
Resource Allocation
Resource Management
Scalability
Scalable System Management

Digital Object Identifier (DOI)

10.1109/ipdps.2009.5161232

International Standard Book Number (ISBN) 13

9781424437504

start page

1

end page

8

URL

http://dx.doi.org/10.1109/ipdps.2009.5161232

Blue Eyes: Scalable and Reliable System Management for Cloud Computing Conference Paper

Overview

abstract

name of conference

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 13

Additional Document Info

start page

end page

Other

URL