Blue Eyes: Scalable and Reliable System Management for Cloud Computing Conference Paper uri icon

abstract

  • With the advent of cloud computing, massive and automated system management has become more important for successful and economical operation of computing resources. However, traditional monolithic system management solutions are designed to scale to only hundreds or thousands of systems at most. In this paper, we present Blue Eyes, a new system management solution to handle hundreds of thousands of systems. Blue Eyes enables highly scalable and reliable system management with a multi-server scaleout architecture. In particular, we structure the management servers into a hierarchical tree to achieve scalability, and management information is replicated into secondary servers to provide reliability and high availability. In addition, Blue Eyes is designed to extend the existing single server implementation without significantly restructuring the code base. Several experimental results with the prototype have demonstrated that Blue Eyes can reliably handle typical management tasks for a large scale of endpoints with dynamic load-balancing across the servers, near linear performance gain with server additions, and an acceptable network overhead. 2009 IEEE.

name of conference

  • 2009 IEEE International Symposium on Parallel & Distributed Processing

published proceedings

  • 2009 IEEE International Symposium on Parallel & Distributed Processing

author list (cited authors)

  • Song, S., Ryu, K. D., & Da Silva, D.

citation count

  • 6

complete list of authors

  • Song, Sukhyun||Ryu, Kyung Dong||Da Silva, Dilma

publication date

  • January 2009