Basser Seminar Series

SHelp: Automatic Self-healing for Multiple Application Instances in a Virtual Machine Environment

Speaker: Associate Professor Bing Bing Zhou
School of IT, University of Sydney

Time: Friday 3 September 2010, 4:00-5:00pm
Refreshments will be available from 3:30pm

Location: The University of Sydney, School of IT Building, Lecture Theatre (Room 123), Level 1

Add seminar to my diary

Abstract

When multiple instances of an application run on multiple virtual machines, an interesting problem is how to utilize the fault handling result from one application instance to heal the same fault occurring on other sibling instances, and hence to ensure high service availability in a cloud computing environment. This talk discusses a lightweight runtime system, SHelp, that can survive software failures in the framework of virtual machines. Technically, it applies "weighted" rescue points and error virtualization techniques to effectively make applications bypass the faulty path. A two-level storage hierarchy is adopted in the rescue point database for applications running on different virtual machines to share error handling information to reduce the redundancy and to more effectively and quickly recover from future faults caused by the same bugs. Some evaluation results will be presented showing that SHelp can make server applications to recover from certain bugs in just a few seconds with modest performance overhead.

Speaker's biography

Bing Bing Zhou received the BS degree from Nanjing Institute of Technology, China and the PhD degree in Computer Science from Australian National University. He is currently an associate professor at the University of Sydney.

His research interests include parallel/distributed computing, Grid and cloud computing, peer-to-peer systems, algorithms, and bioinformatics.