Phase-based reboot: Reusing operating system execution phases for cheap reboot-based recovery

Kazuya Yamakita, Hiroshi Yamada, Kenji Kono

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

Although operating systems (OSes) are crucial to achieving high availability of computer systems, modern OSes are far from bug-free. Rebooting the OS is simple, powerful, and sometimes the only remedy for kernel failures. Once we accept reboot-based recovery as a fact of life, we should try to ensure that the downtime caused by reboots is as short as possible. This paper presents phase-based reboots that shorten the downtime caused by reboot-based recovery. The key idea is to divide a boot sequence into phases. The phase-based reboot reuses a system state in the previous boot if the next boot reproduces the same state. A prototype of the phase-based reboot was implemented on Xen 3.4.1 running para-virtualized Linux 2.6.18. Experiments with the prototype show that it successfully recovered from kernel transient failures inserted by a fault injector, and its downtime was 34.3 to 93.6% shorter than that of the normal reboot-based recovery.

Original languageEnglish
Title of host publication2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, DSN 2011
Pages169-180
Number of pages12
DOIs
Publication statusPublished - 2011 Aug 26
Event2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, DSN 2011 - Hong Kong, Hong Kong
Duration: 2011 Jun 272011 Jun 30

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks

Other

Other2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, DSN 2011
CountryHong Kong
CityHong Kong
Period11/6/2711/6/30

    Fingerprint

Keywords

  • Operating System Reliability
  • Reboot-based Recovery
  • Virtualization

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Yamakita, K., Yamada, H., & Kono, K. (2011). Phase-based reboot: Reusing operating system execution phases for cheap reboot-based recovery. In 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, DSN 2011 (pp. 169-180). [5958216] (Proceedings of the International Conference on Dependable Systems and Networks). https://doi.org/10.1109/DSN.2011.5958216