Can linux be rejuvenated without reboots?

Takeshi Yoshimura, Hiroshi Yamada, Kenji Kono

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Operating systems (OSes) are crucial for achieving high availability of computer systems. Even if the applications running on the operating system are highly available, a bug inside the kernel may result in a failure of the entire software stack. Rejuvenating OSes is a promising approach to prevent and recover from transient errors. Unfortunately, OS rejuvenation takes a lot of time because we do not have any method other than rebooting the entire OS. In this paper we explore the possibility of rejuvenating Linux without reboots. In our previous research, we investigated the scope of error propagation in Linux. The propagation scope is process-local if the error is confined in the process context that activated it. The scope is kernel-global if the error propagates to other processes' contexts or global data structures. If most errors are process- local, we can rejuvenate the Linux kernel without rebooting the entire kernel because the kernel goes back to a consistent and clean state simply by killing and revoking the resources of the faulting process. Our conclusion is that Linux can be rejuvenated without reboots with high probability. Linux is coded in a defensive way and thus, most of the manifested errors (96%) were process-local and only one error was kernel- global.

Original languageEnglish
Title of host publicationProceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011
Pages50-55
Number of pages6
DOIs
Publication statusPublished - 2011
Event3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011 - Hiroshima, Japan
Duration: 2011 Nov 292011 Dec 1

Other

Other3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011
CountryJapan
CityHiroshima
Period11/11/2911/12/1

Fingerprint

Faulting
Computer operating systems
Linux
Data structures
Computer systems
Availability

Keywords

  • Fault Injection
  • Operating System Dependability
  • Rejuvenation
  • Scope of Error Propagation
  • Software Faults

ASJC Scopus subject areas

  • Software

Cite this

Yoshimura, T., Yamada, H., & Kono, K. (2011). Can linux be rejuvenated without reboots? In Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011 (pp. 50-55). [6141725] https://doi.org/10.1109/WoSAR.2011.12

Can linux be rejuvenated without reboots? / Yoshimura, Takeshi; Yamada, Hiroshi; Kono, Kenji.

Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011. 2011. p. 50-55 6141725.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yoshimura, T, Yamada, H & Kono, K 2011, Can linux be rejuvenated without reboots? in Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011., 6141725, pp. 50-55, 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011, Hiroshima, Japan, 11/11/29. https://doi.org/10.1109/WoSAR.2011.12
Yoshimura T, Yamada H, Kono K. Can linux be rejuvenated without reboots? In Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011. 2011. p. 50-55. 6141725 https://doi.org/10.1109/WoSAR.2011.12
Yoshimura, Takeshi ; Yamada, Hiroshi ; Kono, Kenji. / Can linux be rejuvenated without reboots?. Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011. 2011. pp. 50-55
@inproceedings{81ae95cbaaf141e6a9a9220beb10ed3b,
title = "Can linux be rejuvenated without reboots?",
abstract = "Operating systems (OSes) are crucial for achieving high availability of computer systems. Even if the applications running on the operating system are highly available, a bug inside the kernel may result in a failure of the entire software stack. Rejuvenating OSes is a promising approach to prevent and recover from transient errors. Unfortunately, OS rejuvenation takes a lot of time because we do not have any method other than rebooting the entire OS. In this paper we explore the possibility of rejuvenating Linux without reboots. In our previous research, we investigated the scope of error propagation in Linux. The propagation scope is process-local if the error is confined in the process context that activated it. The scope is kernel-global if the error propagates to other processes' contexts or global data structures. If most errors are process- local, we can rejuvenate the Linux kernel without rebooting the entire kernel because the kernel goes back to a consistent and clean state simply by killing and revoking the resources of the faulting process. Our conclusion is that Linux can be rejuvenated without reboots with high probability. Linux is coded in a defensive way and thus, most of the manifested errors (96{\%}) were process-local and only one error was kernel- global.",
keywords = "Fault Injection, Operating System Dependability, Rejuvenation, Scope of Error Propagation, Software Faults",
author = "Takeshi Yoshimura and Hiroshi Yamada and Kenji Kono",
year = "2011",
doi = "10.1109/WoSAR.2011.12",
language = "English",
isbn = "9780769546162",
pages = "50--55",
booktitle = "Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011",

}

TY - GEN

T1 - Can linux be rejuvenated without reboots?

AU - Yoshimura, Takeshi

AU - Yamada, Hiroshi

AU - Kono, Kenji

PY - 2011

Y1 - 2011

N2 - Operating systems (OSes) are crucial for achieving high availability of computer systems. Even if the applications running on the operating system are highly available, a bug inside the kernel may result in a failure of the entire software stack. Rejuvenating OSes is a promising approach to prevent and recover from transient errors. Unfortunately, OS rejuvenation takes a lot of time because we do not have any method other than rebooting the entire OS. In this paper we explore the possibility of rejuvenating Linux without reboots. In our previous research, we investigated the scope of error propagation in Linux. The propagation scope is process-local if the error is confined in the process context that activated it. The scope is kernel-global if the error propagates to other processes' contexts or global data structures. If most errors are process- local, we can rejuvenate the Linux kernel without rebooting the entire kernel because the kernel goes back to a consistent and clean state simply by killing and revoking the resources of the faulting process. Our conclusion is that Linux can be rejuvenated without reboots with high probability. Linux is coded in a defensive way and thus, most of the manifested errors (96%) were process-local and only one error was kernel- global.

AB - Operating systems (OSes) are crucial for achieving high availability of computer systems. Even if the applications running on the operating system are highly available, a bug inside the kernel may result in a failure of the entire software stack. Rejuvenating OSes is a promising approach to prevent and recover from transient errors. Unfortunately, OS rejuvenation takes a lot of time because we do not have any method other than rebooting the entire OS. In this paper we explore the possibility of rejuvenating Linux without reboots. In our previous research, we investigated the scope of error propagation in Linux. The propagation scope is process-local if the error is confined in the process context that activated it. The scope is kernel-global if the error propagates to other processes' contexts or global data structures. If most errors are process- local, we can rejuvenate the Linux kernel without rebooting the entire kernel because the kernel goes back to a consistent and clean state simply by killing and revoking the resources of the faulting process. Our conclusion is that Linux can be rejuvenated without reboots with high probability. Linux is coded in a defensive way and thus, most of the manifested errors (96%) were process-local and only one error was kernel- global.

KW - Fault Injection

KW - Operating System Dependability

KW - Rejuvenation

KW - Scope of Error Propagation

KW - Software Faults

UR - http://www.scopus.com/inward/record.url?scp=84857174776&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857174776&partnerID=8YFLogxK

U2 - 10.1109/WoSAR.2011.12

DO - 10.1109/WoSAR.2011.12

M3 - Conference contribution

SN - 9780769546162

SP - 50

EP - 55

BT - Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011

ER -