Asynchronous failure detectors

Conrejo, Alejandro; Lynch, Nancy; Sastry, Srikanth

dc.contributor.author	Cornejo Collado, Alex
dc.contributor.author	Lynch, Nancy Ann
dc.contributor.author	Sastry, Srikanth
dc.date.accessioned	2014-09-25T19:28:28Z
dc.date.available	2014-09-25T19:28:28Z
dc.date.issued	2012-07
dc.identifier.isbn	9781450314503
dc.identifier.uri	http://hdl.handle.net/1721.1/90357
dc.description.abstract	Failure detectors - oracles that provide information about process crashes - are an important abstraction for crash tolerance in distributed systems. Although current failure-detector theory provides great generality and expressiveness, it also poses significant challenges in developing a robust hierarchy of failure detectors. We address some of these challenges by proposing a variant of failure detectors called asynchronous failure detectors and an associated modeling framework. Unlike the traditional failure-detector framework, our framework eschews real time completely. We show that asynchronous failure detectors are sufficiently expressive to include several popular failure detectors. Additionally, we show that asynchronous failure detectors satisfy many desirable properties: they are self-implementable, guarantee that stronger asynchronous failure detectors solve more problems, and ensure that their outputs encode no information other than process crashes. We introduce the notion of a failure detector being representative of a problem to capture the idea that some problems encode the same information about process crashes as their weakest failure detectors do. We show that a large class of problems, called finite problems, do not have representative failure detectors.	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (Science and Technology Center, grant agreement CCF-0939370) )	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (NSF Award Number CCF-0726514)	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (NSF Award Number CCF-0937274)	en_US
dc.description.sponsorship	United States. Air Force Office of Scientific Research (AFOSR Award Number FA9550-08-1-0159)	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (NSF Award Number CNS-1035199)	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2332432.2332482	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Asynchronous failure detectors	en_US
dc.type	Article	en_US
dc.identifier.citation	Conrejo, Alejandro, Nancy Lynch, and Srikanth Sastry. “Asynchronous Failure Detectors.” Proceedings of the 2012 ACM Symposium on Principles of Distributed Computing - PODC ’12 (2012), July 16–18, 2012, Madeira, Portugal. ACM New York, NY, USA. p.243-252.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.mitauthor	Cornejo Collado, Alex	en_US
dc.contributor.mitauthor	Lynch, Nancy Ann	en_US
dc.contributor.mitauthor	Sastry, Srikanth	en_US
dc.relation.journal	Proceedings of the 2012 ACM symposium on Principles of distributed computing - PODC '12	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dspace.orderedauthors	Conrejo, Alejandro; Lynch, Nancy; Sastry, Srikanth	en_US
dc.identifier.orcid	https://orcid.org/0000-0003-3045-265X
dspace.mitauthor.error	true
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Lynch_Asynchronous failure.pdf
Size:: 220.2Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record