Fault tolerant dynamic agent systems
Author(s)
Roewe, James M
DownloadFull printable version (2.638Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Larry Rudolph.
Terms of use
Metadata
Show full item recordAbstract
Partial system snapshots reduce the cost per node to only depend on the size of the connected group instead of the size of the full system. These groups can be determined during system operation by using the communication patterns between nodes. The number of nodes that must rollback after a failure is limited to the size of these snapshot groups, reducing the work lost. These changes to snapshot algorithms are necessary because the cost per node for a snapshot increases and the expected time between failures decreases as the size of the system grows.
Description
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005. Includes bibliographical references (p. 67-68).
Date issued
2005Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.