Automatic Software Upgrades for Distributed Systems
Author(s)
Ajmani, Sameer
DownloadMIT-CSAIL-TR-2005-078.ps (129.1Mb)
Additional downloads
Other Contributors
Programming Methodology
Metadata
Show full item recordAbstract
Upgrading the software of long-lived, highly-available distributed systems is difficult. It is not possible to upgrade all the nodes in a system at once, since some nodes may be unavailable and halting the system for an upgrade is unacceptable. Instead, upgrades may happen gradually, and there may be long periods of time when different nodes are running different software versions and need to communicate using incompatible protocols. We present a methodology and infrastructure that address these challenges and make it possible to upgrade distributed systems automatically while limiting service disruption.Our methodology defines how to enable nodes to interoperate across versions, how to preserve the state of a system across upgrades, and how to schedule an upgrade so as to limit service disruption. The approach is modular: defining an upgrade requires understanding only the new software and the version it replaces.The upgrade infrastructure is a generic platform for distributing and installing software while enabling nodes to interoperate across versions. The infrastructure requires no access to the system source code and is transparent: node software is unaware that different versions even exist. We have implemented a prototype of the infrastructure called Upstart that intercepts socket communication using a dynamically-linked C++ library. Experiments show that Upstart has low overhead and works well for both local-area and Internet systems.
Date issued
2005-11-30Other identifiers
MIT-CSAIL-TR-2005-078
MIT-LCS-TR-1012
Series/Report no.
Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory