Census: Location-Aware Membership Management for Large-Scale Distributed Systems
Author(s)Cowling, James Alexander; Ports, Dan R. K.; Liskov, Barbara H.; Popa, Raluca Ada; Gaikwad, Abhijeet
MetadataShow full item record
We present Census, a platform for building large-scale distributed applications. Census provides a membership service and a multicast mechanism. The membership service provides every node with a consistent view of the system membership, which may be global or partitioned into location-based regions. Census distributes membership updates with low overhead, propagates changes promptly, and is resilient to both crashes and Byzantine failures. We believe that Census is the first system to provide a consistent membership abstraction at very large scale, greatly simplifying the design of applications built atop large deployments such as multi-site data centers. Census builds on a novel multicast mechanism that is closely integrated with the membership service. It organizes nodes into a reliable overlay composed of multiple distribution trees, using network coordinates to minimize latency. Unlike other multicast systems, it avoids the cost of using distributed algorithms to construct and maintain trees. Instead, each node independently produces the same trees from the consistent membership view. Census uses this multicast mechanism to distribute membership updates, along with application-provided messages. We evaluate the platform under simulation and on a real-world deployment on PlanetLab. We find that it imposes minimal bandwidth overhead, is able to react quickly to node failures and changes in the system membership, and can scale to substantial size.
DepartmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Proceedings of the 2009 USENIX Annual Technical Conference
Cowling, J., et al. "Census: Location-Aware Membership Management for Large-Scale Distributed Systems." Proceedings of the 2009 USENIX Annual Technical Conference (San Diego: June 14-19, 2009).
Author's final manuscript