HASLab

Group communication.

Leitão J, Carvalho N, Pereira JO, Oliveira R, Rodrigues L. 2010. On Adding Structure to Unstructured Overlay Networks. Handbook of Peer-to-Peer Networking. Abstract

Unstructured peer-to-peer overlay networks are very resilient to churn and topology changes, while requiring little maintenance cost. Therefore, they are an infrastructure to build highly scalable large-scale services in dynamic networks. Typically, the overlay topology is defined by a peer sampling service that aims at maintaining, in each process, a random partial view of peers in the system. The resulting random unstructured topology is suboptimal when a specific performance metric is considered. On the other hand, structured approaches (for instance, a spanning tree) may optimize a given target performance metric but are highly fragile. In fact, the cost for maintaining structures with strong constraints may easily become prohibitive in highly dynamic networks. This chapter discusses different techniques that aim at combining the advantages of unstructured and structured networks. Namely we focus on two distinct approaches, one based on optimizing the overlay and another based on optimizing the gossip mechanism itself.

Leitão J, Pereira JO, Rodrigues L. 2010. Gossip-based broadcast. Handbook of Peer-to-Peer Networking. Abstract

Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives on large scale peer-to-peer networks. Epidemic protocols are scalable because they distribute the load among all nodes in the system and resilient because they have an intrinsic level of redundancy that masks node and network failures. This chapter provides an introduction to gossip-based broadcast on large-scale unstructured peer-to-peer overlay networks: it surveys the main results in the field, discusses techniques to build and maintain the overlays that support efficient dissemination strategies, and provides an in-depth discussion and experimental evaluation of two concrete protocols, named HyParView and Plumtree.

Pereira JO, Rodrigues L, Oliveira R. 2000. Semantically reliable multicast protocols. 9th IEEE Symposium on Reliable Distributed Systems - SRDS. :{60-69}. Abstractsrds2k.pdf

Reliable multicast protocols can strongly simplify the design of distributed applications. However it is hard to sustain a high multicast throughput when groups are large and heterogeneous. In an attempt to overcome this limitation, previous work has focused on weakening reliability properties. The authors introduce a novel reliability model that exploits semantic knowledge to decide in which specific conditions messages can be purged without compromising application correctness. This model is based on the concept of message obsolescence: a message becomes obsolete when its content or purpose is overwritten by a subsequent message. We show that message obsolescence can be expressed in a generic way and can be used to configure the system to achieve higher multicast throughput.

Pereira JO, Rodrigues L, Oliveira R. 2003. Semantically reliable multicast: Definition, implementation, and performance evaluation. IEEE Transactions on Computers. 52:150-165. Abstractsemantic.pdf

Semantic reliability is a novel correctness criterion for multicast protocols based on the concept of message obsolescence: A message becomes obsolete when its content or purpose is superseded by a subsequent message. By exploiting obsolescence, a reliable multicast protocol may drop irrelevant messages to find additional buffer space for new messages. This makes the multicast protocol more resilient to transient performance perturbations of group members, thus improving throughput stability. This paper describes our experience in developing a suite of semantically reliable protocols. It summarizes the motivation, definition, and algorithmic issues and presents performance figures obtained with a running implementation. The data obtained experimentally is compared with analytic and simulation models. This comparison allows us to confirm the validity of these models and the usefulness of the approach. Finally, the paper reports the application of our prototype to distributed multiplayer games.

Rodrigues L, Pereira JO, Handurukande S, Guerraoui R, Kermarrec AM. 2003. Adaptive gossip-based broadcast. DSN - International Conference on Dependable Systems and Networks. :{47-56}. Abstract10.1.1.164.2113.pdf

This paper presents a novel adaptation mechanism that allows every node of a gossip-based broadcast algorithm to adjust the rate of message emission 1) to the amount of resources available to the nodes within the same broadcast group and 2) to the global level of congestion in the system. The adaptation mechanism can be applied to all gossip-based broadcast algorithms we know of and makes their use more realistic in practical situations where nodes have limited resources whose quantity changes dynamically with time without decreasing the reliability.

Carvalho N, Pereira JO, Rodrigues L. 2006. Towards a generic group communication service. On The Move To Meaningful Internet Systems (OTM) International Symposium on Distributed Objects, Middleware, and Applications (DOA). 4276:{1485-1502}. Abstractcamera-ready_jop.pdf

View synchronous group communication is a mature technology that greatly eases the development of reliable distributed applications by enforcing precise message delivery semantics, especially in face of faults. It is therefore found at the core of multiple widely deployed and used middleware products. Although the implementation of a group communication system is a complex task, application developers may benefit from the fact that multiple group communication toolkits are currently available and supported. Unfortunately, each communication toolkit has a different interface, that differs from every other interface in subtile syntactic and semantic aspects. This hinders the design, implementation and maintenance of applications using group communication and forces developers to commit beforehand to a single toolkit, thus imposing a significant hurdle to portability. In this paper we propose jGCS, a generic group communication service for Java, that specifies an interface as well as minimum semantics that allow application portability. This interface accommodates existing group communication services, enabling implementation independence. Furthermore, it provides support for the latest state-of-art mechanisms that have been proposed to improve the performance of group-based applications. To support our claims, we present and experimentally evaluate implementations of jGCS for several major group communication systems, namely, Appia, Spread/FlushSpread and JGroups, and describe the port of a large middleware product to jGCS.

Pereira JO, Oliveira R, Rodrigues L. 2006. Efficient epidemic multicast in heterogeneous networks. On The Move To Meaningful Internet Systems - OTM - International Workshop on Reliability in Decentralized Distributed Systems. 4278:{1520-1529}. Abstractrdds06_jop.pdf

The scalability and resilience of epidemic multicast, also called probabilistic or gossip-based multicast, rests on its symmetry: Each participant node contributes the same share of bandwidth thus spreading the load and allowing for redundancy. On the other hand, the symmetry of gossiping means that it does not avoid nodes or links with less capacity. Unfortunately, one cannot naively avoid such symmetry without also endangering scalability and resilience. In this paper we point out how to break out of this dilemma, by lazily deferring message transmission according to a configurable policy. An experimental proof-of-concept illustrates the approach.

Leitão J, Pereira JO, Rodrigues L. 2007. HyParView: A membership protocol for reliable gossip-based broadcast. The 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks - DSN. :{419-428}. Abstract10.1.1.190.3289.pdf

Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives. Due to scalability reasons, each participant in a gossip protocol maintains a partial view of the system. The reliability of the gossip protocol depends upon some critical properties of these views, such as degree distribution and clustering coefficient. Several algorithms have been proposed to maintain partial views for gossip protocols. In this paper, we show that under a high number of faults, these algorithms take a long time to restore the desirable view properties. To address this problem, we present HyParView, a new membership protocol to support gossip-based broadcast that ensures high levels of reliability even in the presence of high rates of node failure. The HyParView protocol is based on a novel approach that relies in the use of two distinct partial views, which are maintained with different goals by different strategies.

Leitão J, Pereira JO, Rodrigues L. 2007. Epidemic broadcast trees. 26th IEEE Symposium on Reliable Distributed Systems - SRDS. :{301-310}. Abstractlpr07a.pdf

There is an inherent trade-off between epidemic and deterministic tree-based broadcast primitives. Tree-based approaches have a small message complexity in steady-state but are very fragile in the presence of faults. Gossip, or epidemic, protocols have a higher message complexity but also offer much higher resilience. This paper proposes an integrated broadcast scheme that combines both approaches. We use a low cost scheme to build and maintain broadcast trees embedded on a gossip-based overlay. The protocol sends the message payload preferably via tree branches but uses the remaining links of the gossip overlay for fast recovery and expedite tree healing. Experimental evaluation presented in the paper shows that our new strategy has a low overhead and that is able to support large number of faults while maintaining a high reliability.

Leitão J, Marques J, Pereira JO, Rodrigues L. 2009. X-BOT: A Protocol for Resilient Optimization of Unstructured Overlays. IEEE International Symposium On Reliable Distributed Systems - SRDS. Abstractsrds09-leitao.pdf

Gossip, or epidemic, protocols have emerged as a highly scalable and resilient approach to implement several application level services such as reliable multicast, data aggregation, publish-subscribe, among others. All these protocols organize nodes in an unstructured random overlay network. In many cases, it is interesting to bias the random overlay in order to optimize some efficiency criteria, for instance, to reduce the stretch of the overlay routing. In this paper we propose X-BOT, a new protocol that allows to bias the topology of an unstructured gossip overlay network. X-BOT is completely decentralized and, unlike previous approaches, preserves several key properties of the original (non-biased) overlay (most notably, the node degree and consequently, the overlay connectivity). Experimental results show that X-BOT can generate more efficient overlays than previous approaches.

José Orlando Pereira

Distributed Systems

groupcomm