Recent Publications

Bahsoun JP, Guerraoui R, Shoker A.  2015.  Making BFT Protocols Really Adaptive. In the Proceedings of the 29th IEEE International Parallel & Distributed Processing Symposium. Abstractadapt.pdf

Many state-machine Byzantine Fault Tolerant (BFT) protocols have been introduced so far. Each protocol addressed a different subset of conditions and use-cases. However, if the underlying conditions of a service span different subsets, choosing a single protocol will likely not be a best fit. This yields robustness and performance issues which may be even worse in services that exhibit fluctuating conditions and workloads.
In this paper, we reconcile existing state-machine BFT protocols in a single adaptive BFT system, called ADAPT, aiming at covering a larger set of conditions and use-cases, probably the union of individual subsets of these protocols. At anytime, a launched protocol in ADAPT can be aborted and replaced by another protocol according to a potential change (an event) in the underlying system conditions. The launched protocol is chosen according to an “evaluation process” that takes into consideration both: protocol characteristics and its performance. This is achieved by applying some mathematical formulas that match the profiles of protocols to given user (e.g., service owner) preferences. ADAPT can assess the profiles of protocols (e.g., throughput) at run-time using Machine Learning prediction mechanisms to get accurate evaluations. We compare ADAPT with well known BFT protocols showing that it outperforms others as system conditions change and under dynamic workloads.

Almeida PS, Shoker A, Moreno CB.  2015.  Efficient State-based CRDTs by Delta-Mutation. In the Proceedings of the International Conference on NETworked sYStems}. 9466 Abstractdeltacrdt.pdf

CRDTs are distributed data types that make eventual consistency of a distributed object possible and non ad-hoc. Specifically, state-based CRDTs ensure convergence through disseminating the entire state, that may be large, and merging it to other replicas; whereas operation-based CRDTs disseminate operations (i.e., small states) assuming an exactly-once reliable dissemination layer. We introduce Delta State Conflict-Free Replicated Datatypes (δ-CRDT) that can achieve the best of both worlds: small messages with an incremental nature, as in operation-based CRDTs, disseminated over unreliable communication channels, as in traditional state-based CRDTs. This is achieved by defining δ-mutators to return a delta-state, typically with a much smaller size than the full state, that is joined to both: local and remote states. We introduce the δ-CRDT framework, and we explain it through establishing a correspondence to current state-based CRDTs. In addition, we present an anti-entropy algorithm that ensures causal consistency, and we introduce two δ-CRDT specifications of well-known replicated datatypes.

Almeida PS, Moreno CB, Fonte V, Gonçalves R.  2015.  Concise Server-Wide Causality Management for Eventually Consistent Data Stores. DAIS - IFIP International Conference on Distributed Applications and Interoperable Systems. Abstractglobal_logical_clocks.pdf

Large scale distributed data stores rely on optimistic replication to scale and remain highly available in the face of network partitions. Managing data without coordination results in eventually consistent data stores that allow for concurrent data updates. These systems often use anti-entropy mechanisms (like Merkle Trees) to detect and repair divergent data versions across nodes. However, in practice hash-based data structures are too expensive for large amounts of data and create too many false conflicts. Another aspect of eventual consistency is detecting write conflicts. Logical clocks are often used to track data causality, necessary to detect causally concurrent writes on the same key. However, there is a nonnegligible metadata overhead per key, which also keeps growing with time, proportional with the node churn rate. Another challenge is deleting keys while respecting causality: while the values can be deleted, perkey metadata cannot be permanently removed without coordination. We introduce a new causality management framework for eventually consistent data stores, that leverages node logical clocks (Bitmapped Version Vectors) and a new key logical clock (Dotted Causal Container) to provides advantages on multiple fronts: 1) a new efficient and lightweight anti-entropy mechanism; 2) greatly reduced per-key causality metadata size; 3) accurate key deletes without permanent metadata.

Abade T, Campos JC, Moreira R, Silva CC, Silva JL.  2015.  Immersiveness of Ubiquitous Computing Environments Prototypes: A case study. Lecture Notes in Computer Science. 9189 Abstract15-dapi-abadecmss-sarch.pdf

The development of ubiquitous computing (ubicomp) environments raises several challenges in terms of their evaluation. Ubicomp virtual reality prototyping tools enable users to experience the system to be developed and are of great help to face those challenges, as they support developers in assessing the consequences of a design decision in the early phases of development. Given the situated nature of ubicomp environments, a particular issue to consider is the level of realism provided by the prototypes. This work presents a case study where two ubicomp prototypes, featuring different levels of immersion (desktop-based versus CAVE-based), were developed and compared. The goal was to determine the cost/benefits relation of both solutions, which provided better user experience results, and whether or not simpler solutions provide the same user experience results as more elaborate one.

Neves DT, Gonçalves RC.  2015.  On the Synthesis and Reconfiguration of Pipelines. ARCS '15: Proceedings of the 28th International Conference on Architecture of Computing Systems. Abstractneves-2015.pdf

In recent years we have observed great advances in parallel platforms and the exponential growth of datasets in several domains. Undoubtedly, parallel programming is crucial to harness the performance potential of such platforms and to cope with very large datasets. However, quite often one has to deal with legacy software systems that may use third-party frameworks, libraries, or tools, and that may be executed in different multicore architectures. Managing different software configurations and adapt them for different needs is an arduous task, particularly when it has to be carried out by scientists or when dealing with irregular applications. In this paper, we present an approach to abstract legacy software systems using workflow modeling tools. We show how a basic pipeline is modeled and adapted—using model transformations—to different application scenarios, either to obtain better performance, or more reliable results. Moreover, we explain how the system we provide to support the approach is easily extensible to accommodate new tools and algorithms. We show how a pipeline of three irregular applications— all from phylogenetics—is mapped to parallel implementations. Our studies show that the derived programs neither downgrade performance nor sacrifice scalability, even in the presence of a set of asymmetric tasks and when using third-party tools.

Jorge T, Maia F, Matos M, Pereira JO, Oliveira R.  2015.  Practical Evaluation of Large Scale Applications. Proceedings of the 15th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS). :124–137. Abstractdais15minhajsr.pdf

Designing and implementing distributed systems is a hard endeavor, both at an abstract level when designing the system, and at a concrete level when implementing, debugging and evaluating it. This stems not only from the inherent complexity of writing and reasoning about distributed software, but also from the lack of tools for testing and evaluating it under realistic conditions. Moreover, the gap between the protocols’ specifications found on research papers and their implementations on real code is huge, leading to inconsistencies that often result in the implementation no longer following the specification. As an example, the specification of the popular Chord DHT comprises a few dozens of lines, while its Java implementation, OpenChord, is close to twenty thousand lines, excluding libraries. This makes it hard and error prone to change the implementation to reflect changes in the specification, regardless of programmers’ skill. Besides, critical behavior due to the unpredictable interleaving of operations and network uncertainty, can only be observed on a realistic setting, limiting the usefulness of simulation tools. We believe that being able to write an algorithm implementation very close to its specification, and evaluating it in a real environment is a big step in the direction of building better distributed systems. Our approach leverages the MINHA platform to offer a set of built in primitives that allows one to program very close to pseudo-code. This high level implementation can interact with off-the-shelf existing middleware and can be gradually replaced by a production-ready Java implementation. In this paper, we present the system design and showcase it using a well-known algorithm from the literature.

Belo O, Faria J, Ribeiro AN, Oliveira B, Santos V.  2015.  Modeling E-Government Processes Using Yawl: Half-Way Towards Their Effective Real Implementation. Proceedings of the 8th International Conference on Theory and Practice of Electronic Governance - Icegov . :{288-291}. Abstractp288-belo.pdf

Today E-Government institutions face a lot of challenges related to the quality and effectiveness of the ervices they provide. In most cases, their users are more demanding, imposing new ways of acting and dealing with heir needs, requesting often expeditious and effective attendance. Independently for their nature, we believe that such pertinent characteristics begin to be sustained immediately as we start to study and model E-Government processes. Modeling and simulation are useful tools on the assurance of the availability of E-Government services in many aspects, contributing significantly to improve processes implementation, ranging from their inception to their final software application roll-up and maintenance. In this paper we studied the use of YAWL – a work flowing language – for modeling E-Government processes, showing through a real world application case how it can help us in the construction of effective models that may be used as a basis for understanding and building the correspondent software applications.

Daniels W, Proença J, Clarke D, Joosen W, Hughes D.  2015.  Refraction: Low-Cost Management of Reflective Meta-Data in Pervasive Component-Based Applications. Proceedings of the 18th International Symposium on Component-Based Software Engineering - CBSE. :27–36. Abstractrefraction.pdf

This paper proposes the concept of refraction, a principled means to lower the cost of managing reflective meta-data for pervasive systems. While prior work has demonstrated the benefits of reflective component-based middleware for building open and reconfigurable applications, the cost of using remote reflective operations remains high. Refractive components address this problem by selectively augmenting application data flows with their reflective meta-data, which travels at low cost to refractive pools, which serve as loci of inspection and control for the distributed application. Additionally reactive policies are introduced, providing a mechanism to trigger reconfigurations based on incoming reflective meta-data. We evaluate the performance of refraction in a case-study of automatic configuration repair for a real-world pervasive application. We show that refraction reduces network overhead in comparison to the direct use of reflective operations while not increasing development overhead. To enable further experimentation with the concept of refraction, we provide RxCom, an open-source refractive component model and supporting runtime environment.

Ramachandran GS, Daniels W, Proença J, Michiels S, Joosen W, Hughes D, Porter B.  2015.  Hitch Hiker: A Remote Binding Model with Priority Based Data Aggregation for Wireless Sensor Networks. Proceedings of the 18th International Symposium on Component-Based Software Engineering, CBSE. :43–48. Abstracthitchhiker.pdf

The aggregation of network traffic has been shown to enhance the performance of wireless sensor networks. By reducing the number of packets that are transmitted, energy consumption, collisions and congestion are minimised. However, current data aggregation schemes restrict developers to a specific network structure or cannot handle multi-hop data aggregation. In this paper, we propose Hitch Hiker, a remote component binding model that provides support for multi-hop data aggregation. Hitch Hiker uses component meta-data to discover remote component bindings and to construct a multi-hop overlay network within the free payload space of existing traffic flows. This overlay network provides end-to-end routing of low-priority traffic while using only a small fraction of the energy of standard communication. We have developed a prototype implementation of Hitch Hiker for the LooCI component model. Our evaluation shows that Hitch Hiker consumes minimal resources and that using Hitch Hiker to deliver low-priority traffic reduces energy consumption by up to 15%.

Backes M, Barbosa MB, Fiore D, Reischuk RM.  2015.  ADSNARK: Nearly Practical and Privacy-Preserving Proofs on Authenticated Data. 2015 IEEE Symposium on Security and Privacy, {SP} 2015, San Jose, CA, USA, May 17-21, 2015. :271–286. Abstractadsnark.pdf

We study the problem of privacy-preserving proofs on authenticated data, where a party receives data from a trusted source and is requested to prove computations over the data to third parties in a correct and private way, i.e., the third party learns no information on the data but is still assured that the claimed proof is valid. Our work particularly focuses on the challenging requirement that the third party should be able to verify the validity with respect to the specific data authenticated by the source — even without having access to that source. This problem is motivated by various scenarios emerging from several application areas such as wearable computing, smart metering, or general business-to-business interactions. Furthermore, these applications also demand any meaningful solution to satisfy additional properties related to usability and scalability. In this paper, we formalize the above three-party model, discuss concrete application scenarios, and then we design, build, and evaluate ADSNARK, a nearly practical system for proving arbitrary computations over authenticated data in a privacy-preserving manner. ADSNARK improves significantly over state-of-the-art solutions for this model. For instance, compared to corresponding solutions based on Pinocchio (Oakland’13), ADSNARK achieves up to 25× improvement in proof computation time and a 20× reduction in prover storage space.

Guimarães P, Pereira JO.  2015.  X-Ray: Monitoring and analysis of distributed database queries. Distributed Applications and Interoperable Systems (DAIS, with DisCoTec). Abstractgp15.pdf

The integration of multiple database technologies, including both SQL and NoSQL, allows using the best tool for each aspect of a complex problem and is increasingly sought in practice. Unfortunately, this makes it difficult for database developers and administrators to obtain a clear view of the resulting composite data processing paths, as they combine operations chosen by different query optimisers, implemented by different software packages, and partitioned across distributed systems.

This work addresses this challenge with the X-Ray framework, that allows monitoring code to be added to a Java-based distributed system by manipulating its bytecode at runtime. The resulting information is collected in a NoSQL database and then processed to visualise data processing paths as required for optimising integrated database systems. This proposal is demonstrated with a distributed query over a federation of Apache Derby database servers and its performance evaluated with the standard TPC-C benchmark workload.

Macedo N, Cunha A, Guimarães T.  2015.  Exploring Scenario Exploration. Proceedings of the 18th International Conference on Fundamental Approaches to Software Engineering. Abstractfase15var.pdf

Model finders are very popular for exploring scenarios, helping users validate specifications by navigating through conforming model instances. To be practical, the semantics of such scenario exploration operations should be formally defined and, ideally, controlled by the users, so that they are able to quickly reach interesting scenarios.
This paper explores the landscape of scenario exploration operations, by formalizing them with a relational model finder. Several scenario exploration operations provided by existing tools are formalized, and new ones are proposed, namely to allow the user to easily explore very similar (or different) scenarios, by attaching preferences to model elements. As a proof-of-concept, such operations were implemented in the popular Alloy Analyzer, further increasing its usefulness for (user-guided) scenario exploration.

Lamas J, Silva CC, Silva M, Mouta S, Campos JC, Santos J.  2015.  Measuring end-to-end delay in real-time auralisation systems. Euronoise – 10th European Congress and Exposition on Noise Control Engineering Abstractend2enddelay_euronoise2015.pdf

One of the major challenges in the development of an immersive system is handling the delay between the tracking of the user’s head position and the updated projection of a 3D image or auralised sound, also called end-to-end delay. Excessive end-to-end delay can result in the general decrement of the “feeling of presence”, the occurrence of motion sickness and poor performance in perception-action tasks. These latencies must be known in order to provide insights on the technological (hardware/software optimization) or psychophysical (recalibration sessions) strategies to deal with them. Our goal was to develop a new measurement method of end-to-end delay that is both precise and easily replicated. We used a Head and Torso simulator (HATS) as an auditory signal sensor, a fast response photo-sensor to detect a visual stimulus response from a Motion Capture System, and a voltage input trigger as real-time event. The HATS was mounted in a turntable which allowed us to precisely change the 3D sound relative to the head position. When the virtual sound source was at 90º azimuth, the correspondent HRTF would set all the intensity values to zero, at the same time a trigger would register the real-time event of turning the HATS 90º azimuth. Furthermore, with the HATS turned 90º to the left, the motion capture marker visualization would fell exactly in the photo-sensor receptor. This method allowed us to precisely measure the delay from tracking to displaying. Moreover, our results show that the method of tracking, its tracking frequency, and the rendering of the sound reflections are the main predictors of end-to-end delay.

Moreno CB, Lima R, Miranda H.  2015.  Adaptive Broadcast Cancellation Query Mechanism for Unstructured Networks. 9th International Conference on Next Generation Mobile Applications, Services and Technologies. Abstract142629213215577.pdf

The availability of cheap embedded sensors in mobile devices boosted the emergence of unstructured networks using wireless technologies without centralised administration. However, a simple task such as collecting temperature needs a discovery service to find a thermometer. Usually, the resource discovery relies on flooding mechanisms, wasting energy and compromising system availability. On the other hand, energy efficient solutions based on broadcast cancellation mechanism have a significant impact on latency. The paper proposes ABC (Adaptive Broadcast Cancellation) a new algorithm that uses the knowledge acquired in the past to accelerate queries towards the resource. Each node listens to its neighbours and the acquired context is stored in a variation of bloom filters.

Lima R, Moreno CB, Miranda H.  2015.  FBL - Filtro Bloom Linear. Infórum 2015. Abstractlima15.pdf

As estruturas de dados que permitem o armazenamento de informação de forma probabilística (em particular, os filtros de Bloom) caracterizam-se por permitir a regulação do equilíbrio entre a eficiência na gestão do espaço de armazenamento e a precisão das respostas. Esta possibilidade tem motivado a sua utilização em cenários adversos, por exemplo em redes de sensores, onde os dispositivos apresentam recursos limitados (memória, cpu, energia). Este artigo apresenta um mecanismo de Filtro de Bloom Linear (FBL), que permite associar a cada um dos elementos uma probabilidade quantizada no intervalo real [0,1], ultrapassando as limitações impostas pela característica binária dos filtros de Bloom. Os resultados mostram que é possível parametrizar um FBL de modo a manter um erro aceitável em função da variação do número de bits usados na quantização e do número de funções de hash usadas na indexação. O artigo discute a aplicação dos FBLs em mecanismos de disseminação e descoberta de recursos em redes de sensores, mostrando como contribuem para manter uma dimensão constante das mensagens trocadas pelos sensores, independentemente da dimensão da rede.