Recent Publications

Lima R, Moreno CB, Miranda H. 2012. Stopping on going broadcasts in large MANETs. Proceedings of the 1st European Workshop on AppRoaches to MObiquiTous Resilience - ARMOR. 4:4. Abstractc9.pdf

Broadcast is a communication primitive building block widely used in mobile ad-hoc networks (MANETs) for the exchange of control packets and resource location for upper level services such as routing and management protocols. Flooding is the most simple broadcast algorithm, but it wastes a lot of energy and bandwidth, as flooding leads to many redundant radio transmissions. An optimization to flooding is to contain it, once the resource has been found. In this paper, we compare the impact on the latency and power consumption of four competing approaches for flooding containment. The results show that stopping ongoing broadcasts can achieve promising performance increases over other flooding base techniques, when applied in large scale MANETs with scarce power resources. In addition, results show that both network topology and the number of copies of the resource influence differently the performance of each searching approach.

Moreno CB, Almeida PS, Fonte V, Preguiça N, Gonçalves R. 2012. Brief announcement: efficient causality tracking in distributed storage systems with dotted version vectors. Proceedings of the symposium on Principles of distributed computing - PODC. :335–336. Abstractp335-preguica.pdf

Version vectors (VV) are used pervasively to track dependencies between replica versions in multi-version distributed storage systems. In these systems, VV tend to have a dual functionality: identify a version and encode causal dependencies. In this paper, we show that by maintaining the identifier of the version separate from the causal past, it is possible to verify causality in constant time (instead of O(n) for VV) and to precisely track causality with information with size bounded by the degree of replication, and not by the number of concurrent writers.

Moreno CB, Bieniusa A, Zawirsky M, Preguiça N, Shapiro M, Balegas V, Duarte S. 2012. Brief announcement: Semantics of eventually consistent replicated sets. Proceedings of the 26th international conference on Distributed Computing - ICDCS . 7611:441–442. Abstractsemantics-set

This paper studies the semantics of sets under eventual consistency. The set is a pervasive data type, used either directly or as a component of more complex data types, such as maps or graphs. Eventual consistency of replicated data supports concurrent updates, reduces latency and improves fault tolerance, but forgoes strong consistency (e.g., linearisability). Accordingly, several cloud computing platforms implement eventually-consistent replicated sets [2,4].

Paulo J, Reis P, Pereira JO, Sousa AL. 2012. DEDISbench: A Benchmark for Deduplicated Storage Systems. In proceedings of International Symposium on Secure Virtual Infrastructures - ACSAC. 7566 Abstractprp12.pdf

Deduplication is widely accepted as an effective technique for eliminating duplicated data in backup and archival systems. Nowadays, deduplication is also becoming appealing in cloud computing, where large-scale virtualized storage infrastructures hold huge data volumes with a significant share of duplicated content. There have thus been several proposals for embedding deduplication in storage appliances and file systems, providing different performance trade-offs while targeting both user and application data, as well as virtual machine images.

It is however hard to determine to what extent is deduplication useful in a particular setting and what technique will provide the best results. In fact, existing disk I/O micro-benchmarks are not designed for evaluating deduplication systems, following simplistic approaches for generating data written that lead to unrealistic amounts of duplicates.

We address this with DEDISbench, a novel micro-benchmark for evaluating disk I/O performance of block based deduplication systems. As the main contribution, we introduce the generation of a realistic duplicate distribution based on real datasets. Moreover, DEDISbench also allows simulating access hotspots and different load intensities for I/O operations. The usefulness of DEDISbench is shown by comparing it with Bonnie++ and IOzone open-source disk I/O micro-benchmarks on assessing two open-source deduplication systems, Opendedup and Lessfs, using Ext4 as a baseline. As a secondary contribution, our results lead to novel insight on the performance of these file systems.

Couto R, Ribeiro AN, Campos JC. 2012. A Java based PSM/PIM and pattern inference approach. 35th annual Software Engineering Workshop - SEW. Abstracteeum_di_dissertacao_pg15456.pdf

Due to the constant increase in the number of platforms and languages available to software developers, we are reaching high levels of complexity. To abstract the complexity that underlies it, the development of new techniques is needed. A solution to this problem was presented by the Object Management Group (OMG) by specifying the Model Driven Engineering (MDE). The MDE bases its development process in models definition and transformation, specifically Computation Independent Models (CIM), Platform Independent Models (PIM) and Platform Specific Models (PSM). The Unified Model Language (UML) allows to create Platform Specific Models (PSM) and Platform Independent Models (PIM), or even more specific diagrams as class diagrams. Some years before the MDE appearance, Erich Gamma et al. catalogued a set of correct means of producing software. These means are called design patterns, and its importance has already been widely recognized. These patterns are not only useful in software developing, but also in the software analysis process. Based on Java programs, this document presents the feasibility to transform source code on MDE models. This code will be transformed into PIM and PSM diagrams, in which will be inferred design patterns. As such, a tool which implements these functionalities will be specified. Implemented as a plugin, it maps the information on a metamodel to obtain an intermediate information representation. Based on that representation it provides information abstraction, by transforming PSM on PIM models. The design patterns inference is possible due to the representation of information contained in the metamodel as Prolog facts, which will be the basis for the design pattern search. Being a reverse engineering process, it allows the process to be started from the source code (and not in models, as predicted by MDE).

Silva CE, Campos JC. 2012. Can GUI implementation markup languages be used for modelling? Human Centred Software Engineering - HCSE. 7623:112-129. Abstractsilvac-hcse2012-final_submission.pdf

The current diversity of available devices and form factors increases the need for model-based techniques to support adapting applications from one device to another. Most work on user interface modelling is built around declarative markup languages. Markup languages play a relevant role, not only in the modelling of user interfaces, but also in their implementation. However, the languages used by each community (modellers/developers) have, to a great extent evolved separately. This means that the step from concrete model to final interface becomes needlessly complicated, requiring either compilers or interpreters to bridge this gap. In this paper we compare a modelling language (UsiXML) with several markup implementation languages. We analyse if it is feasible to use the implementation languages as modelling languages.

Harrison M, Campos JC, Masci P, Thomas N. 2012. Modelling and systematic analysis of interactive systems. Proceedings of the Workshop on Formal Methods in Human-Machine Interaction (Formal H). :25-28. Abstractformalh.2012.proceedings.pdf

Two aspects of our research concern the application of formal methods in human-computer interaction. The first aspect is the modelling and analysis of interactive devices with a particular emphasis on the user device dyad. The second is the modelling and analysis of ubiquitous systems where there are many users, one might say crowds of users.The common thread of both is to articulate and prove properties of interactive systems, to explore interactive behaviour as it influences the user, with a particular emphasis on interaction failure. The goal is to develop systematic techniques that can be packaged in such a way that they can be used effectively by developers. This “whitepaper” will briefly describe the two approaches and their potential value as well as their limitations and development opportunities.

Silva JL, Campos JC, Harrison M. 2012. Formal analysis of Ubiquitous Computing environments through the APEX framework. Symposium on Engineering Interactive Computing Systems - EICS. :131-140. Abstract2012-eics.pdf

Ubiquitous computing (ubicomp) systems involve complex interactions between multiple devices and users. This complexity makes it difficult to establish whether: (1) observations made about use are truly representative of all possible interactions; (2) desirable characteristics of the system are true in all possible scenarios. To address these issues, techniques are needed that support an exhaustive analysis of a system's design. This paper demonstrates one such exhaustive analysis technique that supports the early evaluation of alternative designs for ubiquitous computing environments. The technique combines models of behavior within the environment with a virtual world that allows its simulation. The models support checking of properties based on patterns. These patterns help the analyst to generate and verify relevant properties. Where these properties fail then scenarios suggested by the failure provide an important aid to redesign. The proposed technique uses APEX, a framework for rapid prototyping of ubiquitous environments based on Petri nets. The approach is illustrated through a smart library example. Its benefits and limitations are discussed.

Barbosa MB, Farshim P. 2012. Delegatable Homomorphic Encryption with Applications to Secure Outsourcing of Computation. The Cryptographers' Track at the RSA Conference on Topics in Cryptology - CT-RSA. 7178 Abstract2152.pdf

In this work we propose a new cryptographic primitive called Delegatable Homomorphic Encryption (DHE). This allows a Trusted Authority to control/delegate the capability to evaluate circuits over encrypted data to untrusted workers/evaluators by issuing tokens. This primitive can be both seen as a public-key counterpart to Verifiable Computation, where input generation and output verification are performed by different entities, or as a generalisation of Fully Homomorphic Encryption enabling control over computations on encrypted data.

Our primitive comes with a series of extra features as follows: 1) there is a one-time setup procedure for all circuits; 2) senders do not need to be aware of the functions which will be evaluated on the encrypted data, nor do they need to register keys; 3) tokens are independent of senders and receiver; and 4) receivers are able to verify the correctness of computation given short auxiliary information on the input data and the function, independently of the complexity of the computed circuit.

We give a modular construction of such a DHE scheme from three components: Fully Homomorphic Encryption (FHE), Functional Encryption (FE), and a (customised) MAC. As a stepping stone, we first define Verifiable Functional Encryption (VFE), and then show how one can build a secure DHE scheme from a VFE and an FHE scheme. We also show how to build the required VFE from a standard FE together with a MAC scheme. All our results hold in the standard model.Finally, we show how one can build a verifiable computation (VC) scheme generically from a DHE. As a corollary, we get
the first VC scheme which remains verifiable even if the attacker can observe verification results

Brumley B, Barbosa MB, Page D, Vercauteren F. 2012. Practical realisation and elimination of an ECC-related software bug attack. The Cryptographers' Track at the RSA Conference on Topics in Cryptology - CT-RSA. 7178 Abstractpaper.pdf

We analyse and exploit implementation features in OpenSSL version 0.9.8g which permit an attack against ECDH-based functionality. The attack, although more general, can recover the entire (static) private key from an associated SSL server via 633 adaptive queries when the NIST curve P-256 is used. One can view it as a software-oriented analogue of the bug attack concept due to Biham et al. and, consequently, as the first bug attack to be successfully applied against a real-world system. In addition to the attack and a posteriori countermeasures, we show that formal verification, while rarely used at present, is a viable means of detecting the features which the attack hinges on. Based on the security implications of the attack and the extra justification posed by the possibility of intentionally incorrect implementations in collaborative software development, we conclude that applying and extending the coverage of formal verification to augment existing test strategies for OpenSSL-like software should be deemed a worthwhile, long-term challenge.

Macedo N, Pacheco H, Cunha A. 2012. Relations as executable specifications: taming partiality and non-determinism using invariants. 13th International Conference on Relational and Algebraic Methods in Computer Science - RAMICS. LNCS 7560:146–161. Abstractndlenses12.pdf

The calculus of relations has been widely used in program specification and reasoning. It is very tempting to use such specifications as running prototypes of the desired program, but, even considering finite domains, the inherent partiality and non-determinism of relations makes this impractical and highly inefficient. To tame partiality we prescribe the usage of invariants, represented by coreflexives, to characterize the exact domains and codomains of relational specifications. Such invariants can be used as pre-condition checkers to avoid runtime errors. Moreover, we show how such invariants can be used to narrow the non-deterministic execution of relational specifications, making it viable for a relevant class of problems. In particular, we show how the proposed techniques can be applied to execute specifications of bidirectional transformations, a domain where partiality and non-determinism are paramount.

Maia F, Matos M, Rivière E, Oliveira R. 2012. Slead: low-memory, steady distributed systems slicing. Proceedings of the 12th international conference on Distributed Applications and Interoperable Systems - IFIP. :1–15. Abstractslead.pdf

Slicing a large-scale distributed system is the process of autonomously partitioning its nodes into k groups, named slices. Slicing is associated to an order on node-specific criteria, such as available storage, uptime, or bandwidth. Each slice corresponds to the nodes between two quantiles in a virtual ranking according to the criteria.
For instance, a system can be split in three groups, one with nodes with the lowest uptimes, one with nodes with the highest uptimes, and one in the middle. Such a partitioning can be used by applications to assign different tasks to different groups of nodes, e.g., assigning critical tasks to the more powerful or stable nodes and less critical tasks to other slices.
Assigning a slice to each node in a large-scale distributed system, where no global knowledge of nodes’ criteria exists, is not trivial. Recently, much research effort was dedicated to guaranteeing a fast and correct convergence in comparison to a global sort of the nodes.
Unfortunately, state-of-the-art slicing protocols exhibit flaws that preclude their application in real scenarios, in particular with respect to cost and stability. In this paper, we identify steadiness issues where nodes in a slice border constantly exchange slice and large memory requirements for adequate convergence, and provide practical solutions for the two. Our solutions are generic and can be applied to two different state-of-the-art slicing protocols with little effort and while preserving the desirable properties of each. The effectiveness of the proposed solutions is extensively studied in several simulated experiments.

Arriaga A, Barbosa MB, Farshim P. 2012. On the Joint Security of Signature and Encryption Schemes under Randomness Reuse: Efficiency and Security Amplification. 10th International Conference on Applied Cryptography and Network Security - ACNS. :206-223. Abstract382.pdf

We extend the work of Bellare, Boldyreva and Staddon on the systematic analysis of randomness reuse to construct multi-recipient encryption schemes to the case where randomness is reused across different cryptographic primitives. We find that through the additional binding introduced through randomness reuse, one can actually obtain a security amplification with respect to the standard black-box compositions, and achieve a stronger level of security. We introduce stronger notions of security for encryption and signatures, where challenge messages can depend in a restricted way on the random coins used in encryption, and show that two variants of the KEM/DEM paradigm give rise to encryption schemes that meet this enhanced notion of security. We obtain the most efficient signcryption scheme to date that is secure against insider attackers without random oracles.

Barbosa MB, Pinto A, Gomes B. 2012. Generically extending anonymization algorithms to deal with successive queries. 21st ACM International Conference on Information and Knowledge Management - CIKM . :1362-1371. Abstractc46.pdf

This paper addresses the scenario of multi-release anonymization of datasets. We consider dynamic datasets where data can be inserted and deleted, and view this scenario as a case where each release is a small subset of the dataset corresponding, for example, to the results of a query. Compared to multiple releases of the full database, this has the obvious advantage of faster anonymization. We present an algorithm for post-processing anonymized queries that prevents anonymity attacks using multiple released queries. This algorithm can be used with several distinct protection principles and anonymization algorithms, which makes it generic and flexible. We give an experimental evaluation of the algorithm and compare it to $m$-invariance both in terms of efficiency and data quality. To this end, we propose two data quality metrics based on Shannon's entropy, and show that they can be seen as a refinement of existing metrics.

Cunha A, Pacheco H. 2012. Multifocal: A Strategic Bidirectional Transformation Language for XML Schemas. Proceedings of the 5th International Conference on Model Transformation - ICMT. 7307:89–104. Abstracticmt12.pdf

Lenses are one of the most popular approaches to define bidirectional transformations between data models. However, writing a lens transformation typically implies describing the concrete steps that convert values in a source schema to values in a target schema. In contrast, many XML-based languages allow writing structure-shy programs that manipulate only specific parts of XML documents without having to specify the behavior for the remaining structure. In this paper, we propose a structure-shy bidirectional two-level transformation language for XML Schemas, that describes generic type-level transformations over schema representations coupled with value-level bidirectional lenses for document migration. When applying these two-level programs to particular schemas, we employ an existing algebraic rewrite system to optimize the automatically-generated lens transformations, and compile them into Haskell bidirectional executables. We discuss particular examples involving the generic evolution of recursive XML Schemas, and compare their performance gains over non-optimized definitions.