HASLab

Post Doctoral at HASLab/INESC TEC & University of Minho

The High Assurance Software Lab (HASLab) is an R&D unit at INESC TEC, a leading research institution in Portugal. The HASLab specialises in the rigorous development of software applications for critical systems and infrastructures, drawing on expertise in software engineering, dependable distributed systems, and cryptography and information security.

The HASLab has opened some positions for Post Doctoral researchers and for Ph.D. students, and is offering a significant number of internships to prospective Ph.D. candidates.

Projects

LeanBigData: Ultra-Scalable and Ultra-Efficient Integrated and Visual Big Data Analytics (FP7-619606)
CoherentPaaS: A Coherent and Rich PaaS with a Common Programming Model (FP7-611068)
PRACTICE: Privacy-enhanced and Secure Computations on Potentially Malicious Clouds (FP7-609611)

Projects

LeanBigData: Ultra-Scalable and Ultra-Efficient Integrated and Visual Big Data Analytics (FP7-619606) (Local coordinator)
CoherentPaaS: A Coherent and Rich PaaS with a Common Programming Model (FP7-611068) (Local coordinator)

BIO

I was born in Luanda, Angola. In 1975, just before the independence of Angola (until then it was a Portuguese colony), I came to Portugal. I did almost all my studies in Porto where I live. I have graduated in Electrotechnic and Computers Engineering, by the University of Porto in 1991. I then came to the University of Minho, in Braga to take the MSc. course in Computer Science. My thesis was about the relationship between subclassing and subtyping in object-oriented languages and was concluded in 1994.

Beernaert L, Gomes P, Matos M, Vilaça R, Oliveira R. 2013. Evaluating Cassandra as a manager of large file sets. Proceedings of the 3rd International Workshop on Cloud Data and Platforms (with EuroSys 2013). Abstractclouddp-cassfs.pdf

All companies developing their business on the Web, not only giants like Google or Facebook but also small com- panies focused on niche markets, face scalability issues in data management. The case study of this paper is the content management systems for classified or commercial advertise-ments on the Web. The data involved has a very significant growth rate and a read-intensive access pattern with a reduced update rate. Typically, data is stored in traditional file systems hosted on dedicated servers or Storage Area Network devices due to the generalization and ease of use of file systems. However, this ease in implementation and usage has a disadvantage: the centralized nature of these systems leads to availability, elasticity and scalability problems. The scenario under study, undemanding in terms of the system's consistency and with a simple interaction model, is suitable to a distributed database, such as Cassandra, conceived precisely to dynamically handle large volumes of data. In this paper, we analyze the suitability of Cassandra as a substitute for file systems in content management systems. The evaluation, conducted using real data from a produc- tion system, shows that using Cassandra, one can easily get horizontal scalability of storage, redundancy across multiple independent nodes, and load distribution imposed by the periodic activities of safeguarding data, while ensuring a comparable performance to that of a file system.

Matos M, Felber P, Oliveira R, Pereira JO, Rivière E. 2013. Scaling up Publish/Subscribe Overlays using Interest Correlation for Link Sharing. IEEE Transactions on Parallel and Distributed Systems. 24(13875740):2462-2471. Abstracttpds-stan-main.pdf

Topic-based publish/subscribe is at the core of many distributed systems, ranging from application integration middleware to news dissemination. Therefore, much research was dedicated to publish/subscribe architectures and protocols, and in particular to the design of overlay networks for decentralized topic-based routing and efficient message dissemination. Nonetheless, existing systems fail to take full advantage of shared interests when disseminating information, hence suffering from high maintenance and traffic costs, or construct overlays that cope poorly with the scale and dynamism of large networks. In this paper we present StaN, a decentralized protocol that optimizes the properties of gossip-based overlay networks for topic- based publish/subscribe by sharing a large number of physical connections without disrupting its logical properties. StaN relies only on local knowledge and operates by leveraging common interests among participants to improve global resource usage and promote topic and event scalability. The experimental evaluation under two real workloads, both via a real deployment and through simulation shows that StaN provides an attractive infrastructure for scalable topic-based publish/subscribe.

Matos M, Schiavoni V, Felber P, Oliveira R, Rivière E. 2013. Lightweight, efficient, robust epidemic dissemination. Journal of Parallel and Distributed Computing. 7(7):987-999. Abstractjpdc-brisa.pdf

Today's intensive demand for data such as live broadcast or news feeds requires effcient and robust dissemination systems. Traditionally, designs focus on extremes of the effciency/robustness spectrum by either using structures, such as trees for effciency or by using loosely-coupled epidemic protocols for robustness. We present Brisa, a hybrid approach combining the robustness of epidemics with the effciency of structured approaches. Brisa implicitly emerges embedded dissemination structures from an underlying epidemic substrate. The structures' links are chosen with local knowledge only, but still ensuring connectivity. Failures can be promptly compensated and repaired thanks to the epidemic substrate, and their impact on dissemination delays masked by the use of multiple independent structures. Besides presenting the protocol design, we conduct an extensive evaluation in real environments, analyzing the effectiveness of the structure creation mechanism and its robustness under dynamic conditions. Results confirm Brisa as an effcient and robust approach to data dissemination in large dynamic environments.

Maia F, Matos M, Rivière E, Oliveira R. 2013. Slicing as a distributed systems primitive. Proceedings of the 6th Latin-American Symposium on Dependable Computing (LADC). Abstractslicing_primitive.pdf

Large-scale distributed systems appear as the major in- frastructures for supporting planet-scale services. These systems call for appropriate management mechanisms and protocols. Slicing is an example of an autonomous, fully decentral- ized protocol suitable for large-scale environments. It aims at organizing the system into groups of nodes, called slices, according to an application-specific criteria where the size of each slice is relative to the size of the full system. This al- lows assigning a certain fraction of nodes to different task, according to their capabilities. Although useful, current slicing techniques lack some features of considerable practical importance. This pa- per proposes a slicing protocol, that builds on existing so- lutions, and addresses some of their frailties. We present novel solutions to deal with non-uniform slices and to per- form online and dynamic slices schema reconfiguration. Moreover, we describe how to provision a slice-local Peer Sampling Service for upper protocol layers and how to en- hance slicing protocols with the capability of slicing over more than one attribute. Slicing is presented as a complete, dependable and inte- grated distributed systems primitive for large-scale systems.

Vilaça R, Cruz F, Pereira JO, Oliveira R. 2013. An Effective Scalable SQL Engine for NoSQL Databases. Proceedings of the 13th IFIP Distributed Applications and Interoperable Systems (DAIS). Abstractpaper_29.pdf

NoSQL databases were initially devised to support a few concrete extreme scale applications. Since the specificity and scale of the target systems justified the investment of manually crafting application code their limited query and indexing capabilities were not a major im- pediment. However, with a considerable number of mature alternatives now available there is an increasing willingness to use NoSQL databases in a wider and more diverse spectrum of applications and, to most of them, hand-crafted query code is not an enticing trade-off. In this paper we address this shortcoming of current NoSQL databases with an effective approach for executing SQL queries while preserving their scalability and schema flexibility. We show how a full-fledged SQL engine can be integrated atop of HBase leading to an ANSI SQL compli- ant database. Under a standard TPC-C workload our prototype scales linearly with the number of nodes in the system and outperforms a NoSQL TPC-C implementation optimized for HBase.

Maia F, Matos M, Vilaça R, Pereira JO, Oliveira R, Rivière E. 2013. DataFlasks: an epidemic dependable key-value substrate. Proceedings of the 3rd International Workshop on Dependability of Clouds, Data Centers and Virtual Computing Environments (with DSN 2013). Abstractdataflasks_dcdv13.pdf

Recently, tuple-stores have become pivotal struc- tures in many information systems. Their ability to handle large datasets makes them important in an era with unprecedented amounts of data being produced and exchanged. However, these tuple-stores typically rely on structured peer-to-peer protocols which assume moderately stable environments. Such assumption does not always hold for very large scale systems sized in the scale of thousands of machines. In this paper we present a novel approach to the design of a tuple-store. Our approach follows a stratified design based on an unstructured substrate. We focus on this substrate and how the use of epidemic protocols allow reaching high dependability and scalability.

Matos M, Schiavoni V, Felber P, Oliveira R, Rivière E. 2012. Brisa: Combining efficiency and reliability in epidemic data dissemination. Proceedings of 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS). Abstractipdps-brisa.pdf

There is an increasing demand for efficient and robust systems able to cope with today's global needs for intensive data dissemination, e.g., media content or news feeds. Unfortunately, traditional approaches tend to focus on one end of the efficiency/robustness design spectrum, by either leveraging rigid structures such as trees to achieve efficient distribution, or using loosely-coupled epidemic protocols to obtain robustness. In this paper we present BRISA, a hybrid approach combining the robustness of epidemic-based dissemination with the effi- ciency of tree-based structured approaches. This is achieved by having dissemination structures such as trees implicitly emerge from an underlying epidemic substrate by a judicious selection of links. These links are chosen with local knowledge only and in such a way that the completeness of data dissemination is not compromised, i.e., the resulting structure covers all nodes. Failures are treated as an integral part of the system as the dissemination structures can be promptly compensated and repaired thanks to the underlying epidemic substrate. Besides presenting the protocol design, we conduct an extensive evaluation in a real environment, analyzing the effectiveness of the structure creation mechanism and its robustness under faults and churn. Results confirm BRISA as an efficient and robust approach to data dissemination in the large scale.

Beernaert L, Matos M, Vilaça R, Oliveira R. 2012. Automatic elasticity in OpenStack. Proceedings of Workshop on Secure and Dependable Middleware for Cloud Monitoring and Management (with Middleware 2012). Abstractsdmcmm-elastack.pdf

Cloud computing infrastructures are the most recent ap- proach to the development and conception of computational systems. Cloud infrastructures are complex environments with various subsystems, each one with their own challenges. Cloud systems should be able to provide the following fun- damental property: elasticity. Elasticity is the ability to automatically add and remove instances according to the needs of the system. This is an requirement for pay-per-use billing models. Various open source software solutions allow companies and institutions to build their own Cloud infrastructure. How- ever, in most of these, the elasticity feature is quite imma- ture. Monitoring and timely adapting the active resources of a Cloud computing infrastructure is key to provide the elasticity required by diverse, multi tenant and pay-per-use business models. In this paper, we propose Elastack, an automated monitor- ing and adaptive system, generic enough to be applied to existing IaaS frameworks and intended to enable the elastic- ity they currently lack. Our approach offers any Cloud in- frastructure the mechanisms to implement automated mon- itoring and adaptation as well as the flexibility to go beyond these. We evaluate Elastack by integrating it with the Open- Stack and showing how easy it is to add these important features with a minimum, almost imperceptible, amount of modifications to the default installation.

Jimenez-Peris R, Patiño-Martínez M, Brondino I, Pereira JO, Oliveira R, Vilaça R, Kemme B, Ahmad Y. 2012. CumuloNimbo: Parallel-Distributed Transactional Processing. Proceedings of the CloudFutures Workshop. Abstractcumulonimbo-ultra-scalable-transactional-processing-v6.pdf

CumuloNimbo aims at solving the lack of scalability of transactional applications that represent a large fraction of existing applications. CumuloNimbo aims at conceiving, architecting and developing a transactional, coherent, elastic and ultra scalable Platform as a Service. Its goals are: Ultra scalable and dependable, able to scale from a few users to many millions of users while at the same time providing continuous availability; Support transparent migration of multi-tier applications (e.g. Java EE applications, relational DB applications, etc.) to the cloud with automatic scalability and elasticity. Avoid reprogramming of applications and non-transparent scalability techniques such as sharding. Support transactions for new data stores such as cloud data stores, graph databases, etc.The main challenges are: Update ultrascalability (million update transactions per second and as many read-only transactions as needed). Strong transactional consistency. Non-intrusive elasticity. Inexpensive high availability. Low latency. CumuloNimbo goes beyond the state of the art by scaling transparently transactional applications to very large rates without sharding, the current practice in Today?s cloud. In this paper we describe CumuloNimbo architecture and its performance.

Vilaça R, Oliveira R, Pereira JO. 2011. A correlation-aware data placement strategy for key-value stores. Proceeedings of the 11th IFIP Distributed Applications and Interoperable Systems (DAIS). Abstractdais11.pdf

Key-value stores hold the unprecedented bulk of the data produced by applications such as social networks. Their scalability and availability requirements often outweigh sacrificing richer data and pro- cessing models, and even elementary data consistency. Moreover, existing key-value stores have only random or order based placement strategies. In this paper we exploit arbitrary data relations easily expressed by the application to foster data locality and improve the performance of com- plex queries common in social network read-intensive workloads. We present a novel data placement strategy, supporting dynamic tags, based on multidimensional locality-preserving mappings. We compare our data placement strategy with the ones used in existing key-value stores under the workload of a typical social network appli- cation and show that the proposed correlation-aware data placement strategy offers a major improvement on the system's overall response time and network requirements.

Matos M, Vilaça R, Pereira JO, Oliveira R. 2011. An epidemic approach to dependable key-value substrates. Proceedings of 1st International Workshop on Dependability of Clouds, Data Centers and Virtual Computing Environments (with DSN). Abstractdcdv-epidemicstorage.pdf

The sheer volumes of data handled by today's Internet services demand uncompromising scalability from the persistence substrates. Such demands have been successfully addressed by highly decentralized key-value stores invariably governed by a distributed hash table. The availability of these structured overlays rests on the assumption of a moderately sta- ble environment. However, as scale grows with unprecedented numbers of nodes the occurrence of faults and churn becomes the norm rather than the exception, precluding the adoption of rigid control over the network's organization. In this position paper we outline the major ideas of a novel architecture designed to handle today's very large scale demand and its inherent dynamism. The approach rests on the well-known reliability and scalability properties of epidemic protocols to minimize the impact of churn. We identify several challenges that such an approach implies and speculate on possible solutions to ensure data availability and adequate access performance.

Cruz F, Gomes P, Oliveira R, Pereira JO. 2011. Assessing NoSQL Databases for Telecom Applications. Proceedings of IEEE 13th Conference on Commerce and Enterprise Computing (CEC). Abstracto_publicado.pdf

The constant evolution of access technologies are turning Internet access more ubiquitous, faster, better and cheaper. In connection with the proliferation of Internet access, Cloud Computing is changing the way users look at data, moving from local applications and installations to remote services, accessible from any device. This new paradigm presents numerous opportunities that even traditional businesses like telecoms cannot ignore, in particular, enabling new and more cost effective solutions to old problems. The work presented in this paper provides a detailed description of how a telecom application can be migrated to a NoSQL database. Particularly, by pointing out the necessary change of how we reason about data as well as the data structures that support it, in order to take full advantage of Cloud Computing. In addition, we also present a preliminary evaluation of different data persistency paradigms based on a fully tunable simulation platform that mimics the operation of a telecom business.

Vilaça R, Cruz F, Oliveira R. 2010. On the expressiveness and trade-offs of large scale tuple stores. Proceedings of On the Move to Meaningful Internet Systems (OTM). Abstractcr.pdf

Massive-scale distributed computing is a challenge at our doorstep. The current exponential growth of data calls for massive-scale capabilities of storage and processing. This is being acknowledged by several major Internet players embracing the cloud computing model and offering first generation distributed tuple stores. Having all started from similar requirements, these systems ended up providing a similar service: A simple tuple store interface, that allows applications to insert, query, and remove individual elements. Further- more, while availability is commonly assumed to be sustained by the massive scale itself, data consistency and freshness is usually severely hindered. By doing so, these services focus on a specific narrow trade-off between consistency, availability, performance, scale, and migration cost, that is much less attractive to common business needs. In this paper we introduce DataDroplets, a novel tuple store that shifts the current trade-off towards the needs of common business users, pro- viding additional consistency guarantees and higher level data process- ing primitives smoothing the migration path for existing applications. We present a detailed comparison between DataDroplets and existing systems regarding their data model, architecture and trade-offs. Prelim- inary results of the system's performance under a realistic workload are also presented.

Campos F, Pereira JO, Oliveira R. 2011. Achieving Eventual Leader Election in WS - Discovery. Proceedings of 5th Latin-American Symposium on Dependable Computing (LADC). Abstractcpo11.pdf

The Devices Profile for Web Services (DPWS) provides the foundation for seamless deployment, autonomous configuration, and joint operation for various computing de- vices in environments ranging from simple personal multimedia setups and home automation to complex industrial equipment and large data centers. In particular, WS-Discovery provides dynamic rendezvous for clients and services embodied in such devices. Unfortunately, failure detection implicit in this standard is very limited, both by embodying static timing assumptions and by omitting liveness monitoring, leading to undesirable situations in demanding application scenarios. In this paper we identify these undesirable outcomes and propose an extension of WS-Discovery that allows failure detection to achieve eventual leader election, thus preventing them.

Maia F, Oliveira R, Inigo A-J. 2010. About the Feasibility of Transactional Support in Cloud Computing (Fast Abstract). Proceedings of 8th European Dependable Computing Conference (EDCC). Abstractfeasibility.pdf

In this paper we review what it has been stated so far about transactional support on the cloud computing environment. Then, we propose to extend them with some ideas already stated in replicated databases, like the certification process, to solve certain problems about coordination in the commit phase of transactions in the cloud.

Matos M, Sousa AL, Pereira JO, Oliveira R, Deliot E, Murray P. 2009. CLON: Overlay networks and gossip protocols for cloud environments. Proceedings of On the Move to Meaningful Internet Systems - OTM. 5870 Abstractdoa-clon.pdf

Although epidemic or gossip-based multicast is a robust and scalable approach to reliable data dissemination, its inherent redundancy results in high resource con- sumption on both links and nodes. This problem is aggravated in settings that have costlier or resource constrained links, as happens in Cloud Computing infrastruc- tures composed by several interconnected data centers across the globe. The goal of this work is therefore to improve the efficiency of gossip-based reliable multicast by reducing the load imposed on those constrained links. In detail, the proposed CLON protocol combines an overlay that gives preference to local links and a dissemination strategy that takes into account locality. Extensive experimental evaluation using a very large number of simulated nodes shows that this results in a reduction of traffic in constrained links by an order of magnitude, while at the same time preserving the resilience properties that make gossip-based protocols so attractive.

Position:

Associate Professor

Site logo:

Rui Oliveira

Distributed Systems

Rui Oliveira

Post Doctoral at HASLab/INESC TEC & University of Minho

Projects

Projects

BIO