Recent Publications

Matos M.  2009.  Network-Aware Epidemic Broadcast. Abstractmasterthesis.pdf

Epidemic multicast is an emerging resilient and scalable approach to the reliable dissemination of application data in the context of very large scale distributed systems. Unfortunately, the resilience and scalability come at the cost of considerable redundancy which led to high resource consumption on both links and nodes. In environments with resource constrained links, such as in Cloud Computing infrastructure composed by data centers organized in a federation around the globe, the high resource consumption precludes the use of this class of protocols. The goal of this dissertation is therefore to cope with the constraints of these scenarios, by reducing the network load imposed on the constrained long distance links. This is achieved by constructing an overlay that reflects the characteristics of the links, and by using a dissemination protocol that takes into account locality when transmitting the message payloads. We conducted an extensive experimental evaluation that presents an improvement over an order of magnitude in the number of messages that traverse the costlier links, without endangering the resilience and scalability properties that make epidemic based protocols so attractive.

Nunes A.  2009.  P2P content-push in the Internet. AbstractP2P_content-push_in_the_Internet

Syndicated content-push in the Internet is a huge success, and web feeds are being used to convey various types of content: from news headlines, to podcasts and video podcasts, to being a feature in Web 2.0 websites. This diversity lead to the appearance of several frameworks, each tailored to a specific content type. At the same time, interest in social networking exploded, as more and more websites of this purpose were launched. Syndicated content and social networking websites are now intimately connected.
In this work, a generic, modular, p2p content-push architecture is proposed. It provides an evolutionary upgrade path based on the technologies already in use in the Internet for interoperability, thus without disrupting current infrastructure or changing participants’ habits. It also leverages social networks for content discovery and aggregation, using peer-to-peer protocols for distribution. A partial implementation of this architecture, dubbed SEEDS, is also presented.

Abreu R.  2009.  Spectrum-based fault localization in embedded software. Abstractthesis_rui_abreu.pdf

Modern daily devices such as televisions rely increasingly on (embedded) software. Features implemented in software are often cheaper, easier, flexible to future modifications, and more portable than when implemented in hardware. Such properties are extremely important as, nowadays, many devices serve no single purpose but, instead, have several functionalities which need to be easily modified or upgraded to adhere to the high expectations of the consumers. As an example, a mobile phone is used not only to make phone calls but also, e.g., as a navigation system which has to contain the most upto-date navigation engine and maps.

Oliveira N.  2009.  Improving Program Comprehension Tools for Domain Specific Languages. Abstract

Since the dawn of times, curiosity and necessity to improve the quality of their life, led humans to find means to understand everything surrounding them, aiming at improving it. Whereas the creating abilities of some was growing, the capacity to comprehend of others follow their steps. Disassembling physical objects to comprehend the connections between the pieces in order to understand how they work together is a common human behavior. With the computers arrival, humans felt the necessity of applying the same techniques (disassemble to comprehend) to their programs.
Traditionally, these programs are written resorting to general-purpose programming languages. Hence, techniques and artifacts, used to aid on program comprehension, were built to facilitate the work of software programmers on maintaining and improving programs that were developed by others. Generally, these generic languages deal with concepts at a level that the human brain can hardly understand.
So understanding programs written in this languages is an hard task, because the distance between the concepts at the program level and the concepts at the problem level is too big.
Thus, as in politics, justice, medicine, etc. groups of words are regularly used facilitating the comprehension between people, also in programming, languages that address a specific domain were created. These programming languages raise the abstraction of the program domain, shortening the gap to the concepts of the problem domain.
Tools and techniques for program comprehension commonly address the program domain and they took little advantage of the problem domain. In this master’s thesis, the hypothesis that it is easier to comprehend a program when the underlying problem and program domains are known and a bridge between them is established, is assumed. Then, a program comprehension technique for domain specific languages,
is conceived, proposed and discussed. The main objective is to take advantage from the large knowledge about the problem domain inherent to the domain specific language, and to improve traditional program comprehension tools that only dealt, until then, with the program domain. This will create connections between both program and problem domains. The final result will show, visually, what happens internally
at the program domain level, synchronized with what happens externally, at problem level.

Rodrigues N.  2008.  Slicing techniques applied to architectural analysis of legacy software. Abstracttese.pdf

Program understanding is emerging as a key concern in software engineering. In a situation in which the only quality certificate of the running software artifact still is life-cycle endurance, customers and software producers are little prepared to modify or improve running code. However, faced with so risky a dependence on legacy software, managers are more and more prepared to spend resources to increase confidence on — i.e., the level of understanding of — their (otherwise untouchable) code. In fact the technological and economical relevance of legacy software as well as the complexity of their re-engineering entails the need for rigour.
Addressing such a scenario, this thesis advocates the use of direct source code analysis for both the process of understanding and transformation of software systems. In particular, the thesis focuses on the development and application of slicing techniques at both the “micro” and “macro” structural levels of software.
The former, deals with fine-grained structures of programs, slicing operating over elementary program entities, such as types, variables or procedure identifiers. The latter, on the other hand, addresses architectural issues and interaction modes across modules, components or services upon which a system is decomposed. At the “micro” level this thesis delves into the problem of slicing functional programs, a paradigm that is gaining importance and was generally neglected by the slicing community. Three different approaches
to functional slicing are proposed, accompanied by the presentation of the HaSlicer application, a software tool developed as a proof-of-concept for some of the ideas discussed. A comparison between the three approaches, their practical application and the motivational aspects for keeping investivigating new functional slicing processes are also discussed. Slicing at a “macro” level is the theme of the second part of this thesis,
which addresses the problem of extracting from source code the system’s coordination model which governs interaction between its components. This line of research delivers two approaches for abstracting software systems coordination models, one of the most vital structures for software architectural analysis. Again, a software tool – CoordInspector – is introduced as a proof-of-concept.

Cunha A.  2005.  Point-free Program Calculation. Abstracttese.pdf

Due to referential transparency, functional programming is particularly appropriate for equational reasoning. In this thesis we reason about functional programs by calculation, using a program calculus built upon two basic ingredients. The first is a set of recursion patterns that allow us to define recursive functions implicitly. These are encoded as higher-order operators that encapsulate typical forms of recursion, such as the well-known foldr operator on lists. The second is a point-free style of programming in which programs are expressed as combinations of simpler functions, without ever mentioning their arguments. The basic combinators are derived from standard categorical constructions, and are characterized by a rich set of equational laws. In order to be able to apply this calculational methodology to real lazy functional programming languages, a concrete category of partial functions and elements is used.
While recursion patterns are already well accepted and a lot of research has been carried out on this topic, the same cannot be said about point-free programming. This thesis addresses precisely this component of the calculus. One of the contributions is a mechanism to translate classic pointwise code into the point-free style. This mechanism can be applied to a λ-calculus rich enough to represent the core functionality of a real functional programming language. A library that enables programming in a pure pointfree style within Haskell is also presented. This library is useful for understanding the expressions resulting from the above translation, since it allows their direct execution and, where applicable, the graphical visualization of recursion trees. Another contribution of the thesis is a framework for performing point-free calculations with higher-order functions. This framework is based on the internalization of some basic combinators, and considerably shortens calculations in this setting. In order to assess the viability of mechanizing calculations, several issues are discussed concerning the decidability of equality and the implementation of fusion laws.

Frade MJ.  2004.  Type-Based Termination of Recursive Definitions and Constructor Subtyping in Typed Lambda Calculi. Abstractthesis.pdf

In type systems, a combination of subtyping and overloading is a way to achieve more precise typings. This thesis explores how to use these mechanisms in two directions: (i) as a way to ensure termination of recursive functions; (ii) as a way to capture in a type-theoretic context the use of subtyping as inclusion between inductively defined sets.
The first part of the thesis presents a mechanism that ensures termination through types and defines a system that incorporates it. More precisely, we formalize the notion of type-based termination using a restricted form of type dependency (also known as indexed types). Every datatype is replaced by a family of approximations indexed over a set of stages; then being in a certain approximation means that a term can be seen as having a certain bound on constructor usage. We introduce λˆ, a simply typed λ-calculus à la Curry, supporting parametric inductive datatypes, case-expressions and letrec-expressions with termination ensured by types. We show that λˆ enjoys important meta-theoretical properties, including confluence, subject reduction and strong normalization. We also show that the calculus is powerful enough to encode many recursive definitions rejected by existing type systems, and give some examples. We prove that this system encompasses in a strict way Giménez' λς, a system in which termination of typable expressions is ensured by a syntactical condition constraining the uses of recursive calls in the body of definitions.
The second part of the thesis studies properties of a type system featuring constructor subtyping. Constructor subtyping is a form of subtyping in which an inductive type σ is viewed as a subtype of another inductive type τ if each constructor c of σ is also a constructor of τ (but τ may have more constructors), and whenever c : θ'→σ is a declaration for τ, then c : θ'→τ is a declaration for τ with θ'→≤θ'. In this thesis we allow for this form of subtyping in the system λcs, which is a simply typed λ-calculus à la Curry, supporting mutually recursive parametric datatypes, case-expressions and letrec-expressions. We establish the properties of confluence, subject reduction and decidability of type checking for this calculus. As the system features general recursion, the reduction calculus is obviously non-terminating. However, we sketch two ways of achieving strong normalization. One way is to constrain the system to guard-by-destructors recursion, following what is done for λς. The other way is to enrich the type system with stages (following the ideas presented for λˆ) and enforcing termination through typing. Potential uses of constructor subtyping include proof assistants and functional programming languages. In particular, constructor subtyping provides a suitable foundation for extensible datatypes, and is specially adequate to re-usability. The combination of subtyping between datatypes and overloading of constructors allows the definition of new datatypes by restricting or by expanding the set of constructors of an already defined datatype. This flexibility in the definition of datatypes induces a convenient form of code reuse for recursive functions, allowing the definition of new functions by restricting or by expanding already defined ones. We enrich a calculus featuring constructor subtyping with a mechanism to define extensible overloaded recursive functions by pattern-matching, obtaining the system λcs+fun. We formalize the concept of well-formed environment of function declarations and establish that under such environments the properties of confluence, subject reduction and decidability of type-checking hold. Moreover, we prove that the requirements imposed for the well-formed environments are decidable and show how standard techniques can still be used for compiling pattern-matching into case-expressions.

Pereira JO.  2002.  Semantically Reliable Group Communication. Abstractphd.pdf

Current usage of computers and data communication networks for a variety of daily tasks, calls for widespread deployment of fault tolerance techniques with inexpensive off-the-shelf hardware and software. Group communication is in this context a particularly appealing technology, as it provides to the application programmer reliability guarantees that highly simplify many fault tolerance techniques. It has however been reported that the performance of group communication toolkits in large and heterogeneous systems is frequently disappointing. Although this can be overcome by relaxing reliability guarantees, the resulting protocol is often much less useful than group communication, in particular, for strong consistent replication. The challenge is thus to relax reliability and still provide a convenient set of guarantees for fault tolerant programming. This thesis addresses models and mechanisms that by selectively relaxing reliability guarantees, offer both the convenience of group communication for fault tolerant programming and high performance. The key to our proposal is to use knowledge about the semantics of messages exchanged to determine which messages need to be reliably delivered, hence semantic reliability. In many applications, some messages implicitly convey or overwrite other messages sent recently before, making them obsolete while still in transit. By omitting only the delivery of obsolete messages, performance can be improved without impact on the correctness of the application. Specifications and algorithms for a complete semantically reliable group communication protocol suite are introduced, encompassing ordered and view synchronous multicast. The protocols are then evaluated with analytical and simulation models and with a prototype implementation. The discussion of a concrete application illustrates the resulting programming interface and performance.

Campos JC.  1999.  Automated Deduction and Usability Reasoning. Abstractycst-2000-09.pdf

Reasoning about the usability of a given interactive system's design is a difficult task. However it is one task that must be performed if downstream costs with extensive redesign are to be avoided. Traditional usability testing alone cannot avoid these costs since it too often is performed late in development life cycle. In recent years approaches have been put forward that attempt to reason about the usability of a design from early in development. Mainstream approaches are based on some kind of (more or less structured) inspection of a design by usability experts. This type of approach breaks down for systems with large and complex user interfaces, and again extensive testing is required. In an attempt to deal with these issues there have been attempts to apply software engineering reasoning tools to the development of interactive systems. The article reviews this work and puts forward some ideas for the future.

Pereira JO.  1998.  Comunicação Fiável em Sistemas Distribuídos em Grande Escala. Abstracttese.pdf

Protocolos de difusão fiável em grupos de processo são uma ferramenta de programação utilizada em sistemas distribuídos e confiáveis para isolar cada aplicação da complexidade decorrente da comunicação e das diversas faltas que podem ocorrer no sistema. Isto é conseguido oferecendo às aplicações uma abstração de um canal de difusão que oferece garantias muito fortes quanto à qualidade de serviço, o que é suficiente para garantir a correção de um largo espectro de aplicações com um esforço mínimo por parte do programador.
A concretização de protocolos de difusão fiável sobre redes em grande escala introduz um nível adicional de complexidade, pois as características destas redes dificultam a concretização correta e eficiente de serviços que oferecem garantias muito fortes.
Diversas propostas procuram ultrapassar este problema disponibilizando serviços que oferecem menos garantias de qualidade de serviço, mas que apresentam desempenho superior nas situações em que são utilizáveis. No entanto, quanto menos garantias se oferecerem, menos aplicações podem ser correctamente suportadas, uma vez que diferentes aplicações toleram diferentes relaxamentos da qualidade de serviço.
Nesta tese propõe-se um protocolo de difusão fiável que possa ser configurado para cada aplicação, de modo a poder ser simultaneamente eficiente e correcto porque adequado à aplicação em causa.
Para o efeito, começa-se por decompor a especificação de protocolos de difusão segundo um conjunto de parâmetros aplicáveis em separado a cada uma das mensagens de uma sessão, que podem ser combinados em especificações de protocolos adequados a cada aplicação.
Propõe-se então uma estratégia de concretização aberta orientada por objetos, que permite compor protocolos a partir de módulos independentes para cada mensagem, correspondentes aos parâmetros de especificação.

Almeida PS.  1998.  Control of object sharing in programming languages. Abstractpsa-phd-thesis.pdf

Current data abstraction mechanisms are not adequate to control sharing of state in the general case involving objects in linked structures. They only prevent the direct access to the state variables of single objects, as opposed to considering the state reachable by an object and the inter-object references, neglecting the fact that an object is not, in general, self-contained. The pervading possibility of sharing is a source of errors and an obstacle both to reasoning about programs and to language implementation techniques.
This thesis presents balloon types, a general extension to programming languages which makes the ability to share state a first class property of a data type, resolving a long-standing flaw in existing data abstraction mechanisms. Balloon types provide the balloon invariant, which expresses a strong form of encapsulation of state: it is guaranteed that no state reachable (directly or transitively) by an object of a balloon type is referenced by any `external' object.
The mechanism is syntactically very simple, relying on a non-trivial static analysis to perform checking. The static analysis is presented as an abstract interpretation based on a denotational semantics of a simple imperative first-order language with constructs for creating and manipulating objects.
Balloon types are applicable in a wide range of areas such as program transformation, memory management and distributed systems. They are the key to obtaining self-contained composite objects, truly opaque data abstractions and value types---important concepts for the development of large scale, provably correct programs.

Frade MJ.  1995.  Comportamento e Estado. Abstractcompest.pdf

In this work we introduce a formalism for specifying the behaviour of computational systems, based on logic.
Our basic unit of software specification is called agent and is nothing more than a particular logical system. The agent logical language allow us to express the relationship between the occurrence of events and the properties an agent exhibits. The events are seen as logical connectives and the logical assertions reflect the causality between events and properties - the essential cause of a property in face of an event.
Because we are interested in knowing which sequence of events “justify” a property, we have traces coming from the past to the present. Our perspective is that of knowing which behaviour could justify the characteristics of au agent in some instant. Obviously, these characteristics will condition the future behaviour of the agent. This dependency is expressed in a deduction system that reflects the notion of causality.
For the purpose of characterizing the agents semantically a connection is made between the logical system and its topological models. The Stone duality provides a junction between the logical system and its semantics and it provides two alternative perspectives:
the algebraic or logical perspective and the topological perspective.
We present two types of models for the agents: the trace model, a denotational model intimately connected to the notion of event and trace, and the state model, au operational model based on the notions of state and state transitions.
We establish a relation between these two types of models: we see how a state model may be obtained from a trace model. The states arise as equivalence classes of past behaviours. We also study the properties of the state models which result from the conversion of trace models. We see how topological properties — such as separation conditions, soberty and coherence — characterize the state space of these modeis, and how from behaviour we can generate a minimal state space.

Campos JC.  1993.  GAMA- X - Geração Semi-Automática de Interfaces Sensíveis ao Contexto. Abstractmestrado.pdf

A separação dos sistemas interactivos em componente computacional e componente de diálogo, veio possibilitar o aparecimento de tecnologia, específica para a implementação de interfaces com o utilizador. No entanto, apesar da grande quantidade de tecnologia disponível, a construção de uma interface é ainda um processo difícil e demorado. Tal situação, deu origem ao estudo e desenvolvimento de metodologias que permitam especificar formalmente o diálogo humano-computador, tendo em vista um desenvolvimento mais rápido e seguro da referida camada. A interface de um sistema interactivo não deve ser apenas um meio de transmissão passivo entre o utilizador e a aplicação, mas um reflexo da aplicação, permitindo ao utilizador saber o estado actual da mesma facilitando-lhe, assim, o diálogo com ela. A separação demasiado rígida, proposta por muitos modelos, entre camada computacional e camada de diálogo não permite o desenvolvimento de interfaces deste tipo, uma vez que a informação semântica se encontra encapsulada na componente computacional. Com base nestas constatações, foi estudado e implementado um formalismo de especificação de diálogo - Guiões de Interacção - que permite a inclusão de condições semãnticas na especificação, possibilitando expressar claramente e diferenciar as componentes dinâmicas do diálogo, em particular “o diálogo sensível ao con da aplicação”. A utilização de uma notação a la CCS permite a especificação de diálogos concorrentes, tendo-se utilizado Petri Nets para lhes dar semântica e como modelo de implementação dos Guiões de Interacção. É ainda apresentada a arquitectura do sistema GAMA-X - um UIMS para a linguagem CAMILA baseado em Guiões de Interacção - e feita a implementação da componente de runtime do mesmo, provando, deste modo, a validade dos Guiões de Interacção enquanto formalismo para a especificação de diálogo.