Recent Publications

Jesus P, Moreno CB, Almeida PS. 2015. Flow updating: Fault-tolerant aggregation for dynamic networks. Journal of Parallel and Distributed Computing. 78:53-64. Abstractjpdcfu.pdf

Data aggregation is a fundamental building block of modern distributed systems. Averaging based pproaches, commonly designated gossip-based, are an important class of aggregation algorithms as they allow all nodes to produce a result, converge to any required accuracy, and work independently from the network topology. However, existing approaches exhibit many dependability issues when used in faulty and dynamic environments. This paper describes and evaluates a fault tolerant distributed aggregation technique, Flow Updating, which overcomes the problems in previous averaging approaches and is able to operate on faulty dynamic networks. Experimental results show that this novel approach outperforms previous averaging algorithms; it self-adapts to churn and input value changes without requiring any periodic restart, supporting node crashes and high levels of message loss, and works in asynchronous networks. Realistic concerns have been taken into account in evaluating Flow Updating, like the use of unreliable failure detectors and asynchrony, targeting its application to realistic environments.

Muschevici R, Proença J, Clarke D. 2015. Feature Nets: behavioural modelling of software product lines. Software & Systems Modeling. 14(3):1-26. Abstractfeaturenets.pdfWebsite

Software product lines (SPLs) are diverse systems that are developed using a dual engineering process: (a) family engineering defines the commonality and variability among all members of the SPL, and (b) application engineering derives specific products based on the common foundation combined with a variable selection of features. The number of derivable products in an SPL can thus be exponential in the number of features. This inherent complexity poses two main challenges when it comes to modelling: firstly, the formalism used for modelling SPLs needs to be modular and scalable. Secondly, it should ensure that all products behave correctly by providing the ability to analyse and verify complex models efficiently. In this paper, we propose to integrate an established modelling formalism (Petri nets) with the domain of software product line engineering. To this end, we extend Petri nets to Feature Nets. While Petri nets provide a framework for formally modelling and verifying single software systems, Feature Nets offer the same sort of benefits for software product lines. We show how SPLs can be modelled in an incremental, modular fashion using Feature Nets, provide a Feature Nets variant that supports modelling dynamic SPLs, and propose an analysis method for SPL modelled as Feature Nets. By facilitating the construction of a single model that includes the various behaviours exhibited by the products in an SPL, we make a significant step towards efficient and practical quality assurance methods for software product lines.

Harrison M, Campos JC, Masci P. 2015. Reusing models and properties in the analysis of similar interactive devices. Innovations in Systems and Software Engineering. 11:95-111. Abstractharros.pdf

The paper is concerned with the comparative analysis of interactive devices. It compares two devices by checking systematically a set of template properties that are designed to explore important interface characteristics. The two devices are designed to support similar tasks in a clinical setting. The devices differ as a result of judgements based on a range of considerations including software. Variations between designs are often relatively subtle and do not always become evident through even relatively thorough user testing. Notwithstanding their subtlety, these differences may be important to the safety or usability of the device. The illustrated approach uses formal techniques to provide the analysis. This means that similar analysis can be applied systematically.

Murta D, Oliveira JN. 2015. A study of risk-aware program transformation. Science of Computer Programming. 110:51-77. Abstractscp15.pdfWebsite

Abstract In the trend towards tolerating hardware unreliability, accuracy is exchanged for cost savings. Running on less reliable machines, functionally correct code becomes risky and one needs to know how risk propagates so as to mitigate it. Risk estimation, however, seems to live outside the average programmer's technical competence and core practice. In this paper we propose that program design by source-to-source transformation be risk-aware in the sense of making probabilistic faults visible and supporting equational reasoning on the probabilistic behaviour of programs caused by faults. à la Bird-Moor algebra of programming. This paper studies, in particular, the propagation of faults across standard program transformation techniques known as tupling and fusion, enabling the fault of the whole to be expressed in terms of the faults of its parts.

Castro N, Azevedo PJ. 2015. Automatically estimating iSAX parameters. Intelligent Data Analysis. 19(3) Abstractmotifs.pdf

The Symbolic Aggregate Approximation (iSAX) is widely used in time series data mining. Its popularity arises from the fact that it largely reduces time series size, it is symbolic, allows lower bounding and is space efficient. However, it requires setting two parameters: the symbolic length and alphabet size, which limits the applicability of the technique. The optimal parameter values are highly application dependent. Typically, they are either set to a fixed value or experimentally probed for the best configuration. In this work we propose an approach to automatically estimate iSAX's parameters. The approach - AutoiSAX - not only discovers the best parameter setting for each time series in the database, but also finds the alphabet size for each iSAX symbol within the same word. It is based on simple and intuitive ideas from time series complexity and statistics. The technique can be smoothly embedded in existing data mining tasks as an efficient sub-routine. We analyze its impact in visualization interpretability, classification accuracy and motif mining. Our contribution aims to make iSAX a more general approach as it evolves towards a parameter-free method.

Azevedo PJ, Magalhães A. 2015. Contrast set mining in temporal databases. Expert Systems - The Journal of Knowledge Engineering. 32(3):435-443. Abstractexsy12080.pdf

Understanding the underlying differences between groups or classes in certain contexts can be of the almost importance. Contrast set mining relies on discovering significant patterns by contrasting two or more groups. A contrast set is a conjunction of attribute–value pairs that differ meaningfully in its distribution across groups. A previously proposed technique is rules for contrast sets, which seeks to express each contrast set found in terms of rules. This work extends rules for contrast sets to a temporal data mining task. We define a set of temporal patterns in order to capture the significant changes in the contrasts discovered along the considered time line. To evaluate the proposal accuracy and ability to discover relevant information, two different real-life data sets were studied using this approach.

Moreira JM, Cunha A, Macedo N. 2015. An ORCID based synchronization framework for a national CRIS ecosystem. 4(181) Abstract10.12688_f1000research.6499.1_20150929.pdf

PTCRIS (Portuguese Current Research Information System) is a program aiming at the creation and sustained development of a national integrated information ecosystem, to support research management according to the best international standards and practices.This paper reports on the experience of designing and prototyping a synchronization framework for PTCRIS based on ORCID (Open Researcher and Contributor ID). This framework embraces the "input once, re-use often" principle, and will enable a substantial reduction of the research output management burden by allowing automatic information exchange between the various national systems.
The design of the framework followed best practices in rigorous software engineering, namely well-established principles in the research field of consistency management, and relied on formal analysis techniques and tools for its validation and verification.
The notion of consistency between the services was formally specified and discussed with the stakeholders before the technical aspects on how to preserve said consistency were explored. Formal specification languages and automated verification tools were used to analyze the specifications and generate usage scenarios, useful for validation with the stakeholder and essential to certificate compliant services.

Barbosa LS, Oliveira N, Rodrigues F. 2015. Towards an engine for coordination-based architectural reconfigurations. Computer Science and Information Systems. 12(2):607--634. Abstract1820-02141500019r.pdf

Software reconfigurability became increasingly relevant to the architectural process due to the crescent dependency of modern societies on reliable and adaptable systems. Such systems are supposed to adapt themselves to surrounding environmental changes with minimal service disruption, if any. This paper introduces an engine that statically applies reconfigurations to (formal) models of software architectures. Reconfigurations are specified using a domain specific language— ReCooPLa—which targets the manipulation of software coordination structures, typically used in service-oriented architectures (soa). The engine is responsible for the compilation of ReCooPLa instances and their application to the relevant coordination structures. The resulting configurations are amenable to formal analysis of qualitative and quantitative (probabilistic) properties.

Moreno CB, Jesus P, Almeida PS. 2015. A Survey of Distributed Data Aggregation Algorithms. IEEE Communications Surveys and Tutorials. 17(1):381-404. Abstract1110.0725.pdf

Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, which can then be used to direct the execution of other applications. The resulting values are derived by the distributed computation of functions like Count, Sum, and Average. Some application examples deal with the determination of the network size, total storage capacity, average load, majorities and many others. In the last decade, many different approaches have been proposed, with different trade-offs in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of aggregation algorithms, it can be difficult and time consuming to determine which techniques will be more appropriate to use in specific settings, justifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally defines the concept of aggregation, characterizing the different types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.

Vilaça R, Jimenez-Peris R, Patiño-Martínez M, Kemme B, Brondino I, Pereira JO, Cruz F, Oliveira R, Ahmad MY. 2015. CumuloNimbo: A Cloud Scalable Multi-tier SQL Database. Data Engineering Bulletin Issues. 38(1):73-83. Abstractp73.pdf

This article presents an overview of the CumuloNimbo platform. CumuloNimbo is a framework for multi-tier applications that provides scalable and fault-tolerant processing of OLTP workloads. The main novelty of CumuloNimbo is that it provides a standard SQL interface and full transactional support without resorting to sharding and no need to know the workload in advance. Scalability is achieved by distributing request execution and transaction control across many compute nodes while data is persisted in a distributed data store. In this paper we present an overview of the platform.

Silva A, Jacobs B, Sokolova A. 2015. Trace semantics via determinization. Journal of Computer and System Sciences . 81(5):859--879. Abstractdeterminization.pdf

This paper takes a fresh look at the topic of trace semantics in the theory of coalgebras. The first development of coalgebraic trace semantics used final coalgebras in Kleisli categories, stemming from an initial algebra in the underlying category. This approach requires some non-trivial assumptions, like dcpo enrichment, which do not always hold, even in cases where one can reasonably speak of traces (like for weighted automata). More recently, it has been noticed that trace semantics can also arise by first performing a determinization construction. In this paper, we develop a systematic approach, in which the two approaches correspond to different orders of composing a functor and a monad, and accordingly, to different distributive laws. The relevant final coalgebra that gives rise to trace semantics does not live in a Kleisli category, but more generally, in a category of Eilenberg-Moore algebras. In order to exploit its finality, we identify an extension operation, that changes the state space of a coalgebra into a free algebra, which abstractly captures determinization of automata. Notably, we show that the two different views on trace semantics are equivalent, in the examples where both approaches are applicable.

Furniss D, Masci P, Curzon P, Mayer A, Blandford A. 2015. Exploring medical device design and use through layers of Distributed Cognition: How a glucometer is coupled with its context. Journal of Biomedical Informatics. 53:330-341. Abstractexploring-medical-device-use.pdfhttps://doi.org/10.1016/j.jbi.2014.12.006

Medical devices are becoming more interconnected and complex, and are increasingly supported by fragmented organizational systems, e.g. through different processes, committees, supporting staff and training regimes. Distributed Cognition has been proposed as a framework for understanding the design and use of medical devices. However, it is not clear that it has the analytic apparatus to support the investigation of such complexities. This paper proposes a framework that introduces concentric layers to DiCoT, a method that facilitates the application of Distributed Cognition theory. We use this to explore how an inpatient blood glucose meter is coupled with its context. The analysis is based on an observational study of clinicians using a newly introduced glucometer on an oncology ward over approximately 150h (11 days and 4 nights). Using the framework we describe the basic mechanics of the system, incremental design considerations, and larger design considerations. The DiCoT concentric layers (DiCoT-CL) framework shows promise for analyzing the design and use of medical devices, and how they are coupled with their context.

Masci P, Curzon P, Furniss D, Blandford A. 2015. Using PVS to support the analysis of distributed cognition systems. Innovations in Systems and Software Engineering. 11(2):113-130. Abstractusing-pvs-for-dc.pdfhttps://doi.org/10.1007/s11334-013-0202-2

The rigorous analysis of socio-technical systems is challenging, because people are inherent parts of the system, together with devices and artefacts. In this paper, we report on the use of PVS as a way of analysing such systems in terms of Distributed Cognition. Distributed Cognition is a conceptual framework that allows us to derive insights about plausible user trajectories in socio-technical systems by exploring what information in the environment provides resources for user action, but its application has traditionally required substantial craft skill. DiCoT adds structure and method to the analysis of socio-technical systems from a Distributed Cognition perspective. In this work, we demonstrate how PVS can be used with DiCoT to conduct a systematic analysis. We illustrate how a relatively simple use of PVS can help a field researcher (i) externalise assumptions and facts, (ii) verify the consistency of the logical argument framed in the descriptions, (iii) help uncover latent situations that may warrant further investigation, and (iv) verify conjectures about potential hazards linked to the observed use of information resources. Evidence is also provided that formal methods and empirical studies are not alternative approaches for studying a socio-technical system, but that they can complement and refine each other. The combined use of PVS and DiCoT is illustrated through a case study concerning a real-world emergency medical dispatch system.

Masci P, Rukšėnas R, Oladimeji P, Cauchi A, Gimblett A, Li Y, Curzon P, Thimbleby H. 2015. The benefits of formalising design guidelines: a case study on the predictability of drug infusion pumps. Innovations in Systems and Software Engineering. 11:73-93. Abstractmasci-predictability-isse2015.pdfhttps://doi.org/10.1007/s11334-013-0200-4

A demonstration is presented of how automated reasoning tools can be used to check the predictability of a user interface. Predictability concerns the ability of a user to determine the outcomes of their actions reliably. It is especially important in situations such as a hospital ward where medical devices are assumed to be reliable devices by their expert users (clinicians) who are frequently interrupted and need to quickly and accurately continue a task. There are several forms of predictability. A definition is considered where information is only inferred from the current perceptible output of the system. In this definition, the user is not required to remember the history of actions that led to the current state. Higher-order logic is used to specify predictability, and the Symbolic Analysis Laboratory (SAL) is used to automatically verify predictability on real interactive number entry systems of two commercial drug infusion pumps -- devices used in the healthcare domain to deliver fluids (e.g., medications, nutrients) into a patient’s body in controlled amounts. Areas of unpredictability are precisely identified with the analysis. Verified solutions that make an unpredictable system predictable are presented through design modifications and verified user strategies that mitigate against the identified issues.

Madeira A, Martins MA, Barbosa LS, Hennicker R. 2015. Refinement in hybridised institutions. Formal Aspects of Computing . 27(2):375-395.mmbh14.pdf