Causality is Simple

10/12/2016

By Carlos Baquero, HASLab, INESC TEC & University of Minho.

Abstract. Causality is an essential component of how we make sense of the physical world, and of our relations to other humans. If I put a cup on the table, and look back at it, I expect it to be there. I also expect to get a reply to my postcards, after I send them, and not before. These days hardly any service can claim not to have some form of distributed algorithm at its core. In a distributed scenario, if we are not careful, it is very easy to break the causal sense of things. In a key-value store my writes can be directed to a replica, and my subsequent reads served from an outdated one – my cup might not be there when I look back. Message dissemination middleware might not always provide the ordering I expect – I might receive some replies, before their leading questions. Luckily, most of these problems were already there 30 years ago, although in a much smaller scale, and lots of techniques have been developed to keep track of causality and make sense of the complex interactions in modern systems. However developers often look at techniques such as as replica synchronization with version vectors, or causal broadcasting algorithms, as black boxes; or as complex sets of rules that have to be followed and not questioned. This talk will focus on bringing back the intuition on causality, and show that keeping in mind some simple concepts, allows to understand how version vectors and vector clocks work, and were they differ, and how to use more sophisticated mechanisms to handle millions of concurrent clients in modern distributed data stores.

Keywords. Distributed Systems, Causality, Time.

About the speaker. Carlos Baquero is an Assistant Professor at the University of Minho and a Senior Researcher at the High Assurance Laboratory within INESC TEC. He teaches several courses in the area of Distributed Systems. He is one of the inventors of Conflict-free Replicated DataTypes (CRDTs). In the 90s, motivated by mobile computing and dis- connected operation for file systems, he studied data types with merge (and fork) operations over semi-lattices, a precursor to state-based CRDTs. He is interested in causality and in distributed aggregation algorithms, and specially likes all things distributed that eventually merge. Carlos published dozens of papers in reputable conferences and journals on version vectors, distributed data, aggregation, and Bloom Filters. He chaired and served as TPC member on many conferences and workshops, supervised over ten PhD and MSc students, and has been invited to several talks in academic and industrial events. He is currently the leader of the NORTE 2020 TEC4Growth SMILES project at INESC TEC.

LOCATION AND TIME

Address:  University of Minho, Gualtar Campus, Braga, Portugal

Building: Departamento de Informática, Building 07

Coffee session: at 1:30PM-2PM, Sala de Estar, 4th floor

Talk session: at 2PM-3PM, Auditório A2, first floor

Photos