Spreadsheet Engineering

2/24/2014
EECS Colloquium - Oregon State University

Spreadsheets play a pivotal role in modern society as they are inherently multi-purpose and widely used both by individuals to cope with simple needs as well as large companies as integrators of complex systems and as support for informing business decisions. Spreadsheets have probably passed the point of no return in terms of importance: it is estimated that 95% of all U.S. firms use them for financial reporting, 90% of all analysts in industry perform calculations in spreadsheets, and 50% of all spreadsheets are the basis for decisions. This importance, however, has not been achieved together with effective mechanisms for error prevention. 

To aid in error-prevention in this talk we present a methodology for spreadsheet engineering. First, we present data mining and database techniques to reason about spreadsheet data. These techniques are used to compute relationships between spreadsheet elements (cells/columns/rows). These relations are then used to infer a model defining the business logic of the spreadsheet. Such a model of a spreadsheet data is a visual domain specific language that we embed in a well-known spreadsheet system. The embedded model is the building block to define techniques for model-driven spreadsheet development, where advanced techniques are used to guarantee the model-instance synchronization. In this model-driven environment, any user data update as to follow the the model-instance conformance relation, thus, guiding spreadsheet users to introduce correct data. Bidirectional transformations are used to synchronize models and instances after users update/evolve the model or instance.