Data Science Journal (Jul 2013)
Data Provenance and Trust
Abstract
The Oxford Dictionary defines provenance as “the place of origin, or earliest known history of something.” The term, when transferred to its digital counterpart, has morphed into a more general meaning. It is not only used to refer to the origin of a digital artefact but also to its changes over time. By changes in this context we may not only refer to its digital snapshots but also to the processes that caused and materialised the change. As an example, consider a database record r created at point in time t0; an update u to that record at time t1 causes it to have a value r’. In terms of provenance, we do not only want to record the snapshots (t0, r) and (t1, r’) but also the transformation u that when applied to (t0, r) results in (t1, r’), that is u(t0, r) = (t1, r’).