Data Provenance
Project title: Open, reproducible analysis and reporting of data provenance for high-security health and administrative data
Funder: Wellcome Trust
Duration: 01/05/2021 to 01/05/2022
Chief Investigators: Dr Jessica Butler
Other UoA Investigators: Dr Milan Markovic Professor Corri Black, Ms Katie Wilde, Professor Nir Oren, DaSH research coordinators, DaSH analysts
In the UK, many types of routinely-collected data from the NHS and other government agencies are available for research. To protect privacy, data governance law requires that only project-specific portions of the data be extracted, filtered and anonymised before release for research. Currently little information is provided to researchers on the methods used to produce their data. This lack of transparency results in an increased risk of undetected error propagation and leaves the resulting research difficult or impossible to evaluate and reproduce. We will co-design, pilot, and evaluate methods for recording and reporting provenance for research using high-security data. The result will be a method to report data provenance that maintains privacy and makes the research more findable, accessible, interoperable, and reproducible. Our approach recognises that meeting the needs of both data guardians and researchers requires active cooperation. It is a collaboration between data guardians, computing scientists specialising in provenance and trust, an expert in service evaluation methods, and a specialist in open science practice.