Global Health, Epidemiology and Genomics

An exciting new development from Cambridge Journals

Global Health, Epidemiology and Genomics

Provenance for data-driven healthcare

Home / Technological advances / Provenance for data-driven healthcare

Provenance for data-driven healthcare

Posted on
25 January 2017
by Jat Singh

It is often said that we are amidst a ‘data revolution’. With more data being generated than ever before, there is great potential for data analytics to transform clinical care, health research, and to enable next-generation health services including tele- and precision medicine.

Opportunities to transform healthcare come from access to data. Data analysts often seek to combine data from a number of sources to create a richer, more holistic base in which to derive insights. In practice, this requires data to be shared (or at least made accessible to analytical or computational processes), often across administrative boundaries. For instance, data from a general practitioner may be combined with hospital-managed electronic health records, augmented by data feeds from a patient’s wearable technologies and sensors in their home. As such, data is becoming increasingly federated, providing opportunities to use interconnected but de-centralised data sources and stores, to answer research questions. However, managing data in such an environment raises a number of challenges, particularly where collaboration is required. Here we briefly consider two aspects regarding transparency.

Data security

In the healthcare context, there is (rightly) a considerable focus on data confidentiality. At a technical level, this is typically realised through access controls that define “who may access such information”. For instance, there may be a rule that a doctor can only access a patient’s record if there is a treating relationship. Such controls are crucial to any data governance model, and work well where a common regime can apply, e.g. in a single organisation, such as a hospital, or platform.

However, more controls are required where data is federated and shared across systems. Consider a hospital that provides data to a research organisation. The hospital has control over the data released to the researchers, but after the transfer, the hospital effectively loses visibility and thus control over (the researchers’ copy of) that data. In practical terms, this means that sharing and collaboration agreements in such environments are largely based on trust.

Given the sensitivity of health data, and the overarching responsibilities and obligations of those who deal in personal information, this lack of transparency de-incentivises sharing and collaboration. This runs directly against the vision of the ‘data revolution’, where data can be used and re-used, when and where appropriate, to bring value and innovation to health services.

Data quality

Another consideration is data quality, which directly impacts the value of any analysis. However, assessing data quality in a federated environment described can be difficult. Data will originate from a variety of sources, leading to variance even within the same dataset. Such concerns can be mitigated in controlled environments where equipment and procedures can be standardised, such as in a clinical unit or research project. However, where data is collected in a more ad hoc manner and comes from a range of sources including consumer devices, such as a patient’s wearable technology, contextual information surrounding the data becomes increasingly important: how the data was generated, processed and transformed, etc.

Provenance: Tracking data flow

Data provenance is an emerging area of research that aims to address these challenges. Provenance can be described as ‘data about data’, providing details of the data life cycle: where/when and by what/whom was the data produced, where was it transferred, and how was it processed.

Provenance techniques can complement general access control regimes by improving levels of transparency. By making visible the flow of information, it is possible to track what is happening to data, even after it moves “out of one’s hands”. In line with the previous example, a strong provenance infrastructure could allow the hospital to ‘see’ that the research organisation is using and handling the data appropriately. Raising levels of transparency facilitates accountability, and therefore works to encourage data sharing and collaboration.

Provenance information also assists data quality. Recording where and how data is created, and the processes to which data was subject (e.g. did it pass through a sanitisation, validation or anonymisation routine) can influence its interpretation and handling. For instance, provenance details might highlight that a particular reading came from a faulty or inaccurate sensor, or an untrusted source, in which case those readings might be ignored or transformed before use. Such information also helps in identifying procedural issues, for instance, where a member of staff consistently generates data that does not accord with that of other staff – which may indicate a training issue.

Information management is key to realising the potential of data-driven healthcare. Though work in provenance is at its early stages, the techniques described show great promise for improving transparency in federated data environments, which will assist in mitigating certain governance risks and incentivise data sharing and collaboration.

Technological advances tags: Data provenance / data quality / data revolution / data security / federated data

Leave a Reply Cancel reply

Recent Posts

  • Rethinking clinical outcome markers in multimorbidity
  • ICPD 25: accelerating the promise or just holding ground?
  • Genomic studies in Africa: an opportunity to leverage existing observational data for causal inference
  • Most genetic studies use only white participants – this will lead to greater health inequality
  • RxScanner™: Making medicines safe globally

Archives

  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015

Categories

  • Capacity Building
  • Genetics
  • Global Health
  • Indigenous People
  • Infectious Diseases
  • Journal
  • Non-communicable Diseases
  • Technological advances

Tags

Africa antiretrovirals APCDR blood pressure Brown Capacity building Child health CRONICAS diabetes Ebola Electronic health records epidemiology genetic diversity genetics genomics GHEG global health H3Africa health systems HIV HIV/AIDS human health hypertension India Indigenous health infectious diseases intervention journal LMICs Longitudinal Population-Based studies Low- and middle-income countries Malaria Mental Health NCDs PacBio populations Pregnancy Sandhu Sierra Leone South Africa sub-saharan Africa Women in Global Health World AIDS Day Zoonoses zoonotic
© Copyright 2015 Cambridge University Press
Cambridge University Press