We follow our participants’ health by linking to their electronic medical records, which includes data on hospital stays, cancer diagnoses and causes of death.

These data enable researchers to know what health conditions participants are experiencing over time and when they were diagnosed. Combined with other data types, researchers around the world have been able to make health discoveries that would not otherwise have been possible.

Healthcare records data at a glance

Primary care data

We receive coded GP data, which contain codes about diagnoses, prescriptions and referrals, but no confidential notes or letters. Learn why access to GP data is so important.

  • Current availability
    • 230,000 participants up to 2016 or 2017 (depending on data supplier)
  • Future availability
    • 500,000 participants

Hospital inpatient data

We receive coded hospital data, which contains information about diagnoses and procedures, for all of our participants.

Cancer data

We receive data on all of our participants’ cancer diagnoses from national cancer registries.

Death data

If one of our participants dies, we receive information about the date and cause of death from national death registries.

Algorithmically-defined health outcomes and first occurrences of health outcomes

For dementia, stroke and some other conditions, we use algorithms that use data from across different medical records (and self-report) to identify whether a participant has a certain health outcome and when it was first diagnosed.

  • Current availability
    • 500,000 participants (depending on underlying coverage)

Healthcare records research stories

Read a selection of stories about how healthcare is being changed by discoveries made with healthcare records.

Analysis of the ‘fingerprint’ of blood vessels in the retina could make it possible for people to keep tabs on their cardiovascular health during routine eye tests.

DNA from 1,600 ancient people and 400,000 UK Biobank participants reveals why MS is more common among northern Europeans: they are more closely related to the ancient people in which some of the genetic risk factors for the disease emerged.

An automated algorithm that assesses heart-surrounding fat and predicts heart failure could one day help clinicians to better support patients.

Data from over 450,000 UK Biobank participants show the importance of considering family history alongside genetics before making decisions on invasive preventative surgeries.

Explore our other data categories

Magnetic resonance images, bone-density scans, carotid artery ultrasound and more

Proteins, metabolites, infectious disease markers and other biomarkers

Genotyping, exome and whole-genome information

Participants’ information on health and lifestyle collected via online or touchscreen questionnaires

Baseline data from physical exams, vision and hearing tests, activity monitor and more

Participants’ self-reported data on health and lifestyle

Derived data on participants’ environment, such as local air and noise pollution