In early 2001 the first human genome sequence was published. This was the culmination of a century of phenomenal genetic discovery: the identification of chromosomes in 1882; the discovery of the structure of DNA in the 1950s; and finally the publication of the 3 billion letters that make up our genetic code.

Profound advancements in genetics have continued into this century. A single genome can now be sequenced in just days for less than $1000. Low-cost sequencing has sped up the growth of genetic data, which has progressed our understanding of the traits that make us human and that confer susceptibility to disease. Yet despite such advances, the vast majority of our genome remains relatively unexplored.

Much focus to date has been on the minority of our genome that encodes proteins. This has been driven in part by biological logic, but also by data availability and cost. Research into the function of the remaining 97 per cent of “non-coding” regions (until recently often dubbed “junk”) has been limited. However, there is now abundant evidence that genetic variation outside of coding regions is a critical aspect of our biology. Much of this type of DNA is thought to control which genes are turned on or off – a mechanism already known to be important in a number of diseases, including Parkinson’s.

In November 2021 the UK Biobank (UKBB) project released 200,000 whole human genomes from largely healthy participants: the single biggest human genome dataset to date. Scientists hope that this dataset will enhance understanding of the function and impact of the non-coding genome on human health and disease, as well as improving drug discovery and development. But it holds further value: participants in the project have given broad consent to the use of their genomes such that a swathe of future research possibilities now opens up.

The release of the 200,000 genomes last year will be followed by a further 300,000 in 2023. It’s a boon for scientific progress, and a relative steal: the whole project cost £200 million, far cheaper than the $1 billion it cost to sequence a single genome in 2001.

This piece is from the Witness section of our New Humanist spring 2022 edition. Subscribe here.