Ribbon_diagram_representation_of_the_folding_of_the_protein_barnase-copy
A diagram of the protein barnase in a folded state

Proteins are molecular machines that control every biological process. For each function, a different protein is required, and each protein is encoded within the DNA sequence of genes. A protein starts life as a linear chain of building blocks, called amino acids. Within milliseconds this chain folds into a 3D shape, whose distinctive form is essential for the job the protein has evolved to do.

This much we know. But understanding how a protein gets from an unfolded to a folded state is one of the greatest puzzles in biology. It is a staggeringly complicated problem, as illustrated in 1969 by Cyrus Levinthal’s thought experiment: imagine a protein made up of 101 amino acids. Now assume that each amino acid can adopt just three orientations relative to its neighbour. This model protein would be able to adopt 3100 or 5 x 1047 conformations. If a protein could search through all these possible shapes at the incredibly fast rate of one every 100 femtoseconds (which it can’t) it would still take a timespan many times greater than the age of the Universe. So there must be some way in which proteins are directed down an efficient folding pathway.

Solving this protein-folding problem is essentially a computations challenge: one that was formalised in 1991 through the Critical Assessment of Structure Prediction (CASP) competition, which pits some of the world’s most powerful supercomputers and advanced AIs against each other. The winner of the last CASP, held late last year, was the AlphaFold team from DeepMind. Given only amino acid sequences, AlphaGo solved the structures of two thirds of 100 proteins on the task list, with an accuracy comparable with laboratory-based techniques.

This step-change in structure predictions means AlphaGo is now tantalisingly close to the holy grail of structural biochemistry: the ability to predict the structure of any protein from just the DNA sequence of the gene that codes for it. Such a breakthrough would make it significantly easier to deal with some of the world’s greatest challenges, from tackling genetic diseases to developing proteins that can break down industrial waste.

This piece is a preview from the Witness section of New Humanist spring 2021. Subscribe today.