How Google made AlphaFold 3

Google has lifted the lid on how the latest iteration of its protein-structure prediction model, AlphaFold, was built.

The third version of AlphaFold was released earlier this year and, according to Google, it “predicts the structure and interactions of all of life’s molecules with unprecedented accuracy”.

The AI model now predicts the shapes of proteins, as well as other smaller molecules such as DNA, RNA and ligands – tiny molecules that bond to other molecules to affect proteins in all manner of ways. In humans, for example, ligands can act on cells to cause us to smell smells or signal our brains to release neurotransmitters that affect mood.

For the third version of AlphaFold, Google has teamed up with Isomorphic Labs to put the model to work on drug discovery. Many drugs are ligands, and the pair are hoping that by better understanding the shape of both proteins and ligands, researchers can find new targets for drugs to work on, and also create entirely new types of drugs.

Because AlphaFold 3 needed information on different classes of biomolecules, it was trained on a much larger dataset – and needed a new architecture as a result.

AlphaFold 3 vs AlphaFold 2 (and 1)

AlphaFold 3 uses a diffusion network-based generative model (seen in the likes of AI image generators) to create images of the molecules’ structures, rather than a complex custom geometry-based module, as AlphaFold 2 had.

AlphaFold was created in 2016 by DeepMind, the AI company Google bought in 2014, who trained the model on 100,000 protein structures and the amino acid sequences that underpin them.

The structure of a protein can be complex. All proteins start off as chains of amino acid molecules linked together. Those chains form simple shapes, such as pleated sheets or helices – basically long loops. From there, they build into more complex 3D shapes. The shapes are unique to each protein, that determine their function and how they interact with other molecules – medicines, for example.

Pre-AI, however, being able to work out how a protein would look in 3D just by studying its amino acid chain was slow, expensive, and unreliable.

The first release of AlphaFold came in 2018; Google unveiled its successor in 2020. Google reports that AlphaFold 2 has been used by more than two million scientists since then, working on projects including malaria vaccines, antibiotic resistance, and treatment for cancer.

Researchers can access AlphaFold through the AlphaFold server, where they can input sequences of amino acid sequences along with other molecules and see a visual representation of their 3D structure, along with predictions of how likely each section of the sequence is to be accurate.

Jo Best
Jo Best

Jo has been writing about technology for over 20 years, and has always been fascinated by emerging technologies and innovation. These days, she's particularly interested in the intersection of technology, science, and human health.

NEXT UP