How Google made AlphaFold 3
Google has lifted the lid on how the latest iteration of its protein-structure prediction model, AlphaFold, was built.
The third version of AlphaFold was released earlier this year and, according to Google, it “predicts the structure and interactions of all of life’s molecules with unprecedented accuracy”.
The AI model now predicts the shapes of proteins, as well as other smaller molecules such as DNA, RNA and ligands – tiny molecules that bond to other molecules to affect proteins in all manner of ways. In humans, for example, ligands can act on cells to cause us to smell smells or signal our brains to release neurotransmitters that affect mood.
For the third version of AlphaFold, Google has teamed up with Isomorphic Labs to put the model to work on drug discovery. Many drugs are ligands, and the pair are hoping that by better understanding the shape of both proteins and ligands, researchers can find new targets for drugs to work on, and also create entirely new types of drugs.
Because AlphaFold 3 needed information on different classes of biomolecules, it was trained on a much larger dataset – and needed a new architecture as a result.
AlphaFold 3 vs AlphaFold 2 (and 1)
AlphaFold 3 uses a diffusion network-based generative model (seen in the likes of AI image generators) to create images of the molecules’ structures, rather than a complex custom geometry-based module, as AlphaFold 2 had.
AlphaFold was created in 2016 by DeepMind, the AI company Google bought in 2014, who trained the model on 100,000 protein structures and the amino acid sequences that underpin them.
The structure of a protein can be complex. All proteins start off as chains of amino acid molecules linked together. Those chains form simple shapes, such as pleated sheets or helices – basically long loops. From there, they build into more complex 3D shapes. The shapes are unique to each protein, that determine their function and how they interact with other molecules – medicines, for example.
Pre-AI, however, being able to work out how a protein would look in 3D just by studying its amino acid chain was slow, expensive, and unreliable.
The first release of AlphaFold came in 2018; Google unveiled its successor in 2020. Google reports that AlphaFold 2 has been used by more than two million scientists since then, working on projects including malaria vaccines, antibiotic resistance, and treatment for cancer.
Researchers can access AlphaFold through the AlphaFold server, where they can input sequences of amino acid sequences along with other molecules and see a visual representation of their 3D structure, along with predictions of how likely each section of the sequence is to be accurate.
Related reading
NEXT UP
Cassidy Wolfenson, Creative Director at Labster: “Let data and intention inform your designs”
We interview Cassidy Wolfenson, who has a fascinating job: to develop compelling visuals that make online simulations more immersive — and thus more inspiring to STEM learners
IBM: Mainframes and AI are a match made in heaven
Research from IBM found that the relationship between AI and mainframes is a symbiotic one: mainframes are supporting AI strategies and vice versa.
GoldenJackal attacks prove that air-gapped security still isn’t enough
We reveal the method behind the GoldenJackal attacks, who’s being targeted, and why air-gapped defences aren’t enough