Reading time: 4 minutes
Proteins are the molecular machines that perform nearly every bodily function including contracting muscles, digesting food, and healing wounds. They also provide structural support by reinforcing the shape of cells and tissues. So how do proteins go from a string of amino acids to a 3D structure with a very precise arrangement of atoms within a few milliseconds? This has been dubbed the ‘protein folding problem’ and is one of the long-standing problems in biology. This is important because the way a protein folds determines its function.
There are four orders of protein structure: primary, secondary, tertiary and quaternary. Primary structure is the simplest level and contains information about the amino acid sequence of the protein. The secondary structure refers to local folded structures such as the α-helix or β-pleated sheets that occur within the protein chain. The tertiary structure is the 3D shape of the protein. The final higher ordered structure is the quaternary structure that is the shape when multiple folded subunits of proteins assemble.
Scientists have spent decades developing laboratory techniques to solve protein structures. The current gold standard is X-ray crystallography where a beam of X-rays is fired at a crystallized protein and the diffracted beams give information about the atomic arrangement of the protein. It is a time consuming and costly method, and is limited to stable proteins that form well ordered crystals. With the advances in computational biology, new methods for protein-structure determination have been developed. However, these tools do not match the accuracy of experimental methods.
In November of 2020, a group called DeepMind announced they had cracked the protein folding problem. DeepMind is known for successfully developing AlphaGo, the first AI to beat a professional human Go player (Go is a chess-like board game that is a googol, or 10 to the power of 170 times, more complicated than chess). Since then they shifted gears and developed a neural network program called AlphaFold, that could accurately predict the folded structure of proteins. This system was trained on publicly available data from the protein data bank and other databases.
AlphaFold, won 1st place at the 2020 Critical Assessment of Protein Structure (CASP), a global competition for 3D protein structure prediction. This competition happens over several months where a group of proteins, or parts of proteins, called domains, are released and teams are given a few weeks to submit their best structure predictions. The predictions are scored on a scale of 0-100 and a score of 90 is considered on par with protein structures solved using experimental methods such as X-ray crystallography, Nuclear Magnetic Resonance (NMR), and cryo-electron microscopy. In the 2020 competition DeepMind’s AlphaFold flexed its deep learning muscles scoring 92.4, a record achievement.
So, will AlphaFold turn its eye towards cancer? It has the potential to improve our understanding of cancer and to accelerate the drug discovery process. Experimentally determining the structure of a protein depends on a lot of trial and error and typically takes months if not years. Some proteins cannot be made in the sufficient quantity or purity to determine its structure. Other proteins are too unstable, or fold incorrectly, under experimental conditions. By expanding the number of known protein structures, AlphaFold could increase our understanding of disease pathways. For example, if protein X binds to protein Y and kick-starts a disease process, knowing their structures allows us to predict how they bind to each other and design drugs that target the binding site. Proto-oncogenes are genes that, when mutated, cause normal cells to become cancerous. AlphaFold may determine how a mutation can alter the structure, and hence the function, of the proteins encoded by these genes. Improved understanding of how mutations influence structure can be key to developing personalized therapies for patients.
It is likely that AlphaFold will need further development before it can make significant impacts on treating cancer and other diseases. Furthermore, its predictions will need to be validated using experimental methods. However, it does present a breakthrough that will improve our understanding of the human body, and shows the utility of AI in solving complex problems in biology.
Edited by Aileen Fernandez