Researchers at DeepMind have proudly announced a major break-through in predicting static folded protein structures with a new program known as AlphaFold 2. Protein folding has been an ongoing problem for researchers since 1972. Christian Anfinsen speculated in his Nobel Prize acceptance speech in that year that the three-dimensional structure of a given protein should be algorithm determined by the one-dimensional DNA sequence that describes it. When you hear protein, you might think of muscles and whey powder, but the proteins mentioned here are chains of amino acids that fold into complex shapes. Cells use these proteins for almost everything. Many of the enzymes, antibodies, and hormones inside your body are folded proteins. We’ve discussed why protein folding is important as well covered recent advancements in cryo-electron microscopy used to experimentally determine the structure of folded proteins.
The shape of proteins largely controls their function, and if we can predict their shape then we get much closer to predicting how they interact. While AlphaFold 2 just predicts the static state, the sheer number of interactions that can change a protein, dynamic protein structures are still out of reach. The technical achievement of DeepMind is not to be understated. For a typical protein, there are an estimated 10^300 different configurations.
Out of the 180 million protein sequences in the Protein database, only 170,000 have had their structures identified. Technologies like the cryo-electron microscope make the process of mapping their structure easier, but it is still complex and tedious to go from sequence to structure. AlphaFold 2 and other folding algorithms are tested against this 170,000 member corpus to determine their accuracy. The previous highest-scoring algorithm of 2016 had a median global distance test (GDT) of 40 (0-100, with 100 being the best) in the most difficult category (free-modeling). In 2018, AlphaFold made waves by pushing that up to the high 50’s. AlphaFold 2 brings that GDT up to 87.
At this point in time, it is hard to determine what sort of effects this will have on the drug industry, healthcare, and society in general. Research has always been done to create the protein, identify what it does, then figure out its structure. AlphaFold 2 represents an avenue towards doing that whole process completely backward. Whether the next goal is to map all the proteins encoded in the human genome or find new, more effective drug treatments, we’re quite excited to see what becomes of this landmark breakthrough.