I can remember the day I learned in high school that amino acids were “the building blocks of life.” I was fascinated by the idea that our complex shape, and the shape of other living organisms, was like a little Lego set, constructed to make us who we were. Even then, in the early 1980s, researchers had spent nearly a decade trying to figure out how those amino acids told proteins what form to take. Since then, with increasingly powerful computers and complex algorithms, researchers have applied machine learning techniques to answer the same biological question.
Google‘s (NASDAQ: GOOG)(NASDAQ: GOOGL) DeepMind has just provided an answer, and it baffles researchers’ minds. For nearly 50 years, scientists have wondered how proteins know what shape to fold, and do this over and over again. In a modeling competition, DeepMind researchers just broke the code and created a model that translates the amino acid chains into three-dimensional protein structures. To figure out how this may affect medicine (and investment), it’s important to understand what scientists can do with this new knowledge. What else are the effects downstream? Which areas of biological research can be most affected? And which companies win – or lose – the most?
Tens of thousands of proteins exist in humans, and there are billions in other species, viruses and bacteria. The way these proteins fold directly determines what they do. In fact, there is a saying in molecular biology that ‘structure is function’. The folded shape is key to the role proteins play – such as antibodies that fight infection or insulin to regulate blood sugar. That is why the Critical Assessment of Protein Structure Prediction (CASP) has been conducted since 1994. It is an event that challenges teams to improve the accuracy of protein structure predictions.
AlphaFold, DeepMind’s winning model, was trained on public data from 170,000 protein structures. The program needed 128 high-end cloud computing cores running for several weeks to create the algorithm. Ultimately, two-thirds of the model’s accuracy scores represent design errors less than the width of a single atom. DeepMind stood head and shoulders above other participants in the event, which consisted mainly of academic teams, but also entries from Microsoft (NASDAQ: MSFT) and Chinese internet giant Tencent (OTC: TCEHY).
Why it matters
Most of the medications prescribed today were either discovered by chance or through time-consuming trial and error experiments. Understanding how amino acids make proteins twist and fold, and take their three-dimensional shape, will give you a better understanding of why each protein becomes what it does and how those signals are sent across cell membranes. This could allow scientists to better design drugs that will be used by cells in a desired way, understand pathogenic folds, and allow drug manufacturers to identify the cause of genetic variations that lead to disease.
In an example at the event, the AlphaFold model delivered the structure of a bacterial protein in just 30 minutes. The Max Planck Institute in Germany had been working on that problem for over ten years. Next, the team could begin to address the thousands of unsolved proteins in the human genome and the hundreds of millions of proteins in nature that have not been modeled. This begs the question of when we can all get drugs designed for our own particular biology.
What you should pay attention to
Applications for drug discovery will have to wait for the time being. It’s unclear when and how DeepMind will share its model, and while impressive, it had limitations. For example, the model had difficulty predicting protein complexes or groups where interactions between proteins can disrupt shapes. As more proteins are involved, the potential possibility of modeling interactions becomes nearly impossible. This mathematical limitation – known as combinatorial explosion – is common in advanced modeling, but can eventually be overcome with more computational power. It will be important to address this, as protein-protein interactions are one of the main mechanisms targeted to discover new drugs.
Despite the caveats, the discovery promises to add fuel to the fires of scientific research into the workings of the human body. A better understanding of the translation of amino acids to proteins validates the potential impact of gene editing and companies such as CRISPR Therapeutics (NASDAQ: CRSP), Intellia Therapeutics (NASDAQ: NTLA), and issued Medicine (NASDAQ: EDIT). Furthermore, solving this problem should ultimately lead to less trial-and-error in the lab and make genome sequencing even more important, which is beneficial Illumina (NASDAQ: ILMN), Thermo Fisherman Scientific (NYSE: TMO), and Agilent (NYSE: A). After all, DNA contains information for making proteins.
The benefits of DeepMind’s discovery largely remain behind the curtain of research, appearing to most of us like other medical breakthroughs – in the form of new or better drugs to treat disease. But make no mistake about its importance. A CASP judge, a computer biologist at Columbia University, called it one of the most important breakthroughs of his life. Even the CASP co-founder added, “I never thought I’d see this in my life.” I expect this to be the first salvo in a new battle against human disease. Armed with a deeper understanding of the building blocks of life and once unimaginable computing power, we will soon be able to look back at our current drug discovery process as we now look at treating infections before penicillin was available, or monitoring pregnancy before ultrasound – both advancements are made in the 1950s. Seventy years from now, people may marvel at the effort it takes to discover drugs and wonder how we were ever able to develop drugs with such a random process.