Researchers from Washington have used machine learning algorithms that are trained to predict shapes of protein and assist in the formation of new forms of proteins
Proteins are very essential for the proper functioning of all living beings. Until now, powerful machine learning algorithms including AlphaFold and RoseTTAFold were efficiently trained to predict the detailed shapes of natural proteins based on their amino acid sequences. But there was no progress in designing these proteins due to their structural complexity. Now, three articles in the journal Science detail a revolution in protein design, machine learning can be utilized to develop protein molecules much more efficiently and rapidly than it was possible before. This research will provide solutions to long-standing problems in medicine, energy, and technology.
“Neural networks are easy to train if you have tons of data, but with proteins, we don’t have as many examples as we would like. We had to go in and determine which features in these molecules are the most important. It was a journey of trial and error,” said project scientist Justas Dauparas, a researcher at the Institute of Protein Design.“Protein structure prediction software is part of the solution, but by itself, it cannot come up with anything new,” Dauparas explained.
Now to exceed in research of developing naturally obtained proteins, David Baker professor of biochemistry at the University of Washington School of Medicine and recipient of a 2021 Breakthrough Prize in Life Sciences, and his team divided the task of protein design into three parts. First, a new form of protein must be created, the team revealed that artificial intelligence can produce new forms of proteins in two ways, “hallucination,” and “inpainting,” Second, to accelerate the process, the team developed a new algorithm for generating amino acid sequence, the software tool is called ProteinMPNN. Third, the team used AlphaFold, a tool developed by Alphabet’s DeepMind, to analyze if the amino acid sequence they developed will be folded into intended shapes or not.“ProteinMPNN is like AlphaFold for protein structure prediction,” Baker added.
“We found that ProteinMPNN-derived proteins are much more likely to fold as intended, and we can create very complex protein assemblies using these methods,” said project scientist Basil Wicky, a research fellow at the Protein Design Institute.
“This is the very beginning of machine learning in protein design. In the coming months, we will work to improve these tools to create even more dynamic and functional proteins,” Baker said.
Click here for the Published Research Paper