At the core of any life-sustaining process is a protein that must first fold itself from a chain of amino acids, its fundamental building blocks, in order to function properly. Sometimes this folding can go wrong, especially if the protein takes too long or lacks a some assistance.
Based on computational studies by researchers at the University of Illinois at Urbana-Champaign and the Heidelberg Institute for Theoretical Studies in Heidelberg, Germany, nature has found ways to increase the folding speeds of proteins throughout much of their ev
olutionary history. Publishing their results in PLoS Computational Biology, the researchers studied the folding of proteins from hundreds of different genomes, observing an increase in the folding speeds of proteins until a sort of biological “Big Bang” made slower folding proteins safer to fold.
Their analysis required extensive computing power, analyzing 92,000 proteins from 929 different genomes. However, simulating a protein’s folding from its amino acid sequence alone is still far from practical for a study this extensive. According to Gustavo Caetano-Anollés, professor of bioinformatics in the Department of Crop Sciences at the University of Illinois at Urbana Champaign, they relied on proteins whose structures have either been solved before and are available on the protein data bank or can be inferred by computational methods.
“In the latter case this allows us to study the proteomes of thousands of organisms, provided their proteome complement is available,” Caetano-Anollés wrote in an email to The News-Letter.
Because proteins fold at the same points along their amino acid chain, the spacing of these points reflects how long it takes the protein to fold. Therefore the folding speed is not dependent on the length of the entire protein. When these points are spaced further apart, the protein needs more time to fold than for proteins with closer points. This gives way to a so-called Size Modified Contact Order, which the researchers used to analyze the folding speeds of the proteins against an evolutionary timeline.
The results of their analysis showed that, up until 1.5 billion years ago, proteins evolved to fold more quickly, which often meant they shortened in length. At that point, proteins began adopting more complex folds in multiple domains without an increased risk of folding the wrong way. This was accomplished with interactions between parts of larger proteins, referred to as domains, while other proteins called chaperones would help make sure a protein folds properly.
From his lab’s previous work that was published in 2009, Caetano-Anollés was able to trace the rise of key features of proteins that constituted the biological big bang, which coincided with the rise of the more complex Eukaryotes and possibly multicellular organisms.
“We also know that proteins that appear before the big bang are mostly involve in core metabolic processes, translation, replication and other house-keeping tasks,” he wrote. “We know that the functions of domains that appear after the big bang are quite modern and involve for example scaffolding proteins needed for multicellular structure and many signaling systems that are specific to eukaryotes.”
In humans, a misfolded protein can lead to the aggregation that is the basis of some neurodegenerative diseases like Alzheimer’s and Creutzfeld-Jakob. Usually cells can recognize when a protein’s folding goes awry and promptly degrade them, but there are cases when it is difficult for the cell to find these misfolded proteins in time. Caetano-Anollés explained that by understanding the way proteins fold, researchers can eventually learn how to mitigate any misfolding.
Proteins also began to increase the amount of beta structures they adopted, a type of folding that typically lost in the speed game of folding to another popular structure, alpha helices. Beta structures, according to Caetano-Anollés, have greater distances between amino acids that are linked by hydrogen bonds, one of the factors holding proteins together, than do the amino acids in alpha helices.
Caetano-Anollés expects to address other topics with his colleagues using computational methods, including the bridge between evolutionary biology and molecular dynamics. “However, we are also interested in early evolution of life, the diversification of organisms, the forces behind structure and function, and of course the origin and processes behind the genetic code,” he wrote.