Protein Evolution and Other Musings: October 2011

Coevolution, from hummingbirds to proteins

By Gem on Monday, October 31, 2011

Coevolution (two or more biological objects evolving together) is a common feature of the evolutionary process on all levels from the molecular to the organismal. One of the most beautiful examples is that of hummingbirds and ornithophilous flowers. Hummingbirds feed on the nectar from the flowers, pollinating them in the process. In this mutually beneficial relationship, the plants have evolved flowers that attract the birds with colours that are conspicuous to the bird, and are shaped to perfectly accommodate the bird's beak. This coevolution has happened in a number of hummingbird/plant pairs.

Pic from Wikipedia article on humming birds

For more information on hummingbird/plant coevolution, I direct you to the publications of Ethan Temeles. As usual though, this post will be about proteins, and not whole organisms... and it will include my own crude drawings as usual...

Fig 1. Ta-da! Hummingbird/plant coevolution is a nice analogy for protein receptor/ligand coevolution. Circles show residues directly involved in the interaction.

At the molecular level, an example of coevolution is in the establishment of receptor-ligand interactions (Fig 1). The receptor protein binding site has evolved in concert with the binding site of the ligand. In Fig 1, variation of the yelow residues in the receptor is correlated with that of the green residues in the ligand. The yellow sites are close together in the structure, but not necessarily neighboring in the sequence. For example, the amino acid sequence backbone of these imaginary proteins might be arranged like this:

Fig 2. Black lines show the amino acid sequence of the protein, within its structural density.

Thus, if the structure of the binding interface is known, it's possible to predict candidate coevolving sites. However from the sequence alone, it's not so straightforward.

As discussed in a recent paper of Gloor et al in MBE (and references within), there are two explanations for how covarying positions come to be (and these are actually the extremes of the distribution of possible mutational effects):

1. Suppressor mutations. These arise when a mutation with a deleterious phenotype is suppressed by another mutation at a different position.
2. Covarions. These are cases when both the original residue and the mutated residue are functionally compatible, but mutation alters the spectrum of amino acids possible at another location.

Covarying sites may occur in the same protein or in different proteins (Figs 3-4).


Fig 3. Stars show between-protein correlated mutations at two interaction sites

In between-protein coevolution, green sites coevolve with yellow sites in our example. But there is also within-protein coevolution among yellow site residues and among green site residues. Imagine for instance a change of green residue that multiple yellow resides interact with at different times (Fig 4). Or perhaps the middle yellow starred residue in Fig 4 mutating and causing different constraints in what residues the neighboring yellow sites can mutate to. Either way, the three yellow sites will covary. Remember that those sites are far away from each other in the sequence. So by showing that these sites co-vary, we can predict that they are functionally related, even if we don't have a structure

Fig 4. Correlated mutations can also occur within one protein

Prediction of co-evolving sites can be useful for understanding cases when binding site residues are unconserved in a multiple sequence alignment. It can also be useful for predicting intermolecular interaction sites, and allosteric sites (for example Chen et al., 2006). An allosteric site can remotely affect the evolutionary pressures on a distant site by affecting the structural orientation of the protein (Fig 5).

Fig 5. Correlated mutations among binding site residues and an allosteric site.

Prediction of covarying sites is challenging, not only because they may not always be clustered together in sequence and structure, but because covariation is a combined result of structural and functional constraints and background noise from shared phylogenetic ancestry and random processes.

There are two classes of methods for predicting covarying sites: tree-aware and tree-unaware. Tree aware methods search for sites whose covariation can not be explained by phylogenetic relationships, while tree-unaware methods ignore phylogenetic relationships, instead searching for covarying sites with the strongest signal. The two classes of methods are discussed in Caporase et al (2008), in which it is concluded that tree-unaware methods perform as well as tree-unaware.

Using a tree-unaware method, Gloor et al. (2010) examine covariation in phosphoglyerate kinase evolution. They identify nonconserved sites that covary, and through mutagenesis show that the sites are important for function and epistatic to each other (mutation in one affects the function of the other). They find that covarying positions are just as as diverse within and between clades as are noncovarying positions, and suggest that most covarying positions arise from processes more like the covarion model, than the suppression mutation model.

The importance of covariation in sequence evolution is of interest to people like myself who use patterns of sequence variation to predict protein function. In studying molecular evolution of function, we largely rely on the assumption that the most functionally important positions are those that are conserved over time. Although this is generally the case, it seems that some important sites that are able to covary may slip through the net.

Recently, I've been experimenting with the tree-unaware code of Dunn et al., (2008) to find covarying sites Preliminary results, based on the RelA family are... confusing. Residues that would be predicted to be interacting from the structure are not flagged up as covarying, while there are many pairs of predicted covaring sites that are physically distant and don't seem likely to be allostric sites from the structure. It seems like as with many real-life case studies, real biology is a little bit more complicated than naive sketches like mine would have you believe! Oh well, time to delve a little deeper into the data set...

References and further reading:

Caporaso, J., Smit, S., Easton, B., Hunter, L., Huttley, G., & Knight, R. (2008). Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics BMC Evolutionary Biology, 8 (1) DOI: 10.1186/1471-2148-8-327

Codoñer FM, & Fares MA (2008). Why should we care about molecular coevolution? Evolutionary bioinformatics online, 4, 29-38 PMID: 19204805

Chen, Y. (2006). Evolutionarily Conserved Allosteric Network in the Cys Loop Family of Ligand-gated Ion Channels Revealed by Statistical Covariance Analyses Journal of Biological Chemistry, 281 (26), 18184-18192 DOI: 10.1074/jbc.M600349200

Dunn, S., Wahl, L., & Gloor, G. (2007). Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction Bioinformatics, 24 (3), 333-340 DOI: 10.1093/bioinformatics/btm604

Gloor, G., Tyagi, G., Abrassart, D., Kingston, A., Fernandes, A., Dunn, S., & Brandl, C. (2010). Functionally Compensating Coevolving Positions Are Neither Homoplasic Nor Conserved in Clades Molecular Biology and Evolution, 27 (5), 1181-1191 DOI: 10.1093/molbev/msq004

PVC bacteria and the prokaryote to eukaryote transition... maybe not.

By Gem on Tuesday, October 11, 2011

It was an interesting hypothesis, but it seems the evidence for an origin of eukaryotes in the Planctomycetes, Verrucomicrobia, Chlamydiae (PVC) bacterial superphylum, as proposed by Devos and Reynaud in a Science article doesn't hold up to scrutiny.

In a recent paper by James McInerney et al. in Bioessays, the authors address each of the claimed eukaryote-like features and show that they are all likely to be either analogous (the result of parallel evolution, not shared ancestry), or are the result of horizontal gene transfer (HGT) events. In the words of the authors:

PVC are no more intermediates in the prokaryote-to-eukaryote transition than dragonflies are intermediates in the evolutionary sequence linking bony fish and birds.

The Bioessays paper is an important reminder that for any grand hypotheses about evolution, distinguishing between homologous and analogous characters is critical, as is establishing the direction of inheritance. And by far the best way to address these points is by taking advantage of the mass of genomic data available.