Coevolution (two or more biological objects evolving together) is a common feature of the evolutionary process on all levels from the molecular to the organismal. One of the most beautiful examples is that of hummingbirds and ornithophilous flowers. Hummingbirds feed on the nectar from the flowers, pollinating them in the process. In this mutually beneficial relationship, the plants have evolved flowers that attract the birds with colours that are conspicuous to the bird, and are shaped to perfectly accommodate the bird's beak. This coevolution has happened in a number of hummingbird/plant pairs.
|
Pic from Wikipedia article on humming birds |
For more information on hummingbird/plant coevolution, I direct you to the
publications of Ethan Temeles. As usual though, this post will be about proteins, and not whole organisms... and it will include my own crude drawings as usual...
|
Fig 1. Ta-da! Hummingbird/plant coevolution is a nice analogy for protein receptor/ligand coevolution. Circles show residues directly involved in the interaction. |
At the molecular level, an example of coevolution is in the establishment of receptor-ligand interactions (Fig 1). The receptor protein binding site has evolved in concert with the binding site of the ligand. In Fig 1, variation of the yelow residues in the receptor is correlated with that of the green residues in the ligand. The yellow sites are close together in the structure, but not necessarily neighboring in the sequence. For example, the amino acid sequence backbone of these imaginary proteins might be arranged like this:
|
Fig 2. Black lines show the amino acid sequence of the protein, within its structural density. |
Thus, if
the structure of the binding interface is known, it's possible to predict
candidate coevolving sites. However from the sequence alone, it's not
so straightforward.
As discussed in a recent paper of
Gloor et al in MBE (and references within), there are two explanations for how covarying positions come to be (and these are actually the extremes of the distribution of possible mutational effects):
1. Suppressor mutations. These arise when a mutation with a
deleterious phenotype is suppressed by another mutation at a different
position.
2. Covarions. These are cases when both the original residue and the mutated residue are functionally compatible, but mutation alters the spectrum of amino acids possible at another location.
Covarying sites may occur in the
same protein or in different proteins (Figs 3-4).
|
Fig 3. Stars show between-protein correlated mutations at two interaction sites |
|
In between-protein coevolution, green sites coevolve with yellow sites in our example. But there is also within-protein coevolution among yellow site residues and among green site residues. Imagine for instance a change of green residue that multiple yellow resides interact with at different times (Fig 4). Or perhaps the middle yellow starred residue in Fig 4 mutating and causing different constraints in what residues the neighboring yellow sites can mutate to. Either way, the three yellow sites will covary. Remember that those sites are far away from each other in the sequence. So by showing that these sites co-vary, we can predict that they are functionally related, even if we don't have a structure
|
Fig 4. Correlated mutations can also occur within one protein |
Prediction of co-evolving sites can be useful for understanding cases when binding site residues are unconserved in a multiple sequence alignment. It can also be useful for predicting intermolecular interaction sites, and allosteric sites (for example
Chen et al., 2006). An allosteric site can remotely affect the evolutionary pressures on a
distant site by affecting the structural orientation of the protein (Fig 5).
|
Fig 5. Correlated mutations among binding site residues and an allosteric site. |
Prediction of covarying sites is challenging, not only because they may not always be clustered together in sequence and structure, but because covariation is a combined result of structural and functional
constraints and background noise from shared phylogenetic
ancestry and random processes.
There are two classes of methods for predicting covarying sites: tree-aware and tree-unaware. Tree aware methods search for sites whose covariation can not be explained by phylogenetic relationships, while tree-unaware methods ignore phylogenetic relationships, instead searching for covarying sites with the strongest signal. The two classes of methods are discussed in
Caporase et al (2008), in which it is concluded that tree-unaware methods perform as well as tree-unaware.
Using a tree-unaware method,
Gloor et al. (2010) examine covariation in phosphoglyerate kinase evolution. They identify nonconserved sites that covary, and through mutagenesis show that the sites are important for function and epistatic to each other (mutation in one affects the function of the other). They find that covarying positions
are just as as diverse within and between clades as are noncovarying positions, and suggest that most covarying positions arise from processes more like the covarion model, than the suppression mutation model.
The importance of covariation in sequence evolution is of interest to people like myself who use patterns of sequence variation to predict protein function. In studying molecular evolution of function, we largely rely on the assumption that the most functionally important positions are those that are conserved over time. Although this is generally the case, it seems that some important sites that are able to covary may slip through the net.
Recently, I've been experimenting with the tree-unaware code of
Dunn et al., (2008) to find covarying sites Preliminary results, based on the RelA family are... confusing. Residues that would be predicted to be interacting from the structure are not flagged up as covarying, while there are many pairs of predicted covaring sites that are physically distant and don't seem likely to be allostric sites from the structure. It seems like as with many real-life case studies, real biology is a little bit more complicated than naive sketches like mine would have you believe! Oh well, time to delve a little deeper into the data set...
References and further reading:
Caporaso, J., Smit, S., Easton, B., Hunter, L., Huttley, G., & Knight, R. (2008). Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics BMC Evolutionary Biology, 8 (1) DOI: 10.1186/1471-2148-8-327
Codoñer FM, & Fares MA (2008). Why should we care about molecular coevolution? Evolutionary bioinformatics online, 4, 29-38 PMID: 19204805
Chen,
Y. (2006). Evolutionarily Conserved Allosteric Network in the Cys Loop
Family of Ligand-gated Ion Channels Revealed by Statistical Covariance
Analyses Journal of Biological Chemistry, 281 (26), 18184-18192 DOI: 10.1074/jbc.M600349200
Dunn,
S., Wahl, L., & Gloor, G. (2007). Mutual information without the
influence of phylogeny or entropy dramatically improves residue contact
prediction Bioinformatics, 24 (3), 333-340 DOI: 10.1093/bioinformatics/btm604
Gloor, G., Tyagi, G., Abrassart, D., Kingston, A., Fernandes, A., Dunn, S., & Brandl, C. (2010). Functionally Compensating Coevolving Positions Are Neither Homoplasic Nor Conserved in Clades Molecular Biology and Evolution, 27 (5), 1181-1191 DOI: 10.1093/molbev/msq004