I saw too many talks today to comprehensively discuss them all. Here are a few that stand out:
Matt Hahn discussed the correlation (or lack thereof) between protein sequence similarity and protein function similarity. Although we have increasingly complex models of sequence evolution (using Markov Models, for example), we know almost nothing about how protein function evolves. Matt raised three questions: (1) How fast does protein function evolve? (2) Can we correlate the rate of evolution for protein function to the rate of evolution for protein sequences? (3) Can we find evidence for differential rates of protein function evolution in different types of protein families? Given the short time constraint (15 minutes!), Matt did not conclusively answer any of these questions — but that’s not necessarily a critique of his lecture. His hypothesis was that the rate of evolution for protein function should be slower in orthologs and faster in paralogs. To test this hypothesis, Matt gathered protein function annotations from the Gene Ontology Consortium and plotted this data against rates of evolution for protein sequences. Surprisingly, Matt observed (1) orthologs appear to evolve faster than paralogs, and (2) there is no relation between the rates of sequence evolution and functional evolution. Both of these results are surprising, but difficult to explain. Obviously, Matt’s results depend on the accuracy of the Gene Ontology annotations, which are unlikely to be entirely accurate. I think Matt is asking a set of questions that are critically important, but I don’t think accurate answers will be found until we develop a different method for classifying and measuring protein function.
Paul Hohenlohe discussed RAD sequencing with the Illumina Genome Analyzer II to measure genetic variance (as Fst) in stickleback populations. (RAD sequencing is introduced by Selker et al., Genetics 2007). Sticklebacks are ancestrally a saltwater fish with bony armor plates. Sticklebacks colonize freshwater habitats; colonizing populations lose some — or all — of their armor. Paul used RAD sequencing with Alaskan stickleback populations, and showed that population structures vary between the saltwater and freshwater populations. Paul’s analysis of stickleback populations provides a compelling example of how RAD sequencing is a high-throughput method for population genomics.
Joe Felsenstein talked about “phylogenetic geometric morphometrics.” Given homologous extant morphologies with a set of identified (x,y) coordinates, Joe first showed geometric techniques to rotate and translate the extant geometries such that they are “aligned” in an roughly analogous fashion to sequence alignment. Next, given a phylogeny relating the extant morphologies, Joe discussed a model using Brownian motion to infer ancestral forms — i.e., an ancestral set of Cartesian coordinates. I’m not a developmental biologist, so I can’t offer much critique of this method. I’m curious how he plans to deal with missing data — i.e. extant morphologies with (x,y) coordinates that don’t appear in all descendants.
Finally, James Foster talked about “evolutionary computation.” Specifically, any process which demonstrates replication, variation, and selection will necessarily demonstrate evolution. James’ point is that evolution can take place on digital artifacts as well as biological artifacts. He gave several examples of genetic algorithms applied to problems as far-reaching as ML phylogenetic estimation (Zwickl 2006) , electronic circuit construction (Koza 1985), and jet engine design (Rechenberg 1966). I totally agree with James’ point that evolutionary computation is useful to solve a wide gamut of problems, but I’m afraid his point fell on many deaf ears at this biologically-focused conference.
Okay, that’s it for now.