摘要: Phylogenomics, the inference of phylogenetic trees using genome-scale data, is becoming the rule for resolving difficult parts of the tree of life. Its promise resides in the large amount of information available, which should eliminate stochastic error. However, systematic error, which is due to limitations of reconstruction methods, is becoming more apparent. We will illustrate, using animal phylogeny as a case study, the three most efficient approaches to avoid the pitfalls of phylogenomics: (1) using a dense taxon sampling, (2) using probabilistic methods with complex models of sequence evolution that more accurately detect multiple substitutions, and (3) removing the fastest evolving part of the data (e.g., species and positions). The analysis of a dataset of 55 animal species and 102 proteins (25712 amino acid positions) shows that standard site-homogeneous model inference is sensitive to long-branch attraction artifact, whereas the site-heterogeneous CAT model is less so. The latter model correctly locates three very fast evolving species, the appendicularian tunicate Oikopleura, the acoel Convoluta and the myxozoan Buddenbrockia. Overall, the resulting tree is in excellent agreement with the new animal phylogeny, confirming that “simple” organisms like platyhelminths and nematodes are not necessarily of basal emergence. This further emphasizes the importance of secondary simplification
Henner BRINKMANN, Herve PHILIPPE. 动物系统发育和大规模测序:进展和问题[J]. 植物分类学报, 2008, 46(3): 274-286.
Henner BRINKMANN, Herve PHILIPPE. Animal phylogeny and large-scale sequencing: progress and pitfalls[J]. Acta Phytotaxonomica Sinica, 2008, 46(3): 274-286.