[Paleopsych] SW: On the Scheme of Animal Phyla

Premise Checker checker at panix.com
Mon May 30 01:41:05 UTC 2005

Evolutionary Biology: On the Scheme of Animal Phyla

    The following points are made by M. Jones and M. Blaxter (Nature 2005
    1) Despite the comforting certainty of textbooks and 150 years of
    argument, the true relationships of the major groups (phyla) of
    animals remain contentious. In the late 1990s, a series of
    controversial papers used molecular evidence to propose a radical
    rearrangement of animal phyla [1-3]. Subsequently, analyses of
    whole-genome sequences from a few species showed strong, apparently
    conclusive, support for an older view[4-6]. New work [7] now provides
    evidence from expanded data sets that supports the newer evolutionary
    tree, and also shows why whole-genome data sets can lead
    phylogeneticists seriously astray.
    2) Traditional trees group together phyla of bilaterally symmetrical
    animals that possess a body cavity lined with mesodermal tissue, the
    "coelom" (for example, the human pleural cavity), as Coelomata. Those
    without a true coelom are classified as Acoelomata (no coelom) and
    Pseudocoelomata (a body cavity not lined by mesoderm). We call this
    tree the A-P-C hypothesis. Under A-P-C, humans are more closely
    related to the fruitfly Drosophila melanogaster than either is to the
    nematode roundworm Caenorhabditis elegans[5,6].
    3) In contrast, the new trees [1-3,7] suggest that the basic division
    in animals is between the Protostomia and Deuterostomia (a distinction
    based on the origin of the mouth during embryo formation). Humans are
    deuterostomes, but because flies and nematodes are both protostomes
    they are more closely related to each other than either is to humans.
    The Protostomia can be divided into two "superphyla": Ecdysozoa
    (animals that undergo ecdysis or moulting, including flies and
    nematodes) and Lophotrochozoa (animals with a feeding structure called
    the lophophore, including snails and earthworms). We call this tree
    the L-E-D hypothesis. In this new tree, the coelom must have arisen
    more than once, or have been lost from some phyla.
    4) Molecular analyses have been divided in their support for these
    competing hypotheses. Trees built using single genes from many species
    tend to support L-E-D, but analyses using many genes from a few
    complete genomes support A-P-C [5,6]. The number of species
    represented in a phylogenetic study can have two effects on tree
    reconstruction. First, without genomes to represent most animal phyla,
    genome-based trees provide no information on the placement of the
    missing taxonomic groups. Current genome studies do not include any
    members of the Lophotrochozoa. More notably, if a species' genome is
    evolving rapidly, tree reconstruction programs can be misled by a
    phenomenon known as long-branch attraction.
    5) In long-branch attraction, independent but convergent changes
    (homoplasies) on long branches are misconstrued as "shared derived"
    changes, causing artefactual clustering of species with long branches.
    Because these artefacts are systematic, confidence in them grows as
    more data are included, and thus genome-scale analyses are especially
    sensitive to long-branch attraction. Long branches can arise in two
    ways. One is when a distantly related organism is used as an
    "outgroup" to root the tree of the organisms of interest. The other is
    when one organism of interest has a very different, accelerated
    pattern of evolution compared with the rest.
    References (abridged):
    1. Aguinaldo, A. M. A. et al. Nature 387, 489-493 (1997)
    2. Winnepenninckx, B. et al. Mol. Biol. Evol. 12, 1132-1137 (1995)
    3. Adoutte, A., Balavoine, G., Lartillot, N. & de Rosa, R. Trends
    Genet. 15, 104-108 (1999)
    4. Mushegian, A. R., Garey, J. R., Martin, J. & Liu, L. X. Genome Res.
    8, 590-598 (1998)
    5. Blair, J. E., Ikeo, K., Gojobori, T. & Hedges, S. B. BMC Evol.
    Biol. 2, 7 (2002)
    6. Wolf, Y. I., Rogozin, I. B. & Koonin, E. V. Genome Res. 14, 29-36
    7. Philippe, H., Lartillot, N. & Brinkmann, H. Mol. Biol. Evol. 22,
    1246-1253 (2005)
    Nature http://www.nature.com/nature
    Related Material:
    The following points are made by K.A. Crandall and J.E. Buhay (Science
    2004 306:1144):
    1) Although we have not yet counted the total number of species on our
    planet, biologists in the field of systematics are assembling the
    "Tree of Life" (1,2). The Tree of Life aims to define the phylogenetic
    relationships of all organisms on Earth. Driskell et al (3) recently
    proposed a computational method for assembling this phylogenetic tree.
    These investigators probed the phylogenetic potential of ~300,000
    protein sequences sampled from the GenBank and Swiss-Prot genetic
    databases. From these data, they generated "supermatrices" and then
    2) Supermatrices are extremely large data sets of amino acid or
    nucleotide sequences (columns in the matrix) for many different taxa
    (rows in the matrix). Driskell et al (3) constructed a supermatrix of
    185,000 protein sequences for more than 16,000 green plant taxa and
    one of 120,000 sequences for nearly 7500 metazoan taxa. This compares
    with a typical systematics study of, on a good day, four to six
    partial gene sequences for 100 or so taxa. Thus, the potential data
    enrichment that comes with carefully mining genetic databases is
    large. However, this enrichment comes at a cost. Traditional
    phylogenetic studies sequence the same gene regions for all the taxa
    of interest while minimizing the overall amount of missing data. With
    the database supermatrix method, the data overlap is sparse, resulting
    in many empty cells in the supermatrix, but the total data set is
    3) To solve the problem of sparseness, the authors built a
    "super-tree" (4). The supertree approach estimates phylogenies for
    subsets of data with good overlap, then combines these subtree
    estimates into a supertree. Driskell et al (3) took individual gene
    clusters and assembled them into subtrees, and then looked for
    sufficient taxonomic overlap to allow construction of a supertree. For
    example, using 254 genes (2777 sequences and 96,584 sites), the
    authors reduced the green plant supermatrix to 69 taxa from 16,000
    taxa, with an average of 40 genes per taxon and 84% missing sequences!
    This represents one of the largest data sets for phylogeny estimation
    in terms of total nucleotide information; but it is the sparsest in
    terms of the percentage of overlapping data.
    4) Yet even with such sparseness, the authors are still able to
    estimate robust phylogenetic relationships that are congruent with
    those reported using more traditional methods. Computer simulation
    studies (5) recently showed that, contrary to the prevailing view,
    phylogenetic accuracy depends more on having sufficient characters
    (such as amino acids) than on whether data are missing. Clearly,
    building a super-tree allows for an abundance of characters even
    though there are many missing entries in the resulting matrix.
    References (abridged):
    1. M. Pagel, Nature 401, 877 (1999)
    2. A new NSF program funds computational approaches for "assembling
    the Tree of Life" (AToL). Total AToL program funding is $13 million
    for fiscal year 2004. NSF, Assembling the Tree of Life: Program
    Solicitation NSF 04-526 (www.nsf.gov/pubs/2004/nsf04526/nsf04526.pdf)
    3. A. C. Driskell et al., Science 306, 1172 (2004)
    4. M. J. Sanderson et al., Trends Ecol. Evol. 13, 105 (1998)
    5. J. Wiens, Syst. Biol. 52, 528 (2003)
    Science http://www.sciencemag.org
    Related Material:
    The following points are made by W. Martin and T. M. Embley (Nature
    2004 431:134):
    1) Charles Darwin (1809-1882) described the evolutionary process in
    terms of trees, with natural variation producing diversity among
    progeny and natural selection shaping that diversity along a series of
    branches over time. But in the microbial world things are different,
    and various schemes have been devised to take both traditional and
    molecular approaches to microbial evolution into account. For example,
    Rivera and Lake(1), based on analysis of whole-genome sequences, call
    for a radical departure from conventional thinking.
    2) Unknown to Darwin, microbes use two mechanisms of natural variation
    that disobey the rules of tree-like evolution: lateral gene transfer
    and endosymbiosis. Lateral gene transfer involves the passage of genes
    among distantly related groups, causing branches in the tree of life
    to exchange bits of their fabric. Endosymbiosis -- one cell living
    within another -- gave rise to the double-membrane-bounded organelles
    of eukaryotic cells: mitochondria (the powerhouses of the cell) and
    chloroplasts. At the endosymbiotic origin of mitochondria, a
    free-living proteobacterium came to reside within an archaebacterially
    related host. This event involved the genetic union of two highly
    divergent cell lineages, causing two deep branches in the tree of life
    to merge outright. To this day, biologists cannot agree on how often
    lateral gene transfer and endosymbiosis have occurred in life's
    history; how significant either is for genome evolution; or how to
    deal with them mathematically in the process of reconstructing
    evolutionary trees. The report by Rivera and Lake(1) bears on all
    three issues: Instead of a tree linking life's three deepest branches
    (eubacteria, archaebacteria and eukaryotes), they uncover a ring.
    3) The ring comes to rest on evolution's sorest spot -- the origin of
    eukaryotes. Biologists fiercely debate the relationships between
    eukaryotes (complex cells that have a nucleus and organelles) and
    prokaryotes (cells that lack both). For a decade, the dominant
    approach has involved another intracellular structure called the
    ribosome, which consists of complexes of RNA and protein, and is
    present in all living organisms. The genes encoding an organism's
    ribosomal RNA (rRNA) are sequenced, and the results compared with
    those for rRNAs from other organisms. The ensuing tree(2) divides life
    into three groups called "domains". The usefulness of rRNA in
    exploring biodiversity within the three domains is unparalleled, but
    the proposal for a natural system of all life based on rRNA alone has
    come increasingly under fire.
    4) Ernst Mayr(3), for example, argued forcefully that the rRNA tree
    errs by showing eukaryotes as sisters to archaebacteria, thereby
    obscuring the obvious natural division between eukaryotes and
    prokaryotes at the level of cell organization. A central concept here
    is that of a tree's "root", which defines its most ancient branch and
    hence the relationships among the deepest-diverging lineages. The
    eukaryote-archaebacteria sister-grouping in the rRNA tree hinges on
    the position of the root. The root was placed on the eubacterial
    branch of the rRNA tree based on phylogenetic studies of genes that
    were duplicated in the common ancestor of all life(2). But the studies
    that advocated this placement of the root on the rRNA tree used, by
    today's standards, overly simple mathematical models and lacked
    rigorous tests for alternative positions(4).
    5) One discrepancy is already apparent in analyses of a key data set
    used to place the root, an ancient pair of related proteins, called
    elongation factors, that are essential for protein synthesis(5).
    Although this data set places the root on the eubacterial branch, it
    also places eukaryotes within the archaebacteria, not as their
    sisters(5). Given the uncertainties of deep phylogenetic trees based
    on single genes(4), a more realistic view is that we still don't know
    where the root on the rRNA tree lies and how its deeper branches
    should be connected.
    References (abridged):
    1. Rivera, M. C. & Lake, J. A. Nature 431, 152-155 (2004)
    2. Woese, C., Kandler, O. & Wheelis, M. L. Proc. Natl Acad. Sci. USA
    87, 4576-4579 (1990)
    3. Mayr, E. Proc. Natl Acad. Sci. USA 95, 9720-9723 (1998)
    4. Penny, D., Hendy, M. D. & Steel, M. A. in Phylogenetic Analysis of
    DNA Sequences (eds Miyamoto, M. M. & Cracraft, J.) 155-183 (Oxford
    Univ. Press, 1991)
    5. Baldauf, S., Palmer, J. D. & Doolittle, W. F. Proc. Natl Acad. Sci.
    USA 93, 7749-7754 (1996)
    Nature http://www.nature.com/nature

More information about the paleopsych mailing list