Blog

A diversity of open access DNA barcoding articles

The entire May 2009 Mol Ecol Res “Special Issue on Barcoding Life” is open access, thanks to support from Genome Canada and NSERC. As an aside, Mol Ecol Res publisher Wiley-Blackwell, which puts out over 1400 journals, charges $3000 US per article for open access, as compared to, for example, $1300 in PLoS ONE (all articles open access), and $1200 (plus $70/page) for open access option in Proc Natl Acad Sci USA. If funders mandate open access for publications based on research they support, then either this differential will disappear, or many manuscripts will migrate to lower cost journals. The special barcoding issue is based on Canadian Barcode of Life Network Scientific Symposium held at the Royal Ontario Museum in April 2008 and includes 27 articles on topics ranging from methodology to applications in creatures great and small including fungi and plants.

Most DNA barcoding analyses look at DNA identification through the lens of established taxonomy, ie how well does sequence data capture the species-level taxonomic categories established by morphologic analysis? In the special issue article “DNA barcoding and the mediocrity of morphology” researchers from York University and University of Guelph look at the comparison the other way around–how well does morphology identify the sorts of specimens that can be distinguished by DNA-based methods, barcoding in particular? In Packer and colleagues’ analysis, morphology comes up short “in numerous important situtations such as the association of larvae with adults and discrimination among cryptic species.” Taking an example not entirely at random, the authors analyze a key to Agathidium genus slime mold beetles co-authored by a sometime skeptic of barcoding (Miller and Wheeler, 2005) (this key made popular news as 3 of newly described beetles were named in tribute to then current US government officials–A. bushi, A. cheneyi, A. rumsfeldi). As is common in keys to insect identification, the reliance on adult male characters, usually genitalia, means that females and immature forms often cannot be identified to species (for the 3 USG namesakes, the key states “female not examined” and there is no description of immature forms). Again typical of insect keys, there is no documentation of intraspecific variation in diagnostic characters (for A. cheneyi, “the holotype is the only specimen examined of this species”). As a result, Packer and colleagues note “the morphological equivalent of the barcode gap that enables molecular identification of species cannot be calculated using traditional approaches, and the sample size of illustrations upon which measures of intraspecific variation might be estimated usually averages one per species with zero variance.”

I hope that future keys for slime mold beetles will include DNA barcode sequences. This will enable anyone, scientists and public alike, with access to a DNA sequencer to identify A. cheneyi adults of both sexes, larvae, fragments in the guts of predators, and perhaps eggs in random leaf litter samples.

Coaxing DNA out of ancient insects and sediments

Deep space telescopes gather light from the early universe, providing pictures of the unimaginably remote past. What about the biological universe–can we peer back in time? Geochemical evidence suggests life on Earth arose about 3.5 billion years ago and fossils reveal what life looked like as far back as 3.0 billion years, and important fossil discoveries across that whole span of time continue to be made. What about DNA?  As Carl Woese first realized, DNA sequences of living organisms contain signatures of their evolutionary relationships, and enable reconstructing history as far back as the origin of replication, even before cells and DNA. At the near end of the time scale, recovery of DNA from historical samples can help identify organisms that lived hundreds, thousands, tens of thousands, or even, in a few cases so far, hundreds of thousands years ago.

In April 2009 PLoS ONE ten researchers from university centers in Denmark, United Kingdom, United States, Canada, Russia, and New Zealand report on non-destructive recovery of diagnostic DNA from ancient insect specimens. As an aside, PLoS ONE is an important sea change in scientific publishing. First of all, as described on their website, the journal “features reports of original research from all disciplines within science and medicine. By not excluding papers on the basis of subject area, PLoS ONE facilitates the discovery of the connections between papers whether within or between disciplines.” Second, it puts the judgement of importance in the hands the scientific community where it belongs: 

“Too often a journal’s decision to publish a paper is dominated by what the Editor/s think is interesting and will gain greater readership — both of which are subjective judgments and lead to decisions which are frustrating and delay the publication of your work. PLoS ONE will rigorously peer-review your submissions and publish all papers that are judged to be technically sound. Judgments about the importance of any particular paper are then made after publication by the readership (who are the most qualified to determine what is of interest to them).”

This is so sensible it is surprising it has not happened earlier! There is of course a place for journals like Nature and Science, but I expect that a great deal of scientific publishing will migrate to PLoS ONE, with benefits to the authors and the scientific community.  

Back to the paper. Thomsen and colleagues first tested a non-destructive extraction method (Gilbert et al 2007 PLoS ONE 2:e272) on museum beetle specimens. This involves overnight incubation with gentle agitation in a digestion buffer at 55^o C. Remarkably, the specimens emerged none the worse for the wear. The researchers recovered 77-204 bp segments of mtCOI from all of 20 beetles, which were collected as early as 1825 (1/3 were over 100 years old). Using a Bayesian approach that generates taxonomic assignments with probability estimates, these short fragments were sufficient for identification to species in most cases; the remainder could be assigned to family or genus level. The researchers then applied this same technique to insect chitin (exoskeleton) fragments preserved in permafrost dating from about 7,000 to over 47,000 years before present (BP). Here only 3 of the 14 (21%) samples (10,000-26,000 y BP) yielded amplifiable DNA, with Bayesian assignments to family or order level. Although the authors appear to have hoped for higher success, this seems pretty remarkable to me. They speculate that destructive sampling might have produced higher yields.

Saving what might be the best for last, Thomsen and colleagues tested non-frozen sediment samples that lacked visible insect parts collected in New Zealand caves and dated 1800 to 3280 years BP. Using a more or less standard extraction protocol developed by some of the authors (Willerslev et al 2003 Science 300:791), 96 bp fragments of COI (1 beetle, 1 butterfly) were recovered from 2 of 3 samples tested. The authors drily note “although the non-frozen sediment DNA approach involves destructive sampling, it has the advantage that the material is the sediment itself, which is usually abundant, and normally not too valuable to process.”

I conclude that if bits of DNA are preserved in ancient dirt then DNA from the past and present must be all around us. Perhaps single molecule sequencing methods will reveal an even greater abundance and diversity of DNA in environmental samples.

The Jack Rabbit of Depression, or Do economic slumps benefit environment?

Some wonder whether the present economic slump will elicit a change of direction and faster progress in reducing environmental harm.  Jesse Ausubel and Paul Waggoner, assisted by Smriti Rao, examined what happened to USA energy use and emissions during the slumps of the 1930s and after World War II.   A short essay, “The Jack Rabbit of Depression, or Do economic slumps benefit environment?” provides our answer.  Two 40-second animations prepared by Smriti show the year-by-year changes from 1920 to 1940 of energy intensity and carbon emissions.

Keeping a steady course rather than darting about was also a theme of Jesse’s address “Natural Gas and the Jack Rabbit” to the Power South Energy Cooperative on 22 January 2009.

Dinoflagellate diversity revealed by DNA

Peering into the vast diversity of life beyond multicellular eukaryotes (animals, plants, and fungi) is dizzying. In March 2009 Applied Environ Microbiol researchers from University of Connecticut assess dinoflagellate diversity with mitochondrial DNA sequencing. Dinoflagellates are unicellular, often photosynthetic, mostly marine plankton characteristically having two flagella and encased in a segmented hardened exterior. Dinoflagellate blooms are the cause of red tides, and dinoflagellate toxins ingested by fish and shellfish are the cause of ciguatera and paralytic shellfish poisoning. For unknown reasons, some species are bioluminescent when mechanically stimulated, producing glowing displays when perturbed by waves, fish, or kayakers, for example.

As a first step toward creating a reference library, Lin and colleagues compiled mtDNA sequences from 49 dinoflagellate species representing six orders (this included 20 COI and 60 cytochrome b sequences; 12 of the latter were newly obtained in this study). As there are about 2500 named dinoflagellate species, this is a sparsely-populated reference library so far. In addition, there were multiple samples from just 5 species, so intraspecific variation is not yet well-studied. As an aside, I note that most of the published and new sequences were derived from strains maintained at Pravasoli-Guillard National Center for Culture of Marine Phytoplankton (CCMP). There is no explicit mention of CCMP in the paper or GenBank depositions, although a plankton specialist would probably recognize the source from sample designations. More generally, there is no formal documentation of taxonomic identifications (eg collection sources for cultures or photographs for environmental samples and/or individual who performed identifications). Although this is not unusual in taxonomic papers, it seems to me that identifications should be as well documented as for example PCR conditions. 

In preparing the reference library, the researchers were unable to develop primers that amplified the barcode region of COI efficiently (ie the primers worked with some species and not others) and instead focused on cytochrome b using a primer pair that amplified a 385 bp segment. The primer difficulty is surprising given that COI is usually more conserved than cyt b (including in dinoflagellates), which should make it easier to design broad-range primers.  

The researchers then analyzed pooled environmental DNA samples prepared by filtering water specimens collected during different months at 3 marine stations in Long Island Sound and at a freshwater retention pond (Mirror Lake) on the University of Connecticut campus. While PCR products from monospecific cultures were sequenced directly, those from environmental samples were first cloned, and then 20 to 50 clones from each water sample were sequenced (total clones analyzed 450). 

Lin and co-workers obtained a large number of distinct haplotypes from the environmental samples; by my inspection of their phylogram nearly all of the clones (>420) were unique. Only a small minority could be assigned to known species or genera. On the technical side, the authors used a complex model of nucleotide substitution (TVM+G) to calculate differences among haplotypes and UPGMA to create trees, so their distance results and trees are not directly comparable to those in most DNA barcoding papers, which use K2P- or p-distances to calculate differences and neighbor-joining to create trees. In any case, according to the authors, the sequence results consistently showed greater diversity than was detected through microscopic analysis, “likely caused by the much higher detection sensitivity of PCR than of microscopic counting and by some genotypes that could not be discriminated morphologically.” The authors conclude “[w]hen a broader cob [cyt b] database becomes available, the taxon-resolving power of this gene would certainly increase.” I hope they or others will also develop efficient primer sets for amplifying COI in addition to cyt b

Looking ahead, the reference library can be augmented relatively inexpensively by analyzing mtDNA sequences of the 2400 strains at CCMP. However, the mtDNA diversity in this study suggests dozens of new species from just 4 sampling sites around Connecticut, implying the global total of undescribed species is very large. This suggests a need for some sort of “automated species identifier”: a machine approach that would sort samples into individual cells, then photograph, sequence, apply MOTU-type analysis, for example. In the meantime, it may be necessary to work with pooled sequences from environmental samples, as is done for bacterial communities, without attempting to delineate species.

Sushi-gate in SOI Magazine

PHE-sponsored “Sushi-gate” investigation is featured in April 2009 Scripps Oceanographic Institute Magazine. The article, which describes work done by FISHBOL researchers Phil Hastings and Ron Burton, quotes PHE’s Mark Stoeckle and cites his daughter Kate, co-author with her high school classmate Louisa Strauss, of 2008 report on mislabeled fish sold in NYC revealed by DNA barcoding (aka “Sushi-gate”).

DNA sorts out bewildering morphology

DNA helps flag genetically divergent forms that may represent cryptic species and is equally valuable the other way around: in linking morphologically diverse forms that occur within species. In 20 jan 2009 Biol Lett, researchers from National Museum of Natural History, Washington, DC; Australian Museum, Sydney; Virginia Institute of Marine Science; University of Tokyo; and Natural History Museum, Tokyo, solve the mystery of “the most extreme example of ontogenetic morphoses and sexual dimorphism in vertebrates.”

Johnson and colleagues examined specimens of small (body size 4-408 mm) deep water (1000-4000m) fishes thought to represent 3 families in the order Stephanoberyciformes (whalefish and relatives). The authors analyzed morphology and whole mitochondrial genomes from 34 individuals of 16 species including representatives of all 5 whalefish families. They found three whalefish “families” are one: “Mirapinnidae (tapetails), Megalomycteridae (bignose fishes), and Cetomimidae (whalefishes), are larvae, males and females, respectively of a single family Cetomimidae.” These are strange-looking fish–the males, which do not feed as adults, are sustained by enormous livers, and the minute larvae have streamers up to 75 cm. For fun, see deep ocean video of live female whalefish swimming (and narration of the amazed icthyologists) in supplementary material. Next up is to link the three life stages of each species; here DNA will help along with meristic data (quantitative features such as number of fins or scales).

Why do mitochondria differ among species?

Mitochondria are the power plants of the cell, consuming oxygen and breakdown products of sugars, amino acids, and fatty acids to produce energy as ATP and heat.  As originally proposed by Lynn Margulis in 1967, mitochondria, which have their own circular genome and replicate independently of the cell, are derived from an ancient symbiosis of an an alpha-proteobacterium related to gram-negative bacteria. 

In multicellular animals, most of the 100+ proteins in mitochondria are encoded by nuclear genes. The mitochondrial genome is only about 16 kb (vs about 2000 kb for nearest bacterial relatives Rickettsia sp) and encodes just 13 proteins, all of which function in the electron transport chain, plus 2 ribosomal RNAs and approximately 20 tRNAs.  Multiple protein-protein interactions (for example, complex I comprises 34 nuclear-encoded and 7 mitochondrial-encoded proteins) suggest there must be close co-evolution between nuclear and mitochondrial genomes; this might be one of the constraints on mitochondrial variation. Although an enormous amount of information on mitochondrial sequence differences among and within species has been been compiled (through DNA barcoding initiative and other efforts), there is surprisingly little study so far on whether mitochondrial differences among species reflect functional adaptation (although see Ruiz-Pesini et al 2004 Science 303:223, Bayona-Balfaluy et al 2004 Mol Biol Evol 22:716).  

In 25 February 2009 Proc R Soc B researchers from University of Groningen, The Netherlands; Max Planck Institute for Ornithology, Germany; and Ohio State University investigate whether mitochondrial differences modulate energy metabolism in birds. As mitochondria consume 90% of respired oxygen, mitochondrial activity presumably determines basal metabolism. Tieleman and colleagues performed crosses among 3 captive bred populations of stonechats (Saxicola torquata spp.) that differ in basal metabolic rate, which presumably reflects adaptation to different climates: Africa (Kenya, Saxicola torquata axillaris), Asia (Kazakhstan, Saxicola torquata maura), and Europe (Austria, Saxicola torquata rubicola). As an aside, I note that these three taxa are elevated to species status in current world checklists (Clements 2007, IOC Checklist; even 1992 edition of Birds of Europe notes “Siberian race may be a full species.”) This does not change the interpretation of the findings, but it does reflect the confused nature of taxonomic science that even for a group as well studied as birds, publication standards accept this laxity in taxonomic classification. Naming of bacteria in medical studies is more uniformly up to date than for multicellular animals; it seems that animal taxonomists have not found a way to establish a regularly updated consensus. In this regard the IOC Checklist suggests a way forward: “in this global world of wiki-style sharing of knowledge, we invite world birders and ornithologsts alike to help us keep the IOC list accurate, vital, and accessible.” 

Back to the paper. Tieleman and colleagues “tested for a genetic effect on BMR based on mitochondrial-nuclear coadaptation using hybrids between ancestral populations with high and low BMR (Europe-Africa and Asia-Europe), with different parental configurations (female-high x male-low or female-low x male-high). Hybrids with different parental configurations have on average identical mixtures of nuclear DNA, but differ in mitochondrial DNA because it is inherited only from the mother.”  The researchers found that metabolic rate differed between hybrids with contrasting parental configurations, “providing evidence for the importance of a match between mitochondrial and nuclear genomes to regulate metabolic rate.” So far so good. However, contrary to expectations, in both sets of crosses, metabolic rates in hybrids were more similar to that of the father than the mother! (see adapted figure). This result is a puzzler; it suggests there might be another factor such as genomic imprinting at work. 

Looking at the bigger picture, for those interested in mitochondrial evolution, there is a lot of opportunity: a large and growing database of COI sequences (>500,000 individuals, >50,000 species so far) that is waiting to be analyzed for evidence of purifying or positive selection, for example, or for limits to plasticity in COI amino acid sequence. I wonder if there might be convergent evolution of COI, such that diverse organisms in very cold or very hot environments environments, for example, might exhibit similar amino acid substitutions.

NAP books from 1983 (Thanks, Google).

Thanks to scanning by Google Books, we post two vintage reports for which Jesse early in his career was the lead scribe: Changing Climate, Report of the Carbon Dioxide Assessment Committee, National Research Council, Washington DC , 1983. Toward an International Geosphere-Biosphere Program: A Study of Global Change, National Research Council, Washington DC , 1983.

“Changing Climate” was the first comprehensive review of the now popular global warming issue, while “Toward an IGBP” defined what became the Global Change Research Program.