Blog

Mitochondrial DNA’s unique power

The DNA barcode for animals is a 648 base pair (bp) fragment from the 5′ end of mitochondrial gene cytochrome c oxidase subunit I (COI).  Does this relatively short mitochondrial sequence contain enough information to make evolutionary inferences about species limits, or is it a more of a rough survey method that needs to be confirmed by more data including from nuclear genes? 

In April 2008 Mol Ecol researchers from University of Minnesota and American Museum of Natural History, New York, analyze utility of mitochondrial as compared to nuclear DNA for inferring recent evolutionary history. Zink and Barrowclough first apply population genetic theory and then look at real data from bird species. 

Based on mathematical population genetics, they find “mitochondrial loci are generally a more sensitive indicator of population structure than are nuclear loci,” primarily due to much smaller effective population size (Ne) for mitochondrial as compared to nuclear markers, which leads to more rapid sorting of differences among genetically isolated populations. Analysis of real-world data in 45 studies of differences among and within avian species confirms this expectation, ie either the patterning is consistent between mitochondrial and nuclear genes, or there are shallow mtDNA trees which are not yet reflected in nuclear genes. Reanalysis of one study, which appears to show a split in nuclear but not mitochondrial markers, suggests possible misinterpretation. Regarding other factors that could potentially lead to mistaken inferences about species limits based on mitochondrial DNA (NUMTs, sex-biased gene flow, introgression), experimental data suggests these are rarely important, at least in birds. The authors conclude “mtDNA patterns will prove to be robust indicators of population history and species limits.” Nuclear markers ARE important for deep gene trees, for detecting hybrids, and for “quantitative estimates…of rates of population growth and values of gene flow.”  

Regarding length of mitochondrial sequence, this only has to be long enough to capture differences among closely-related species. Most populations that we recognize as species differ from their closest relatives by >1% in mitochondrial coding regions (corresponding to about 0.5 million years or more of reproductive isolation). At this level, even 100 bp is generally sufficient to distinguish most closely-related species, and a 648 bp COI barcode sequence should generally allow resolution of populations/species which have been reproductively isolated for much shorter periods of time. 

Identifying the unidentifiable

In July 2008 Wildlife Management researchers from Smithsonian Institution report on identifying otherwise unidentifiable remnants from bird-aircraft collisions (hereafter birdstrikes). Authors Dove et al point out “birdstrikes are a serious safety hazard and a major expense for the industry”. The US Federal Aviation Agency Wildlife Mitigation site shows about 600 incidents a month over the past year, peaking in late summer and early fall, presumably coincident with fall migration. The Smithsonian Institution has been identifying birdstrike species for military and civil aviation industries since the 1960s, analyzing specimens which range from whole carcasses to bits of feathers, tissue, or blood. Prior to availability DNA testing, identifications have relied on expert examination of detailed feather morphology with comparisons to Smithsonian’s vast bird specimen collection.

Of 1,715 birdstrike samples sent to Smithsonian Insitution during 4 months in fall 2006, 821(47.9%) contained only blood or tissue. Of these, 554 (67.5%) had amplifiable mtCOI DNA, and 535 (96.6%) with DNA led to species-level identifications based on reference sequences in Barcode of Life Database (BOLD). DNA barcoding identified 128 species representing 14 orders of birds, plus 2 bat species. 19 cases were deemed inconclusive as DNA barcode matched to a set of 2 or more closely-related species with overlapping barcodes, or the recovered sequence did not meet their 98% match criteria when compared to BOLD.

There was much better success recovering DNA from dry samples (70%) than from samples collected with a wet paper towel (about 23%), which had been the standard method, pointing the way toward improving yield of DNA-based ID. The authors conclude with a call for applying “a combination of morphological and molecular methods such as DNA barcoding for efficient, cost-effective birdstrike identifications”.

Just as in CSI television series, DNA-based identification can make possible what would otherwise be impossible; in this case, identifying birds from bits of tissue and blood and making birdstrike identifications available to those without access to Smithsonian’s experts or vast collections. In addition to helping airlines, birdstrike ID will inform our knowledge of bird migration routes. There are many exciting discoveries ahead.

Abiotic Petroleum Bibliography

Our bibliography on abiotic petroleum origins grows bigger and better with the inclusion of another 50+ references to Russian papers, courtesy to Vladimir Kutcherov. Stay tuned for more exciting updates.

DNA identifies invasive parasitic wasp

Like the creatures Sigourney Weaver battles in Alien, parasitoids are organisms whose larva develop in other species, usually leading to the death of the host. Insect parasitoids are widely used as biological control agents; sometimes these efforts go awry, threatening non-pest species in local ecosystems. Widespread introduction of tachnid fly parasitoid Compsilura concinnata has failed to control Gyspy moth Lymantra dispar outbreaks in eastern US, but has led to dramatic declines in large, showy Silk moths including the beautiful Luna moth Actias luna (Elkinton and Boettler. 2004). 

About 10% of named insect species are parasitoids, mostly wasps, but recognizing these often minute insects can be tricky. In November 2007 Conservation Genetics researchers from Czech Academy of Sciences and University of South Bohemia, Czech Republic; Natural History Museum, London; and Imperial College London apply COI DNA sequencing to identify wasps parasitizing Canary Islands Large White butterfly Pieris cheiranthi, which is restricted to local endemic ecosystem of relict laurel forests. Lozan et al reared 55 P. cheiranthi caterpillars from 2 Canary Island sites, and found half of the larva from forest margin and none from central forest were parasitized with what appeared to be Cotesia glomerata, native to Europe and introduced elsewhere as biocontrol agent although not in Canary Islands. 

3 of 600 C. glomerata-like adult wasps reared from Canary Island White larvae and 2 of 700 C. glomerata reared in Czech Republic from European Large White P. brassicae larvae were analyzed and found to have identical 5′ COI DNA sequences (this is the same region selected as a DNA barcode for animals). The authors conclude that European C. glomerata has been accidentally introduced to Canary Islands and is threatening a local endemic butterfly already under pressure from habitat loss. Without mentioning DNA barcoding by name, the authors conclude with a call for “increased effort to sequence morphological Costesia spp. from a broad geographical range…enabling the regular testing of species hypotheses…and the incorporation of all life stages using a single character set”. I hope that the authors can join forces and enable their sequences and associated metadata (eg collection location, specimen photographs, voucher information) from this and future Cotesia spp work to be usefully combined with growing COI barcode database (>415,000 COI barcode records from >41,000 species in BOLD so far, including 514 records from 89 named and provisional Cotesia spp). Looking ahead, routine application of DNA-based identification to parasitoids will help establish host ranges of potential biocontrol agents and detect inadvertent introduction of broad-range parasitoids that damage local ecosystems.

Freshwater fish DNA data debut

In June 2008 PLoS ONE, thirteen researchers from nine Canadian universities, museums, and federal agencies report on mtDNA sequences from 1360 individuals representing 195 (95%) of Canada’s 205 freshwater fish species. Hubert et al follow “best practices” established for DNA barcode records (similar criteria would enhance the value of other genetic reference data as well), namely each sequence is derived from a vouchered specimen and the barcode record includes:

  • “Bi-directional sequences of at least 500 base-pairs from the approved barcode region of COI, containing no ambiguous sites
  • Links to electropherogram trace files available in the NCBI Trace Archive
  • Sequences for the forward and reverse PCR amplification
  • Species names that refer to documented names in a taxonomic publication or other documentation of the species concept used
  • Links to voucher specimens using the approved format of institutional acronym:collection code:catalog ID number”

The researchers analyzed an average of 7.6 specimens/species, with an effort to sample across species ranges. A first pass look at genetic distances among and within Canadian freshwater fish shows results similar to those of other animal groups: average variation within species, 0.3%; average minimum distance between congeneric species (nearest neighbor), 8.3%; species with overlapping mtDNA sequences, 7% (4 species pairs and 1 flock of 5 species; one of the overlapping species pairs represents probable introgression. ) Five species showed divergent clusters differing by 1-2% in different parts of their geographic ranges, and 2 species showed larger divergences (3%, 7%); some or all of these might represent distinct species. 

A challenge for science publishing is disseminating the large data sets that are increasingly generated. Restricting publication to only those studies with novel findings can lead to a kind of distortion, sometimes with serious consequences. The bias against negative studies, for example, is one factor contributing to the misculation of risks of medicines. As biodiversity genetics moves forward, we need ways to ensure high-quality work, receive appropriate academic credit, and disseminate results in a timely manner.  PLoS ONE describes itself as “an international, peer-reviewed, open-access, online publication…that welcomes reports on primary research from any scientific discipline.” It seems to me that this sort of forum with a focus on quality rather than novelty is needed as a home for publication of large genetic data sets including DNA barcode records. Making this information available in a timely manner will in turn help drive development of analytic and display tools and enable scientific applications, such as identification of fish eggs and larva shown above. 

First barcode data release paper is published

The first barcode data release paper has just been published by PLoS One. These data comply with the BARCODE data standard and the paper includes a table that links data records in GenBank to museum voucher specimens and data in the BOLD workbench database. This paper provides important background to our discussions with PLoS on data publication and community-based management of data curation and publication.

Hubert N, Hanner R, Holm E, Mandrak NE, Taylor E, et al. (2008) Identifying Canadian Freshwater Fishes through DNA Barcodes. PLoS ONE 3(6): e2490. doi:10.1371/journal.pone.0002490

DNA helps sort out really big animals, crowding Ark

How many giraffes were onboard the Ark?  Giraffes are classified as a single species, Giraffa camelopardalis, with five to nine subspecies proposed based on regional variation in pelage (coat pattern). In 21 dec 2007 BMC Biology (open access) researchers from University of California, Los Angeles; Center for Conservation Research, Omaha Zoo; and Mpala Research Centre, Kenya, investigate genetic variation in giraffes across African continent. 

Using biopsy darts, the authors collected skin specimens from 266 giraffes at 19 localities in West, East, and South Africa. A 654 nucleotide region of mtDNA spanning cytb and control sequences was analyzed, revealing 35 haplotypes, and the remainder of the cytb gene (1709 bp total) was sequenced from one individual from each of the 35 haplotypes. The mtDNA sequences clustered into six reciprocally monophyletic lineages, which corresponded to groupings according to pelage pattern and regional location, and were largely concordant with subspecies designations.  Genetic distances suggested these groups have been reproductively isolated for 0.3 to 1.6 MY, similar to calculated divergence times among other closely-related mammals.

Analysis of 14 nuclear microsatellites from 381 individuals at 18 locations (it is not clear whether these are the same individuals as above) recovered the same six groups and suggested additional genetic subdivisions within some groups. Although at least some of the genetically and pelage-defined clusters have overlapping or adjacent ranges without geographic barriers, only three (0.8%) of individuals were identified as hybrids. These findings raise interesting questions about giraffe biology; for example, is there behavioral isolation perhaps based on visual recognition of pelage patterns? 

It is impressive that species can be overlooked in such large, boldly patterned, iconic animals.  Might there be similar divisions within the numerous species of small, brown, rarely seen mammals? Routine DNA analysis of a standardized mtDNA region (aka DNA barcoding) will help discover how finely divided animal biodiversity is.  Wilson and Reeder’s Mammal Species of the World, Third Edition lists 5,419 species, so this appears to be an achievable goal for our mammalian kin (list available online https://nmnhgoph.si.edu/msw/).  I hope the authors include barcode region COI in their next analyses, so their data can be easily combined with other data sets, including the 28,560 mammalian barcode records in BOLD to date. 

Pirelli Prize

The video for the launch of the Encyclopedia of Life earned a 15,000 euro Pirelli International Award for science communication. Thanks to Avenue A/RazorFish and all other members of the team who helped prepare it.

High school students help demonstrate practicality, utility of DNA barcoding

High school students in San Diego are using DNA barcoding to survey life in San Diego Bay, ranging from invasive mussels, to gastropod egg masses on eel grass, zooplankton and endangered species. Under leadership of Dr. Jay Vavra, students developed a simplified protocol for DNA extraction and amplification that can be performed in the high school’s biotechnology laboratory, and successfully identified dried jerky meat from ostrich, turkey, and beef. They have established a collaboration with East African graduate students to apply this approach to identifying bushmeat from endangered species in local African markets.

Just two years ago, in Syst Biol 55: 844, 2006 some taxonomists worried whether DNA barcoding would ever be useful: “The truth is that DNA barcoding will not have any meaningful use for the general public and even when a portable barcoder becomes available it will not lead to any increase in the biological literacy of the man in the street.” Authors Cameron et al might want to visit their local high school!

For high school students DNA barcoding seems as natural as texting. You can analyze DNA to identify species? Sure. You only need a trace sample, like a hair or a bit of dried skin? Sure, just like CSI shows. On the other side, many identification keys are not practical for most persons who would like to identify what is in their backyard.

Budget Hero

Continuing our interest in Serious Games, we encouraged the creation of an interactive video game about the US federal budget deficit through Sloan and Lounsbery.  The game, Budget Hero is off to a great start in the blogosphere (for example, freakonomics]) and more than 40,000 people have completed a play.  Congratulations to Michael Skoler, Dave Rejeski, Ben Sawyer, and other Serious Gamers.