Next up, bees

On 15 may 2008 an international assembly of bee experts gathered at York University and announced a new initiative to DNA barcode world’s bees. Some snippets from news reports:

“According to York biology professor Laurence Packer, who’s leading the group’s efforts, precisely 19,231 different kinds of bees have been identified. But he thinks there might be another 5,000 or more species out there waiting.” (Toronto Star

“There are two reasons why bee species would benefit from a barcode name tag. “Most of the specimens in museums are not identified, and the ones that are identified are only 60 to 70 percent correctly identified,” says Dr. Packer.” (University Affairs)

Early results, as in figure from poster at right, suggest that DNA barcoding will help sort out bee taxonomy and speed discovery of new species, with benefits to society and science.  Bees are essential pollinators for many of the world’s plants, including many endangered species, and approximately 1/3 of world’s food is derived directly or indirectly from bee-pollinated plants.

In common with other Hymenoptera (bees, wasps, ants), bees exhibit haplodiploidy (males haploid, females diploid), many species are eusocial (live in large colonies with single queens), and have greatly accelerated rates of mitochondrial evolution. Are these factors causally linked? Looking at differences among and within world’s bees may help provide fundamental insight into mitochondrial biology and evolution.  

Revealing tiny food webs

As Bruegel the Elder recognized in 1557, “big fish eat little fish”. Determining exactly what eats what remains a fundamental question in modern ecology and this task is particularly challenging for biologists studying small organisms, which make up the bulk of the biomass in ecosystems. To add interest, a number of these tiny creatures with unknown diets are medically and/or economically important disease vectors.

The Metropolitan Museum of Art www.metmuseum.org

In Mol Ecol Resources May 2008 researchers from University of California, Irvine, Kenya Medical Research Insitute, and Kanazawa University, Japan, apply DNA analysis identify gut contents of larval Anopheles gambiae complex mosquitoes, the major malaria vector in Africa. 

Photo Richard C. RussellAuthors Garros, Ngugi, Githeko, Tuno, and Yan collected anopheline larva near Kisumu in western Kenya, dissected stomach contents of third and fourth instar forms, extracted DNA, and amplified an 800 bp fragment of nuclear 18s rRNA. A separate PCR assay was used to confirm species identity (five were A. gambiae s.s. and 68 were sister species A. arabiensis). According to authors, 18s rRNA was analyzed rather than COI because “more sequences are available [for 18s than for COI] in databases for plants, fungi, and protists”. I note there are now many research groups working on “plants, fungi, and protists” so it should be possible to achieve greater resolution in this sort of study as the DNA barcode libraries are built up.

The PCR products from gut contents were first screened with a restriction endonuclease known to digest mosquito but not algal 18s.  355 PCR products from eight randomly selected larvae were screened, yielding 14 unique non-mosquito sequences. Best matches in GenBank blast results clustered into 3 main clades: green algae (7), fungi (5), and unknown eukaryotes (2). The authors conclude “the method presented in this study may be a promising tool to investigate natural diets of [anopheline] larvae”. Looking ahead, “such studies will not only improve our understanding of Anopheles larval ecology, but also provide fundamental information to facilitate the develpment of novel larval control tools.”

This study is one demonstration of how routine DNA analysis combined with expanding DNA barcode libraries will help reveal and monitor changes in a multiplicity of tiny food webs. More generally, routine DNA analysis combined with reliable DNA reference libraries opens wide avenues for rapid progress in understanding how the diversity of organisms interact, with benefits to society and science. Continued development of robust, inexpensive methods for analyzing DNA from the various types of biological samples and methods of matching the results to well-curated DNA reference libraries will speed this along.   

Finding frogs with DNA

Knowledge of how species are distributed is essential for understanding evolution and ecology, and monitoring enables detecting invasive species and recognizing effects of biological and physical environmental change. That’s easy to say, but many species are small, secretive, or difficult to distinguish from one another, so mapping species distributions requires enormous human effort and ongoing monitoring requires even more. I venture a guess that we have good monitoring for 10,000 or so plant and animal species, mostly large animals and those plants and animals of economic importance, and static distribution maps for about 100,000 species, out of a total of 1.7 million named species and not counting the projected total of 10 million species that might eventually be recognized when surveying biodiversity approaches completion. 

Just as high-resolution satellite mapping has surpassed most ground surveys in accuracy, speed, and cost, we need efficient technologies that can help detect and monitor species from environmental samples. In 9 April 2008 Early Online Biol Lett researchers from Universite Joseph Fourier and Universite de Savoie, France, and Universita Milano Bicocca, Italy, apply high-sensitivity DNA analysis to detect presence or absence of American Bullfrog Rana catesbeiana, a globally invasive species. PCR amplification of a diagnostic 79 bp fragment of mitochondrial gene cyt b using species-specific primers (no amplification of samples from the 5 locally native Rana sp). Three 15 mL water samples were collected from each of 9 ponds (surface area 1000-10,000 m^2) in France, including “three ponds where bullfrogs were present at low density (one to two adults seen, no reproduction), three ponds where bullfrogs were present at high density (dozens of adults and thousands of tadpoles), and three ponds where bullfrogs have never been detected.” Each sample was analyzed 3-5 times, giving 9-15 repeats per pond. R. catesbiana DNA was never detected in the ponds with no bullfrogs and was detected in water samples from all three high-density ponds, with most (79%) of replications positive. Bullfrog DNA was also detected in all low-density ponds, although fewer of the replications were postive ( 37%).

Ficetola et al observe “our approach allows the reliable detection of secretive organisms in wetlands without direct observation.”  The authors conclude “The ongoing effort to develop DNA barcodes for identifying species  from degraded DNA (Hajibabaei et al 2006; Taberlet et al 2007) will make our approach applicable to more and more plant and animals species…These factors will soon make possible the assessment of the current biodiversity of macro-organisms from environmental samples.” 

Like satellite mapping 20 years ago, DNA-based environmental monitoring of biodiversity, aided by growing DNA barcode libraries, is set to expand rapidly.

Detecting aliens with DNA

Alien species sometimes damage native landscapes. In Voyage of the Beagle, in entry dated September 19, 1832, Darwin describes the spread of an introduced European thistle Cyanara cardunculus in Banda Oriental, now Uruguay: “very many (probably several hundred) square miles are covered by one mass of these prickly plants, and are impenetrable by man or beast. Over the undulating plains, where these great beds occur, nothing else can live…I doubt whether any case is on record, of an invasion on so grand a scale of one plant over the aborigines.”

The challenge is to recognize invasive species before they become established. In 11 January 2008 Polar Biology researchers from Stellenbosch University and University of Western Ontario apply DNA barcoding to otherwise unrecognizable moth larvae on sub-Antarctic Marion Island. The indigenous Lepidoptera on Marion Island comprises 2 or 3 flightless moths, and the occassional adult winged moths or butterflies have been assumed to be transients arrived on fresh produce.

In April 2004, 3 noctuid moth larvae were found in an abandoned Wandering Albatross nest, a common habitat for one of the indigenous moth species. The larvae could be tentatively identified only to genus level and so rearing was attempted, with one larva dying after several months of pupating (as an aside, this is one example of how morphologic identifications can be laborious and/or incomplete, even for experts). The final larva was killed and preserved for DNA study; COI DNA barcode region was amplified using standard Folmer primers. The Marion Island moth larva barcode clustered with the 40 or so Black Cutworm Agrotis ipsilon sequences in BOLD, and was distinct from COI sequences of the other 18 Agrotis species in BOLD. Agrotis ipsilon is a common pest that feeds on a wide variety of plants. The authors conclude that Agrotis ipsilon is an established alien species with the potential to disrupt local ecosystems and that “steps be taken to eradicate the species from Marion Island.”

It is easy to predict that rapid identification of potential invasive alien species will be a major application of DNA barcoding, with direct economic and ecosystem benefits.

Routine DNA ID as quality control in ecology and evolutionary biology

photo Colorado Division of WildlifeJust as DNA analysis regularly overturns seemingly solid eyewitness identifications in crime investigations, routine DNA analysis can also help biologists avoid blunders. In 28 August 2007 Mol Ecol, researchers from University of Colorado, New Mexico State University, Pisces Molecular, and Brigham Young University report that over 20 years of restocking efforts in western US aimed at restoring native populations of endangered greenback cutthroat trout Oncorhynchus clarkii stomias have mostly been restocking a non-native, non-endangered subspecies, Colorado River cutthroat trout O. c. pleuriticus.  They trace the confusion to repeated introductions beginning in the late 1800s of Colorado River cutthroat trout throughout the native range of greenback cutthroat trout. The authors analyzed mitochondrial (COI, ND2) and nuclear (microsatellites, AFLP) DNA from 365 individuals from 15 locations in 3 major river drainage systems in Colorado and surrounding states. Distinct mtDNA lineages corresponding to each subspecies were corroborated by nuclear microsatellite and AFLP data.  For another cautionary tale of repeated misidentification of a widely studied organism, see Siddall and colleagues’ entertaining June 2007 Proc R Soc paper scrutinizing commercially available medicinal leeches sold as Hirudo medicinalis. 

How might the future look with routine application of DNA ID as quality control? Incorporating DNA barcode analysis into Tree of Life studies is one useful approach, exemplified by two recent large-scale evolutionary studies published in January and April 2008 Syst Entomol, one on phylogenetic relationships in Saturnid silkmoths, and one on higher-level relationships among 12 families in ‘bombycoid complex’ of Lepidoptera. Both studies analyze COI barcodes of all specimens, “allowing confirmination of their identification for species present in the BOLD reference library and enabling future identifications of organisms whose identity is still pending.”

Encylopedia of Life opens, another step toward global commons of biodiversity knowledge

On February 27, 2008, Encyclopedia of Life (EOL), “a web page for every species” officially launched, with over 30,000 species pages, mostly fish so far, and a diversity of links to internet resources including Biodiversity Heritage Library (2.9 million pages digitized). In case you missed it, there is a thrilling, award-winning video.  Following Wikipedia model, EOL users are invited to become “curators” for one or more species pages, and later this year all are invited to submit content (photos, drawing, text, video, for example) for review. For an entertaining brief history of Wikipedia and why it keeps getting better see Nicholson Baker’s review of John Broughton’s Wikipedia: The Missing Manual in March 20, 2008 New York Review of Books.

Most near-sun comets are now discovered by amateurs, using images downloaded from the Solar and Heliospheric Observatory (SOHO), a satellite launched December 2, 1995 as part of international collaboration between European Space Agency (ESA) and National Aeronautics and Space Administration (NASA). I expect that EOL and other open-access databases will lead to many more persons contributing to biodiversity science.

.

.

.

.

.

Solving puzzles of mitochondrial variation within and among species

What limits mitochondrial variation within species?  In January 2008 PLoS Biology researchers from Karolinksa Institute, Sweden, and University of Newcastle upon Tyne, United Kingdom, report on an ingenious mouse model that shows strong purifying selection acting within a single generation, or even earlier, during embryogenesis. Stewart and colleagues employed “mtDNA mutator” mice which are homozygous defective for a nuclear gene which encodes a proof-reading subunit of mtDNA polymerase. These mice have increased levels of mtDNA mutations in all tissues, with mutations evenly distributed along all codon positions in mtDNA protein genes, accelerated senescence and “a number of phenotypes associated with mitochondrial diseases.” mtDNA mutator mice were backcrossed to wild-type mice to produce offspring that inherited defective mitochondria but whose nuclear genome is homozygous normal at the mtDNA polymerase locus.  They then sequenced entire mitochondrial genomes from 190 of these progeny individuals in N2 to N6 generations (N2 is the first backcross that is homozygous normal at mtDNA mutator locus). To skip to the conclusion, most of the non-synonomous mitochondrial mutations were eliminated, leaving a pattern of  excess synonymous mutations similar to that seen in human populations (which are the largest dataset so far for mitochondrial variation). The authors conclude that the mitochondrial population bottleneck known to occur at oogenesis, which deposits just one or few mitochondrial genomes per oocyte, means each mitochondrial genome must stand on its own so to speak, with the result that those eggs, embryos, or offspring harboring defective mitochondria will fail to survive. My figure at right tries to illustrate part of this process. 

In the same issue, David Rand, Brown University, provides a lucid commentary on Stewart et al’s research putting it in the context of mitochondrial and evolutionary biology, and suggesting next steps. Among others, he notes “the new mouse study also begs new questions about positive selection on mtDNA. …it is interesting that no signature of a selective sweep leading to fixation of a novel mtDNA variant was evident in the data”.

Purifying selection against deleterious mutations enabled by an embryonic bottleneck may save mtDNA from “mutational meltdown”. Now we need to understand more about the positive selection on mtDNA that presumably occurs when species adapt to new environments or diverge. I believe that growing mtDNA databases in the form of COI barcodes from a diversity of organisms with varying size, lifespan, population size, and reproductive strategy, in a diversity of environments including marine, terrestrial, temperature, and tropical regions will help solve this puzzle.

Testing DNA barcodes to help identify biodiversity hotspot plants including endangered and cryptic species

Plants challenge DNA barcoding. It has been difficult to identify candidate barcode regions that amplify readily and also distinguish among closely-related species. In 7 February 2008 PNAS (open access) researchers from University of Johannesburg; University of Costa Rica; Royal Botanic Gardens, Kew; and Imperial College, London, analyze potential barcode regions on specimens collected in plant biodiversity hotspots in Kruger National Park, South Africa, and Costa Rica. They initially tested eight candidate regions identified in earlier studies (coding regions accD, rpoC1, rpoB, ndhJ, ycf5, and matK, and non-coding trnH-psbA). Amplification was done according to earlier studies except that a different set of matK primers was used which appeared to be more effective. All eight regions were examined in 101 specimens representing 32 species of trees, shrubs, and achlorophyllous parasites from South Africa, and on 71 specimens representing 48 species of Costa Rican orchids (in all, 44 species with 2-7 specimens per species, and 36 species with one sample). Based on their analysis, the coding region matK with the new primer set and the non-coding region trnH-psbA were >90% effective in species identification. For reasons I do not understand, the authors favor unweighted pair group method with arithmetic mean (UPGMA) for analyzing genetic clustering, although they tested neighbor-joining, maximum likelihood, maximum parsimony, and Bayesian methods. Given the presumed advantages of a coding region barcode (ease of alignment, greater higher-level phylogenetic signal), Lahaye et al propose 5′ region of plastid gene matK as a first-pass standard barcode for plants.  

The authors then analyzed the 5′ matK barcode in a much larger sample of orchids: 1,566 specimens representing 1,084 Mesoamerican species. It is exciting that this is the largest test of candidate barcode variation within species for plants to date. They report 212 genetic clusters in UPGMA tree, of which “86 fully matched previously recognized species and a further 25 partially matched taxonomic species…an examination of these clusters reveals cryptic species, which need further taxonomic work”. I am unsure from this short report what “partially matched taxonomic species” are and how many possible cryptic species were identified. I look forward to a more detailed report on the DNA barcodes, morphology, and range distribution of this very large sample of Mesoamerican orchids.  A DNA-based method for identifying non-flowering orchids and other plants could help protect many threatened species. 

Deepwater clam mtDNAs map in unexplored sequence territory

Approximately 8,000 – 15,000 species of bivalves (clams, mussels, scallops, oysters, and relatives) are known. According to BOLD Taxonomy Browser www.barcodinglife.org, 620 bivalve species have COI barcode records so far, so this group is relatively unexplored genetically. In September 2007 Zoologica Scripta researchers from University of Bergen, Norway, analyze COI barcode region sequences of 62 deepwater clams dredged in a single offshore region at 69 m to 567 m, morphologically identified as 12 species from 4 genera (Thyasira, Ennucula, Nucula, Yoldiella) representing 3 subclasses of Bivalvia. The COI barcode region was amplified with broad-range primers (Folmer et al 1994). Mean differences within species collected in this single area were small, 0.0 – 0.48%, similar to results in other animal groups, suggesting assignment of specimens to species will be straightforward. This will be helpful in environmental surveys for example, as some species “are infamous for being difficult to determine to species from morphology” and some “remain difficult to identify for the non-expert.” As one example, some Thyasira species are distinguished only by sperm and egg morphology, which is impractical in most circumstances.

https://www.conchology.be/en/home/home.phpmtDNA differences among these bivalves are remarkably large, even among species in the same genus. The differences among congeneric species in this sample (average 22%, range 12-42%) are larger than differences among entire class Aves (according to my analysis with BOLD software, COI differences among birds in different orders, such as penguins and hummingbirds for example, average 20%, with range 14-28%).

Blastn GenBank searches with these divergent mtDNA sequences showed very limited identity to anything, and the closest matches were short stretches (100-150 nucleotides of the 678 full-length barcode sequence) to COI sequences of species outside the phylum Mollusca (I obtained similar results submitting Thyasira sequences for example to the public BOLD Identification Engine at www.barcodinglife.org.)  It will be helpful if Mikkelsen et al deposit their sequences along with associated collecting data (voucher specimen information, images, collection locations) to the BOLD database. I look forward to learning more about these bivalves, and whether their remarkably deep differences in mtDNA are associated with deep physiological, ecological, or other biological differences.