Web initiative aims to help clear name confusion

“The first part of knowledge is getting the names right.”   Chinese proverb quoted in Evolution of Insects, Grimaldi and Engel, 2005.

Species names are the primary entrance for accessing biological knowledge about organisms. However, the tangled bank of nomenclature created by 250 years of diverse communities of taxonomic specialists working largely in isolation challenges those seeking knowledge. It can be difficult to know what is already known. Identifying even well-studied organisms in backyards, such as North American ants for example, may require graduate-level training. As taxonomic knowledge moves increasingly onto the web, tools that enable non-specialists and specialists alike to access biological knowledge of organisms are beginning to be developed. In my view, the solution will be a combination of information science tools enabling access to biological literature together with a universal library of standardized genetic sequences, ie DNA barcodes, and simple technologies for barcode sequencing. 

An exciting development in taxonomic information science is https://www.ubio.orguBio (Universal Biological Indexer and Organizer) www.ubio.org, “an initiative within the science library community to join international efforts to create and utilize a comprehensive and collaborative catalog of known names of all living (and once-living) organisms. The Taxonomic Name Server (10,699,999 NameBank records so far) catalogs names and classifications to enable tools that can help users find information on living things using any of the names that may be related to an organism.??” 

The uBio site provides a sophisticated and enjoyable illustrated introduction (excerpt at right) to the variety of challenges in retrieving information using organism names. Another feature is Nomenclator Zoologicus, a searchable list of the names of genera and subgenera in zoology from the tenth edition of Linnaeus 1758 to the end of 2004, developed with Zoological Society of London. uBio is helping organize and index Encyclopedia of Life (“a web page for every species”) and Biodiversity Heritage Library (1.124 million pages digitized and on the web so far). 

I close with an example from birds. Some taxonomic confusion reflects the struggle to integrate older works that use outdated taxon names or species limits with modern knowledge. Other discordances reflect lack of consensus among current experts. Given the intensity of scientific study and public interest in birds, it is surprising there is no single authoritative world checklist, especially since most of the differences at the species level reflect minor disagreements about generic assignment, a few cases of splitting/lumping, or differences in spelling. As one step until there is an expert consensus checklist, for those interested in birds, we have prepared an “ABBI Name Lookup” (Excel, 8 MB) file for harmonizing specimen lists that recognizes 2,462 synonyms, alternate and misspellings, and extinct species.

Optimizing PCR primers for amphibian COI sequences

“Amphibians are globally in decline, yet there is still a tremendous amount of unrecognized diversity” observed Vences et al in 2005 Phil Trans R Soc B 360:1859, the first report applying DNA barcoding to amphibian diversity.  Vences and colleagues highlighted the pressing need for fast and reliable identification tools, including for eggs and larva, which are often unrecognizable morphologically.

Here I focus on one technical aspect of DNA barcoding amphibians, namely designing primers that amplify the target sequence from a broad range of species. Previous research had shown remarkable mitochondrial sequence diversity among closely-related amphibians, and even within what appear to be single species, some of which may represent cryptic species. In the 2005 Proc R Soc B paper, researchers used COI primers designed for invertebrates (Folmer et al 1994); suprisingly these “worked in a large proportion of specimens”. They concluded “We support attempts to build up a global and complete cox1 database of [animal] eukaryotes”.

In 2005 Frontiers Zool 2:5 the same group of researchers quantified their PCR amplification success on specimens from 38 individuals representing 20 amphibian species. Using a well-established primer set for vertebrate 16s (Palumbi et al 1991) 38 of 38 (100%) samples amplified; with 3 COI primer sets (1 for invertebrates, 2 for birds), 36 of 38 (95%) amplified, although there was only 50-70% success for the individual COI primer pairs. The authors did not attempt to design new primers for amphibians. They concluded “we strongly advocate use of 16s rRNA as standard DNA marker for vertebrates to complement COI”. This seems reasonable but the advantages of standardizing on a single gene call for an effort to design primers that amplify COI from amphibians before abandoning the field to 16s or some other marker.  

In 2007 Mol Ecol Notes Smith and colleagues from University of Guelph analyzed 83 amphibian COI sequences in GenBank to design new primers. The 3′ ends of the forward and reverse primers bind at 1st or 2nd codon position G-C residues, which they found to be highly conserved among amphibian species, and each primer contains three 2-fold degenerate sites. Using this set, they amplified full-length PCR products from 267 of 377 specimens (71%) representing 39 amphibian species (including Triturus vulgaris illustrated at right), and recovered an additional 34 sequences (9%) using a “mini-barcode” primer set designed for butterflies. The authors comment “many of the specimens…which failed to amplify had been fixed in formalin or were collected more than 15 years ago”, so further work to test these primers on fresh material and a diversity of species is needed.

Amphibians are an exciting group. A comprehensive amphibian DNA barcode library will likely provide many, many new insights. I believe further work will help establish robust primer sets for amphibian COI sequences. 

Non-invasive DNA recovery leaves tiny specimens intact

Rowley et al Mol Ecol Notes 2007Reference databases of DNA sequences used for species identification, ie DNA barcode libraries, are most powerful when the morphologic specimens are vouchered in a museum collection. This way, when there are puzzling results, DNA and morphologic specimens can be re-examined. However to date it has been challenging to recover DNA from small organisms without destroying them in the process. 

In Mol Ecol Notes 9 aug 2007 researchers from US Department Agriculture and Smithsonian Institution, National Museum of Natural History, describe a uniform protocol for “nondestructive extraction of DNA from terrestrial arthropods” including ticks, spiders, beetles, flies, and bees. 1 to 4 h in a guanidium thiocyanate extraction buffer yielded amplifiable COI DNA from most specimens. Inspection of specimens after extraction including with phase contrast and scanning electron microscopy demonstrated preservation of most morphologic characters.

In Mol Ecol Notes 27 june 2007, UK researchers (University College, London, NERC Centre for Ecology and Hydrology, Oxford, and UK Environmental Agency) describe a rapid, non-destructive, chemical-free method for DNA recovery from blackflies, including adult, larval, and pupal forms. Hunter et al report brief (1 minute) sonication in sterile water yielded 66% success with COI barcode amplification and preserved morphologic details.

These reports are exciting in the methods they describe and in how they highlight the general value of extracting DNA and determining DNA barcode sequences as an integral part of preparing traditional morphologic vouchers. 

Neotropical birds: Argentine researchers speed past halfway point

The Neotropics, comprising southern Mexico, Central America, Caribbean, and South America, is home to over 4,000 bird species, representing over 40% of world birds. In this post, Pablo Tubaro, Museo Argentino de Ciencias Naturales (MACN), Buenos Aires, Argentina, sends this update on DNA barcoding birds of Argentina:

“This project, which started in December 2005, is a collaboration between MACN and the Biodiversity Insitute of Ontario/Canadian Center for DNA Barcoding (BIO/CCDB). In November 2006 the project was boosted by a grant from the Richard Lounsbery Foundation that supports expanded collecting efforts in Argentina, training of Argentine students at CCDB (2 trained so far), and establishment of a DNA laboratory at MACN. 

A special feature of this project is that it started literally from scratch. As there were no significant collections of frozen bird tissues with associated vouchers in Argentina, we started by resampling the country from north to south, conducting joint campaigns in collaboration with researchers from several North American institutions including American Museum of Natural History, Cornell University, Louisiana State University, Queen’s University, University of Alaska, and University of Kansas. At present our frozen tissue collection with associated vouchers includes more than 3100 samples and is growing rapidly. We will be doing field work at Iguazu National Park in November and December and aim to have collected 70% of Argentine birds by the year’s end.

Results so far show interspecific and intraspecific levels of divergence in COI squence are similar to published results with North American birds. In more than 98% of cases, the COI sequences belonging to different species do not overlap. In addition, in 3% of cases Argentine birds show distinct COI sequence clusters, suggesting the possible existence of cryptic species or geographical races that deserve species status. At this moment, four doctoral and post-doctoral fellowships have been requested or are already awarded by the National Research Council of Argentina (CONICET) and the National Science Foundation of Argentina (ANPCyT) to study in depth the phylogeographic structure of some of the interesting cases revealed by our DNA barcode survey.”

Congratulations to Pablo Tubaro and his team on their rapid progress in DNA barcoding Argentine birds, creation of a significant avian tissue and skin collection at MACN, and on recognition of the value of this work by science institutions in Argentina!
 

Taxonomy without borders

341 researchers from 44 countries gathered for the Second International Barcode of Life Conference, held at Academia Sinica, Tapei, Taiwan on 17-21 September 2007 (program, participants, and abstracts at www.dnabarcodes2007.org).

Conference presentations highlighted a thrilling array of progress on diverse scientific and practical fronts since the First International Barcode of Life at The Natural History Museum, London, in February 2005 (London Conference proceedings in themed issue Phil Trans R Soc 360: 2005 available through Consortium for Barcode of Life (CBOL) website.  I found the Tapei conference to be a landmark demonstration of the value to society and science of a standardized, inexpensive approach to identifying species through DNA, ie DNA barcoding.  The Economist’s 20 September 2007 piece “Name, rank, and serial number” recaps results so far and looks ahead to near future societal benefits.  

Near the close of the conference, David Schindel, Executive Secretary for CBOL, referred to the DNA barcode initiative as “taxonomy without borders”. Just as removing security fences benefits African wildlife, standardized inexpensive technology for species identification, ie DNA barcoding, is helping remove barriers that balkanize taxonomy and limit public access to biological knowlege. The DNA barcode initiative, together with the Encyclopedia of Life which includes digitizing the world’s taxonomic literature are creating powerful new ways of seeing biodiversity, with benefits to society and science. 

I look forward to a future in which the multiple sectors of taxonomic and biodiversity science are densely linked to each other and public users.  

Adapted from Valdis Krebs, Emergent Online Community

Exploring mitochondrial DNA differences within species

Paul De Barro, CSIRO, Australia, recently posed the question “What is the expected level of mitochondrial variation within species?” The answer may be “almost none”. Results so far with DNA barcoding initiatives show average intraspecific variation in most animal species, whether saturniid moths or sand martins, is on the order of 0.5% or less. Here is my somewhat speculative set of inferences drawn from the finding of low variation within most animal species:

from Joron, Mallet TREE 13:461, 19981. Low intraspecific variation implies low effective population size (Ne); according to my back-of-the-envelope math, about 10,000 or so for most animal species. The apparent ceiling on Ne is low enough that census population size and species age, both of which might be expected to be determinants, do not contribute to intraspecific variation.

2. What about species with larger average differences in mitochondrial DNA? Most are mosaics of reproductively-isolated or partially reproductively-isolated populations, some of which might be considered separate species. According to standard models of sequence evolution, it takes tens of thousands of years of reproductive isolation for distinct lineages of mitochondrial DNA to arise; significant morphological, ecological, and behavioral differences considered characteristic of separate species may arise over that length of time as well. 

3. The paradoxical observation of large differences between species (indicating steady change) and small differences within (indicating change is constrained) implies that the pool of variants within a species changes steadily over evolutionary time scales. Like influenza virus, which regularly produces new variants that replace last year’s strains, the DNA sequences within breeding populations are continuously evolving, so that reproductive isolation over a sufficient period of time inevitably leads to genetic divergence. There may be morphologic stasis but there is no genetic stasis.

4. The usual absence of multiple lineages (with say >1% divergence in coding mtDNA) within breeding populations implies selection against hybrids and their offspring.  

Now for some complex real data that challenge this simple model! In Proc R Soc B August 2007, researchers report on “Limited performance of DNA barcoding in a diverse community of tropical butterflies”. Elias and colleagues examined COI barcode region mtDNA sequences in 353 specimens from 57 species of ithomiine butterflies, most from 2 study sites in eastern Ecuador. Ithomiines are a tropical subfamily of approximately 360 species, virtually all of which are part of dizzyingly complex “rings” of Mullerian mimicry (all species distasteful) in which multiple species, some only distantly related, have nearly identical morpholgy. There is often marked geographic variation within what are considered single species such that different regional forms participate in different rings. For more appreciation, there are gorgeously illustrated research and other sites on ithomiines and other Mullerian mimics. 

This exemplary study helps demonstrate the power of analyzing a standardized region, ie DNA barcoding, as their findings can be directly compared to results in other studies. In NJ analysis using the 273 study site specimens, the authors found that 44 of 57 (77%) of species formed well-supported (>50%) clusters. When sequences from non-local specimens were added to the analysis, and considering only species with more than one congener and with local and non-local sequences, 28 of 41 species (68%) formed distinct clusters. So one might mark down this group as challenging for DNA barcode approach to species identification.

One question is whether genetic diversity is more finely divided than current taxonomy recognizes. Differences within species sampled at distant geographic sites were as high as 8.5%, which the authors view as expected variation for tropical species with large census population sizes. Is this correct? Do larger populations support greater mitochondrial variation? According to report last year by Bazin et al Science 312:570 April 2006, the answer is no, but this conclusion seems not yet widely embraced.  Following Bazin et al and the model outlined above, I suggest the genetically divergent forms reflect reproductively isolated allopatric populations and some might turn out to represent different species. 

On the other end, some species had nearly identical COI sequences. Are these young species?  The authors helpfully analyzed nuclear gene EF-1 alpha for most specimens and state that the nuclear gene sequence improved species-level identifications compared to mtCOI. On my inspection the published tree shows a similar overlap of EF-1 alpha gene sequences, which together with COI data suggests these are very closely-related young species.  Recent work by some of the same authors Nature 441:868 14 June 2006 shows new species formation in just 3 generations in related Heliconidae butterflies through hybridization, so perhaps there are mechanisms that enable very rapid emergence of distinctive forms within these butterflies. There are presumably swarms of populations within many species that are distinctive in one form or another. 

As this study shows, comparing relative and absolute differences in a standardized gene region is a useful approach for exploring the genetics of biodiversity. DNA barcode data sets can help address the question of whether population size influences mitochondrial sequence variation, and in turn the answer will help in understanding the patterning of genetic diversity among and within species. I look forward to more data on ithomiines and their relatives! 

Scanning mosquito barcodes to help solve disease mystery

What limits Japanese encephalitis virus (JEV) to its current range? JEV is a mosquito-transmitted flavivirus related to yellow fever and West Nile viruses that causes approximately 40,000 human cases annually in SE Asia. Although regular epidemics occur in islands off Papua New Guinea as close as 70 km to Australia and the major JEV vector in Papua New Guinea (PNG), Culex annulirostris, is found throughout Australia, there have only been sporadic cases in Australia and the disease has not become established there.

In 29 June 2007 BMC Evol Biol researchers analyzed mitochondrial COI and nuclear ITS 1 sequences in 273 mosquitos identified as Culex annulirostris or its close relatives Cx. palpalis and Cx. sitiens, collected at 30 locations in Australia and Papua New Guinea.  Hemmerter et al found that 10% of morphological identifications were incorrect, based on ITS 1 sequences, and there was “100% agreement between the ITS 1 diagnostic and the COI sequence grouping of Culex spp.” Bayesian phylogenetic trees with COI showed “distinct geographically-structured lineages” (ie possible cryptic species) within the vector species Culex annulirostris, and two of the four Cx. annulirostris lineages are restricted to PNG, with a southern limit at the top of Australia’s Cape York peninsula, “which correlates exactly with the current southern limit of JEV activity”.  Analysis of blood meals reveals the Australian Cx. annulirostris feed mainly on marsupials (PNG lineages feed on wild pigs which are the primary JEV reservoir), and laboratory studies indicate Australian Cx. annulirostris is an inefficient vector for JEV. As the authors note, it seems likely these genetically and biologically distinct lineages are likely different species.

One limitation of this study is that the COI region analyzed does not match the COI barcode region. By my analysis the 538-bp fragment analyzed in this study starts at position 359 in COI. As the defined COI barcode region is 648 bp starting at position 58, there is only 289 bp overlap between the sequences in this study and COI barcodes.  It appears generally straightforward to amplify COI barcodes from insects including mosquitos, so I hope the next study on genetic differences in human disease vectors will amplify the COI barcode region, as that will enable linking the results to the growing DNA barcode library, amplifying the power of the research itself. 

I conclude that routine application of standardized genetic testing, ie DNA barcoding, will help in understanding the distribution of mosquito biodiversity, with implications for human health.

Marine barcode of life initiative joins web panoply

In July 2007 the Marine Barcode of Life initiative (MarBOL) surfaced at www.marinebarcoding.org. MarBOL is “an international initiative to enhance our capacity to identify marine life by utilizing DNA barcoding”. It is an offspring of the Census of Marine Life (CoML), a ten-year initiative to assess and explain the diversity, distribution, and abundance of marine life in the oceans and the DNA barcode initiative.

The target list for MarBOL includes the diverse invertebrates that inhabit the oceans, as well as marine mammals, fish, and birds. MarBOL will be compiling barcodes collected through CoML projects, including those focused on marine zooplankton (CMarZ), pelagic animals (TOPP), nearshore environments (NaGISA), reefs (CReefs), continental shelves (COMARGE), seamounts (CenSeam), deep water vents (ChEss), abyssal plains (CeDAMar), Arctic Ocean (ArcOD), Antarctic Ocean (CAML), northern Mid-Atlantic ridge (MAR-ECO), Gulf of Maine (GoMA), northeastern Pacific continental shelf (POST), and perhaps even marine microbes (IcoMM)! The project will also utilize barcodes collected by ongoing barcoding initiatives on fish, birds, and sponges.

Part stands for the whole

A synecdoche is a figure of speech in which a part stands for the whole, or the whole stands for a part. Taking the first, we might consider a DNA barcode as a synecdoche, in which the short barcode gene fragment stands for whole genome. As in the figure, a COI barcode usually encapsulates the differences found elsewhere in the mitochondrial genome. Because COI barcodes generally capture the discontinuities we recognize as species, we can surmise that differences in this short mitochondrial gene fragment usually reflect differences in the nuclear genome. More study of variation within and among species will help understand why differences in mitochondrial and nuclear genomes appear inextricably linked. 

for larger version click here

DNA barcode helps describe new goby, a vertebrate first

In 12 July 2007 Zootaxa, Benjamin Victor, Ocean Science Foundation and Nova Southeastern University, describes a new species of goby Coryphopterus kuna from the western Caribbean. Although species descriptions often cite DNA sequence differences as evidence for species status, the sequence data itself is usually not shown. Victor’s work is the first vertebrate species description that includes the holotype mtCOI DNA barcode, a simple step that will enable more persons to identify this fish regardless of life stage (egg, larva, and adult forms of an individual all have the same DNA of course) or whether specimen is in bits and pieces, as in stomach contents of a predator for example.  (For a look at the strange diversity of fish larva, see Victor’s web-based photographic guide to larval fishes of the Caribbean).

The process that leads to taxonomic recognition of new species is often glacially slow. In this case the holotype specimen was collected off the coast of Panama in 1982, twenty-five years ago. Just as the Human Genome Project generated enormous amounts of raw sequence data, genetic explorations of biodiversity, including DNA barcoding, are creating vast amounts of data that outpace the ability of traditional species descriptions to keep up. Making the sequence and specimen data available through public databases in BOLD and GenBank might lead others to find to new ways of analyzing biodiversity in addition to the stately process of formal species descriptions.