DNA Barcoding – Page 16 – The Rockefeller University

Evidence

May 26, 2010

What is the evidence that DNA barcoding is a reliable method for species identification?

For this commentary, “DNA barcoding” refers to nucleotide sequencing of PCR-amplified DNA corresponding to an approved barcode region, namely 5′ portion of COI for animals or rbcL + matK for land plants; and “species identification” refers to assigning the name of a known species to a specimen of unknown identity.

Acceptance by scientific community. For identification of known species, I think it is fair to say that DNA testing in general and DNA barcoding in particular are generally accepted in the scientific community as reliable methods. For example, the Canadian Centre for DNA Barcoding website has a compilation of peer-reviewed publications, which includes over 500 articles published since 2003. The primary limitation to identification is whether the relevant species and close relatives have yet been documented in the databases at the time they are queried. The BOLD database is strongest for multicellular animals (> 1,000,000 records as of May 2010; see chart), particularly arthropods and chordates. For plants, the general principles are the same, but so far there is much less documentation, as plant barcodes were not agreed-upon until last year (Hollingsworth et al PNAS May 2009), and there was not a large set of pre-existing data to Untitled-2-records-2 work with. Nonetheless, DNA barcoding of plants is ready for practical application and is providing immediately useful information (e.g. “DNA barcoding exposes a case of mistaken identity in the fern horticultural trade” Prior et al, Mol Ecol Resources April 2010) . For fungi, from perusing database it appears that ITS (internal transcribed spacer) and COI are informally accepted as barcodes. For protists and other domains of life, results so far suggest COI will serve as a primary barcode.

Most articles focus on DNA barcoding in a particular group and assess the accuracy of identification in that group. For example, in “DNA barcoding of commercially important salmon and trout species (Oncorhynchus and Salmo) from North America” (J Agricultural Food Chem 57:8379, 2009) Rasmussen and colleagues analyzed more than 1000 samples representing the 7 commercially important salmonid species from 143 sites across western North America including Alaska and Canada, (to capture possible variation within species) The authors found 100% separation of these species by DNA barcoding, i.e., distances among species were always greater than within species.

Forensic application. DNA barcoding for species identification has been used in legal cases (e.g. Cohen et al J Food Protection 72: 810, 2009). More general evidence is presented by Dawnay et al in “Validation of the barcoding gene COI for use in forensic genetic species identification” (Forensic Sci International 173:1, 2007). The authors conclude “this study demonstrates that the cytochrome c oxidase I gene enables accurate animal species identification where adequate reference sequence data exists.” As with any laboratory method, quality control and quality assurance (QA/QC) measures are essential (e.g. Morin et al J Heredity 101:1, 2010).

DNA barcode identification was designed to be a simple, straightforward method appropriate for wide use, and the results so far amply bear this out, including its use by high school students (e.g., “FDA pressured to combat rising ‘food fraud’,” Lyndsey Layton, Washington Post March 30, 2010). One aspect that needs work in my opinion are better explanations of the algorithms used for matching sequences to the databases and what the results mean. It still takes an expert to make sense of the data. Although the results are often obvious (e.g., 100% sequence identity to 10 barcode records of “Bos taurus (cow)”, interpretation is context dependent–a 100% match has a different meaning if a “neighboring” species differs by, say 1%, or if a congeneric species is not documented or is represented by a single record, for example. In my experience, identifications are usually straightforward, including recognizing ambiguous identifications. Nonetheless, for DNA barcoding to have the widest use, including in legal settings, it will be helpful to have better documentation of how we arrive at species diagnoses through DNA barcodes.

Why we need DNA ID

May 10, 2010

a) Culex pipiens, b) Culicens incidens, c) C. pipiens larvae, d) C. pipiens eggs Biting insects transmit human and animal diseases, including protozoan (e.g., malaria, leishmania, trypanosoma (sleeping sickness, Chagas disease)), filiarial (e.g., onchocerciasis, Guinea worm), and viral (e.g., yellow fever, West Nile, dengue) diseases. Control measures rely on identifying the insects, which generally requires expert training.

There are 174 mosquito species and subspecies in North America (“Identification and Geographical Distribution of the Mosquitos of North America, North of Mexico,” Richard F. Darsie, Jr. and Ronald A. Ward, University Press of Florida, 2005). Many species bite humans, but only a handful are important disease vectors. It takes an expert to identify Culex pipiens (panel A), which is the major vector for West Nile virus in eastern U.S., and to distinguish this from other species, for example, Culiseta incidens (panel B), which does not transmit human disease. Even experts are challenged by larvae (C), and eggs (D), and the latter are small and easily overlooked (egg raft size shown in inset). Planning and/or applying control measures is best done before adults hatch, but the early stages are what is most difficult.

The reference work cited above includes morphologic keys for identification of adult females and fourth-instar larvae. However, only an expert could make use of these (e.g. “lower mesepimeral setae absent, pale basal band on abdominal tergum II narrowed, or completely interrupted, medially). If mosquito identification is important for society, then reference DNA barcodes are what is needed, as these enable many more persons to name specimens, regardless of life stage. It does not make sense to rely on reference works for the world’s mosquitos that are incomprehensible to anyone who is not already a mosquito specialist.

Leishmaniasis: DNA helps ID vectors, parasite, control agent

April 25, 2010

Leishmaniasis is a chronic parasitic infection caused by various Leishmania species, kinetoplast protozoans related to Trypanosoma (the latter includes agents of African sleeping sickness and Chagas disease, suggested as a cause of Charles Darwin’s ill health in late life). Depending on the species involved, leishmaniasis manifests as illness ranging from non-healing cutaneous or mouth ulcers (CL) to sometimes fatal visceral infection (VL). In the Neotropics, 12 species infecting humans have been identified, all associated with CL. Neotropical leishmaniasis is mostly zoonotic (ie originates from animal reservoirs as opposed to human-to-human transmission), and the vectors are tiny phlebotomine sand flies, particularly Lutzomyia sp.

In March 2010 PloS Neglected Trop Diseases investigators from Smithsonian Tropical Research Institute (STRI) and Instituto Conmemorativo Gorgas de Estúdios para la Salud, Panamá, apply DNA testing to Lutzomyia sandflies collected on Barro Colorado Island, STRI’s island home in the Panama Canal. Aiming to analyze as many species as possible, Azpurua and colleagues selected 435 individuals, which they morphologically identified as representing 16 Lutzomyia and 2 Brumptomyia sandfly species, for further analysis. Over 95% of specimens in the original collection were from one species, L. panamensis, so this was not a completely representative sample; nonetheless, “the relative abundances of species collected in this study were significantly correlated to those found in a previous intensive study of sand fly community composition on the [Panama] mainland…that collected over 30,000 Lutzomyia individuals in 35 species.”

To skip to the end, COI barcodes unambiguously assigned all 49 individuals to 18 distinct lineages corresponding to named species, plus highlighted 2 genetically-divergent individuals that might represent cryptic species. Using primers for ITS-1 (a nuclear gene) and mini-circle DNA (part of mitochondrial genome), Leishmania were detected in 2 of 5 human-biting species, Lu. trapidoi (13/30 individuals tested, 43.3%) and Lu. gomezi (5/19 individuals tested, 26.3%). By my estimate, taking into account relative abundances of Lutzomyia sp., about 1% of Barro Colorado Island sand flies carry Leishmania. Surprisingly, DNA sequencing identified the parasite as Le. naiffi, a South American species not previously reported in Panama. Finally, using the same set of DNA extracts, the researchers tested for Wolbachia, a rickettsial intracellular insect parasite and candidate biological control agent. Wolbachia were found in 3 of 18 species, including 50% of Lu. trapidoi, the main vector of CL in Panama. As an aside, I note that the presence of Wolbachia apparently did not interfere with discriminating among sand fly species; hypothesized interference from Wolbachia was one of the early worries some expressed about DNA barcoding (e.g Whitworth Proc Biol Sci 2007).

Standardized DNA testing enables many more persons to identify insects, regardless of life stage, including those that serve as vectors for human diseases. In this report by Azpurua and colleagues, the discovery of a new species of Leishmania for Panama, and possible undescribed Lutzomyia vectors, suggests that wide application of standardized DNA testing will lead to further discoveries relevant to control of human and animal infectious diseases.

Simplified DNA barcode recipe-skip step one

April 15, 2010

It used to be standard practice to shave the area around the incision before surgery, as it was thought that hair harbored bacteria that would cause wound infection. Beginning in the 1970s, doctors found this was unnecessary, and in fact was associated with higher incidence of post-operative infection. This history comes to mind in reading March 2010 BioTechniques report by researchers from University of Guelph demonstrating that DNA suitable for PCR and sequencing can be obtained simply by leaving specimens in alcohol overnight!

well-stocked-bar-tavern Of the three steps required to get from a specimen to a DNA barcode, namely DNA isolation, PCR (polymerase chain reaction), and sequencing, the first step is the most labor intensive and hardest to automate. Numerous protocols/kits have been developed to optimize DNA isolation from various types of specimens, such as plant vs animal tissues. As described by the Guelph researchers, “these procedures force cells to release their DNA via physical pertubation and/or chemical treatment, which is then followed by a clean-up procedure in which unwanted cellular compoents are separated from the DNA.” The researchers “hypothesized that a small amount of DNA leaks from the tissue into the preservation solution (usually ethanol), and that this DNA was amplifiable using a standard PCR protocol.” To start, they analyzed Monte Alban mescal, which is sold with a “worm” (a caterpillar of the agave moth, Hypopta agavis) in each bottle. They evaporated 50 mL mescal, re-dissolved the residue in water, applied this to a Qiagen MinElute spin column, resuspended the product in 50 ?L water, and used 2 ?L of resulting solution in a standard 25 ?L PCR reaction, with successful amplification and sequencing of 130 base mini-barcode of COI. This case was presumably challenging as mescal is only 40% ethanol and contains a variety of material that might inhibit PCR. In subsequent tests, 1 mL of 95% ethanol used to preserve specimens was evaporated, resuspended in 30 ?L of water without column purification, and 2 ?L used for PCR.

By evaporating 1 mL of ethanol in which specimens had been stored overnight (out of 2 mL total ethanol volume) and re-suspending residue in water, Shokralla and colleagues amplified and sequenced 130-base and 650-base fragments of COI and 1100-base fragment of 28s RNA from 25 whole insect specimens (mayflies, caddisflies; 1 gave COI only) and rbcL from 45 plant specimens (0.5 mm leaf samples). They also obtained COI sequences by sampling 1 mL of ethanol solution from 7 insect specimens stored at room temperature for 7 to 10 years. The researchers note this approach could facilitate for “high-throughput” analyses, as it involves liquid handling which is easy to automate, avoids destructive sampling, and could be used even when “there is simply no sample left for further analysis.” They conclude with a caution about “field sampling procedures that include placing mixtures of specimens in an ethanol jar” as this “may increase the chance of cross-contamination.”

The remarkably simple procedure reported by Shokralla and colleagues offers benefits to many persons who want to get DNA barcode identifications. I look forward to applications of this method in research and commercial laboratories, classrooms, and perhaps kitchens!

For accurate census, birds await their barcodes

April 3, 2010

sdarticle08-2b Although birds have been studied in more detail than any other large group of animals, mtDNA continues to reveal many overlooked species, such that named taxa turn out to be comprised of two or more distinct species. These revisions include some very familiar birds, e.g., Canada Goose, which was recently recognized as comprising two species, Cackling Goose (B. hutchinsii) and Canada Goose (B. canadensis) (A.O.U. Check-list 45th suppl. 2004); for a current example, see Päckert et al Mol Phylogenet Evol 2010). Although such taxonomic revisions reflect a combined analysis of morphological, behavioral (particularly song), geographic range, and DNA information, to my reading mtDNA generally trumps the other data, which mostly serve as corroborating evidence. It is not that mtDNA similarities or differences are important per se, it is that they are strongly predictive of the presence or absence of organismal differences, particularly reproductive isolation, that are the hallmarks of species status. Of the species examined so far (I estimate about 1/3 of the 10,000 world birds), most demonstrate a similar patterning of limited mtDNA differences within species and relatively large differences among species, so we can be confident that this analytic approach will hold up. As an aside, referring to splits of named species as “taxonomic inflation” is misleading, as it suggests the real number of species is already known, which seems no more correct than referring to binary stars resolved with a new telescope as reflecting “stellar inflation.”

As with most animal groups, there are no nuclear genes that regularly distinguish closely-related birds. (It seems likely that sequencing entire nuclear genomes will enable discriminating species units, but this is inherently a more costly, less standardizable approach, making it unattractive for routine use.) Differences among closely-related species are more or less evenly-distributed throughout the mitochondrial genome, so that approximately 500-1000 bp of coding DNA usually contains sufficient information to recognize species and create genus-level phylogenies. Moreover, if species are not distinguished by this length of mtDNA then additional sequencing usually does not improve the resolution. This might be better stated the other way around, namely, if two sets of birds cannot be distinguished by a relatively short stretch of mitochondrial DNA, then they most likely belong to the same species. Why is this so? The prevailing null hypothesis is that mitochondrial differences are an accidental if useful accompaniment to species status, reflecting genetic drift over the several hundred thousand years of reproductive isolation presumably required for new species to emerge. In a recent essay (Nature 18 nov 2009), Nick Lane explores the interesting possibility that mitochondrial differences cause speciation, thus producing what they identify.

What barcoding adds to the historical approaches in avian systematics is standardization, so that the same locus is analyzed in every specimen, which speeds completion of species-level avian taxonomy and facilitates large-scale comparisons. Given that most bird species have yet to analyzed for species-level genetic differences, the standardized approach can save considerable time and money, as the results of independent investigators can be readily merged (e.g. Johnsen et al J Ornithol 2010, Kerr et al Frontiers Zool 2009). Beyond classification, this approach creates a reference library of avian barcodes, which enables identifying unknown specimens, such as from birdstrikes (Marra et al Frontiers Ecol Environ 2009). Our survey of avian frozen tissue collections identified over 315,000 specimens representing over 7,200 species (Stoeckle and Winker, Auk 2009). Based on what is in GenBank, it appears that most of these specimens have never been analyzed for any genetic locus, so a first-pass effort to sequence COI barcode region will certainly reveal many new species and help resolve higher-level relationships as well. There is an opportunity for a granting agency or foundation to have a large impact on avian systematics for modest cost by supporting barcoding of existing collections, given that the tissue specimens and vouchers, which are the most expensive components of collections, have already been prepared. Looking further ahead, such an effort would also create sets of high-quality DNA extracts, with species identity confirmed by COI barcode, linked to vouchered specimens, that would be a powerful resource for further genetic study.

News about mitochondria

March 21, 2010

Mitochondria are energy-producing organelles, found in nearly every cell in nearly every plant and animal species (some protozoans lack mitochondria). As first demonstrated by Lynn Margulis in 1967 (J Theor Biol 14:225) mitochondria, like chloroplasts, are derived from bacteria, reflecting an ancient symbosis that pre-dates the divergence of plant and animal species 1 billion years ago. Both organelles have retained a truncated circular genome and replicate independently of the nucleus. The mitochondrial genome in particular has turned out to be exceedingly useful in tracing evolutionary history, as it is present in all eukaryotic organisms, evolves rapidly as compared to nuclear DNA, and does not undergo meiosis and recombination, processes that scramble the evolutionary lineages of nuclear genes. Because it is several orders of magnitude more abundant than nuclear DNA (hundreds to thousands of mitochondria per cell, and 5 to 10 genomes per mitochondrion), it facilitates forensic identification, including DNA barcoding, even with very small or degraded samples. Given its practical and scientific importance, we want to better understand mitochondrial genetics.

A general observation is that mitochondrial DNA is uniform in an individual, presumably reflecting a stringent bottleneck at oogenesis, such that one or a very small number of mitochondria are passed on. However, a number of cases of heteroplasmy (i.e., individuals harboring multiple mitochondrial variants) are reported in humans and other animals. In 3 March 2010 Nature researchers from Johns Hopkins University and Case Western University apply next-generation sequencing to make a high-resolution survey of mitochondrial heteroplasmy in various tissues including cancerous cells. He and colleagues used “two sets of PCR primers, each resulting in amplicons of about 650 base pairs (bp) in length…to cover the mtDNA genome” and the resulting PCR products were sequenced by synthesis in an Illumina GAII. This approach was applied to a sample of normal colonic mucosa, yielding “8.5 million tags that matched the mitochondrial genome.” As a result, “each mtDNA base was sequenced, on average 16,700 times and fewer than 11 bases (0.07%) of the 16,569 bp in the mtDNA genome) were represented fewer than 1,000 times.”

nature08802y To establish a cutoff for artefactual errors due to PCR and/or sequencing, a control comparison with amplified nuclear DNA was performed, which yielded an average of 0.058% (SD 0.057%) mutations per base and a maximum of 0.82% mutations. He and colleagues used a “very conservative assumption that all variants in excess of twice this value (1.6%) represented true heteroplasmies rather than sequencing artefacts.” Now to some results! The researchers detected “28 homoplasmic alleles and 8 heteroplasmic alleles in this sample of normal colonic mucosa.” Here “homoplastic” refers to differences from the reference human mtDNA sequence (NCBI entry NC_012920). All of the homoplastic alleles were previously found in normal individuals, so we can set these aside as representing normal variation among human individuals.

The researchers extended this analysis to other tissues from same individual; all tissues yielded heteroplasmic mtDNA, and the proportion of of individual variants differed strongly among tissues, e.g., the frequency of the most common variant ranged from 7.4% in skeletal muscle to 90.9% in kidney. Surprisingly, 75% of the heteroplasmic variants are already reported in human databases, suggesting a limited pool of variation and/or strong purifying selection. Further evidence for restricted variation is that 67% of heteroplasmic variants were in non-protein coding or RNA-coding regions,” presumably the control region, which represents less than 10% of the mitochondrial genome.

What is the origin of heteroplasmic mtDNA? Using samples from one kindred, the researchers found identical variants in a mother and her two children (and not in the father), demonstrating that, at least in this case, the heteroplasmic variants were inherited from the mother. The authors go on to analyze mitochondrial heteroplasmy in cancerous tissue which is interesting but I will not discuss here.

In terms of species-level identification, the findings add confidence to the established approach using COI mtDNA for animals. This high-resolution study demonstrates that mitochondrial variation within human individuals is a smaller scale version of the variation already known to exist among individuals. As in standard mitochondrial genetics, most of the heteroplasmic variants are maternally inherited. On the other hand, when identification of individuals is important, as in human forensics, mitochondrial heteroplasmy may need to be taken in account, at least on the negative side when apparent mismatches are found. The authors conclude by suggesting “caution in excluding identity on the basis of a single or small number of mismatched alleles when the tissue in evidence (such as sperm) is not the same as the reference tissue of the suspect (such as blood or hair).” Looking ahead, for those interested in exploring mitochondrial heteroplasmy in other species, the initiative has created a large database of intra-specific variation in diverse species, an essential benchmark for investigating possible within-individual variation.

Note added 22 march 2010: As a thought experiment we can ask: how much of the within-species variability in COI might be due to unrecognized mitochondrial heteroplasmy? In the present study, the average number of heteroplasmic variant sites in one tissue sample was about 5, and, on average, 33% of such sites were in protein- or RNA-coding regions, which represent about 90% of mitochondrial genome. That gives (5 x .33) sites distributed across (16,569 x 0.90) nucleotides in the mitochondrial genome, which works out to about 0.0001 variants per site. For 650 bp COI barcode region, that corresponds to an average of 0.07 heteroplasmic sites per barcode sequence, or 0.01% variation. So for humans at least, mitochondrial heteroplasmy appears unlikely to contribute significantly to the observed intra-specific variation in COI.

Next-generation DNA barcode application

March 13, 2010

DNA barcoding efficiently identifies species from flies to fish to flowers, including from bits and pieces and other unrecognizable forms: eggs, larvae, seeds, pollen, roots, damaged museum specimens, and even DNA shed into aquatic and terrestrial environments. What else can we do with this new instrument? With the BOLD reference library at >800,000 records from > 68,000 species, DNA barcoding combined with high-throughput sequencing can be a macroscope for studying large-scale patterns in biodiversity.

glyophodes-margaritaria1 In March 2, 2010 Proc Natl Acad Sci USA researchers from University of Minnesota, National Museum of Natural History, University of Guelph, and University of South Bohemia, Czech Republic, apply DNA barcoding to measure species diversity and distribution in tropical moths and butterflies. In an earlier study (Novotny et al Nature 2007), some of the same researchers had shown surprisingly low beta diversity and little host specialization in herbivorous insects across 75,000 square kilometers in lowland rainforest in Papua New Guinea, an area 1 1/2 times as large as Costa Rica. (Note: alpha diversity is number of species at a given site; beta diversity refers to differences in species composition among sites).

For the Nature report, the researchers hand-collected 74,184 caterpillars representing 370 species; each caterpillar was tested for food preference in the field, and 25,346 were raised to adults. In the PNAS study, the researchers analyzed COI barcodes of 1,359 individuals representing 28 apparently widespread Lepidoptera species for which they had collected large numbers of individuals at 8 sites across the region (average individuals per species, 49, range 29-80; average sites per species, 6.1, range 3-8; average distance between sites, 160 km, range 59 -513 km).

Craft and colleagues found “no universal pattern of population genetic structure among 28 Lepidoptera species in lowland New Guinea.” Although about half of the species showed genetic diversity associated with host plant specialization and/or geographic isolation (some of the variant lineages may represent distinct species), the phylogeographic patterns differed among species and there were a surprising number of widely sympatric species with overlapping diets, a challenge for ecological theory. As the authors note, their results contradict estimates of insect diversity and host specialization in the Americas, and they call for “comparative population genetics of ecological guilds” to enable testing “major hypotheses for the origin and maintenance of species diversity.” Like a new telescope for astronomy, DNA barcoding offers biologists a new instrument for exploring the structure of biodiversity.

The City Ant and The Country Ant: DNA tells the story

March 6, 2010

North_America_satellite-tsessile DNA helps answer the origin of infectious diseases: are cases sporadic events or part of larger epidemic, such as the recent Salmonella Montevideo outbreak involving at least 245 persons in 44 states, traced to a single importer of crushed red pepper used in salami manufacturing. In a similar way, DNA helps answer the origin of apparently widespread species–are they part of single outbreak so to speak, or are they multiple independent populations or species. (This suggests useful connections between phylogeography, the genetic study of populations, and molecular epidemiology of disease.) As with pathogen diagnostics, a minimalist DNA testing approach will help make feasible analyzing large numbers of specimens.

In February 2010 PLoS ONE, six researchers from University of North Carolina report on Odorous house ant Tapinoma sessile (smells like rotten coconuts when crushed), collected from 47 urban and rural localities across the US. According to the authors, T. sessile is the most common and widely distributed ant in North America, found “from the West coast to the East coast and the deserts nearly all the way to the tundra.” The structure of the 18 colonies examined in detail ranged from a monogynous (single queen) colony in an acorn with 50 workers, to a polygynous colony with 2 queens and 250 workers, to a large, dispersed colony of “several million workers and thousands of queens in and around several buildings on a college campus.” For DNA analysis, 68 individual were analyzed (1 from each of the 18 colonies, plus 23 collections in natural environments made by entomologists, 26 collections in urban environments mostly provided by pest control professionals, and 1 T. erraticum specimen). Menke and colleagues found 4 distinct genetic groups, corresponding to geographic areas, with 7.5 – 10% COI sequence differences among groups, and relatively small (0.2 – 2.3%) differences within groups, a pattern that “may represent multiple species.” Counter to initial expectations, urban ants were genetically similar or identical to non-urban ants within each region, and colony structure was not associated with urban vs natural environment, namely monogynous and polygynous colonies were found in both environments.

I conclude there is much we don’t know about the commonest, most everyday species, and that DNA barcodes are just the right size for many of the relevant scientific and practical questions. In closing, for a view of complexity of ant life, please see E.O. Wilson’s wonderful short story “Trailhead”, in March 6, 2010, New Yorker, an excerpt from his upcoming book Anthill.

International barcoders get into print

February 26, 2010

Now that 3rd International Barcode of Life Conference (held in November 2009 in Mexico City with over 350 researchers from 54 countries) is behind us, where to turn for DNA barcode science and organizational news? A bright answer arrived in today’s email: the first issue of the International Barcode of Life (iBOL) Bulletin (download pdf or view online flash version). The 12-page illustrated quarterly iBOL newsletter has a promising diversity of news. To take one example, I learned that some members of the North American Moth Photographers Group (MPG) are submitting their hard-to-identify specimens to Biodiversity Institute of Ontario, thus building up the reference library, and in turn receiving DNA-based identifications! This sort of crowd-sourcing approach to specimen collection could be a big thing for barcoding in particular, and for biodiversity science in general. There are many dedicated, expert, non-professionals who are likely to contribute given the right framework.

iBOL-Barcode-Bulletin1 In terms of citizen participation, the MPG story suggests expanding opportunities for biological research that harnesses the skill and energy of non-professionals, a step beyond the successful BioBlitz model, which still requires a lot of on-site organization. If North American birders can create a comprehensive, regularly-updated database documenting migration, i.e. eBird (1 1/2 to 2 million sightings submitted monthly), then there must be a large potential for crowd-sourcing specimen collection, at least for certain organisms. After all, the most expensive part of biodiversity science is often collecting and/or documenting specimens. How to encourage and streamline data collection is suggested by Cornell University’s recently-released iPhone app BirdsEye, which displays current local sightings based on eBird database and user’s GPS location, with planned update that will enable birders to instantly update eBird with their own sightings.

The Barcode Bulletin aims to “inform and entertain iBOL collaborators, the global DNA barcoding community and the wider world of biodiversity genomics”; this issue is a promising start.

Medicinal orchids unmasked

February 15, 2010

Herbal products make a compelling case for DNA-based identification–how else to recognize dried bits of roots, leaves, stems, bark, and flowers from a multitude of species? In December 2009 J Nat Med, researchers from Ochanomizu University and Showa Pharmaceutical University, Japan, apply recently agreed-upon standards for DNA barcoding land plants, namely matK and rbcL, to distinguish among Dendrobium species. Dendrobium is a large (about 1200 species) genus of orchids widely distributed through east Asia to Philippines, Australia, and New Zealand. Over 50 Dendrobium species are used in traditional medicines and are thought to have various pharmacologic activities, although the active ingredient(s) are not yet characterized.

Asahina and colleagues analyzed rbcL and matK from 12 samples representing 5 Dendrobium sp. and 3 hybrid cultivars whose genetic histories are uncertain. Single primer sets successfully amplified matK and rbcL from all specimens. The researchers cloned PCR products (and then sequenced at least 3 clones per species), rather than directly sequencing amplified products (rationale for the cloning step is not given). They found that matK, but not rbcL, distinguished among the five species; this is consistent with general observation that rbcL varies less among closely-related species than does matK. Results were similar when 22 matK Dendrobium sp. sequences from GenBank were added to analysis (bringing species total to 6), with one exception; 1 of 11 D. officinale GenBank matK sequences was unique, and in NJ diagram appeared on branch distant from the other 10. In this modest sampling, there was no intra-specific variation in the original 12 samples; some intra-specific differences were noted in 2 species in comparison with GenBank sequences.

Untitled-7 This study demonstrates advantages of DNA barcoding approach for plant identification. Of course, there is already a lot of interest in DNA identification of herbal plants in general and Dendrobium orchids in particular. For example, I found over a dozen articles describing DNA methods for distinguishing Dendrobium sp. However, the methods described are limited to identifying species in this one genus, which means one has to have a pretty good idea what the specimen is before applying DNA testing! This highlights the essential advantage of barcoding–a standardized approach can be applied to any unknown, and makes feasible creation of a comprehensive reference library.

Looking ahead, we want to know more about intra- and inter-specific variation in plants. In animals, the patterning of mitochondrial variation is quite uniform, with intra-specific << inter-specific variation, such that most species form relatively tight clusters distinct from those of other species in NJ diagrams. Results so far in plants generally show little intraspecific variation in chloroplast genes (including rbcL and matK), but a diversity of distances among closely-related species. Assuming these early results are borne out, we then want to know why plants and animals differ? For more genetic variation in plants and animals, see Rieseberg et al Nature 2006, Fazekas et al Mol Ecol Res 2009).

Rockefeller University

Program for the Human Environment

Area of Research: DNA Barcoding