Jesse’s commentary, “A Botanical Macroscope“, about the DNA plant barcode and its implications for the Encyclopedia of Life and related e-Biosphere initiatives, appeared in the 4 August issue of the Proceedings of the National Academy of Sciences (vol. 106, no. 31 12569b)
News
DNA for tardigrades
Tardigrades, commonly called water bears, are tiny (0.1-1.5 mm) water-dwelling invertebrates found in diverse environments. About 1000 species are known. Morphologic identification is difficult and may be limited to certain life stages–some species can be identified only from eggs, for example. Tardigrades can transform into a dormant state with remarkable ability to withstand extreme drying, cold, and radiation for prolonged periods, making them of interest for persons studying biology of tissue repair, aging and other fields.
Tardigrade Barcoding Project has just launched their website at www.tardigradebarcoding.org. The project will “provide a set of indispensible tools for the identification of marine, freshwater, and terrestrial tardigrade species, and will greatly aid taxonomists and ecologists. It will also enhance understanding on the evolution, ecology, life-history and extraordinary tolerance of physical extremes for these animals.” I add that COI barcodes are likely to reveal great genetic diversity hidden within morphologically defined species (eg Blaxter et al 2003).
I look forward to learning more about tardigrades!
Mark Stoeckle commentary in The Auk
A global survey of avian tissue resources, co-authored by Mark Stoeckle, appears in the July 2009 issue of The Auk. The survey identifies over 317,000 specimens in 29 collections representing 7,226 species (73% of world birds). This is an outgrowth of work he began in 2005 as part of All Birds Barcoding Initiative and is the first compilation of genetic resources for any large group of organisms.
"A Global Snapshot of Avian Tiss ue Collections: State of the Enterprise" (PDF)
Botanists establish DNA barcode for land plants
In this week’s Proc Natl Acad Sci USA, CBOL Plant Working Group, which included 52 researchers from 25 institutions, announced agreement on a DNA barcode for land plants. The authors tell their story:
“DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF–atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK–psbI spacer, and trnH–psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL and matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.”
The Working Group concludes: ” There is little doubt that the approaches used in plant DNA barcoding will be refined in the future. However, the key foundation step for plant barcoding is in reaching agreement on a standard set of loci to enable large-scale sequencing and the development of a global plant barcoding infrastructure. The broad community agreement presented here, to sequence rbcL and matK as a standard 2-locus barcode, is thus an important step in establishing a centralized plant barcode database as a tool for taxonomy, conservation, and the multitude of other applications that require identification of plant material.”
In the same issue of PNAS, a Commentary by Jesse Ausubel traces the development of DNA barcoding, from a proposal in 2003 for a standardized DNA-based approach to species identification, using mitochondrial COI gene for animal species. Adopting COI as a standard was the essential first step, leading to a rapidly growing library now with over 620,000 specimens from over 58,000 species, enabling high-school students to become identification experts for store-bought fish items and shedding new light on species diversity. With the publication of this paper, DNA-based identification for land plants is now poised to expand rapidly, with benefits to science and society. Ausubel views DNA barcoding enterprise as an urgently needed “macroscope” for probing ecological and evolutionary patterns on a broad scale. He concludes with a call “to accept the invitation of the 52 authors led by Hollingsworth to use the standard two-locus barcode of matkK and rbcL to join in building a powerful botanical macroscope.”
Census of Marine Life maps an ocean of species
Award-winning journalist Bob Drogin publishes an excellent feature article about the Census of Marine Life in the Los Angeles Times.
Barcoding Nemo
How does one collect tropical reef fish without leaving North America? In July 2009 PLos ONE researchers from University of Guelph report on genetic diversity in SE Asian tropical reef fish, collected without plane fares or permits. How did they do it? Steinke and colleagues analyzed “dead on arrival” marine fish imported into Canada for the ornamental pet trade from various locations in SE Asia. A total of 1631 specimens representing 391 named species were frozen, imaged on a flatbed scanner, and a muscle tissue sample was taken for COI analysis. This is remarkable on several counts. First, the large number of species–according to FAO report cited by Steinke, “some 800 marine fish species, representing about 5% of all marine taxa, are involved in this trade, with 70% of sales directed to North America,” and estimated revenue of $200-$300 million annually. Second, this study surveys genetic biodiversity in reef fishes, provides a practical method for identification, and at the same time provides insight into what is probably the major threat to their survival. I am reminded of near extinction of Common egrets in North America in the late 1800’s as a result of hunting for plumes in women’s hats. This led to a popular uprising among women of fashion, who pledged not to wear such clothing, organizing what were the first “Audubon Societies” and successfully petitioning for legislative change, saving egrets and many other birds. Nemo and other reef fish may need a similar campaign.
Back to the study, Steinke and colleagues found distinct barcodes among 384/391 (98.2%); 9 species displayed 2 or 3 distinct clusters, most of which were allopatric. Review of these potential “splits” revealed possible inappropriate synonymization in several cases. On the other side, 2 pairs and 1 triplet of species were not distinguished by DNA barcodes using distance. I look more closely at one of these examples, butterfly fishes Chaetodon multicinctus and C. punctatofasciatus, to see if there might be diagnostic characters whose signal is swamped by intraspecific variation. As in figure, there are 2 possibly diagnostic differences among this species pair. Of course, this sort of analysis only works for known species, but I wonder how many other species pairs/sets with “overlapping” barcodes have diagnostic differences.
Voucher and collection information in GenBank records
A core tenet of DNA barcoding initiative, beginning with the first workshops in 2003, is that reference sequences should be linked to vouchered specimens stored in museums, so that data can be re-checked. This also provides visibility to collections. For example, “GenBank DQ433554 Crotophaga ani voucher KU 89123 cytochrome oxidase subunit 1 (COI) gene, partial cds; mitochondrial” contains voucher information in the title and the record itself, at least for those who know “KU” refers to University of Kansas. The GenBank file contains a “LinkOut” to the BOLD page which spells out the collection name. The GenBank file (and the BOLD record) could also include a “LinkOut” to the museum itself, although I do not find examples of this feature being used.
More generally, is collection information available in GenBank records? Taking birds as an example, there are 475,273 GenBank avian records; eliminating the five most-represented species (Chicken, Turkey, Mallard, Zebra Finch, Fairy Wren) leaves 108,766 sequences, of which about half (48,915) contain the word “voucher.” This sounds promising but my unscientific sample suggests most entries in the “voucher” field are cryptic designations that do not identify the institution storing the specimen. I tried searching by acronyms for some of the larger collections. Louisiana State University has the largest avian tissue collection in the world with about 40,000 specimens; searching “LSU AND aves[organism] AND voucher” returned only 1,148 records, which seems likely to underrepresent the museum’s contribution. Results for some other large collections were higher but still appear to be incorrectly small considering there are 100,000+ avian GenBank records: (Burke Museum (UWBM) 3,318; Field Museum (FMNH), 2,593; American Museum of Natural History (AMNH), 1,994; Smithsonian (USNM), 1,920; University of Kansas (KU), 684 records).
I conclude that researchers and collections will benefit from following practices promoted by DNA barcode initiative for GenBank records including taking advantage of GenBank’s “LinkOut” feature.
www.iBarcode.org: web tools for sequence analysis
In 16 june 2009 BMC Bioinformatics researchers from University of Guelph report on web platform for DNA barcode analysis, www.iBarcode.org. The site works with aligned barcode files in standard .fas format, such as produced by MEGA or BOLD. Registration is not required; the site keeps track of files you have uploaded.
According to authors Singer and Hajibabaei, iBarcode is designed to “allow the user to manage their barcode datasets, cull out non-unique sequences, identify haplotypes within a species, and examine the within- to between-species divergences.” iBarcode provides several clever, easy-to-use tools and I look forward to further refinements.
.
.
.
.
.
.
Lizard mitochondria converge on snakes–why?
In 2 june 2009 Proc Natl Acad Sci USA researchers from 5 American universities report on convergent molecular evolution among agamid lizards and snakes. In constructing a nuclear and mitochondrial DNA phylogeny of squamates (snakes and lizards), Castoe and colleagues noted their data placed agamid lizards as sister to snakes, rather than within lizard clade Iguania, as supported by prior work including morphology. The apparently aberrant phylogenetic placement was due to similarity among mitochondrial genomes of agamid lizards and snakes; nuclear genes recovered the established tree. Most of the aberrant signals were in first and second codon positions in protein-coding genes, and thus associated with similarity in predicted amino acid sequences among agamids and snakes. These convergent changes were distributed across all 13 mitochondrial protein-coding genes, but were clustered particularly in COXI and ND1.
The authors conclude that there was an ancient adaptive episode in the ancestors of today’s agamid lizards, which led to a snake-like mitochondrial genome. I note this conclusion is based on analyzing just 2 of the more than 350 species in 52 genera in Agamidae. Are these changes universal in Agamidae? There are 2 more complete agamid mitochondrial genomes in GenBank which could be examined; of additional interest would be to see if the same convergent changes are found in the 253 COI sequences from 88 agamid species in 11 genera in BOLD. As in this study, phylogenetic reconstruction usually involves just a few representatives of each lineage, which means that evolutionary patterns may remain invisible. I expect that BOLD will be an increasingly useful resource to expand the scope of phylogenetic studies utilizing mitochondrial DNA.
The conclusion that these findings represent convergent adaptive evolution is strong, yet it is also puzzling, as at first glance there doesn’t seem to be any special morphological or life-style resemblance between snakes and agamids as compared to other lizards. Perhaps we need to keep an open mind for other seemingly unlikely mechanisms, such as eukaryotic horizontal gene transfer.
Poisonous fish revealed
What fish is that you are eating? This question has many possible answers. Unlike meats, which are derived from a handful of species, most of which are farmed, there are numerous fish sold for human consumption, most of which are wild. The US FDA Regulatory Fish Encyclopedia and the Canadian Food Inspection Agency lists of approved fish and shellfish include approximately 1700 and 660 names, respectively. And yet DNA surveys regularly turn up fish in the marketplace that are not on any regulatory list, as well as mislabeling of those that are listed, suggesting we may not know what we are eating or what fish stocks are being harvested.
In addition to economic and environment impact, mislabeling can have public health implications. In April 2009 J Food Protection government and research scientists report on 2 cases of tetrodotoxin poisoning in Chicago, IL resulting from ingestion of soup prepared from mislabeled puffer fish, sold as “monkfish.” Two additional cases were traced to the same supplier and this led to the recall of several thousand pounds of frozen fish. Morphologic examination of leftover parts and DNA testing of the cooked meat implicated Lagocephalus sp., most likely Green roughed-back puffer L. lunaris. Unlike most other toxic puffer species, L. lunaris tetrodotoxin is in muscle as well as organ tissue, making safe preparation impossible. At the time of the study, there were no reference sequences in BOLD for L. lunaris, so the DNA barcode identification was incomplete. It would be of interest to repeat the database searches (as of today GenBank contains 1 L. lunaris COI sequence and BOLD taxonomy browser lists 2), but for some reason the sequences obtained by the researchers were not published.
DNA testing is the only way to identify many of the fish items in the marketplace. I expect that standardized DNA testing (aka DNA barcoding) will play an increasingly important role in helping protect both consumers and fish.