Labelling specimens helps make natural history museum collections valuable. Once specimens have been carefully examined by an expert, a species label can be applied. Labelled specimens can then be re-found and re-examined. Unsorted specimens lacking species designations are more raw material than scientific resource.
As in all sciences, taxonomic knowledge undergoes continuous revision. Changes in names and understanding of species boundaries may mean that specimen labels need to be updated in light of current knowledge. Partly to compensate, there is enormous effort to link current taxonomic understanding to historical species descriptions. Nonetheless, the accuracy of specimen labels may decay over time. In one study, a revision of predatory flies (Euscelidia) (Meier and Dikow Conservation Biol 2004 18:478), 83% of 1383 specimens from 19 collections were found to be incorrectly identified.
DNA sequences offer a simple approach to help keep specimen labels up to date. DNA sequences are an intrinsic, unvarying characteristic of a specimen. A DNA sequence from a standardized locus (ie a DNA barcode) can serve as a permanent “name” for a specimen. Results so far with more than 20,000 invertebrate and vertebrate species show it is generally straightforward to use a COI barcode to assign specimens to known species. Future taxonomic revisions may change
species names or boundaries, but that will not change DNA barcodes of specimens or the clustering patterns of barcode sequences. Thus it should be simple to use a specimen’s barcode sequence “name” to search a regularly revised public database for the current species name it corresponds to. A public database of sequences, specimens, and associated data as is BOLD can undergo continuous refinement, whereas revising labels in every collection around the world is impractical. Given that assigning specimens to species involves sorting among millions of species names, this approach does not have to give 100% resolution to species level to be valuable. It will be useful both for those species with unique barcode clusters or characters and for those sets of closely-related species with overlapping or indistinguishable barcodes.
Finally, on a related note I observe that diagnostic keys are unwieldy and not easily amenable to computerization. For example, Dragonflies of North America (Gainesville: Scientific Publishers, 2000) by Needham, Westfall, and May covers 350 species in 939 pages, or about 2.7 pages per species, which by my count is typical for diagnostic keys. At this rate, a world key for the 5500 known Odonates would run to 15,000 pages. The recondite language that is required to describe morphologic detail in this and other keys makes the tools inacessible except to highly-trained persons. On the other hand, anyone can interpret a DNA sequence. It is exciting that taxonomists and others are increasingly taking up the challenge of translating taxonomic knowledge into a much more widely-accessible format, namely DNA barcode libraries.

1 thought on “Labelling specimens and species with standardized DNA sequences”