In 2003, Paul Hebert and colleagues proposed a universal identification system employing short DNA sequences as identifiers for animal and plant species. Inspired by the Universal Product Code labels that stores use to track merchandise, he named these short sequences “DNA barcodes.” My colleagues and I set down thoughts inspired by this new name:
“Commercial barcodes and the barcode of life
Jesse Ausubel, Mark Stoeckle, Paul Waggoner
September 2004
Although new methods of sequencing and visualization have displaced the one that produced autoradiographs that show blurry gray stripes of a gel indicating presence or absence of particular bits of DNA, the analogy between the commercial barcode and the barcode of life may be traced to it. However, the power of the analogy comes from other similarities: large capacities to differentiate mind-boggling diversity, ability of digits to distinguish unambiguously, rapidity and economy of identification, ability of parts of the code to distinguish categories, and avoidance of a Tower of Babel by uniformity. We elaborate briefly.
Without the final digit that checks accuracy, the quartets of bars and spaces in the Universal Product Code (UPC) have 10 alternatives at 11 locations, creating an ample 1011 capacity to identify manufacturers and their products. Instead of operating in quartets, sequences of CATG operate in trios that specify synthesis of an amino acid. Each trio of the four alternative CATG has 43 or 64 alternatives. A 600-unit sequence of DNA comprising 200 trios with 64200 alternatives opens ample capacity to identify millions of species. Such large capacities are needed to differentiate the diversity of an economy or a forest.
Because one product number in a UPC differs from another by discrete, digital steps rather than by the shades of verbal descriptions, the numbers identify the product–unambiguously. A barcode of life written as a sequence of CATG along a uniform locality of genomes differs from another by four discrete, unambiguous steps rather than by gradations of words, shapes, and colors. Barcodes gain power because digital beats analogue at making unambiguous distinctions.
Speed and economy also propel use of barcodes. Behind the beep of a UPC scanner lies orchestration that began with the initial conception of bars for numbers a half-century ago. Users and inventors orchestrated optics, electronics, and software to develop miniature, robust equipment that made the barcode an affordable master key to supermarket inventories and suppliers (Swartz 1999). Now that the price of DNA identification of a species has fallen to about $10 (Randhawa 2004), the orchestration can begin to provide a barcode of life. Uniformity fosters frequent use and thus learning and economy.
Product codes can identify products with increasing resolution. At the first level of resolution, the first bars of a UPC on a carton resolve the manufacturer. At the second level, the last bars resolve the product line. Opening the box and reading the serial number would resolve the individual. In analogous manner, extending a DNA barcode through more and more sequences would resolve from kingdoms to species, subspecies, and finally individuals. For our goal, Ockham’s razor prescribes as short a barcode of life as suffices to distinguish species.
Uniformity bestows the universality implied by the U in UPC. Scanners in hardware, grocery, and convenience stores must all call the same light bulb by the same 12 digits. Recently agreement between America and Europe added a thirteenth digit, made uniformity more widespread, and brought universality closer to realization (NY Times 12 July 2004, page C1). The power of standardization, whether in railroad gauges or typewriter keyboards, is one of the strongest lessons of the history of technology.
Finally, the success of a short DNA sequence distinguishing species will rest on reasoning, testing, and agreement, not just an appealing analogy. Reasoning will select a uniform locality on genomes that varies enough but not too much among species, testing will establish whether barcodes of that uniform locality correspond to established binomial names across several species, and then agreement will foster an expanding compilation of matching barcodes and binomial names.”

I conclude that genetics is equally essential for eukaryotic taxonomy as for microbiology. I believe there is no getting around the need to genetically reexamine most or all of the species named in the past 200 years to see if what we recognize as single and distinct species are really so. If there can be cryptic species in large visible animals such as birds, and
As of October 11, 2008 researchers have deposited 14,594 DNA barcodes in BOLD representing 2,586 avian species, 26% of world’s 9,933 birds. You can browse taxonomic coverage to date at
How far along are researchers toward mapping COI barcode resolution of avian species? Birds are of particular interest because species limits are generally well-defined, supported by a wealth of morphologic, ecological, behavioral, and other genetic data. Looked at regionally, there is good coverage in northern North America, parts of Central and South America, western Europe, Korea, and New Zealand, so it should be possible to see how well COI barcodes distinguish among local species in these areas. Published studies so far show >95% resolution of named species and have identified genetically divergent clusters which may represent unrecognized cryptic or “hidden” species (
Looked at globally, there is 100% coverage of 104 polytypic genera (having 2 or more species) representing 324 birds, so this should include the sister species and/or “nearest neighbors” for these, plus there are 853 monotypic genera (having only 1 species) in world birds, which are likely or known to be genetically divergent from birds in other genera. In addition, there are likely many other sister species or “nearest neighbors” within the remaining 1,982 birds with DNA barcodes so far (for example, BOLD includes 28 of 29 Dendroica sp wood warblers). It would be interesting to look at the nearest neighbor differences within the global data set. To my knowledge, comparisons among regions with COI barcode data have been not been published. My impression based on other avian genetic work is that named taxa in different biogeographic regions are genetically distinct, plus there are many unrecognized genetic divisions within species that range across biogeographic regions. I look forward to trans-regional and global comparisons!
In
In
In
An article in