Blog

Comparing barcoding performance

Suggested metric, terminology, and standard graphic

How well do barcodes distinguish among species? A standardized, simple quantitative method and terminology for comparing barcoding performance among different data sets will be helpful.

In trying to answer this question, I aim to promote terminology that does not include “error”. In my view, it generally does not make sense to talk about the error rate of barcoding. Barcoding is an instrument akin to a telescope, except that it is designed to resolve species, not stars. A telescope that does not resolve a double star is not wrong, it simply lacks sufficient resolution. Also, the term error rate implies there is an accurate reference standard in species identification. As systematists emphasize, species definitions are hypotheses and frequently undergo revision. Thus in this view barcoding performance, effectiveness, and resolution are useful descriptive terms and are more informative than barcoding error rate.

What we want is an approach that quantitatively compares barcoding with current taxonomy. In the future, taxonomy may incorporate some of the groups discovered through barcoding as recognized species, perhaps will combine some of the recognized species with overlapping barcodes into single species, and additional sequence data may enable resolution of species with overlapping barcodes. To start, a 2 x 2 table comparing recognized species to distinct barcode groups:

Barcode groups and species

Suggested terminology:

Barcode group (or cluster): the shallowest branch in a neighbor-joining tree that corresponds to one or more recognized species or potential split within a recognized species.

Distinct barcodes: a barcode group that corresponds to a recognized species or a potential split within a recognized species. This definition can incorporate whatever criteria are used for recognizing splits (such as criteria that have been used to define provisional species, ESUs).

Barcode resolution: #barcode groups/total #species, in which total #species includes recognized species plus provisional species/ESUs.

This definition of barcode resolution incorporates “partially-resolved” species, so that if, for example, 8 species are resolved into 4 barcode groups, then resolution for that set would be 4/8 = 50%. Alternatively, if idea of partial resolution is not helpful, resolution could be defined more simply as a + b (green + yellow)/total #species.

Suggested graphic: Applying this to recent barcode data sets:

Suggested standard graphic comparing barcode performance

Suggested color scheme: As in table, green (=good!) matches current taxonomy; yellow represents novel species/provisional species/ESUs (yellow like an early bud that lacks chlorophyll), and gray (as in a gray indeterminate zone) represents recognized species with overlapping barcodes. By definition, all potential splits/ESUs have distinct barcodes, so d) in the 2 x 2 table is blank. As barcode findings are incorporated into taxonomy, I expect that the proportion that is green will increase—the greening of barcoding and taxonomy!

Mark Stoeckle

China’s 2nd MagLev

China confirmed its leadership in new transport technology with an announcement 13 March 2006 that it will build the world’s 2nd commercial maglev between Shanghai and Hangzhou, 175 km apart. The round trip will take less than 1 hour, fitting the journey into the human daily travel time budget and assuring massive traffic, as we explain in our papers Toward Green Mobility and The Evolution of Transport.

Claire William’s Book

Claire Williams’ book Landscapes, Genomics, and Transgenic Conifers to which we contributed “Foresters and DNA” is now available from Amazon where the site features our opening sentence: “The decoding of DNA messages produces magnificent structures, perhaps none more magnificent than a tree…”

New Scientist Magazine

Journalist Fred Pearce interviewed Jesse for New Scientist magazine. The version published in the 28 January 2006 issue of the magazine omits about 500 words of the version we post here. Fred is a good interviewer!

OpenLibrary

Our work on Industrial Ecology of Wood Products and Restoring Forests convinced us years ago that wood demand would continue weak for many decades. On 27 December 2005 the Wall Street Journal reported that demand for paper in offices and businesses in North America fell from 14.3 million metrics tons in 1999 to 13 million tons in 2004, more than 1.7% per year, and dropped 4% through the first 10 months of 2005. To experience what might weaken paper demand for a broad range of consumers, visit the Open Library previewing e-books developed by the wonderfully creative Brewster Kahle.

Elektron: Splicer announcement

To lift enthusiasts of distributed generation to heaven, our 1996 paper Elektron on pp. 157-158 proposed a “splicer”, a multi-purpose household energy alliance of 5 kW, to produce electricity, heat, or whatever you like. Honda and Plug Power report progress toward a kind of splicer with their “Home Energy Station.”

Torrance, CA-based Honda R&D Americas, Inc., together with its partner Plug Power, Inc. Latham, NY has launched Home Energy Station III providing electricity for the home and hydrogen for future fuel cell-powered vehicles (H&FCL July 05). This third-generation system which operates on natural gas as primary fuel, is roughly 30 % smaller than its predecessor and produces about 25% more power. It’s rated at up 5 kW. Honda says it’s more energy efficient, hydrogen storage and production capacity is up about 50% via a new improved natural gas reformer, and start-up time has been cut to about one minute. The system will be tested at Honda R&D Americas.