The Barcode Blog

A mostly scientific blog about short DNA sequences for species identification and discovery. I encourage your commentary. -- Mark Stoeckle

Subscribe to this blog

Sign up for email notifications

How to make an indentification machine

Successful automation often involves machines that carry out tasks differently than persons. For example, a Coulter counter (developed by Wallace H. Coulter, an American engineer), analyzes blood cells by electrical charge, producing a detailed report of red and white cell types faster and more cheaply than does a technician examining a blood smear under a light microscope.  As another case, machine identification of commercial products is enabled by a UPC bar code, which represents a product name in a digital format that can be “read” almost instantaneously by a laser scanner. In a similar way, DNA barcoding “reads” the digital code of DNA, associating that with species names in a reference database, opening the door to fully or partly automated identifications. In 9 September 2010 Nature, scientists from London Natural History Museum, Louisiana State University, and University of Plymouth, UK, propose a different route to automate taxonomic identification, namely, teaching computers to do morphologic pattern recognition. Now that we are on the threshold of “anyone, anywhere, anything” identification with DNA barcoding, this seems a step backward.

I see three major challenges that limit any morphology-based identification system: naming an organism from bits and pieces, recognizing look alikes and life stages, and the diversity of diagnostic features requiring specialized equipment. On the other hand, DNA is the same whether from an intact specimen or an unrecognizable stomach fragment, readily distinguishes look alikes in any life stage, and can be analyzed using the same equipment regardless of specimen. More generally, at the end of the day, little scientific insight will have been gained from a system that distinguishes life forms by the multitidinous particulars of appearance, whereas a library of DNA barcodes linked to named specimens offers a broad view of species-level differences across the diversity of life.

According to MacLeod and colleagues, “a [DNA] bar code isn’t useful until the reference species has been identified by experts”. This makes no sense to me. All large barcode surveys of animals, from ants to fish, have revealed hidden genetic divergences, in many cases leading to recognition of new species.  In fact, DNA barcoding is fast way of screening existing collections for unrecognized species. In this same section, as part of discounting a DNA approach, they state “researchers frequently need to identify non-living objects as well as living ones”. I don’t understand how this is an objection, since, for example, DNA barcodes from ancient bone fragments have been used to define species of extinct flightless Moa (Lambert et al J Heredity 2005).

I know from iPhoto’s remarkable ability to recognize individuals that computers are getting better at pattern recognition. Further development focused on taxonomic specimens may lead to useful tools. However, this seems unlikely to lead to a widely applicable automated system. In a study cited by the authors, phytoplankton identifications by 16 marine ecologists were compared to those with DiCANN, a machine learning system (Culverhouse et al Marine Ecol Prog Series 2003). The authors of that study conclude what is likely to be generally true about morphology based identification:   “In general, neither human nor machine can be expected to give highly accurate or repeatable labeling of specimens”.

This entry was posted on Thursday, November 4th, 2010 at 3:32 pm and is filed under General. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

5 Responses to “How to make an indentification machine”

  1. hugo mejia Says:

    I wouldn’t be so radical about morphology-based taxonomies. In the first instance, they precede DNA identification. If it weren’t for them no one would have had the idea of DNA-based taxonomies. In the second instance, morphology lies at the base of whatever identification comes in the future. Third, probably DNA-based taxonomies are additional tools for species identification, but I guess will never make it further from identifying organisms we have never seen in the flesh. I take the stand that we must first SEE the organism, and if not, we might get lost in a barage of information without getting to know the whole organism, not a small part of it. And last, I guess that such radical attitudes have taxonomists like myself out of a job, simply because morphology based taxonomies are dubbed as ‘backward’ or ‘outdated’ and therefore receive no funding because the fad is DNA. I myself adore doing DNA identifications for parasites, but first I collect the little worms, study them thoroughly, and then identify them with DNA techniques.

  2. hugo mejia Says:

    And thanks to being out of a job, I cannot read the reference from Nature.

  3. Mark Stoeckle Says:

    Hugo, in case not yet received, a pdf of Nature article was sent. Best, Mark

  4. Mark Stoeckle Says:

    Hugo, Thanks for your comments. We experience the natural world primarily through sight, so morphology will always be the primary way we recognize species. On the other hand, if the goal is have a machine that can recognize species, then DNA is a better modality. In addition, a standardized DNA approach is likely to provide insights that computerized morphologic keys do not. Best, Mark

  5. Jonathan Geach Says:

    Species can go through many profound “life stages”. For example we use DNA barcoding in our business to identify the species of solid wood products and match back to their origin.

    DNA allows us not only to identify species but also the location that the timber was harvested. This is very useful given legality and sustainability issues associated with tropical hardwoods.

    Okay – I recognize that teak furniture or merbau decking are not exactly what one would consider when thinking about these issues, but I offer it here to demonstrate a really important application for barcoding that is having a profound effect on illegal logging and conservation programmes such as REDD.

    I can imagine morphology will remain important, but I clearly see a future for DNA barcoding to generate very cost effective species identification in numerous circumstances.


About this site

This web site is an outgrowth of the Taxonomy, DNA, and Barcode of Life meeting held at Banbury Center, Cold Spring Harbor Laboratory, September 9-12, 2003. It is designed and managed by Mark Stoeckle, Perrin Meyer, and Jason Yung at the Program for the Human Environment (PHE) at The Rockefeller University.

About the Program for the Human Environment

The involvement of the Program for the Human Environment in DNA barcoding dates to Jesse Ausubel's attendance in February 2002 at a conference in Nova Scotia organized by the Canadian Center for Marine Biodiversity. At the conference, Paul Hebert presented for the first time his concept of large-scale DNA barcoding for species identification. Impressed by the potential for this technology to address difficult challenges in the Census of Marine Life, Jesse agreed with Paul on encouraging a conference to explore the contribution taxonomy and DNA could make to the Census as well as other large-scale terrestrial efforts. In his capacity as a Program Director of the Sloan Foundation, Jesse turned to the Banbury Conference Center of Cold Spring Harbor Laboratory, whose leader Jan Witkowski prepared a strong proposal to explore both the scientific reliability of barcoding and the processes that might bring it to broad application. Concurrently, PHE researcher Mark Stoeckle began to work with the Hebert lab on analytic studies of barcoding in birds. Our involvement in barcoding now takes 3 forms: assisting the organizational development of the Consortium for the Barcode of Life and the Barcode of Life Initiative; contributing to the scientific development of the field, especially by studies in birds, and contributing to public understanding of the science and technology of barcoding and its applications through improved visualization techniques and preparation of brochures and other broadly accessible means, including this website. While the Sloan Foundation continues to support CBOL through a grant to the Smithsonian Institution, it does not provide financial support for barcoding research itself or support to the PHE for its research in this field.