Successful automation often involves machines that carry out tasks differently than persons. For example, a Coulter counter (developed by Wallace H. Coulter, an American engineer), analyzes blood cells by electrical charge, producing a detailed report of red and white cell types faster and more cheaply than does a technician examining a blood smear under a light microscope. As another case, machine identification of commercial products is enabled by a UPC bar code, which represents a product name in a digital format that can be “read” almost instantaneously by a laser scanner. In a similar way, DNA barcoding “reads” the digital code of DNA, associating that with species names in a reference database, opening the door to fully or partly automated identifications. In 9 September 2010 Nature, scientists from London Natural History Museum, Louisiana State University, and University of Plymouth, UK, propose a different route to automate taxonomic identification, namely, teaching computers to do morphologic pattern recognition. Now that we are on the threshold of “anyone, anywhere, anything” identification with DNA barcoding, this seems a step backward.
I see three major challenges that limit any morphology-based identification system: naming an organism from bits and pieces, recognizing look alikes and life stages, and the diversity of diagnostic features requiring specialized equipment. On the other hand, DNA is the same whether from an intact specimen or an unrecognizable stomach fragment, readily distinguishes look alikes in any life stage, and can be analyzed using the same equipment regardless of specimen. More generally, at the end of the day, little scientific insight will have been gained from a system that distinguishes life forms by the multitidinous particulars of appearance, whereas a library of DNA barcodes linked to named specimens offers a broad view of species-level differences across the diversity of life.
According to MacLeod and colleagues, “a [DNA] bar code isn’t useful until the reference species has been identified by experts”. This makes no sense to me. All large barcode surveys of animals, from ants to fish, have revealed hidden genetic divergences, in many cases leading to recognition of new species. In fact, DNA barcoding is fast way of screening existing collections for unrecognized species. In this same section, as part of discounting a DNA approach, they state “researchers frequently need to identify non-living objects as well as living ones”. I don’t understand how this is an objection, since, for example, DNA barcodes from ancient bone fragments have been used to define species of extinct flightless Moa (Lambert et al J Heredity 2005).
I know from iPhoto’s remarkable ability to recognize individuals that computers are getting better at pattern recognition. Further development focused on taxonomic specimens may lead to useful tools. However, this seems unlikely to lead to a widely applicable automated system. In a study cited by the authors, phytoplankton identifications by 16 marine ecologists were compared to those with DiCANN, a machine learning system (Culverhouse et al Marine Ecol Prog Series 2003). The authors of that study conclude what is likely to be generally true about morphology based identification: “In general, neither human nor machine can be expected to give highly accurate or repeatable labeling of specimens”.










