Plants challenge DNA barcoding. It has been difficult to identify candidate barcode regions that amplify readily and also distinguish among closely-related species. In 7 February 2008 PNAS (open access) researchers from University of Johannesburg; University of Costa Rica; Royal Botanic Gardens, Kew; and Imperial College, London, analyze potential barcode regions on specimens collected in plant biodiversity hotspots in Kruger National Park, South Africa, and Costa Rica. They initially tested eight candidate regions identified in earlier studies (coding regions accD, rpoC1, rpoB, ndhJ, ycf5, and matK, and non-coding trnH-psbA). Amplification was done according to earlier studies except that a different set of matK primers was used which appeared to be more effective. All eight regions were examined in 101 specimens representing 32 species of trees, shrubs, and achlorophyllous parasites from South Africa, and on 71 specimens representing 48 species of Costa Rican orchids (in all, 44 species with 2-7 specimens per species, and 36 species with one sample). Based on their analysis, the coding region matK with the new primer set and the non-coding region trnH-psbA were >90% effective in species identification. For reasons I do not understand, the authors favor unweighted pair group method with arithmetic mean (UPGMA) for analyzing genetic clustering, although they tested neighbor-joining, maximum likelihood, maximum parsimony, and Bayesian methods. Given the presumed advantages of a coding region barcode (ease of alignment, greater higher-level phylogenetic signal), Lahaye et al propose 5′ region of plastid gene matK as a first-pass standard barcode for plants.
The authors then analyzed the 5′ matK barcode in a much larger sample of orchids: 1,566 specimens representing 1,084 Mesoamerican species. It is exciting that this is the largest test of candidate barcode variation within species for plants to date. They report 212 genetic clusters in UPGMA tree, of which “86 fully matched previously recognized species and a further 25 partially matched taxonomic species…an examination of these clusters reveals cryptic species, which need further taxonomic work”. I am unsure from this short report what “partially matched taxonomic species” are and how many possible cryptic species were identified. I look forward to a more detailed report on the DNA barcodes, morphology, and range distribution of this very large sample of Mesoamerican orchids. A DNA-based method for identifying non-flowering orchids and other plants could help protect many threatened species.
A concise, comprehensive review of plant barcoding results so far, including where Lahaye et al’s work fits in this context, is in this same issue 26 February PNAS http://www.pnas.org/cgi/reprint/0800476105v1, authors Kress and Erickson, Smithsonian Institution.