The Barcode Blog

A mostly scientific blog about short DNA sequences for species identification and discovery. I encourage your commentary. -- Mark Stoeckle

Subscribe to this blog

Sign up for email notifications

Visualizing birds: 4. Diagnostic differences

In 2004, the American Ornithologists’ Union (AOU) (Banks et al. 45th supplement) recognized most of the smaller-bodied forms of Canada Goose (Branta canadensis) as a separate species, Cackling Goose (B. hutchinsii). It can be difficult to distinguish these birds in the field including for banders, as there is overlap between some of the smaller forms of candensis and the larger forms of hutchinsii (see for example David Sibley’s account). Given morphologic approximation in some cases, one might also expect a range of genetic differences between the species, with some Canada geese being very similar to some Cackling Geese.

Using COI as a genetic flashlight, a surprising finding to me is that sequence differences between species are generally fixed. Individuals of a species do differ from each other in COI, but they usually differ in ways that do not change the distances between species. There are exceptions, particularly with species that hybridize regularly, and perhaps with very young species, but these are a small minority of birds analyzed so far. Stated another way, for most species there are no genetically intermediate forms. One important corollary is that early results with small numbers of individuals are likely to be indicative of results with more comprehensive sampling, which is what has been seen with All Birds Barcoding Initiative (ABBI) to date.

Where’s the data? Here are some illustrations of results so far. For figure at left, I downloaded one B. canadensis and one B. hutchinsii barcode from public records section of BOLD (www.barcodinglife.org), and printed the map showing where each specimen was collected (a very useful tool in BOLD). For this and subsequent sequence analyses, I used publicly-available MEGA software to highlight all sites at which the two sequences differed. (In MEGA you click “variable” to highlight and then “export highlighted sites to Excel.” I then used Excel’s “conditional formatting” color the cells according to letter). These two sequences differed at 13 out of 653 COI positions, 11 of which were 3rd codon position (codon position may turn out to be interesting later on).

Now what happens if you analyze a larger number of individuals? For next illustration, I used BOLD Taxonomy Browser, navigated to Chordata-Aves-Anseriformes-Branta, downloaded all public sequences, and used MEGA as above to highlight and export all sites that differed among the set (there are several other Branta species; these were deselected for this analysis).

With over 100 individuals for each taxa collected at widely dispersed sites (including some canadensis in Norway and Sweden), variation within both species was observed. Most of this was scattered differences found in one or a few individuals, although there did appear to be a number of canadensis individuals with a shared variant, which might be of interest for further study.

However, the intraspecific variation rarely involved diagnostic sites, with the result that all pairwise comparisons between canadensis and hutchinsii differed at 12 or 13 sites.

I close with another slightly more complex example. There are 5 Catharus thrushes in North America. These are relatively small, drab woodland birds with haunting, ethereal songs (you can listen to Hermit Thrush (C. guttatus) song on Cornell Laboratory of Ornithology site). One bird, Bicknell’s Thrush (C. bicknelli) was first recognized by AOU as a species distinct from Gray-cheeked Thrush (C. minimus) in 1998, and distinguishing individuals except by song is difficult even for experts with hand-held birds.

I downloaded all public Catharus barcodes using BOLD Taxonomy browser, and analyzed as described above. In comparing single sequences from the 5 species, these differed at 6 to 52 sites. With larger sample sizes (12-34/species), some intraspecific variation was observed, particularly in Hermit and Swainson’s Thrushes (see NJ tree at left of larger alignment), but diagnostic differences were mostly unchanged, even for very closely related minimus-bicknelli-fuscescens group.

[On a separate note–the nature of intraspecific variation might be of interest–a disproproportionate number are singletons (present in one individual in the set) and are codon first or second position substitutions (whereas most interspecific differences among closely-related birds are at codon third position). No doubt evolutionary biologists have investigated this previously, but perhaps not with such a large number and diversity of species with multiple individuals analyzed.]

These figures help illustrate the nucleotide sequence differences that distinguish species. In the language of evolutionary biology, these sequence differences are diagnostic characters. An NJ tree is a powerful shorthand way of representing these differences. In some situations, analyzing the actual diagnostic characters will be important. It might be a useful exercise for the scientific community to compile and display on the web diagnostic differences, at least for groups in which most or all the closely-related species have been surveyed.

This entry was posted on Saturday, March 26th, 2011 at 10:51 pm and is filed under General. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.

Contact: mark.stoeckle@rockefeller.edu

About this site

This web site is an outgrowth of the Taxonomy, DNA, and Barcode of Life meeting held at Banbury Center, Cold Spring Harbor Laboratory, September 9-12, 2003. It is designed and managed by Mark Stoeckle, Perrin Meyer, and Jason Yung at the Program for the Human Environment (PHE) at The Rockefeller University.

About the Program for the Human Environment

The involvement of the Program for the Human Environment in DNA barcoding dates to Jesse Ausubel's attendance in February 2002 at a conference in Nova Scotia organized by the Canadian Center for Marine Biodiversity. At the conference, Paul Hebert presented for the first time his concept of large-scale DNA barcoding for species identification. Impressed by the potential for this technology to address difficult challenges in the Census of Marine Life, Jesse agreed with Paul on encouraging a conference to explore the contribution taxonomy and DNA could make to the Census as well as other large-scale terrestrial efforts. In his capacity as a Program Director of the Sloan Foundation, Jesse turned to the Banbury Conference Center of Cold Spring Harbor Laboratory, whose leader Jan Witkowski prepared a strong proposal to explore both the scientific reliability of barcoding and the processes that might bring it to broad application. Concurrently, PHE researcher Mark Stoeckle began to work with the Hebert lab on analytic studies of barcoding in birds. Our involvement in barcoding now takes 3 forms: assisting the organizational development of the Consortium for the Barcode of Life and the Barcode of Life Initiative; contributing to the scientific development of the field, especially by studies in birds, and contributing to public understanding of the science and technology of barcoding and its applications through improved visualization techniques and preparation of brochures and other broadly accessible means, including this website. While the Sloan Foundation continues to support CBOL through a grant to the Smithsonian Institution, it does not provide financial support for barcoding research itself or support to the PHE for its research in this field.