DNA Barcoding – Page 21 – The Rockefeller University

mtDNA recovery from old bones hints at DNA durability, ubiquity

December 23, 2008

In another seeming step towards Jurassic Park, two groups of researchers recovered full-length mitochondrial DNA sequences from 22,000 to 44,000 year-old bones of extinct European and North American bears. Full-length mtDNA has been recovered from similarly ancient specimens, but in those cases frozen tissues preserved in permafrost were used. Both groups used specialized PCR protocols employing several hundred primer pairs designed to recover short fragments, rather than one of the newer sequencing technologies, demonstrating the continued power of DNA amplification.

In 28 july 2008 BMC Evol Biol Proc a group of 18 researchers led by Johannes Krause, Max Planck Institute, Germany, recovered full-length mtDNA from a 44,000 year old Ursus spelaeus (European cave bear) bone found in an Austrian cave, and from a 22,000 year-old skull of Arcdotus simus (American giant short-faced bear) from Eldorado Creek, Canada. In 11 november 2008 Proc Natl Acad Sci USA, 14 researchers led by Jean-Marc Elalouf, Institute de Biologie et Technologies de Saclay, France, report full-length U. spelaeus mitochondrial genome from a 32,000 year-old bone from the legendary Chauvet-Pont d’Arc Cave, home to the oldest rock art pictures ever found.

If we found a bone from one of these extinct bears in our backyard, could it be identified by its COI barcode? Submitting the long-ago bears’ COI barcode region sequences (positions 48 to 705) to BOLD ID engine flags both species as not in database, with a NJ tree similar to that created by full-length genomes (ie the extinct U. spelaeus is sister to U. arctos (Brown bear) and U. maritimus (Polar bear), and extinct Arcdotus simus is sister to Tremarctos ornatus (Spectacled bear). Of course it would be difficult to recover a full-length sequence–what about the 130 base pair “mini barcode” proposed for broad-scale biodiversity analysis? This is within the size range(ie < 180 bp) that Elalouf and colleagues report best for recovery of ancient DNA. Remarkably, A. simus mini-barcode submitted to BOLD ID engine gives NJ tree correctly showing T. ornatus as its sister species and U. spelaeus mini-barcode correctly picks out U. arctos and U. maritimus as most closely-related species.

Recovering DNA from ancient bones leads to CSI-like thoughts of where else we might usefully recover DNA for species identification. DNA has been recovered from naturally shed feathers, flakes of seal skin at breathing holes in polar ice, hair and saliva left by predators of sheep, bird faeces, and, turning to world of commerce, ancient and modern processed leather goods (Long 2007). I look forward to analyses of the many processed foods with what is currently an unverifiable “list of ingredients.”

Some taxonomists worry when DNA barcodes highlight unfinished taxonomy

December 13, 2008

In Cladistics 25 Sept 2007, Steven Trewick from Massey University, New Zealand applies mtDNA to help sort out endemic flightless grasshoppers in genus Sigaus, which are restricted to mountainous alpine habitat on New Zealand’s South Island. Here we might expect a complex pattern of diversification. These are small, terrestrial, flightless, presumably non-vagile (ie don’t travel far) animals in a deeply fragmented habitat. Their habitat lies in New Zealand’s central mountains, the Southern Alps, formed by a geologically recent uplift 5 to 2 million years ago. Like other organisms restricted to elevated mountain terrain, they are effectively living on “sky islands.” In this setting, we might expect a plethora of relatively young species with very narrow ranges, with difficulty determining which forms merit species-level status.

Trewick focused on Sigaus australis species complex, which includes the apparently widely-distributed S. australis, and 5 sympatric or parapatric species with much narrower ranges (S. childi, S. obelisci, S. homerensis, and 2 undescribed species). Within this complex he analyzed 160 individuals collected at 26 locations (mostly S. australis (136 individuals) and 1-13 individuals for the more restricted species). For mtDNA analysis, an approximately 600 bp region of 12-16S and about 500 bp of 3′ COI (ie not overlapping COI barcode region!) were examined.

Although the 3′ COI fragment analyzed in this grasshopper paper has been utilized in a number of invertebrate mtDNA studies, it is just one of many mtDNA targets that give essentially equivalent phylogenetic information (eg, in this study COI and 12S-16S gave same results). The hodgepodge of mtDNA regions analyzed in species-level animal work means that most data cannot be compared or combined. In my view, ALL animal mtDNA studies should include the standard COI barcode (defined relative to the mouse mitochondrial genome as the 648 bp region that starts at position 58 and stops at position 705; https://barcoding.si.edu/PDF/DWG_data_standards-Final.pdf), plus of course any other regions of interest. Standardization on the barcode region ensures long-term usefulness, both as a reference for identification and for comparisons across the diversity of animals. In addition to a defined genic target region, DNA barcode standards have other advantages, including that records are linked to voucher specimens and list primer sequences and include bidirectional trace files and quality scores.

In the present study single-strand conformation polymorphism (SSCP) of a 380 bp 12S fragment was used to screen for differences, and then individuals with different SSCP results were subjected to sequencing, so in the end just 40 of 160 Sigaus sp grasshoppers were sequenced for COI. This also means that there is voucher data in GenBank for just these 40 individuals. Continuing down the DNA barcode standard checklist, primer sequences are not easily accessible (there is a published reference for the primers, but access requires article purchase), it is not stated if bidirectional sequencing was done, and trace files and quality scores are not provided. I hope that future studies on New Zealand orthopterans will include the 5′ COI region and the remaining information, as I believe this will increase their long-term utility both as an identification reference and for comparisons across diversity of animal life (>520,00K individuals representing >50,000 species in BOLD so far). There is a big opportunity for grasshopper specialists to contribute–the BOLD taxonomy browser contains records for only 191 of the approximately 10,000 species in family Acrididae!

To skip to the conclusion, the sequence analysis gave an entirely different picture than existing morphologic taxonomy. 12S-16S and COI gave identical results: four well-supported geographically-structured clades within the widespread S. australis morphospecies, 3 of which had partly overlapping ranges. The 5 described or proposed species in the complex nested within these clusters, with shared or similar mtDNA haplotypes to S. australis from the same region.

The author concludes that the results show that “haplotype sharing and paraphyly essentially invalidate the DNA barcoding approach.” I disagree. To my reading, the most parsimonious explanation is that 1) morphologic taxonomy has overlooked deeply divergent genetic lineages, which likely represent different species, in S. australis for over 100 years, and 2) a number of morphologically distinctive forms have arisen very recently.

In support of the first point I note that in April 2008 report “Diversity and taxonomic status of some New Zealand grasshoppers” by the same author and Simon Morris, “Attention needs to be given to the spatial distribution of diversity within [S. australis complex]…Further morphological study may support the splitting of one or more of the groups indicated by phylogenetic analysis of mtDNA sequences.”

Regarding point 2, genetic methods including DNA barcoding may not resolve very young species. For Sigaus sp. grasshoppers, nuclear sequence data will help sort out whether these are young species or the products of recent hybridization or introgression.

In this regard, I am struck by the apparent variability in some grasshopper species, as in the color morphs of S. childi shown above. It brings to my mind the extraordinary transformations from solitary grasshoppers to swarming locusts (these are members of the same Acrididae family as Sigaus). Perhaps grasshopper genetics include analogous latent “switches” that might enable relatively rapid evolutionary transformations.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

Growing DNA barcode database leaps past 50,000 species

November 29, 2008

The DNA barcode initiative aims to establish a universal identification system for plant and animal species by analyzing a standardized genetic locus (or for plants, a small set of loci). In addition to making analysis cheaper, standardizing on one or a few loci enables a diverse assemblage of researchers to work together to build an interoperative library.

If there were no Human Genome Project, researchers working gene by gene might eventually have decoded the human genome sometime during this century, albeit at much slower pace using more expensive and less accurate technology. For a genetic library of biodiversity, a concerted effort is essential. The various taxon-specific genetic initiatives, which are typically aimed at reconstructing deep evolutionary history, are too limited in scope (ie number of species and individuals per species analyzed) and too expensive in terms of cost per species to completely catalog animal and plant life. In addition, because different groups analyze different gene regions, it is impossible to stitch together the results into single database, for instance one that could be used to identify an unknown specimen without knowing beforehand what group it belongs to. The DNA barcoding initiative offers the necessary framework for constructing a genetic reference database for species. In addition as a large-scale project it should help drive technological improvements analogous to those spawned by the Human Genome Project which enabled its completion for a fraction of the originally projected cost.

As of today, researchers have deposited 516,134 barcode records from 50,138 species in Barcode of Life Database (BOLD) www.barcodinglife.org. According to my analysis of GenBank shown in figure, this puts COI BOLD records far above the totals for any other single gene for animals. Thus five years of a concerted, standardized approach has leapt ahead of 30 years of incremental analysis. If the proof is in the pudding, this to me is a pudding that proves the value of the DNA barcoding initiative. Comparison of the totals indicates that most BOLD COI records are not yet in GenBank, although some aspects are visible through ID engine and Taxonomy Browser, so there is work to help move these fully into the public domain and at the same time ensure appropriate academic credit. Congratulations to all those moving this effort forward.

Everyday DNA

November 25, 2008

GPS devices for civilian use were first introduced 1982. The TI 4100 from Texas Instrument Company cost $150,000, weighed 50 lbs, and had heavy demand from land surveyors (GPS World, December 2004). Thanks to steady improvements in cost, size, and power demand, GPS technology is now a standard feature in cellular phones, meeting such daily needs as finding the nearest coffeeshop. The simplicity of everyday use is undergirded by an enormous investment in technology. In a 1997 report, RAND corporation estimated approximately $8 billion had been spent to develop, launch, and maintain the 24-satellite system that provides GPS signals, and the ongoing costs were $300 million/y.

The GPS history suggests viewing the current drive to establish a DNA reference library for millions of plant and animal species as infrastructure investment, analogous to the GPS satellite system. It is relatively expensive but once established will enable diverse new applications for society and science. What uses will improvements in DNA sequencing married to a robust DNA barcode library bring?

Food authentication is likely to be one major application, including a wide array of products such as fish, olive oil, and packaged mixtures such as soups and pet food.

Making sense of Mexican microcrustaceans

November 16, 2008

In Hidrobiologica March 2008 researchers from El Colegia de la Frontera Sur, Universidad Autonoma Metropolitana, Iztapalapa, Mexico, describe a new species of Cladocera from temporary pools in a semi-desert region. Cladocera, commonly known as “water fleas,” are minute crustaceans mostly limited to fresh water; Daphnia sp are the best known. Cladocera are of practical importance as water quality indicators.

Similar to that for other invertebrates, the species description for this minute (0.4 mm) crustacean Leberis chihuahuensis comprises about 4 pages of mysterious text and 2 pages of equally enigmatic illustrations. In addition, the DNA barcode of the type specimen is provided, as well as the more usual NJ tree, in this case showing 14% sequence divergence from its sister species L. davidi.

By including both kinds of characters, ie DNA barcode and morphology, Elias-Gutierrez and Valdez-Moreno provide what seems to me a model for any new species description, one that will enable specialists and non-specialists alike to make the most use of their findings.

What’s in a name?

November 7, 2008

In 2003, Paul Hebert and colleagues proposed a universal identification system employing short DNA sequences as identifiers for animal and plant species. Inspired by the Universal Product Code labels that stores use to track merchandise, he named these short sequences “DNA barcodes.” My colleagues and I set down thoughts inspired by this new name:

“Commercial barcodes and the barcode of life

Jesse Ausubel, Mark Stoeckle, Paul Waggoner

September 2004

Although new methods of sequencing and visualization have displaced the one that produced autoradiographs that show blurry gray stripes of a gel indicating presence or absence of particular bits of DNA, the analogy between the commercial barcode and the barcode of life may be traced to it. However, the power of the analogy comes from other similarities: large capacities to differentiate mind-boggling diversity, ability of digits to distinguish unambiguously, rapidity and economy of identification, ability of parts of the code to distinguish categories, and avoidance of a Tower of Babel by uniformity. We elaborate briefly.

Without the final digit that checks accuracy, the quartets of bars and spaces in the Universal Product Code (UPC) have 10 alternatives at 11 locations, creating an ample 1011 capacity to identify manufacturers and their products. Instead of operating in quartets, sequences of CATG operate in trios that specify synthesis of an amino acid. Each trio of the four alternative CATG has 43 or 64 alternatives. A 600-unit sequence of DNA comprising 200 trios with 64200 alternatives opens ample capacity to identify millions of species. Such large capacities are needed to differentiate the diversity of an economy or a forest.

Because one product number in a UPC differs from another by discrete, digital steps rather than by the shades of verbal descriptions, the numbers identify the product–unambiguously. A barcode of life written as a sequence of CATG along a uniform locality of genomes differs from another by four discrete, unambiguous steps rather than by gradations of words, shapes, and colors. Barcodes gain power because digital beats analogue at making unambiguous distinctions.

Speed and economy also propel use of barcodes. Behind the beep of a UPC scanner lies orchestration that began with the initial conception of bars for numbers a half-century ago. Users and inventors orchestrated optics, electronics, and software to develop miniature, robust equipment that made the barcode an affordable master key to supermarket inventories and suppliers (Swartz 1999). Now that the price of DNA identification of a species has fallen to about $10 (Randhawa 2004), the orchestration can begin to provide a barcode of life. Uniformity fosters frequent use and thus learning and economy.

Product codes can identify products with increasing resolution. At the first level of resolution, the first bars of a UPC on a carton resolve the manufacturer. At the second level, the last bars resolve the product line. Opening the box and reading the serial number would resolve the individual. In analogous manner, extending a DNA barcode through more and more sequences would resolve from kingdoms to species, subspecies, and finally individuals. For our goal, Ockham’s razor prescribes as short a barcode of life as suffices to distinguish species.

Uniformity bestows the universality implied by the U in UPC. Scanners in hardware, grocery, and convenience stores must all call the same light bulb by the same 12 digits. Recently agreement between America and Europe added a thirteenth digit, made uniformity more widespread, and brought universality closer to realization (NY Times 12 July 2004, page C1). The power of standardization, whether in railroad gauges or typewriter keyboards, is one of the strongest lessons of the history of technology.

Finally, the success of a short DNA sequence distinguishing species will rest on reasoning, testing, and agreement, not just an appealing analogy. Reasoning will select a uniform locality on genomes that varies enough but not too much among species, testing will establish whether barcodes of that uniform locality correspond to established binomial names across several species, and then agreement will foster an expanding compilation of matching barcodes and binomial names.”

Genetics is essential framework for microbiology, eukaryotes next?

October 24, 2008

Robert Koch (1842-1910), father of medical microbiology, isolated agents of mankind’s major plagues: Vibrio cholera, Bacillus anthracis (anthrax and bubonic plague), and Mycobacterium tuberculosis. He laid down four conditions, “Koch’s postulates“, for establishing that an organism is the agent of disease, and subsequent generations of researchers applied these principles to determine the etiology of a multitude of infectious diseases. One legacy of Koch’s postulates was that isolation of organisms in pure culture became the backbone of diagnostic and research microbiology.

A century later, genetics has replaced culture as the essential framework for exploring microbial life. Metagenomic analysis of environmental samples, including from anatomic sites, has identified an unsuspected plethora of organisms, most of which are unculturable, at least under standard laboratory conditions. Even for organisms that can be grown in the laboratory, genetic detection is often the preferred diagnostic method, including for example detection of HIV, Neisseria gonorrhea, and Chlamydia sp. Following Carl Woese’s early lead (PNAS 1977, 74:5088), microbiologists have generally included a standard locus, 16s rRNA, in genetic work, enabling phylogenetic trees spanning the diversity of life, and allowing each new isolate to be analyzed in conjunction with the work of others (as of 24 oct 2008, 75,257 16S rRNA sequences in GenBank).

Are genetic methods equally necessary for eukaryotes? In October 2008 Mol Ecol researchers from Cardiff University analyze mitochondrial COI differences among nine species of British lumbricid earthworms which were first described between 1758 and 1843, over 150 years ago. Partial COI sequences (a 582 bp segment which overlaps 648 bp DNA barcode region) from 71 individuals showed 2-5 deeply divergent clusters (average 13-15% sequence difference) in 4 of the 8 multiply-sampled species, and small divergences within each cluster, “indicative of the presence of multiple previously undescribed species”. COI sequences from 270 individuals of one species, Allolobophora chlorotica, collected at 24 British and 5 mainland European sites showed 5 divergent clusters and surprisingly no clear geographic distribution pattern; over half the sites had 2 or more lineages, and one site had 4 lineages. As expected the same clusters were found by comparing another mitochondrial gene, 16s rRNA. Two of the lineages were found only in green color morphs; prior work indicated this form has distinct ecological preferences compared to pink morph Allo. chlorotica and that F1 hybrids are sterile, suggesting species status. As an aside, if earthworm specialists find morphological and ecological differences and mating incompatibility, why not designate as distinct species? As another example, two forms of European corn borer Ostrinia nubilalis are sympatric, genetically distinct, develop on different host plants, have different mating pheromones, and exhibit >95% reproductive isolation, yet are described as “host races” rather than separate species (Science 2005, 308:258). It sometimes seems there is an arbitrary aspect of how species status is awarded, or perhaps the process is slow.

To see if mtDNA clusters were also reflected in nuclear genome, King et al performed AFLP (amplified fragment length polymorphism) mapping on 4-12 individuals from each of the 5 lineages. The nuclear results corresponded exactly to COI clusters except that the 2 green morph forms could not be distinguished, suggesting these are either a single interbreeding species (despite 14% mtCOI sequence difference!) or are young species which have not yet accumulated differences in nuclear DNA. It is hard to see how a 14% sequence difference could accumulate in mtDNA without accompanying nuclear changes, so I wonder if one of the genetic forms might reflect a relatively recent introgression from another earthworm species which has not yet been sequenced. It will be interesting to see whether the two green morph lineages, which were often found together at the same site, show assortative mating or restricted fertility. The authors conclude “extraordinary species-level genetic diversity was revealed among the British earthworms”….”four of nine ecologically generalist earthworms are probably complexes of multiple cryptic species”. And finally “further earthworm research in areas such as ecology and ecotoxicology, should be conducted in the knowledge that there are multiple cryptic species within many earthworm species”.

I conclude that genetics is equally essential for eukaryotic taxonomy as for microbiology. I believe there is no getting around the need to genetically reexamine most or all of the species named in the past 200 years to see if what we recognize as single and distinct species are really so. If there can be cryptic species in large visible animals such as birds, and males and females can be given different species names in fish, then there must be many more such oversights among the less easily observed. A standardized approach (ie DNA barcoding) is the most expeditious way forward and will leave a permanent marked trail that can easily be followed by non-experts who wish to identify their specimens. As in bacteria, standardizing on a single locus (ie barcode region COI for animals) enables new work to be seamlessly combined with old, leveraging its value (497,851 barcode records from 48,459 species in BOLD so far). Regarding higher-level evolutionary relationships, I find routine dismissal based on mathematical modeling of mtDNA single-locus trees, but not much effort to see what the potential is. Perhaps translated amino acid sequences and/or GC content can be informative for deeper branches, and nucleotide sequences for family- and generic-level relationships. At the very least, mtDNA trees serve to generate hypotheses, which can be corroborated or disproved by more extensive genetic, morphologic, ecologic, behavioral, or fossil record data.

Building DNA libraries: one-quarter world birds so far

October 12, 2008

As of October 11, 2008 researchers have deposited 14,594 DNA barcodes in BOLD representing 2,586 avian species, 26% of world’s 9,933 birds. You can browse taxonomic coverage to date at All Birds Barcoding Initiative (ABBI) and BOLD taxonomy browser sites. Coverage includes representatives of all 27 orders and 159 families of world birds and nearly half of avian genera [1,014/2,101 (48%)]. To uncover possible hidden diversity, most researchers are sampling species across their geographic ranges rather than focusing on named subspecies, many or most of which appear to represent clinal variation (see for example Zink 2004, Phillimore and Owens 2006).

How far along are researchers toward mapping COI barcode resolution of avian species? Birds are of particular interest because species limits are generally well-defined, supported by a wealth of morphologic, ecological, behavioral, and other genetic data. Looked at regionally, there is good coverage in northern North America, parts of Central and South America, western Europe, Korea, and New Zealand, so it should be possible to see how well COI barcodes distinguish among local species in these areas. Published studies so far show >95% resolution of named species and have identified genetically divergent clusters which may represent unrecognized cryptic or “hidden” species (Vilaca et al 2006, Yoo et al 2006, Nyari 2007, Kerr et al 2007). As an aside, “cryptic” is an awkward term for genetically divergent populations of birds since most of these have diagnostic differences in morphology or behavior; “hidden” is more accurate to my ear. On a separate note, COI surveys have regularly revealed misidentified voucher specimens of birds, suggesting routine application of DNA barcode analysis could enhance quality of avian collections.

Looked at globally, there is 100% coverage of 104 polytypic genera (having 2 or more species) representing 324 birds, so this should include the sister species and/or “nearest neighbors” for these, plus there are 853 monotypic genera (having only 1 species) in world birds, which are likely or known to be genetically divergent from birds in other genera. In addition, there are likely many other sister species or “nearest neighbors” within the remaining 1,982 birds with DNA barcodes so far (for example, BOLD includes 28 of 29 Dendroica sp wood warblers). It would be interesting to look at the nearest neighbor differences within the global data set. To my knowledge, comparisons among regions with COI barcode data have been not been published. My impression based on other avian genetic work is that named taxa in different biogeographic regions are genetically distinct, plus there are many unrecognized genetic divisions within species that range across biogeographic regions. I look forward to trans-regional and global comparisons!

New places to find DNA

October 6, 2008

In 29 july 2008 Fish Biology scientists from Macquarie University, Sydney describe successful recovery of mitochondrial DNA from contemporary and historical shark teeth and jaws. After developing the method on 11 recently collected teeth from Gray nurse shark Carcharias taurus and Ornate wobbegong (excellent name!) Orectolobus halei, Ahonen and Stowe applied it to 20-40 year old museum specimens, including 5 jaws from 3 species and 19 individual teeth from 2 species. They collected approximately 0.02-0.06 g of “tooth powder” by drilling several small holes into a tooth or jaw; DNA was extracted using a standard silica-based method or Qiagen DNAeasy tissue kit.

The authors are interested in historical population sizes for sharks; following the theory that genetic variation within species is an indicator of population size, they picked the hypervariable control region as their target. As an aside, results so far with mitochondrial surveys including DNA barcoding generally show very low variation within most animal species and no relationship between intraspecific variation and census population size. In any case, a 700 bp fragment of mtDNA control region was amplified with a single pair of primers. The two extraction methods gave similar results. DNA was amplified and sequenced from 100% of the contemporary samples and 15/34 (44%) historical samples. 700 bp is a relatively long sequence to amplify from historical samples, suggesting it may be possible to obtain standard COI barcodes (648 bp) from museum skeletons of sharks and bony fish, which would be particularly useful for those species which are rare or otherwise difficult to collect. A standard set of fish primers (see for example Hubert et al June 2008 PLoS ONE) amplifies COI barcode region from most fish (more than 5,000 species so far, including including representatives of all major divisions of Chondrichthyes (cartilaginous fish) and Osteichthyes (bony fish), both marine and freshwater).

To date most fish specimens are preserved in formaldehyde, which makes routine DNA recovery difficult or impossible. If DNA can be recovered from skeletons, there are many museum specimens that might be used. For example, the American Museum of Natural History Icthyology Department collection includes over 35,000 fish skeletons as compared to about 2,500 tissue samples so far.

DNA differences first step in describing new spider species

September 28, 2008

In August 2008 Sys Biol (open access!) researchers from East Carolina University apply mtDNA analysis as the necessary first step in defining three new species of trapdoor spider, previously subsumed as a single species Aptostichus atomarius. According to Bond and Stockman, “the genus Aptostichus is species rich, consisting of 30+ species (most undescribed) found predominantly throughout southern California.” I note that online World Spider Catalog version 9.0, 2008, lists four Aptostichus species, all described between 1891 and 1919, so apparently there is a lot of more work to be done, including updating the reference lists. One of the new names is A. stephencolberti, which led to what must be the first appearance of a spider taxonomist on national television (link to TV episode).

The authors describe the challenge for delimiting species in these California trapdoor spiders: “Highly structured, genetically-divergent, yet morphologically homogeneous species (eg nonvagile cryptic species[my note: nonvagile refers to organisms with limited dispersal]), although often ignored or overlooked, provide one of the greatest challenges to delimiting species. Populations, or very small groups of populations constitute diverent genetic lineages but present somewhat of a contradiction because they lack the “requisite” characteristics” often used when delimiting species. Morphological approaches to species delimitation in many of these groups grossly oversimplify and underestimate diversity; in short these traditional applications fail if our interests extend beyond what can simply be diagnosed with a visual and/or anthropormorphic-based assessment.”

So on the one hand, these spiders comprise multiple genetically distinct lineages (up to 24% sequence difference in 12S/16S mtDNA) with geographically restricted ranges; on the other hand, they all look more or less alike. How to decide which are species? The authors apply “cohesion species concept” by asking if the lineages are “genetically and/or ecologically interchangeable.” The authors provide helpfully provide explicit details of their decision making process. The short version is that genetically distinct, geographically disjunct lineages are counted as separate species, and parapatric or sympatric lineages are counted as different species only if they are NOT “ecologically interchangeable (EI).” EI is calculated from a defined set of ecological and climatic parameters.

Under some criteria, the authors note these spiders could be split into “more than 20” [or even] “~60” groups, which they describe as “an unreasonable number of species-level lineages.” This conjecture may be true; I hope that more scientists apply similarly explicit criteria for species delimitation as described here so we can learn more about how finely divided biodiversity is, in addition to our judgment about what is a “reasonable” number of species. Genetics is a powerful window into biology, of course. In birds the frequency of extra-pair matings (up to 96% pairs and 75% offspring in fairy wrens, for example (Double and Cockburn 2000)) was unsuspected until genetic testing was applied to parents and offspring.

The genetic framework in this study is based on 1300 bp of 12S/16S mtDNA (167 individuals, 75 locations), plus 905 bp nuclear ITS sequence in a subset of 22 individuals. Looking ahead, I hope that in their next study of spider phylogeography the authors include COI as an mtDNA locus (full-length sequence is 1500 bp, so that would likely have given the same phylogenetic signal as 12S/16S); this would enable the authors and others to combine their data with the reference COI DNA barcode databases.

I close with an observation about spider genetic data. To my eye, there are surprisingly few genetic data on spiders so far. A search in GenBank for Order Araneae (spiders) shows 9,445 sequences (representing any gene) from 1,852 species (4.6% world total of 40,432 species (World Spider Catalog)). Looking at mitochondrial genes, there are 2,629 COI sequences from 1,071 species (2.6% world) and 2,268 12S/16S sequences from 1,041 species (2.6% world). Thus it appears that only about 1/40th of world’s spiders have a uniform gene locus deposited in GenBank, and on average, only 2 individuals per species have been sequenced. The Spider Tree of Life project plans to sequence 50 loci (including COI and 12S/16S) from about 500 species, so that will help. I hope that arachnologists will follow the approach in this paper and include a standard genetic locus (most usefully COI) as part of species descriptions and analyze multiple individuals per species. Among other applications, this might help identify currently unidentifiable juvenile forms, like the wind-blown “little aeronaut[s]” that arrived on silk threads in vast numbers on the Beagle when it was sixty miles distant from land, November 1, 1832 (Voyage of the Beagle).

Rockefeller University

Program for the Human Environment

Area of Research: DNA Barcoding