Census of Marine Life goes Experimental and Weird

The Census of Marine Creature video, produced by the CoML + National Geographic about the work of the deep sea teams of the CoML, has been nominated for a Webby, 8 June 2010 selection. The Webby Award is the leading international award honoring excellence on the Internet. The video features several species found by the Census of Marine Life in the deep sea, including a Piglet Squid and Football Octopod. Webby’s are awarded both by an expert jury and by popular polling. To vote for the video, visit https://www.youtube/webby and click “Experimental & Weird”, The Census of Marine Life video is last on the page. Support the Census of Marine Life by voting today.

The Hidden Majority of Marine Life

For a lively summary of the latest news from the Census of Marine Life, including still images and videos, visit the news release about “hard-to-see” creatures. Learn about a carpet of bacteria the size of Greece and 35 elephants of marine microbes for every human.

Leishmaniasis: DNA helps ID vectors, parasite, control agent

Leishmaniasis is a chronic parasitic infection caused by various Leishmania species, kinetoplast protozoans related to Trypanosoma (the latter includes agents of African sleeping sickness and Chagas disease, suggested as a cause of Charles Darwin’s ill health in late life).  Depending on the species involved, leishmaniasis manifests as illness ranging from non-healing cutaneous or mouth ulcers (CL) to sometimes fatal visceral infection (VL). In the Neotropics, 12 species infecting humans have been identified, all associated with CL.  Neotropical leishmaniasis is mostly zoonotic  (ie originates from animal reservoirs as opposed to human-to-human transmission), and the vectors are tiny phlebotomine sand flies, particularly Lutzomyia sp.

Lutzomyia_longipalpis-sandflyIn March 2010 PloS Neglected Trop Diseases investigators from Smithsonian Tropical Research Institute (STRI) and Instituto Conmemorativo Gorgas de Estúdios para la Salud, Panamá, apply DNA testing to Lutzomyia sandflies collected on Barro Colorado Island, STRI’s island home in the Panama Canal. Aiming to analyze as many species as possible, Azpurua and colleagues selected 435 individuals, which they morphologically identified as representing 16 Lutzomyia and 2 Brumptomyia sandfly species, for further analysis. Over 95% of specimens in the original collection were from one species, L. panamensis, so this was not a completely representative sample; nonetheless, “the relative abundances of species collected in this study were significantly correlated to those found in a previous intensive study of sand fly community composition on the [Panama] mainland…that collected over 30,000 Lutzomyia individuals in 35 species.”

To skip to the end, COI barcodes unambiguously assigned all 49 individuals to 18 distinct lineages corresponding to named species, plus highlighted 2 genetically-divergent individuals that might represent cryptic species.  Using  primers for ITS-1 (a nuclear gene) and mini-circle DNA (part of mitochondrial genome), Leishmania were detected in 2 of 5 human-biting species, Lu. trapidoi (13/30 individuals tested, 43.3%) and Lu. gomezi (5/19 individuals tested, 26.3%). By my estimate, taking into account relative abundances of Lutzomyia sp., about 1% of Barro Colorado Island sand flies carry Leishmania. Surprisingly, DNA sequencing identified the parasite as Le. naiffi, a South American species not previously reported in Panama. Finally, using the same set of DNA extracts, the researchers tested for Wolbachia, a rickettsial intracellular insect parasite and candidate biological control agent. Wolbachia were found in 3 of 18 species, including 50% of Lu. trapidoi, the main vector of CL in Panama. As an aside, I note that the presence of Wolbachia apparently did not interfere with discriminating among sand fly species; hypothesized interference from Wolbachia was one of the early worries some expressed about DNA barcoding (e.g Whitworth Proc Biol Sci 2007).

Standardized DNA testing enables many more persons to identify insects, regardless of life stage, including those that serve as vectors for human diseases. In this report by Azpurua and colleagues, the discovery of a new species of Leishmania for Panama, and possible undescribed Lutzomyia vectors, suggests that wide application of standardized DNA testing will lead to further discoveries relevant to control of human and animal infectious diseases.

Simplified DNA barcode recipe-skip step one

It used to be standard practice to shave the area around the incision before surgery, as it was thought that hair harbored bacteria that would cause wound infection. Beginning in the 1970s, doctors found this was unnecessary, and in fact was associated with higher incidence of post-operative infection. This history comes to mind in reading March 2010 BioTechniques report by researchers from University of Guelph demonstrating that DNA suitable for PCR and sequencing can be obtained simply by leaving specimens in alcohol overnight!

well-stocked-bar-tavernOf the three steps required to get from a specimen to a DNA barcode, namely DNA isolation, PCR (polymerase chain reaction), and sequencing, the first step is the most labor intensive and hardest to automate. Numerous protocols/kits have been developed to optimize DNA isolation from various types of specimens, such as plant vs animal tissues. As described by the Guelph researchers, “these procedures force cells to release their DNA via physical pertubation and/or chemical treatment, which is then followed by a clean-up procedure in which unwanted cellular compoents are separated from the DNA.” The researchers “hypothesized that a small amount of DNA leaks from the tissue into the preservation solution (usually ethanol), and that this DNA was amplifiable using a standard PCR protocol.” To start, they analyzed Monte Alban mescal, which is sold with a “worm” (a caterpillar of the agave moth, Hypopta agavis) in each bottle. They evaporated 50 mL mescal, re-dissolved the residue in water, applied this to a Qiagen MinElute spin column, resuspended the product in 50 ?L water, and used 2 ?L of resulting solution in a standard 25 ?L PCR reaction, with successful amplification and sequencing of 130 base mini-barcode of COI. This case was presumably challenging as mescal is only 40% ethanol and contains a variety of material that might inhibit PCR. In subsequent tests, 1 mL of 95% ethanol used to preserve specimens was evaporated, resuspended in 30 ?L of water without column purification, and 2 ?L used for PCR.

By evaporating 1 mL of ethanol in which specimens had been stored overnight (out of 2 mL total ethanol volume) and re-suspending residue in water, Shokralla and colleagues amplified and sequenced 130-base and 650-base fragments of COI and 1100-base fragment of 28s RNA from 25 whole insect specimens (mayflies, caddisflies; 1 gave COI only) and rbcL from 45 plant specimens (0.5 mm leaf samples). They also obtained COI sequences by sampling 1 mL of ethanol solution from 7 insect specimens stored at room temperature for 7 to 10 years. The researchers note this approach could facilitate for “high-throughput” analyses, as it involves liquid handling which is easy to automate, avoids destructive sampling, and could be used even when “there is simply no sample left for further analysis.” They conclude with a caution about “field sampling procedures that include placing mixtures of specimens in an ethanol jar” as this “may increase the chance of cross-contamination.”

The remarkably simple procedure reported by Shokralla and colleagues offers benefits to many persons who want to get DNA barcode identifications. I look forward to applications of this method in research and commercial laboratories, classrooms, and perhaps kitchens!

Update-Cardiovascular vs. Cancer

Using more recent data, we updated our analysis of Death and the Human Environment published in 2001 on competition between cancers and cardiovascular conditions as the leading cause of death in the US. We revisited the paper’s Figure 8, which projected cancers overtaking heart diseases about the year 2015.

In the past decade the fraction of deaths from cardiovascular conditions has decreased significantly, from 42% in 1993 to 34 % in 2006, as forecast.  However, unexpectedly, the fraction of deaths resulting from malignant neoplasms has remained level (~23%). Fitting logistic curves to the updated dataset, we deduce that if the trends remain stable, the crossover is now likely deferred to about 2025.

Declining rates of smoking are likely a major contributer to the steady decline in cardiovascular deaths since 1965 and the more recent flattening of cancer-deaths. For example, epidemiologists associated with the American Cancer Society recently reported a 30% decline in age-standardardized cancer death rates since 1990, resulting “mostly from reductions in tobacco use, increased screening allowing early detection of several cancers, and modest to large improvements in treatment for specific cancers” (Jemal et al PLoS ONE, March 2010). The temporal trends in cardiovascular and cancer deaths may reflect differential effects of stopping smoking, as tobacco cessation has a large, relatively immediate benefit for cardiovascular disease, whereas reductions in cancer are more modest and delayed.

Food Fraud – Washington Post

Our high school DNA study with Brenda Tan and Matt Cost was the lead item in front-page article on “food fraud” in Washington Post on March 30, 2010. In addition to detailing the students’ findings, reporter Lyndsey Layton quotes PHE researcher Mark Stoeckle on DNA barcode testing of food, “If it’s simple enough that high school students with some supervision can do it, it moves out of the research application to something you can do regularly.”  The WP article was widely reported and discussed in food-related blogs, and prompted a statement from Congressman John Dingell (MI), noting “more evidence of…insufficient authorities and resources to ensure the quality and safety of the U.S. food supply,” and calling for passage of H.R. 2749, the Food Safety Enhancement Act”.

For accurate census, birds await their barcodes

sdarticle08-2bAlthough birds have been studied in more detail than any other large group of animals, mtDNA continues to reveal many overlooked species, such that named taxa turn out to be comprised of two or more distinct species. These revisions include some very familiar birds, e.g., Canada Goose, which was recently recognized as comprising two species, Cackling Goose (B. hutchinsii) and Canada Goose (B. canadensis) (A.O.U. Check-list 45th suppl. 2004); for a current example, see Päckert et al Mol Phylogenet Evol 2010).  Although such taxonomic revisions reflect a combined analysis of morphological, behavioral (particularly song), geographic range, and DNA information, to my reading mtDNA generally trumps the other data, which mostly serve as corroborating evidence.  It is not that mtDNA similarities or differences are important per se, it is that they are strongly predictive of the presence or absence of organismal differences, particularly reproductive isolation, that are the hallmarks of species status. Of the species examined so far (I estimate about 1/3 of the 10,000 world birds), most demonstrate a similar patterning of limited mtDNA differences within species and relatively large differences among species, so we can be confident that this analytic approach will hold up.  As an aside, referring to splits of named species as “taxonomic inflation” is misleading, as it suggests the real number of species is already known, which seems no more correct than referring to binary stars resolved with a new telescope as reflecting “stellar inflation.”

As with most animal groups, there are no nuclear genes that regularly distinguish closely-related birds. (It seems likely that sequencing entire nuclear genomes will enable discriminating species units, but this is inherently a more costly, less standardizable approach, making it unattractive for routine use.) Differences among closely-related species are more or less evenly-distributed throughout the mitochondrial genome, so that approximately 500-1000 bp of coding DNA usually contains sufficient information to recognize species and create genus-level phylogenies. Moreover, if species are not distinguished by this length of mtDNA then additional sequencing usually does not improve the resolution. This might be better stated the other way around, namely, if two sets of birds cannot be distinguished by a relatively short stretch of mitochondrial DNA, then they most likely belong to the same species.  Why is this so? The prevailing null hypothesis is that mitochondrial differences are an accidental if useful accompaniment to species status, reflecting genetic drift over the several hundred thousand years of reproductive isolation presumably required for new species to emerge. In a recent essay  (Nature 18 nov 2009), Nick Lane explores the interesting possibility that mitochondrial differences cause speciation, thus producing what they identify.

What barcoding adds to the historical approaches in avian systematics is standardization, so that the same locus is analyzed in every specimen, which speeds completion of species-level avian taxonomy and facilitates large-scale comparisons. Given that most bird species have yet to analyzed for species-level genetic differences, the standardized approach can save considerable time and money, as the results of independent investigators can be readily merged (e.g. Johnsen et al J Ornithol 2010, Kerr et al Frontiers Zool 2009). Beyond classification, this approach creates a reference library of avian barcodes, which enables identifying unknown specimens, such as from birdstrikes (Marra et al Frontiers Ecol Environ 2009). Our survey of avian frozen tissue collections identified over 315,000 specimens representing over 7,200 species (Stoeckle and Winker, Auk 2009). Based on what is in GenBank, it appears that most of these specimens have never been analyzed for any genetic locus, so a first-pass effort to sequence COI barcode region will certainly reveal many new species and help resolve higher-level relationships as well. There is an opportunity for a granting agency or foundation to have a large impact on avian systematics for modest cost by supporting barcoding of existing collections, given that the tissue specimens and vouchers, which are the most expensive components of collections, have already been prepared.  Looking further ahead, such an effort would also create sets of high-quality  DNA extracts,  with species identity confirmed by COI barcode, linked to vouchered specimens, that would be a powerful resource for further genetic study.

News about mitochondria

Mitochondria are energy-producing organelles, found in nearly every cell in nearly every plant and animal species (some protozoans lack mitochondria).  As first demonstrated by Lynn Margulis in 1967 (J Theor Biol 14:225) mitochondria, like chloroplasts, are derived from bacteria, reflecting an ancient symbosis that pre-dates the divergence of plant and animal species 1 billion years ago. Both organelles have retained a truncated circular genome and replicate independently of the nucleus. The mitochondrial genome in particular has turned out to be exceedingly useful in tracing evolutionary history, as it is present in all eukaryotic organisms, evolves rapidly as compared to nuclear DNA, and does not undergo meiosis and recombination, processes that scramble the evolutionary lineages of nuclear genes. Because it is several orders of magnitude more abundant than nuclear DNA (hundreds to thousands of mitochondria per cell, and 5 to 10 genomes per mitochondrion), it facilitates forensic identification, including DNA barcoding, even with very small or degraded samples. Given its practical and scientific importance, we want to better understand mitochondrial genetics.

A general observation is that mitochondrial DNA is uniform in an individual, presumably reflecting a stringent bottleneck at oogenesis, such that one or a very small number of mitochondria are passed on. However, a number of cases of heteroplasmy (i.e., individuals harboring multiple mitochondrial variants) are reported in humans and other animals.  In 3 March 2010 Nature researchers from Johns Hopkins University and Case Western University apply next-generation sequencing to make a high-resolution survey of mitochondrial heteroplasmy in various tissues including cancerous cells. He and colleagues used “two sets of PCR primers, each resulting in amplicons of about 650 base pairs (bp) in length…to cover the mtDNA genome” and the resulting PCR products were sequenced by synthesis in an Illumina GAII. This approach was applied to a sample of normal colonic mucosa, yielding “8.5 million tags that matched the mitochondrial genome.” As a result, “each mtDNA base was sequenced, on average 16,700 times and fewer than 11 bases (0.07%) of the 16,569 bp in the mtDNA genome) were represented fewer than 1,000 times.”

nature08802yTo establish a cutoff for artefactual errors due to PCR and/or sequencing, a control comparison with amplified nuclear DNA was performed, which yielded an average of 0.058% (SD 0.057%) mutations per base and a maximum of 0.82% mutations. He and colleagues used a “very conservative assumption that all variants in excess of twice this value (1.6%) represented true heteroplasmies rather than sequencing artefacts.” Now to some results! The researchers detected “28 homoplasmic alleles and 8 heteroplasmic alleles in this sample of normal colonic mucosa.” Here “homoplastic” refers to differences from the reference human mtDNA sequence (NCBI entry NC_012920). All of the homoplastic alleles were previously found in normal individuals, so we can set these aside as representing normal variation among human individuals.

The researchers extended this analysis to other tissues from same individual; all tissues yielded heteroplasmic mtDNA, and the proportion of of individual variants differed strongly among tissues, e.g., the frequency of the most common variant ranged from 7.4% in skeletal muscle to 90.9% in kidney. Surprisingly, 75% of the heteroplasmic variants are already reported in human databases, suggesting a limited pool of variation and/or strong purifying selection. Further evidence for restricted variation is that 67% of heteroplasmic variants were in non-protein coding or RNA-coding regions,” presumably the control region, which represents less than 10% of the mitochondrial genome.

What is the origin of heteroplasmic mtDNA? Using samples from one kindred, the researchers found identical variants in a mother and her two children (and not in the father), demonstrating that, at least in this case, the heteroplasmic variants were inherited from the mother. The authors go on to analyze mitochondrial heteroplasmy in cancerous tissue which is interesting but I will not discuss here.

In terms of species-level identification, the findings add confidence to the established approach using COI mtDNA for animals. This high-resolution study demonstrates that mitochondrial variation within human individuals is a smaller scale version of the variation already known to exist among individuals. As in standard mitochondrial genetics, most of the heteroplasmic variants are maternally inherited. On the other hand, when identification of individuals is important, as in human forensics, mitochondrial heteroplasmy may need to be taken in account, at least on the negative side when apparent mismatches are found. The authors conclude by suggesting “caution in excluding identity on the basis of a single or small number of mismatched alleles when the tissue in evidence (such as sperm) is not the same as the reference tissue of the suspect (such as blood or hair).” Looking ahead, for those interested in exploring mitochondrial heteroplasmy in other species, the initiative has created a large database of intra-specific variation in diverse species, an essential benchmark for investigating possible within-individual variation.

Note added 22 march 2010: As a thought experiment we can ask: how much of the within-species variability in COI might be due to unrecognized mitochondrial heteroplasmy?  In the present study, the average number of heteroplasmic variant sites in one tissue sample was about 5, and, on average, 33% of such sites were in protein- or RNA-coding regions, which represent about 90% of mitochondrial genome. That gives (5 x .33) sites distributed across (16,569 x 0.90) nucleotides in the mitochondrial genome, which works out to about 0.0001 variants per site. For 650 bp COI barcode region, that corresponds to an average of 0.07 heteroplasmic sites per barcode sequence, or 0.01% variation. So for humans at least, mitochondrial heteroplasmy appears unlikely to contribute significantly to the observed intra-specific variation in COI.