The Barcode Blog

A mostly scientific blog about short DNA sequences for species identification and discovery. I encourage your commentary. -- Mark Stoeckle

Subscribe to this blog

Sign up for email notifications

Vast microbial genetic diversity found in oceans, stimulating new informatics tools

The biological universe is much larger and more diverse than we thought. In three papers in March 2007 PLoS Biology, scientists report on a genetic survey of microbial diversity in the world’s oceans.  A large collaboration, the Global Oceanic Sampling (GOS), led by Craig Venter, analyzed microbial DNA collected by filtering seawater at 250 sites along a several thousand kilometer transect from the North Atlantic, through the Panama Canal, around the Galapagos Islands, ending in the Cocos Islands of the South Pacific. The resulting DNA dataset consisted of 6.3 billion base pairs (twice the size of the human genome), with 85% of the assembled and 57% of the unassembled data unique at a 98% identity cutoff. The extreme diversity prevented assembly of complete genomes, as many reads were unique. A comprehensive dataset of GOS sequences combined with pre-exisiting databases reveals nearly 6.12 million proteins, nearly doubling the number of known proteins. Some families of microbial proteins discovered in this study, particularly protein kinases, were previously thought to be restricted to eukaryotic organisms. Over 1700 sequence clusters show no identity to known families, implying we are far from knowing the full range of what proteins can do. 

How to make sense of all this data? First, more data is needed!, namely more complete genomes into which the unassembled fragments can be placed. Second, new analytic tools. A new genomics and informatics group based at the California Institute for Telecommunications and Information Technology in San Diego, have built a metagenomics version of GenBank, known as the Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (try saying that 3 times quickly!) which is fortunately known by acronym CAMERA

Just as Google and other search engines solved a problem of information overload that did not exist a few years ago, I am confident that CAMERA and other new informatics tools will enable us to view the expanding universe of environmental genomics, including DNA barcode libraries, in ways that will provide new understanding.

This entry was posted on Tuesday, April 3rd, 2007 at 11:51 pm and is filed under General. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

One Response to “Vast microbial genetic diversity found in oceans, stimulating new informatics tools”

  1. PLoS and GOS » Blogging Biodiversity Says:

    […] to the Barcode of Life blog for bringing this one to my attention. The March 2007 edition of the Public Library of Science […]

Contact: mark.stoeckle@rockefeller.edu

About this site

This web site is an outgrowth of the Taxonomy, DNA, and Barcode of Life meeting held at Banbury Center, Cold Spring Harbor Laboratory, September 9-12, 2003. It is designed and managed by Mark Stoeckle, Perrin Meyer, and Jason Yung at the Program for the Human Environment (PHE) at The Rockefeller University.

About the Program for the Human Environment

The involvement of the Program for the Human Environment in DNA barcoding dates to Jesse Ausubel's attendance in February 2002 at a conference in Nova Scotia organized by the Canadian Center for Marine Biodiversity. At the conference, Paul Hebert presented for the first time his concept of large-scale DNA barcoding for species identification. Impressed by the potential for this technology to address difficult challenges in the Census of Marine Life, Jesse agreed with Paul on encouraging a conference to explore the contribution taxonomy and DNA could make to the Census as well as other large-scale terrestrial efforts. In his capacity as a Program Director of the Sloan Foundation, Jesse turned to the Banbury Conference Center of Cold Spring Harbor Laboratory, whose leader Jan Witkowski prepared a strong proposal to explore both the scientific reliability of barcoding and the processes that might bring it to broad application. Concurrently, PHE researcher Mark Stoeckle began to work with the Hebert lab on analytic studies of barcoding in birds. Our involvement in barcoding now takes 3 forms: assisting the organizational development of the Consortium for the Barcode of Life and the Barcode of Life Initiative; contributing to the scientific development of the field, especially by studies in birds, and contributing to public understanding of the science and technology of barcoding and its applications through improved visualization techniques and preparation of brochures and other broadly accessible means, including this website. While the Sloan Foundation continues to support CBOL through a grant to the Smithsonian Institution, it does not provide financial support for barcoding research itself or support to the PHE for its research in this field.