Taxonomy, DNA, and the Barcode of Life: Comments

Email comments to MarkStoeckle@nyc.rr.com.

Public access to tools

Charles Godfray, Sandy Knapp and Ian Gauld are co-editing a volume of the Philosophical Transactions of the Royal Society on "Taxonomy for the 21rst Century". They have invited a commentary on the major challenges facing the subject and their potential remedies, based on my experiences lobbying for the field over the years. This is that. Any and all reactions and comments are very, very welcome.

Dan Janzen

26 September 2003

*******************************************************************************************

The space ship lands. He steps out. He points it around. It says "friendly - unfriendly - edible - poisonous - dangerous - living - inanimate". On the next sweep it says "Quercus oleoides - Homo sapiens - Spondias mombin - Solanum nigrum - Crotalus durissus - Morpho peleides - serpentine. This has been in my head since reading science fiction in 9th grade a half a century ago. I am sure it was in the heads of Linneaus, Alexander the Great, and Timid the Mastodont Stomper. And it has been on the wish list of every other human confronted with the bewildering blizzard of wild biodiversity on the edge, middle and center of society.

Imagine a world where every child's backpack, every farmer's pocket, every doctor's office and every biologist's belt has a gadget the size of a cell phone. For free. Pop off a leg, pluck a tuft of hair, pinch a piece of leaf, swat a mosquito, and stick it in on a tuft of toilet tissue. One minute later the screen says Periplaneta americana, Canis familiaris, Quercus virginiana, or West Nile Virus in Culex pipiens. A chip the size of your thumbnail could carry 30 million species-specific gene sequences and brief collaterals. Push the collateral information button once, the screen offers basic natural history and images for that species - or species complex - for your point on the globe. Push it twice, and you are in dialogue with central for more complex queries. Or, the gadget, through your cell-phone uplink, says "this DNA sequence not previously recorded for your zone, do you wish to provide collateral information in return for 100 identification credits?". Imagine what maps of biodiversity would look like if they could be generated from the sequence identification requests of from millions of users.

Such a gadget would allow access to true bioliteracy for all humanity. Such a gadget would be to biodiversity as the printing press was to literacy (and reading glasses, chairs, newspapers, the Library of Congress, and the computer). The blessing of information access through such a gadget is what the taxasphere - the collective intellectual might of taxonomists, museums, collections and their centuries of literature - has within its power to offer society, global society, everyone. Will it? If it does not, wild biodiversity will continue its inexorable decline into the pit under the human heel, and the taxasphere will continue its accelerating slide into the realm of quaint esoteria shared among a very few enthusiasts who love their bugs, ferns and birds.

The gadget requires two things and a third. Thing one is the economic and social selective pressure to engineeringly collapse what today occupies two tabletops of machinery and a technician down to the size of a cell phone, reusable and cheap. This collapse is technically feasible in any one of many industrial centers of the world. $1 million and five bright people. It has not happened in the past decade because no one saw any particular reason to do it. But there is a reason. Real bioliteracy requires on-site real-time hand-held cheap identification of hundreds of thousands of species, even though any one person at any one time may care about only one organism in one place.


Thing two is the global library of partial DNA sequences of a few cleverly selected target genes that among them carry species-specific combinations of nucleotides. Such a library is constructible in two phases running concurrently. The world's great biodiversity collections - museums, herbaria and microbe depositories - have on the shelf, somewhat in order, easily half (if not more) of the species of wild biodiversity encountered daily and consciously by 99% of the world's people. One phase is to quite straightforwardly organize and fund SWAT teams to simultaneously polish the taxonomic organization and extract the DNA samples for target gene species-level sequencing for these shelves of things. The other phase is to simultaneously reinforce the on-going biodiversity inventory of the world, and its taxonomic processing, so as to be able to sequence and characterize the as-yet uncollected wild biodiversity.

Both the in-house taxonomic processing and outdoors inventory must be congruent with the agendas of the taxon- and site-focused primary users, so that as the sequence libraries emerge from the great collections, these same collections are also receiving and taxonomically processing the stream of new material (much of which may be sequenced as collected). The cross-phase potential for mutual and iterative reinforcement between the taxasphere and building the sequence library - and populating its collaterals - is enormous. I cannot overemphasize the necessity for collaterals. A phone number does you no good if there is nothing at the other end.

Thing three is the commercial-entrepreneurial process such that each of the millions of times that the gadget processes a sequence, a penny drops into a bucket that fuels the taxasphere to do what it does best and with such joy, and fuels the conservation community to actually conserve that which is being sequenced in its wild homes. Such a feedback system is imperative to saving the present and future biodiversity Library of Congress, so to speak. The goal is not to support yet another guild of biodiversity administration and consultancies, but rather to deliver bioliteracy to the world. And once people can read, ensure that there still be books to be read.

Thing three is obviously the most difficult, given that Homo sapiens is notorious for not reinvesting its gains, ill-gotten or otherwise, in the raw material source of those gains. Yes, startup capital will be required, but rather than get this from classsical venture capitalism, this is a time for the world philanthropic capitalists to focus their energy. And will the gadget user pay a penny an identification once the system is in place? Yes, if "one minute one sequence one name" - and serious amounts of collaterals are available. Will the users feed new collaterals back into the information source to accompany their old or new sequence submitted? Yes, if they get identification credits, and as they see the value of retrieving their own submissions years down the road, to say nothing of the value of examining each others' submissions in real time and across geography.

The blending of these three things within their software glue and matrix is easily attainable in less than a decade with the technological and sociological understanding in hand, for a total budget in the $1-5 billion range. The process can be put in motion as a proof-of-concept for a tiny fraction of this.

The viewpoint in this commentary was inspired by reading Paul Hebert's enthusiasm and foresight in targeting just a part of the DNA sequence of a single gene as a species' "barcode" (Hebert et al 2003a,b), by recent planning efforts (Stoeckle 2003), and by witnessing the clarity with which a portion of the CO1 gene sequence can discriminate among a large number of species of butterflies and moths, bees, birds and mammals. Simultaneously it has its roots in decades of attempting to process millions of neotropical insects and plants through biodiversity inventory for a multiplicity of agendas.

And I am frustrated by working for a half century in the field, nurtured and guided at long distance by the worlds' best taxonomists, among hundreds of thousands of species of organisms, most of which are actually known to science yet can be identified in the field at best by only a very few persons. Neither I nor the other millions of wild biodiversity users can carry in their pocket the tens of thousands of pages of taxonomic descriptions, keys and images, and their authors. And even if all were to be collapsed down into a single chip, I still could not connect the beast in hand to its information as I stumble through the mud, rain and green of a 200,000-species-rich patch of Costa Rican rainforest. No one can learn the scientific language to read and hear the taxasphere's collective wisdom and facts for identification at the moment the bug is in hand or the leaf in mouth - even if we have the best access uplink to Google. And if each of us makes the long trek, which each won't, to the doors of any one of the great collections, in a matter of seconds the total taxasphere will be overwhelmed. And the gadget has huge potential for relieving the taxasphere of the drudgery of routine identifications at those places where even today's bioiliterate populace already knows that it needs to know what it is - the farmer's field, ports of entry, doctor's office, environmental monitoring, the kitchen, school science class, etc. Imagine what happens if the environmental monitor can know in a few minutes on-site the hundreds of species of insects, mites, fungi and protista in an environmental sample.


The answer does not lie in better keys, more keys, more images on the web, more web sites, species pages, more descriptions, more phylogenies, more specimens, more anythings. Those are necessary collaterals, but not sufficient. The answer lies in a process that will for the first time connect the collective species-level biodiversity knowledge of the world to any and all users, on the spot, in real time, now. Fast, cheap and on-site single (or very few) gene sequencing has the potential to deliver the species-specific linkage between the species and its human-known collaterals. There is a huge opportunity for the taxasphere to thrust itself into a position of friendly social prominence - just as have education, agriculture, medicine and communication.

We must move wild biodiversity from the category of something to be removed to make room for the extended human genome to a book to be read, and read, and read. To an illiterate people, a library is just neatly stacked firewood. We must move the taxasphere from a "woe is us" mode to "here is what we can offer at society's negotiating table". It is within the technological power of the taxasphere to choose to move into a mutualism with directed molecular biology, mineaturizing engineering and entrepreneuralism. Praise and support taxonomists to be taxonomists and let us all be able to read wild biodiversity. The time is ripe for a barcorder. Godfray (2002) noted "in 10 to 20 years' time it will be simpler to take an individual organism and get enough sequence data to assign it to a 'sequence cluster' (equivalent to species) than to key it down using traditional methods". We do not have to wait 1-2 decades. Please do it now.

Literature cited.

Godfray, H.C.J. 2002. Challenges for taxonomy. Nature 417:17-19.

Hebert, P. D. N., Cywinska, A., Ball, S. L. & deWaard, J. R. 2003a. Biological identifications through DNA barcodes. Proc. R. Soc. Lond. B 270: 313-322.

Hebert, P. D. N., Ratsingham, S, & deWaard, J. R. 2003b. Barcoding animal life: cytochrome C oxidase subunit 1 divergences among closely related species. Proc. R. Soc. Lond. B 270: supplement O3BL0066.1-4.

Stoeckle, M. 2003. Taxonomy, DNA, and the bar code of life. BioScience 53:796-797.

Posted at 03 Sep 03 12:33 PM
This post is archived at http://phe.rockefeller.edu/BarcodeConference/archives/2003_09.html#000235

Barcoding turtles

On August 08, 2003, Gisella Caccone [adalgisa.caccone@yale.edu] writes:

I started talking with Brad Shaffer about gathering information and support material for a pilot project on turtles. He is quite enthusiastic and agrees with me in thinking that they would be an ideal group for our purposes. Moreover, he has in his freezers 2/3 of the recognized taxa and all the connections we will need to accomplish the project in a short time. On top of this, he has given already lots of thoughts to this type of projects because he recently prepared a PBI grant proposal to arrive to a complete inventory of all turtles on earth using DNA, fossil, and morphology (unfortunately, his project was not among the 4 selected, but it was pretty close, one of the main reason for the rejection was that the group was to small, which is one of the strengths in this case!). For the PBI proposal he and his co-PI put together a great worldwide group of scientist and field people, ready to help with expertise ranging from morphology, ecology, collecting, and DNA analyses (I was part of it, this is why I thought of turtles as a possible pilot group).

Posted at 13 Aug 03 04:03 PM
This post is archived at http://phe.rockefeller.edu/BarcodeConference/archives/2003_08.html#000234

Barcoding vertebrates

On August 8, 2003, Gisella Caccone [adalgisa.caccone@yale.edu] writes:

I have some thoughts about possible directions for the meeting that I would like to share.

I am thinking that we should arrive at the meeting with some practical plan on how to test if Barcoding animals using an mtDNA fragment and museum collections is feasible. I think that the idea of Museum consortia is great. The way I would go is to develop some "Museum/other institutions consortia" for a variety of groups of organisms. I am starting to hatch an embryonic idea that I would like to develop together with Jim Hanken and the other "Vertebrate-oriented" participants for a vertebrate group. the idea is as follow: we would pick 4 or 5 small groups, that are well represented in museum collection, have a good representation in frozen collections (I will explain why below), they are different enough to provide a picture of the general feasibility of the project in a larger group.

For instance, for Vertebrates one could propose to do turtles (only about 300 species recognized, well represented in museum collections, and with a good frozen collection), for amphibians we could propose salamanders, marsupials could be another relatively small group that one can tackle and has also some public appeal, and then we would choose a relatively small group of fish (sharks?) and birds. Other Consortia could include "Insect and Arthropods", "Mollusks and worms" etc etc etc.

The tasks of these consortia will be to answer three main questions (in a relatively short time- 1 year?) that will determine the overall feasibility of the project from a practical point of view: 1- can we extract DNA from museum preserved samples (skeletons, skins, formalin/ethanol preserved tissues, dry mounted specimens) that is of quality good enough to allow the amplification of 500--600 bp of mitochondrial and nuclear DNA. 2- Are the markers we chose good enough to discriminate among the recognized taxa in that group?. 3- Is COI enough or we need at least another fragment, possibly nuclear? The samples from the museum collection will allow us to answer point 1, the availability of a frozen tissue collection for that group will allow to address questions 2 and 3, even if the museum collections do not yield material of good quality.

I think it would be helpful if all of us could give some thoughts about this idea before we meet again. Think about possible candidate small groups that fit the requirements I described above (small, well known, with good traditional and frozen collection, diverse, and also interesting to a broader public- it never hurts for funding purposes!!) and other markers that we might want to test together with the COI.

Mark Stoeckle's response:
>Dear Gisella,
>
>These are very helpful thoughts. This is a good strategy-starting with
>smallish groups for which there is both frozen tissue and routinely
>preserved specimens. As you say, it seems likely to answer the 2 basic
>questions: in each group, does COI discriminate species and can DNA be
>isolated from museum specimens. Also, having a number of researchers
>analyze their own group will help build confidence in the scientific
>benefits and technical methods of barcoding.
>
>Mark
Posted at 13 Aug 03 03:54 PM
This post is archived at http://phe.rockefeller.edu/BarcodeConference/archives/2003_08.html#000233

This page is powered by Movable Type 2.63.