How many species are there? One widely cited estimate, now 24 years old, is 1.7 million named species (EO Wilson 1985 Science 230:1227). This estimate is updated in detailed form in September 2009 publication from Australian Government “Numbers of Living Species in Australia and the World, 2nd edition” by Arthur Chapman (illustrated report open access for perusing online or as pdf for download). According to Chapman’s analysis, there are 1.9 million published species in the world. Approximately 18,000 new species are described each year, 75% of which are invertebrates, 11% vascular plants, and 7% vertebrates. Chapman estimates the true number of world species is about 11 million. The largest uncertainties, for which it is estimated fewer than 10% of species have been named, are for fungi, single-celled eukaryotes (protocista, cyanophyta, chromista), and “prokaryotes”, i.e. eubacteria and archaea.

This overview brings to mind pictures of the distribution of matter and dark matter in the universe. On a large scale, is the “density” of species uniform? For example, given there about about 10,000 bird and about 40,000 fish species, do fish take up 4x as much diversity space? We know on a small scale there are some “high-density” closely-related groups of species, like cichlid fishes in Africa, but can we map the distribution of diversity on a larger scale? Large databases of homologous sequences representing diverse species (aka DNA barcodes; as of today, BOLD has over 700,000 records representing over 64,000 species) and new mathematical approaches to calculating diversity from nucleotide sequences (eg Sirovich 2009 PLoS ONE; I am co-author) may help provide a biological macroscope (Ausubel PNAS 2009) for understanding the genetic structure of biodiversity, complementary to the historical view expressed in the Tree of Life.

Marine zooplankton comprise an enormous mass of diverse organisms distributed throughout the world’s oceans from deep waters to surface. Zooplankton include representatives of at least dozen phyla, some of which are larval forms of much larger animals, and challenge identification with their diversity and tiny size. In
Machida and colleagues found evidence for 189 species, only 10 of which could be confidently matched to reference sequences. This report demonstrates that this sort of “kitchen blender” approach, which has previously been applied largely to bacterial and archaeal communities, shows promise for assemblages of eukaryotes and reveals surprisingly few organisms have reference sequences in databases. Identified organisms included several copepods as well as presumably larval forms of 
The horse-chestnut leaf miner moth Cameraria ohridella (link to
Tardigrades, commonly called water bears, are tiny (0.1-1.5 mm) water-dwelling invertebrates found in diverse environments. About 1000 species are known. Morphologic identification is difficult and may be limited to certain life stages–some species can be identified only from eggs, for example. Tardigrades can transform into a dormant state with remarkable ability to withstand extreme drying, cold, and radiation for prolonged periods, making them of interest for persons studying biology of tissue repair, aging and other fields.
“DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF–atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK–psbI spacer, and trnH–psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL and matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.”
Back to the study, Steinke and colleagues found distinct barcodes among 384/391 (98.2%); 9 species displayed 2 or 3 distinct clusters, most of which were allopatric. Review of these potential “splits” revealed possible inappropriate synonymization in several cases. On the other side, 2 pairs and 1 triplet of species were not distinguished by DNA barcodes using distance. I look more closely at one of these examples, butterfly fishes Chaetodon multicinctus and C. punctatofasciatus, to see if there might be diagnostic characters whose signal is swamped by intraspecific variation. As in figure, there are 2 possibly diagnostic differences among this species pair. Of course, this sort of analysis only works for known species, but I wonder how many other species pairs/sets with “overlapping” barcodes have diagnostic differences.
In