RockEDU summer students Priyam Shah and Michael Epelman, who just completed high school, teamed with mentor extraordinaire Mark Stoeckle to study the fishes of an NYC Superfund Site, Newtown Creek. Their excellent poster shows that eDNA detected a surprising diversity of fish in Newtown Creek, despite ongoing pollution and sewage overflow. The number and relative abundance of fish species differed among sites consistent with species habitat preference and pollution tolerance. Our data support eDNA as a cost-effective, non-destructive method for monitoring fish populations and assessing habitat restoration efforts in Newtown Creek and other Superfund sites
Area of Research: DNA Barcoding
eDNA biodiversity survey of Charles River & Boston Harbor
For a 50th Harvard College Reunion Seminar on EO Wilson’s proposal to conserve half Earth, Jesse Ausubel and Mark Stoeckle, assisted by Elizabeth Munnell, conducted a survey of vertebrates in three locations in the Charles River and two in Boston Harbor. The 14 slides on The Charles River and Boston Harbor Then and Now tell a story of remarkable ecological recovery.
New software for visualizing the whole animal kingdom
Swiss bioinformatics wizard Wandrille Duchemin and PHE Guest Investigator David Thaler publish PyKleeBarcode: Enabling representation of the whole animal kingdom in information space in PLoS One. The computational advances in the paper open the way to calculating DNA-relatedness of all animal species, as the figure below for mammals suggests.

Fig 2. A. View of the structure matrix of the mammalian dataset and taxonomic structure of Mammalia. B. Phylogenetic tree structure of the taxonomic groups retrieved from NCBI taxonomy.
The paper builds on the pioneering work done earlier in the PHE by Larry Sirovich and Mark Stoeckle:
L Sirovich, MY Stoeckle, Y Zhang. A scalable method for analysis and display of DNA sequences. PLoS ONE 4 (10): e7051, 2009
L Sirovich, MY Stoeckle, Y Zhang. Structural analysis of biodiversity. PLoS ONE 5 (2): e9266, 2010
MY Stoeckle, C Coffran. TreeParser-Aided Klee Diagrams Display Taxonomic Clusters in DNA Barcode and Nuclear Gene Datasets . Nature Scientific Reports 3 (2635): 2013
Interspecific allometric scaling in eDNA production among northwestern Atlantic bony fishes reflects physiological allometric scaling
Our paper on eDNA as bioassay of Anthropocene published
The new journal based in China, The Innovation, has published the Thaler-Ausubel-Stoeckle paper on Human and domesticated animal environmental DNA as bioassays of the Anthropocene in their “Out of the Box” category, where we like to be. We also post the pdf.
We thank Song Sun and Ke Chen for editorial assistance.
Summary: Human and domesticated animal sequences, commonly detected in environmental DNA (eDNA) metabarcoding studies, are routinely excluded from analysis. Here we suggest that reporting human and domesticated animal eDNA results might open new lines of investigation. For example, the relative abundance of human and domesticated animal eDNA as compared to that of wild vertebrate species might provide an index of human impact on local biota. Such an index could be applied to sites ranging from urban harbors to remote villages, and possibly to analyze historical samples. Various potential sources of contamination complicate the picture, but it should be possible to develop procedures that minimize risk of DNA introduction during collection and processing. Our near-term recommendation is to encourage inclusion of human and domesticated animal data in eDNA publications as an incentive for discovery, to lift quality controls, and to collectively contribute to new vistas that eDNA science might open.
Human and domesticated animal environmental DNA as bioassays of the Anthropocene
12S gene metabarcoding with DNA standard quantifies marine bony fish environmental DNA, identifies threshold for reproducible detection, and overcomes distortion due to amplification of non-fish DNA
Our article on incorporating a known amount of non-fish DNA to allow better quantification of the fish DNA present in a seawater sample appears in the journal Environmental DNA.
Open Access
12S gene metabarcoding with DNA standard quantifies marine bony fish environmental DNA, identifies threshold for reproducible detection, and overcomes distortion due to amplification of non-fish DNA. Mark Y. Stoeckle, Jesse H. Ausubel, Michael Coogan, first published: 08 December 2022, https://doi.org/10.1002/edn3.376
While our paper focuses on fish, we believe the approach of “spiking” samples collected in nature with known amounts of DNA from a species that would not be present in the sample (such as ostrich) offers great promise for increasing the value of a wide range of aquatic DNA studies. The exhibit below shares some of the main points from the paper.

Thaler memo on “Distinguishing contamination from authentic human eDNA”
PHE guest investigator David Thaler summarizes “Ways in which contamination might be distinguished from authentic human eDNA” in a useful draft memo. Meeting this challenge matters greatly for using human eDNA as an assay of the Anthropocene, a subject of a forthcoming paper by Thaler, Ausubel, and Stoeckle.
New code for analyzing eDNA sequences using DADA2 pipeline
For the past year MIchael Coogan, now a grad student in marine science at the U. of New Hampshire, has helped Mark Stoeckle and PHE with improved software for our eDNA studies. See summary below. A pdf of the R code is available here. If you have questions, please write to Mark.
The goal is to adapt the DADA2 pipeline to Mark Stoeckle’s 12S experiment. Sample sequences will be identified using 12S reference file containing sequences of 262 unique vertebrates found around New York. The starting point is a set of Illumina-sequenced paired-end fastq files that have been split (or demultiplexed) by sample and which have barcodes/adapters already removed. The end product will be a sequence table, analogous to the ubiquitous “OTU table”, which records the number of times sample sequences were observed in each sample. The key difference between the output of DADA2 and standard OTU analyses is that DADA2 infers sample sequences exactly rather than clustering sequences into fuzzy OTUs which hide and complicate biological variation.