Reports: AC8

Back to Table of Contents

44461-AC8
Testing the Effect of Taxonomic Bias on Estimating Pliocene-Recent Sea Surface Temperatures Using Planktonic Foraminifera

Norman MacLeod, The Natural History Museum

The goals of this research project were to assemble a reference collection of images of the main Pleistocene to Recent planktonic foraminiferal species occurring in the Atlantic Ocean in order to test for identification inconsistencies and gauge the implications of the this for sea surface temperature (SST) estimates. A suite of pre-picked samples were obtained from Dr. Harry Dowsett to serve as the basis for artificial neural net (ANN) training sets. From October 2006 to July 2007 specimens comprising these samples were oriented and imaged using reflected light, single-shot digital photomicrography, composite digital photomicrography, and 3D scanning. Some 3,000 images and scans from 30 species were obtained. These datasets were then submitted to an unsupervised ANN based on the plastic self-organizing map (PSOM) combined with an n-tuple classifier; the Digital Image Analysis System (DAISY). Resultant DAISY models were compared with each other for consistency and with parallel geometric morphometric analyses.

Results show that, while composite images appear more detailed visually, they were unsuitable for ANN analysis. The compositing process introduces artefacts that inflate shape variance estimates. Both single-shot and 3D scan representations yielded sets of discriminant models that were over 90% accurate as measured by crosstabulation tests for some large image and scan datasets. However, the overall morphological consistency—estimated by the ANN analysis of the image datasets parsed into their original species groups—was much lower with only a handful of species being separated into truly unique shape-based groups. This result may be due to (1) inability of single-image views to identify these species correctly, (2) genuine inconsistency in the original species concepts, and/or (3) inconsistency in the application of these taxonomic concepts to real planktonic foraminiferal faunas. In order to evaluate this issue a panel of three taxonomy specialists were given access to web-based catalogues of the training set images. Despite agreeing to the design of this part of the project during development of the proposal, these specialists felt uncomfortable making definitive identifications from the image sets. Accordingly, I am now in the process of sending all training-set specimens sequentially to an expanded panel of taxonomic specialists. Owing to the increased time necessary for specimen transfer and the fact that only a single specimen set exists, this phase of the analysis will require an addition 3-6 months to complete. However, this phase will be accomplished at no further expense to ACS. Preliminary quality-control results obtained by my own identification of the training set images suggests the panel of independent taxonomic experts will likely disagree with many of the original identifications. This is not a reflection on the skills of the original taxonomist, but the expected result based on several previous taxonomic identification consistency studies among panels of experts. DAISY analysis of the alternative training-set models based on my identifications resulted a in dramatic improvement in the DAISY models, most of which achieved > 80 percent separation in a crosstabulation analysis. These models were also generalizable insofar as they could be used to identify other faunas close to the training set sample in time and/or space with high reliability (though aspects of this evaluation also await input from the panel of taxonomy experts). While it is premature to provide an estimate taxonomic inconsistency effect on SST estimates, these results suggest the taxonomic bias will likely be considerable. In addition the DAISY ANN approach was found to be marginally less accurate than linear discriminant analysis LDA of shape coordinates for achieving automated species recognition, but was far more flexible, much faster, and more robust to sample dependencies.

A presentation detailing results obtained to date was made at the 4th International Zooplankton Production (IZP) Symposium (Hiroshima, 2007). Another is scheduled for Geological Society of America 2007 Convention. A technical paper based on the IZP presentation is in preparation for the journal Marine Micropaleontology. At least two additional publications are planned, but must await input from the taxonomy experts. The impact of these papers is likely to be considerable as paleontologists—and taxonomists in general—have not previously acknowledged the problem of identification consistency in taxonomic data and are largely unfamiliar with technologies that would allow them to do so. This has resulted in an over-optimistic sense of the quality of their basic data.

Ms Kalina Davis benefited greatly from being associated wit this project and has now entered a PhD program in the University College London Biology Department where she intends to use the skills and knowledge gained as a result in her dissertation research. Dr. Jon Krieger has also gained important skills and experience and work with myself to complete and extend the overall investigation. This project has enabled the first medium-scale trial of the DAISY software to an outstanding dataset, generated important new results that will have implications throughout systematics, and demonstrated the potential of this approach to morphological investigation.

Back to top