Home
  

The Darwin Core

Author: Dave Vieglais
Revision: 1.5
Date: 2003-08-21

Overview

Natural history collections and observation data sets represent sets of observations, with each record detailing the observation of an organism, ideally at a specific geo-temporal location. In the case of collections, the observation is permanent in that the organism was collected from the field and preserved in a curated collection intended to last indefinitely. Collected specimens can be prepared in various ways, and several preparations from a single organism are not unusual (skin, skeleton, and perhaps microscope slides), thus there may be several records for a single organism, each representing the organism prepared using different techniques, but all records refering to a single observation event. Conversely, some collection records may represent a collection object that contains many organisms. For example, in icthyology, where the contents of a trawl may be sorted by taxon and lumped into a single collection container. Observation data sets catalog the observation of an organism, also at a specific geo-temporal location, but in this case the organism observed is not collected, and hence the observation record is the only information recorded about the organism. In both cases a taxonomic identification of the organism is attempted, with obvious consequences for accuracy of identification (a specimen available for identification to several experts compared with a potentially fleeting glimpse of an organism in the field).


Each specimen in collections is irreplaceable, and many may provide insight about the previous geographic distribution of taxa, and how this may have changed over time. Every specimen in collections is identified by a tag, which may be hand written. Specimens may be collected in the field under somewhat extreme conditions, with sometimes vague descriptions of the location from which the specimen was collected. Some natural history collections are quite old, reaching back some 300 years or so. Many collections are only partially computerized. For those collections that are wholly or partially computerized, there are no standards for the database content, schema, structure or type. Nevertheless, there is a commonality in the content of almost all collection and observation databases which may be exploited to perform ordered search and retrieval from these diverse data sets. The Darwin Core attempts to provide a set guidelines for addressing this commonality regardless of the underlying mechanism for storing the record content.

The Darwin Core profile provides a list of suggested access points and recommendations for their usage for searching natural history specimen and observation databases. It provides suggestions for stringifying queries such that they are protocol independent. It also provides guidance as to the content, structure and format of records retrieved from an information server supporting the Darwin Core.

Darwin Core Versions

Following the original, there are now several versions of the Darwin Core in use. This section attempts to provide a field guide to the various DwC's that have evolved. Figure 1 provides an overview of the evolutionary phylogeny of the Darwin Core.

http://tsadev.speciesanalyst.net/graphviz/dot.php?dot=http://speciesanalyst.net/docs/dwc/dwcevolution.gdot

Figure 1.

Darwin Core Version 1.0
This is close to the original Darwin Core, and is the first version that was put into active use. This version of DwC has some deficiencies that should be fairly obvious to anyone that works closely with Natural History collections. However, this version of the DwC is the most widely used (as of mid 2003).

Darwin Core Version 2.0

Darwin Core Version 2.0 var. MaNIS
This variety of DwC 2 is being actively deployed by the Mammal Networked Information System.

OBIS Federation Schema