Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 7:(550):207-23.
doi: 10.3897/zookeys.550.9546. eCollection 2016.

The use and limits of scientific names in biological informatics

Affiliations

The use and limits of scientific names in biological informatics

David Remsen. Zookeys. .

Abstract

Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription. These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names.

Keywords: Taxonomic name services; identifiers; relevance; search and retrieval; taxon concepts.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Scientific names label information about species.
Figure 2.
Figure 2.
All information of a species is linked by a name.
Figure 3.
Figure 3.
The semiotic triangle describes how names communicate meaning.
Figure 4.
Figure 4.
Precision vs. recall in search results.
Figure 5.
Figure 5.
A polyseme is a single name referring to more than one overlapping or included concept.

References

    1. Alonso-Zarazaga MA, Fautin DG, Michel E. (2016) The List of Available Names (LAN): A new generation for stable taxonomic names in zoology? In: Michel E. (Ed.) Anchoring Biodiversity Information: From Sherborn to the 21st century and beyond. ZooKeys 550: 225–232. doi: 10.3897/zookeys.550.10043 - DOI - PMC - PubMed
    1. Berendsohn WG. (1995) The concept of “potential taxa” in databases. Taxon 44: 207–212. doi: 10.2307/1222443 - DOI
    1. Berendsohn WG, Geoffroy M. (2007) Networking taxonomic concepts – uniting without ‘unitary-ism. In: Curry GB, Humphries CJ. (Eds) Biodiversity Databases: Techniques, Politics, and Applications. Systematics Association Special Volume 73 CRC Press, Boca Raton, 13–22. doi: 10.1201/9781439832547.ch3 - DOI
    1. Blackwelder RA. (1967) Taxonomy: A text and reference book. Wiley, New York, 714 pp.
    1. Boulis C, Ostendorf M. (2005) Text classification by augmenting the bag-of-words representation with redundancy-compensated bigrams. Feature Selection in Data Mining. SIAM conference on Data Mining.

LinkOut - more resources