Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Feb 18:8:50.
doi: 10.1186/1471-2148-8-50.

Mining metadata from unidentified ITS sequences in GenBank: a case study in Inocybe (Basidiomycota)

Affiliations

Mining metadata from unidentified ITS sequences in GenBank: a case study in Inocybe (Basidiomycota)

Martin Ryberg et al. BMC Evol Biol. .

Abstract

Background: The lack of reference sequences from well-identified mycorrhizal fungi often poses a challenge to the inference of taxonomic affiliation of sequences from environmental samples, and many environmental sequences are thus left unidentified. Such unidentified sequences belonging to the widely distributed ectomycorrhizal fungal genus Inocybe (Basidiomycota) were retrieved from GenBank and divided into species that were identified in a phylogenetic context using a reference dataset from an ongoing study of the genus. The sequence metadata of the unidentified Inocybe sequences stored in GenBank, as well as data from the corresponding original papers, were compiled and used to explore the ecology and distribution of the genus. In addition, the relative occurrence of Inocybe was contrasted to that of other mycorrhizal genera.

Results: Most species of Inocybe were found to have less than 3% intraspecific variability in the ITS2 region of the nuclear ribosomal DNA. This cut-off value was used jointly with phylogenetic analysis to delimit and identify unidentified Inocybe sequences to species level. A total of 177 unidentified Inocybe ITS sequences corresponding to 98 species were recovered, 32% of which were successfully identified to species level in this study. These sequences account for an unexpectedly large proportion of the publicly available unidentified fungal ITS sequences when compared with other mycorrhizal genera. Eight Inocybe species were reported from multiple hosts and some even from hosts forming arbutoid or orchid mycorrhizae. Furthermore, Inocybe sequences have been reported from four continents and in climate zones ranging from cold temperate to equatorial climate. Out of the 19 species found in more than one study, six were found in both Europe and North America and one was found in both Europe and Japan, indicating that at least many north temperate species have a wide distribution.

Conclusion: Although DNA-based species identification and circumscription are associated with practical and conceptual difficulties, they also offer new possibilities and avenues for research. Metadata assembly holds great potential to synthesize valuable information from community studies for use in a species and taxonomy-oriented framework.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Colour plate of Inocybe. A. I. geophylla var. lilacina, B. I. whitei, C. I. posterula, D. I. jacobi, E. I. flavella, and F. I. squamata.
Figure 2
Figure 2
Phylograms depicting three species complexes of Inocybe. The figures are derived from one of the most parsimonious trees for the respective alignment group (Additional file 1). These are based on ITS and partial LSU with jackknife support reported above the branches. When only accession numbers are given, the corresponding sequence represents an unidentified GenBank sequence; when both an accession number and a species name are given, the entry corresponds to an fully identified GenBank sequence; and when a voucher is given in parenthesis, the sequence corresponds to a sequence added in this study (Additional file 2). a) The I. soluta/I. boltonii complex, b) The I. geophylla, I. posterula, and I. whitei complex (in this study collectively referred to as I. geophylla s. l.), and c) the I. flavella/I. squamata complex.
Figure 3
Figure 3
Graph depicting inter- and intraspecific similarity for Inocybe. The bars represent the number of species within each interval and are based on 105 species for the interspecific similarity (white bars) and 53 species for the intraspecific similarity (black bars).
Figure 4
Figure 4
Schematic illustration of the distribution of unidentified sequences among the major clades of Inocybe. The clades are named according to [23] and the numbers depict the number of unidentified sequences associated with each clade. As no fully identified ITS sequences representing species in the Auritella clade were available, this clade was excluded.
Figure 5
Figure 5
The geographic origins of Inocybe sequences (present in GenBank) that could be geographically assessed. a) North America, b) Europe, c) Asia and Australia. The color coding of the map represents climate according to the Köppen-Geiger climate classification (modified from [28]); main climate A = equatorial, B = arid, C = warm temperate, D = snow, E = polar; precipitation W = desert, S = steppe, f = fully humid, s = summer dry w = winter dry, m = monsoonal; temperature h = hot arid, k = cold arid, a = hot summer, b = warm summer, c = cool summer, d = extremely continental, F = polar frost, T = polar tundra. Black symbols represent studies with unidentified sequences, white symbols represent studies with fully identified sequences, and grey symbols represent studies with both unidentified and fully identified sequences. Circles represent studies for which the coordinates were given in the paper in which they were published, diamonds represent studies where the location had to be inferred from the locality given in the paper, and squares represent studies where the location was not available beyond the country/state level. The numbers correspond to the number given to each study in the Additional file 2.

References

    1. Bruns TD, Szaro TM, Gardes M, Cullings KW, Pan JJ, Taylor DL, Horton TR, Kretzer A, Garbelotto M, Li Y. A sequence database for the identification of ectomycorrhizal basidiomycetes by phylogenetic analysis. Mol Ecol. 1998;7:257–272. doi: 10.1046/j.1365-294X.1998.00337.x. - DOI
    1. Horton TR, Bruns TD. The molecular revolution in ectomycorrhizal ecology: peeking into the black-box. Mol Ecol. 2001;10:1855–1871. doi: 10.1046/j.0962-1083.2001.01333.x. - DOI - PubMed
    1. Bruns TD, Shefferson RP. Evolutionary studies of ectomycorrhizal fungi: recent advances and future directions. Can J Bot. 2004;82:1122–1132. doi: 10.1139/b04-021. - DOI
    1. Hillis DM, Dixon MT. Ribosomal DNA: Molecular evolution and phylogenetic inference. Q Rev Biol. 1991;66:411–453. doi: 10.1086/417338. - DOI - PubMed
    1. Hershkovitz MA, Lewis LA. Deep-level diagnostic value of the rDNA-ITS region. Mol Biol Evol. 1996;13:1276–1295. - PubMed

Publication types

LinkOut - more resources