Forskningsartikel2022Vetenskapligt granskadÖppen tillgång

The curse of the uncultured fungus

Abarenkov, Kessy; Kristiansson, Erik; Ryberg, Martin; Nogal-Prata, Sandra; Gomez-Martinez, Daniela; Stueer-Patowsky, Katrin; Jansson, Tobias; Polme, Sergei; Ghobad-Nejhad, Masoomeh; Corcoll, Natalia; Scharn, Ruud; Sanchez-Garcia, Marisol; Khomich, Maryia; Wurzbacher, Christian; Nilsson, R. Henrik


The international DNA sequence databases abound in fungal sequences not annotated beyond the kingdom level, typically bearing names such as "uncultured fungus". These sequences beget lowresolution mycological results and invite further deposition of similarly poorly annotated entries. What do these sequences represent? This study uses a 767,918-sequence corpus of public full-length that represent truly unidentifiable fungal taxa - and what proportion of them that would have deposition. Our results suggest that more than 70% of these sequences would have been trivial to identify to at least the order/family level at the time of sequence deposition, hinting that factors other than poor availability of relevant reference sequences explain the low-resolution names. We speculate that researchers' perceived lack of time and lack of insight into the ramifications of this problem are the main explanations for the low-resolution names. We were surprised to find that more than a fifth of these sequences seem to have been deposited by mycologists rather than researchers unfamiliar with the consequences of poorly annotated fungal sequences in molecular repositories. The proportion of these needlessly poorly annotated sequences does not decline over time, suggesting that this problem must not be left unchecked.


Data interoperability; data mining; DNA barcoding; scientific practice; species identification; taxonomic; annotation

2022, nummer: 86, sidor: 177-194 Utgivare: PENSOFT PUBLISHERS