Skip to main content
SLU publication database (SLUpub)

Abstract

DNA-based biodiversity surveys result in massive-scale data, including up to millions of species-of which, most are rare. Making the most of such data for inference and prediction requires modeling approaches that can relate species occurrences to environmental and spatial predictors, while incorporating information about their taxonomic or phylogenetic placement. Even if the scalability of joint species distribution models to large communities has greatly advanced, incorporating hundreds of thousands of species has not been feasible to date, leading to compromised analyses. Here we present a 'common to rare transfer learning' (CORAL) approach, based on borrowing information from the common species to enable statistically and computationally efficient modeling of both common and rare species. We illustrate that CORAL leads to much improved prediction and inference in the context of DNA metabarcoding data from Madagascar, comprising 255,188 arthropod species detected in 2,874 samples.

Published in

Nature Methods
2025
Publisher: NATURE PORTFOLIO

SLU Authors

UKÄ Subject classification

Ecology
Bioinformatics (Computational Biology)

Publication identifier

  • DOI: https://doi.org/10.1038/s41592-025-02823-y

Permanent link to this page (URI)

https://res.slu.se/id/publ/143875