For decades, scientists have warned that the growing magnitude, intensity, and scale of anthropogenic activities threaten the existence of life on Earth as we know it. Assessments of extinction risk categories provided by the IUCN Red List of Threatened Species enable large-scale analyses of human impacts on the living world and highlight conservation as well as restoration potentials. In a time- and labor-intensive effort, such assessments have been manually generated for, to date, more than 147,000 different species. However, assessing the extinction risk for a representative set of species, and keeping these assessments up to date, is challenging due to the dynamic nature of threats and the diversity of life on Earth. Out of those species that have been assessed, roughly 14% are classified as “Data Deficient”. This means that the available information about the species is inadequate to define its risk of extinction. Not only are there several different reasons for why a species can be listed as Data Deficient, but the true extinction risk of a Data Deficient species remains unknown.
For practitioners it is subsequently difficult to account for their conservation importance because a Data Deficient species could rightfully belong to any of the five extinction risk categories (Least Concern, Near Threatened, Vulnerable, Endangered & Critically Endangered). As a consequence, Data Deficient species are usually either ignored (thereby, neglecting a large share of the species pool) or treated as a homogeneous group in research utilizing the IUCN extinction risk categorizations, for instance for assessing biodiversity impacts of different anthropogenic activities.
In our recent paper published in Communications Biology, we predicted the risk of being threatened by extinction for exactly these species using a machine learning algorithm. For training this model, we linked known extinction risk categories of thousands of species to globally available data of known stressors (such as climatic conditions, land use, threats from invasive species, pesticide use, and many more). The model henceforth estimates whether a Data Deficient species faces conditions (within its geographical distribution) that likely puts it at risk of extinction.
The predictions indicate that Data Deficient species should be of high conservation interest, as they are on average more threatened by extinction than their data-sufficient counterparts. More specifically, 28% of data-sufficient species are threatened by extinction according to the IUCN, compared to a predicted 56% of Data Deficient species. Assessing the true conservation status seems to be specifically urgent for Data Deficient amphibians of which the classifier predicts 85% to be threatened by extinction, as well as for anthozoans (marine invertebrates including anemones and corals), insects, mammals and reptiles where more than half of Data Deficient species could be threatened by extinction. The findings further show that Data Deficient species vary greatly in their potential for being threatened. Consequently, species in some regions are more urgent to re-assess than in others. For example, we find hotspots for Data Deficient marine species predicted to be threatened in Southeast Asia as well as along the East-Atlantic and Mediterranean coastlines (Fig. 1 a). Land-dwelling Data Deficient species that are likely threatened are scattered across all continents and often geographically restricted to smaller ranges, most notably in Central Africa, Madagascar and Southern Asia (Fig. 1 b).
We believe these predictions can contribute to reducing the uncertainty of approaches that rely on the IUCN extinction risk categorizations. In addition, it illustrates the potential that lies in further utilizing big data for conservation, acting as a time efficient supplement to the manual IUCN Red List assessment procedures. By no means do we suggest replacing the labor-intensive and meticulous Red List assessments conducted by thousands of dedicated experts. The accuracy of our approach naturally depends on these very assessments. However, as global biodiversity continues to decline faster than we can conduct conservation assessments, we are convinced that it is of utmost importance to urgently increase the precision of Red List-derived analyses. The generated algorithm can aid in accelerating completion of the IUCN Red List by enabling screenings for necessary updates of data-sufficient species, as well as pre-assessing species on a large scale. We have translated the model into an online tool that allows anyone to look up the threat-probability of any species (given its geographical distribution is known).
As the IUCN Red List of Threatened Species is, for example, used to measure progress towards the United Nations Sustainable Development Goals, the presented extinction risk scores for Data Deficient species can increase the robustness of future studies identifying conservation priorities for a transition into a more sustainable world.