By Gabriel Muñoz, a PhD student at Concordia University
Life on Earth thrives through its remarkable diversity of species. Different species combinations across the globe shape unique ecosystems, while interactions among them and their environment regulate nutrient flow and genetic exchange. During the past centuries, scientists around the globe have independently collected data on biodiversity, most of which is now found in stored as museum specimens and text in scientific publications. Recent efforts have consolidated much of this data into comprehensive and digitally available datasets (Figure 1). Globally aggregated data allows researchers to uncover large biodiversity patterns and investigate the drivers behind them.
Understanding community assembly, the process driving biodiversity patterns, requires more than mere species counts. Comprehensive knowledge of species traits and species interactions is critical. Despite the efforts in data collection and synthesis, significant knowledge gaps remain. Artificial intelligence can offer a powerful tool to generate insights, by leveraging the structure and relationships of observed data to fill knowledge gaps.
As part of my PhD thesis, I used machine learning models trained with data collected from literature and museological records, to predict synthetic variables representing species’ multitrophic traits (Figure 2) and cross trophic interactions. These synthetic variables are numerical variables that capture observed as well as un-observed ecological relationships, allowing a bridge between knowledge gaps and facilitating ecological inferences at large scales from sparse observations. Specifically, I harnessed global datasets on palms and mammals to train random forest models, enabling the prediction of multitrophic traits. Additionally, I employed neural network models to predict species interactions in the neotropics, allowing for the generation of synthetic datasets at a continental scale. By integrating these modeled data with maps detailing species’ geographic ranges, I explored the differences in functional diversity across trophic levels and constructed probabilistic networks for any given region of the Neotropics. Our results can inform conservation efforts and help to understand the potential consequences of global climate change on the structure of seed dispersal networks. Last year, in Vancouver, I had the privilege of presenting my research at the International Biogeography Meeting (IBS), thanks to the support of the QCBS Excellence Award.

Leveraging AI and synthetic data, we can embark on an era of ecological discovery, striving for a comprehensive understanding of Earth’s biodiversity and working towards a sustainable future. However, it is important that we must not leave data collection aside, as for now there are few only taxa with enough data to train models at a global scale, and huge geographical biases are present in data completion between northern and southern hemispheres. Globally inclusive collaborative efforts among scientists, natural historians, and data experts will pave the way for a deeper understanding of the intricate dynamics that govern all of Earth’s ecosystems.
About the author: Gabriel Muñoz is a PhD candidate working in the Community Ecology and Biogeography Lab at Concordia University.
0 Comments