DivShift

Exploring Domain-Specific Distribution Shift in Large-Scale, Volunteer-Collected Biodiversity Datasets

If you are here for more information on DivShift, see my main webpage , the AAAI proceedings , the extended paper , the blog post , the dataset , or the code . Below is a reflection on my growth as a science communicator during this project.

DivShift is both my first academic publication and an exciting step into machine learning research for sustainability. After two and a half years since my last research experience, I had grown as a computer scientist and a communicator and had found a new passion for the intersection of machine learning and biodiversity. During my summer at the geosense lab, I investigated how biases in volunteer-collected biodiversity data (e.g., from iNaturalist) impact deep learning model performance.

I presented this work at a Climate Change workshop at NeurIPS 2024 and at AAAI 2025 (where it won best paper award). This presented me with a new challenge: how to communicate my interdisciplinary research to an audience of machine learning experts. A key challenge was bridging the gap between the ecology community, which cares deeply about species distribution and observational data biases, and the machine learning community, which typically focuses on algorithmic innovation and performance metrics.

NeurIPS Video

My NeurIPS presentation, which can be viewed here, challenged me to independently distill these dense analyses into crisp diagrams that could quickly convey the significance of volunteer-collected data biases. I begin with the ecology motivation behind the study, the importance of biodiversity monitoring in a rapidly changing climate and the bias in the data used to build models to assist in biodiversity monitoring. Then I move to the machine learning methods, using clear visuals to show the bias in the data and the impact on model performance. I also utilized slide transtitions and shading to intentionally focus audience attention as I presented each visual.

AAAI Slide Presentation

The AAAI talk incorporated collaborative improvements from Lauren Gillespie, my mentor and co-author, and other lab members, showing me how a team-based iteration and mentorship on slides and data visualizations can yield more meaningful charts, streamlined layouts, and a coherent color palette. These more advanced charts helped effectively communicate more nuanced results to the audience.

Working on DivShift taught me the importance of audience calibration, building technically rigorous materials for peer reviewers, while still presenting an overarching story for interdisciplinary audiences. Going forward with the research I conduct in my PhD, I plan to further blend advanced computational methods with ecological insights in my research and continue this careful attention to audience accessability.

Educate

Uplift

Sustain

← Grow: Carbon Taxation Pollinate: EquiMon →