AI-powered Bioacoustics with BirdNET

How can computers learn to recognise birds from sounds? The Cornell Lab of Ornithology and the Chemnitz University of Technology are trying to find an answer to this question.

Their research has led to the development and evolution of BirdNET – an artificial neural network that has learned to detect and classify avian sounds through machine learning.

This webinar provides an introduction to BirdNET, how it works and and how the use of BirdNET can be scaled to generate huge biodiversity datasets (with a case study from Wilder Sensing).


BirdNET: Advancing Conservation with AI-Powered Sound ID

Dr Stefan Kahl (Cornell Lab of Ornithology and Chemnitz University)

Learn about how BirdNET was built with Dr Stefan Kahl. He’ll cover some basics about AI for sound ID, present a few case studies that have used BirdNET at scale and then conclude with some thoughts on the future of AI in bioacoustics.

Dr Stefan Kahl is a computer science post-doc at the Cornell Lab of Ornithology and Chemnitz University in Germany, working on AI algorithms for animal sound ID. He is lead of the BirdNET team, where they develop state-of-the-art tools for anyone interested in large-scale acoustic monitoring.

How did you handle species with multiple distinct call types?

We put all of the files from one species into a folder and told the AI model that this is one species. That has worked relatively well, these models are able to distinguish different call types for the same species so we can do calls, we can do songs, we can do all kinds of call types. We know that these feature vectors embed these call types, and we can re-construct it. So, after you’ve run BirdNET, instead of the class scores if you take these embeddings, you can do a clustering and cluster out these different call types and you’ll end up with a nice visualisation.

Sometimes what people will do, if they are interested in something specific like mating calls, you can train a custom classifier, as long as it is a category. So, if you can categorise it, it can become a model. So, if you want to run a call type model instead of a species model you can. One category of call types which is challenging is flight calls. Not all of the species have sufficient flight call data and flight calls tend to be short. I would exclude those.

How did you settle on the three second segments for the sampling?

We wanted to reduce the input size as these models are computationally expensive. So, the bigger the input, the more computationally expensive, so we want the smallest spectrogram you can have that still retains all the detail.

During my PhD, I did an empirical study looking at the average bird song length and the literature gave a time of two seconds. I added half a second before and after to have some wiggle room and that’s why we chose three seconds. We know some people are using five seconds and that is usually fine if you have longer context windows that might help with call sequences. Sometimes three seconds is not good enough as you need a temporal component to it.

Do recordings need to be added to Xeno-canto or can you access recordings from Merlin and other systems?

Merlin doesn’t currently leverage user’s observations, i.e., Merlin is not collecting data that we can use. We have access to recording submitted to eBird and Macaulay Library, though Xeno-Canto is a bit faster to allow non-bird uploads. The best The best way for users of BirdNET-Pi is BirdWeather (get a device ID and hook it up to the BirdWeather platform).

Is there a way to reduce the number of false positives in BirdNET?

Yes, you can also use false positives to train a custom classifier and then BirdNET will (hopefully) learn to separate target from non-target sounds. So basically, using those “negative” samples to train a model.

What is needed to scale BirdNET fast?

More data. Plus anecdotal evidence on how people are using it, so we can learn what we need to tackle to make it more useful.

Do you know if anyone is using BirdNET to look at the relationship between anthropogenic noise and species abundance?

Not abundance but disturbance – there’s a project going on in Yellowstone National Park looking at the impact of snowmobiles on bird vocalisation. They found that engine sounds are a significant disturbance and needs to be managed, i.e., birds will stop vocalizing for extended periods of time, even well after the snowmobiles have passed.

What data augmentation techniques do you use (if any) to expand your training dataset?

MixUp (mixing multi recordings into one) is the most important and most effective augmentation. We had a student look into augmentations a while ago.


A Scalable Platform for Ecological Monitoring

Lorenzo Trojan (Wilder Sensing)

How can we measure the impact of wildlife restoration, assess biodiversity loss, and evaluate the effectiveness of environmental policies? Passive bioacoustic monitoring offers a powerful solution, enabling continuous, large-scale coverage of ecosystems. To fully harness its potential, we need a scalable, robust software platform capable of handling vast audio datasets, detecting key biological sounds using AI, and extracting ecological insights such as species richness and assemblage trends.

This presentation explores the challenges of biodiversity monitoring with passive audio recorders, the processing and analysis of large-scale acoustic data, and the technologies that make this approach viable and impactful. We’ll also showcase how the Wilder Sensing Platform is purpose-built to meet these demands—providing researchers, conservationists, and policymakers with an intuitive, scalable, and efficient tool for biodiversity monitoring and ecological surveys.

Dr Lorenzo Trojan is a technologist with a PhD in Astrophysics and over a decade of leadership experience in high-growth tech startups. His expertise spans remote sensing, cloud computing, DevOps, and AI. As CTO of Wilder Sensing, he leads the development of a scalable platform for ecological monitoring, driven by a commitment to innovation, inclusivity, and impact.


Useful links


Wilder Sensing ecoTECH blogs

  1. How Can We Use Sound to Measure Biodiversity: https://biologicalrecording.co.uk/2024/07/09/bioacoustics-1/
  2. Can Passive Acoustic Monitoring of Birds Replace Site Surveys blog: https://biologicalrecording.co.uk/2024/09/17/bioacoustics-2/
  3. The Wilder Sensing Guide to Mastering Bioacoustic Bird Surveys: https://biologicalrecording.co.uk/2024/11/26/bioacoustics-3/
  4. Bioacoustics for Regenerative Agriculture: https://biologicalrecording.co.uk/2025/03/31/bioacoustics-for-regen-ag/
  5. AI-powered Bioacoustics with BirdNET: https://biologicalrecording.co.uk/2025/07/08/birdnet/

Event partners

This blog was produced by the Biological Recording Company in partnership with Wilder Sensing, Wildlife Acoustics and NHBS.


More for environmental professionals

Published by Keiron Derek Brown

A blog about biological recording in the UK from the scheme organiser for the National Earthworm Recording Scheme.

One thought on “AI-powered Bioacoustics with BirdNET

Leave a comment