Tom Stephenson on the BirdGenie App

Where did the idea for BirdGenie’s technology come from?

TS: I was a musician and played concerts and worked in studios for many years. During this time I became interested in sound design and processing technologies. I continued this interest at Roland Corporation, where I finally got a “real” job, designing multi-channel recorders and mixing consoles. After retiring as director of technology of one of the divisions, I finally had more time for birding.

I had always been interested in vocalizations, often sketching out the shape of songs I was learning. Bird songs are often highly variable from individual to individual across one species. I wanted to find out what unifying elements allowed a member of a species, or us humans, to look beyond those variations and identify a bird as a member of one species.

As part of this study I wrote an article for ABA’s Birding magazine outlining how to identify the songs of different thrasher species, highly variable mimics, using the structure of their songs. Realizing how powerful this kind of analysis could be was a breakthrough for me. I began looking at spectrograms, which are graphics representations, of many different kinds of bird songs. I was trying to find the unifying structures underlying all of the songs of one species.

I was also comparing the general structure of all songs. Were there any universal constants that could help us notice the key unifying features that make one species’ songs unique? These studies, including basics like the elements, phrases and sections of songs, led to the breakthrough vocabulary and analysis of vocalizations in the book I wrote with Scott Whittle, The Warbler Guide.

My background work in signal processing and audio analysis as a musician and designer led me to think more broadly about how these song criteria could be formalized and stereotyped so they could even be used by a computer to help with song identification. After working on this for some time I filed for a patent on these methods and concepts for identifying animal vocalizations, which was finally granted a few years ago now. Sorry for the long-winded answer!

Did you do the programming yourself?

TS: No. That’s tough work! But I had been thinking about how to implement these ideas for some time. I went to an audio trade show and ran into a friend. He had just started working at a prominent software company that created products with highly sophisticated signal processing. I asked him if their processing technologies might work with my ideas. He thought they would and became very interested in the project.

I went up to their offices in, Cambridge, MA, and gave a presentation about the size of the birding marketplace and my strategies. They got excited about these ideas and assigned two engineers to work with me on the project. They first researched all of the academic papers on the topic, and then developed programs to implement those strategies.

The state of the art then, and even now, is black-box-machine-learning. Basically you feed a computer with lots of known examples which are analyzed using a variety of processing tools. The computer develops a map of features for each type. You then present the computer with a new example and it applies those mapped criteria and tries to make an identification. This system works well for music identification and even voice recognition, however birds are just a lot more variable.

We started with warbler songs because I had been working on them for some time and had a good library of examples. After some development time, the program started working very well about 70% of the time. But based on the mistaken identifications it made, I could see the “black box” wasn’t using much of my system. For example, a species, whose songs always have at least two sections, might be mistaken for a species that only sings one-section songs.  When I pointed this out to the programmers they said they could try and “train” the black box with more examples, but they didn’t have any way of seeing the computer’s rules or modifying them manually. That’s why they call it a “black box” system.

I realized that, even though this was state-of-the-art, it wasn’t going to work well with bird songs. I needed a system that allowed for direct injection of the ID criteria I had found to be fundamental to identifying highly variable bird songs. They agreed and the CEO put me in touch with Stephen Pope, who became our programmer for BirdGenie’s engine. He had an extensive background in music technologies, signal processing and computer programming. He immediately understood the need for a new way of designing the engine, using black-box strategies but supplementing them more types of machine learning and with a user-driven rule set and other strategies that could take advantage of my prior work.

A sample of the BirdGenie app.

So is that what makes BirdGenie different from other programs on the market?

TS: Yes, most definitely. BirdGenie’s patented strategies are unique and effective. We use a wide range of rule-based strategies that we can control, modify and improve, which is very different from the standard black-box methods.

When I started working with Stephen my first requirement was that the whole system be transparent. I wanted to be able to “see” what the computer was finding relevant and then modify or add other criteria. So the first program he created was a Tool that shows all of the ID criteria and how they are weighted. We then added more than thirty additional criteria that reflect the important structures and underlying features for all bird songs. And in addition to these general rules, we can add more criteria that are specific to each individual species. Some features, like density and ratio of harmonics, or relationship of silences between elements to element lengths; are not detectable by the human ear, but can be very effective ID criteria.

Stephen’s background at Stanford’s CCRMA, UC Berkley’s CNMAT and Xerox PARC programs made him uniquely qualified to work on the project. Bird song is actually much more difficult than music or even human voice. It’s highly variable and that’s why our methods work so well, using the underlying structures and the similarities that allow even us humans to identify a singing bird.

How do you think users will benefit from BirdGenie?

TS: BirdGenie can help users identify almost all of the birds they find in their backyard or local park. People who feed birds, take hikes, or just enjoy walking in their local natural areas, are often surrounded by singing birds that can be hard-to-see. BirdGenie can help them find the identify of these hidden songsters and learn more about the natural bird life around them.

Here’s an example. I was visiting a family that had a small suburban backyard and a couple of bird feeders. They recognized the Blue Jays, Northern Cardinals and Downy Woodpeckers that visited regularly.

But during a June visit I stepped outside and in about fifteen minutes heard over twenty species of birds singing in their yard. We looked them all up and they realized they had seen most of them, at least briefly. The loud singing of one species, the House Wren, had baffled them for years. Once they realized what a great species that is, they went out and bought a small bird house which it now nests in every year.

We’re hoping that being able to identify and learn more about all of their great local birds, will allow people to enjoy nature even more than they do now, and maybe even get more involved in conservation.

A screenshot of BirdGenie as it records the bird sounds around its user.

Is BirdGenie kid-friendly?

TS: The better question might be “Is it adult friendly?!!” Kids are such great adopters of technology that often they’re more fluent with an iPhone or Android than adults. That being said, we spent a lot of time and resources making BirdGenie very easy to use. The screens and user interface are very simple.  

That might sound like an obvious thing, but actually it takes a lot of work. Simple is not nearly as easy as confusing! We worked with two different design firms, and have done hours of field testing to make sure everything in the program is simple and intuitive, even for adults!

Beyond helping users learn more about their local birds, does BirdGenie have any other benefits?

TS: Yes, we hope so. Users can choose to share their recordings and IDs with us. This is all anonymous, of course. But these data could be very valuable for research.

For example, right now there is no easy way for a scientist to study how Song Sparrow songs vary across the U.S. and Canada. Of course they could get grant money and spend a year or more traveling from state to state. But with BirdGenie’s shared song data, we could generate a large sample set of Song Sparrow songs all across the U.S.

Once we have it, we will make this information available to researchers, who could then use it to target studies in local song dialects and possibly learn more about how songs are learned, the status and distribution of species during different seasons, and more.

Last question: What is the Match Assist feature in BirdGenie?

TS:  Match Assist is a feature unique to BirdGenie. Birds often sing in noisy environments or are difficult to get close enough to so for a good recording. Of course BirdGenie uses powerful noise reduction tools. Even if there are people talking or lawn mowers are going, BirdGenie can usually isolate the bird song and make the identification.

But in cases of really distant birds, or multiple birds singing at once, we have provided an additional tool to help with the identification process. That’s Match Assist. Using this unique feature, users can choose to answer 3 or 4 simple questions about different aspects of a song they recorded. The engine can then use the answers to further isolate a song and make an ID.

For example, one question is whether the bird is singing one sound over and over again, like an American Crow or a Blue Jay, or many different sounds, like a House Wren. This not only can assist in the ID process but also is a way users can learn more about the structure and characteristics of the songs around them.

In many ways, BirdGenie is really an educational tool. It helps users learn more about what birds are singing and living around them, and also helps users become more aware of bird song in general and what makes songs so unique and beautiful.