Intelligent Wildlife Species Detector

Species are auto detected 'in the wild' using machine learning with results transmitted to the cloud

Similar projects worth following
Microphones are used to capture audio data which is then processed using machine learning to identify the animal species, whether it be bird, bat, rodent, whale, dolphin or anything that makes a distinct noise.

The key advantages over other existing technology is that: the audio data is filtered at source saving both disc space and human intervention. Previously recordings could easily generate many hours of footage per day, consuming up to 5 Gb per hour of disc space and adversely affecting the zoologist's golfing handicap and social life.

The core ingredients of this project are:

  • Nvidia Jetson Nano development board / Raspberry Pi 4 + 4 GB RAM.
  • Noctula fan nf a4x20 5v pwm (Nano only)
  • ADC Pi shield for sensing battery and supply voltages.
  • EVM3683-7-QN-01A  Evaluation Board for supplying a steady 5v to the Nano.
  • 5 Inch EDID Capacitive Touch Screen 800x480 HDMI Monitor TFT LCD Display.
  • Dragino LoRa/GPS HAT  for transmitting to the 'cloud' (Currently Pi 4 only)
  • 12 V rechargeable battery pack.
  • WavX bioacoustics R software package for wildlife acoustic feature extraction.
  • Random Forest R classification software.
  • In house developed deployment software.
  • Full spectrum (384 kb per second) audio data.
  • UltraMic 384 usb microphone.
  • Waterproof case Max 004.

What have been the key challenges so far?

  • Choosing the right software. Initially I started off using a package designed for music classification called ' PyAudioAnalysis' which gave options for both Random Forest and then human voice recognition Deep Learning using Tensorflow. Both systems worked ok, but the results were very poor. After some time chatting on this very friendly Facebook group:  Bat Call Sound Analysis Workshop , I found a software package written in the R language with a decent tutorial that worked well within a few hours of tweaking. As a rule, if the tutorial is crap, then the software should probably be avoided! The same was true when creating the app with the touchscreen - I found one really good tutorial for GTK 3 + python, with examples, which set me up for a relatively smooth ride.
  • Finding quality bat data for my country. In theory, there should be numerous databases of full spectrum audio recordings in the UK and France, but when actually trying to download audio files, most of them seem to have been closed down or limited to the more obscure 'social calls'. The only option was to make my own recordings which entailed setting up the device overnight in my local nature reserves, by which I managed to find 7 species of bat. Undoubtedly, the data is the most important part of this project and I spent very many pleasant hours out in the wilderness with the detector and the sounds of these wonderful creatures.
  • Using GTK 3 to produce the app. Whilst python itself is very well documented on Stack exchange etc, solving more detailed problems with GTK 3 was hard going. One bug was completely undocumented and took me 3 days to remove! The software is also rather clunky and not particularly user friendly or intuitive. Compared to ordinary programming with Python, GTK was NOT an enjoyable experience, although it's very rewarding to see the app in action.
  • Designing the overall architecture of the app - GTK only covers a very small part of the app - the touch screen display. The rest of it relies on various Bash and Python scripts to interact with the main deployment script which is written in R. Learning the R language was really not a problem as it's a very generic languages and and only seems to differ in it's idiosyncratic use of syntax, just like any other language really. The 'stack' architecture initially started to evolve organically with a lot of trial and error. As a Hacker, I just put it together in a way that seemed logical and did not involve too much work. I'm far too lazy to learn how to build a stack properly or even learn any language properly, but, after giving a presentation to my local university computer department, everybody seemed to agree that that was perfectly ok for product development. Below is a quick sketch of the stack interactions, which will be pure nonsense to most people but is invaluable to remind myself of how it all works:
  • Creating a dynamic barchart - I really wanted to display the results of the bat detection system in the most easy and comprehensive way and the boring old barchart seemed like the way forwards. However, to make it a bit more exciting, I decided to have it update dynamically so that as soon as a bat was detected, the results would appear on...
Read more »

View all 12 components

  • Web Page for Live Updates

    Tegwyn☠Twmffat03/31/2020 at 13:12 0 comments

    Most evenings, currently at about 20.00 UTC, this gadget will be deployed in the wild for rigorous testing and debugging. The barchart updates itself every 3 minutes or so and it's quite fun to see the animals appear during the evening / night.

  • Is it a Bat?

    Tegwyn☠Twmffat03/21/2020 at 14:57 0 comments

    After data augmentation, we get a whole load more spectograph images, but a lot of them are blank so it's a really good idea to auto delete all the blank ones. This is done through a function called 'Morphology', which basically counts discrete shapes in an image. Very useful !!!

    The image is firstly inverted, then converted to grey scale, then blurred, then a threshold is applied to create a mask, then blended back with the original and finally, the discrete objects are counted. Anything over 1 is kept and anything of value zero is deleted.

  • Testing Deep Learning

    Tegwyn☠Twmffat03/13/2020 at 10:03 0 comments

    So, about 17,300 spectographs were trainin on the Xavier for about 10 hours. I then fed 177 'unseen' spectographs into the network with the saved model and each result looked pretty much as above. There were 7 false positives, which was expected, and no obvious signs of over fitting.

  • 17,300 Spectograms cant be Wrong !

    Tegwyn☠Twmffat03/09/2020 at 09:17 0 comments

    Although the current software stack can automatically generate thousands of spectographs from almost nothing, each one of them has to be manually checked by eye of human. I tried to train my dog to do this for me, but it just cost me a whole load of German sausage for nothing.

    Here's few example of auto generated spectographs for Daubenton's bat ready for training. Each one takes about 0.5 seconds to check by eye:

  • 4G LTE modem Enabled !!

    Tegwyn☠Twmffat03/08/2020 at 12:00 0 comments

    The Sierra Wireless EM7455 is a high end cat 6 4G LTE modem that has a whole load of yummy features such as 300 mb per sec download and 50 mb per sec upload speeds .... Here's the FULL SPEC. In the past I have used some 3G and 2G modems, but only every got them to work in GPRS mode, which is fine for sending lots of text via some rather insecure methods such as 'get' or 'post' but not good for uploading spectograph image files to an Amazon server, for example. Another great feature of this device is that's it's extremely compact and slots nicely into a M.2 key B connector.

    The device requires a USB carrier board to connect to the Raspberry Pi or Jetson Nano and there's a few possibilities here although we opted for the Linkwave version and bought a high quality antenna on a 10 metre cable at the same time. Ok, it was expensive, but eventually, the results are worth it as being able to send images quickly means less battery juice being consumed.

    Connecting to the Raspberry Pi was just a matter of installing 'network manager' and creating a modem connection with the correct APN settings. For my network Three in the UK, the APN was '3Internet' with no password or username. Simple! Getting functionality with the Jetson Nano was a different matter and required doing a live probe on the system drivers being used in the Raspberry pi using:

    tail -f /var/log/syslog

     .. run in command line. Eventually i worked out that the most essential driver was qcserial, which is short for 'Qualcomm serial modem', which then had to be enabled in the Jetson Nano kernel .... So with a fresh 128 Gb SD card I flashed the Nano from a host computer using the latest Nvidia SDK Manager package, expanded the file system form 16 Gb to 128 Gb and started messing with the drivers using these these SCRIPTS.

  • View the Bat Detector LIVE in Action

    Tegwyn☠Twmffat03/04/2020 at 13:16 0 comments

    Click for live bat detection results

  • From Random Forest to Inception V3

    Tegwyn☠Twmffat02/28/2020 at 14:14 0 comments

    Now to instigate Deep Learning. As opposed to the Random forest method, using a fully convoluted network such as Google's Inception does not require features to be extracted using custom coded algorithms targeted to the animal's echo location voice. The network works it all out on it's own and a good pre-trained model already has a shed load of features defined which can be recycled and applied to new images which are completely unrelated to the old ones. Sounds too good to be true?

    To start with, I was very sceptical about it being able to tell the difference between identical calls at different frequencies, which is important when trying to classify members of the Pipistrelle genus. Basically, calls above 50 KHz are from Soprano Pips and calls under 50 KHz are Common Pips (There are other species in this genus, but not where I live!). So can the network tell the difference? The answer is both yes and no since we are forgetting one major tool at our disposal - data augmentation.

    Data augmentation can take many different forms such as flipping the image either horizontally or vertically or both, to give us 4x more data. Great for photos of dogs, but totally inappropriate for bat calls ! (bats never speak in reverse or upside down). Scaling and cropping is also inappropriate as we need to keep the frequency axis in tact. Possibly the only thing we can do is move the main echo-location calls in the time axis ..... so if we have a 20 msec call in a 500 msec audio file we could shift that call sideways in the time frame as much as we wanted. I chose to shift it (about) 64 times with some simple code to create a 'sliding window'. The code uses the Bash 'Mogrify' command which, strangely, only works properly on .jpg images.

    Essentially, it involves 8 of these:

    # Convert all .png files to .jpg or else mogrify wont work properly:
    ls -1 *.png | xargs -n 1 bash -c 'convert "$0" "${0%.png}.jpg"'
    # delete the .png files:
    find . -maxdepth 1 -type f -iname \*.png -delete
    # Now split up all the 0.5 second long files into 8 parts of 680 pixels each:
    for file in *
        mogrify -crop 5500x680+220+340 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname".jpg
        # This produces image 1 of 8:
        cp "$fname".jpg "$fname"_1.jpg
        mogrify -crop 680x680+0+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_1.jpg
        # This produces image 2 of 8:
        cp "$fname".jpg "$fname"_2.jpg
        mogrify -crop 680x680+680+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_2.jpg
        # This produces image 3 of 8:
        cp "$fname".jpg "$fname"_3.jpg
        mogrify -crop 680x680+1360+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_3.jpg
        # This produces image 4 of 8:
        cp "$fname".jpg "$fname"_4.jpg
        mogrify -crop 680x680+2040+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_4.jpg
        # This produces image 5 of 8:
        cp "$fname".jpg "$fname"_5.jpg
        mogrify -crop 680x680+2720+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_5.jpg
        # This produces image 6 of 8:
        cp "$fname".jpg "$fname"_6.jpg
        mogrify -crop 680x680+3400+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_6.jpg
        # This produces image 7 of 8:
        cp "$fname".jpg "$fname"_7.jpg
        mogrify -crop 680x680+4080+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_7.jpg
        # This produces image 8 of 8:
        cp "$fname".jpg "$fname"_8.jpg
        mogrify -crop 680x680+4760+0 /media/tegwyn/Xavier_SD/dog-breed-identification/build/plecotus_test_spectographs/test/"$fname"_8.jpg

     The final code is a bit more complicated than this, but not by much!

    Suddenly we've got about 64x the amount of data. But what do the images contain? What if they contain...

    Read more »

  • What else can this be used for?

    Tegwyn☠Twmffat02/25/2020 at 09:45 0 comments

    Whilst wildlife is obviously intelligent, an intelligent wildlife detector uses so called artificial intelligence to analyse audio recordings and calculate probabilities that a certain species was roaming around in the near vicinity. Most importantly, the software is open source so it can be adjusted to work in any geographical area on different levels. For example, here in the UK, the system is set up to detect my local bat species and I can also select genus or even just animal, in which case it will give results according to whether the audio was of a bat, a rodent or a cricket.

    Obviously not everybody is going to have the skills to hack the software, but to someone who does have the skills, it's actually not that hard to train the system to work on, for example, the local birds in the area. The hard part is actually getting all the recordings and sorting them into the different species.

    So, assuming the software is now tuned on the local animals / birds, the device would now be deployed out in the wild to seamlessly record chunks of audio and analyse each one of them on the spot to try and find something interesting. If the system finds something, even just a snapping twig, it will save the audio file, renaming it to 'rubbish' or such like. If it finds nothing, it will delete the file, saving space on the file storage device. More to the point, if it finds an interesting animal it will rename the file with the confidence of successful classification and the animal name and the date. Eg 95%_Nattereri_02:07:2020_18:44.wav. This saves a huge amount of time manually sorting through hundreds of Gb of data and manually naming files.

    The device itself has a lot of other functions, including displaying the data in a dynamic bar-chart on a small touchscreen. There's also the capability to transmit some of the data to the cloud via a radio link or the 4G cell phone network although much of our wildlife has the tendency to inhabit rather remote locations, far away from human communication infrastructure. For example, connection to the cloud is not going to work in the middle of the Amazon rain forest without a more expensive satellite transmitter.

    It is possible to leave the device out in the wilderness for a long period of time but it would need an external battery and solar panel to keep it alive. It would also need to be hidden and protected from theft or tampering, whether that be from human beings or other curious wildlife.

  • It's Pie for Dinner

    Tegwyn☠Twmffat02/18/2020 at 10:32 0 comments

    LoRa is not meant for big data. To get long range transmission with low power, the transmit time can be quite long, about a second in this case, and the TTN has a fair use policy of 30 seconds per day. For developers, this is a challenge and creating a simple bar chart as above is no easy feat. The data I wanted to transmit is represented in the table, top left, so obviously trying to transmit this in one go is a non starter. Instead, I transmitted one species at a time, presuming that normally we would not have more than 2 or 3 species being detected in one evening. This worked quite well except, as can be seen in the middle right, there are gaps in the data! This was solved by iterating through the species repeatedly and programming the channel to give precedence to the most recent reading for that species. After familiar with MatLab, this was actually quite fun and eventually I came up with quite a neat bit of code:

    % Channel ID to read data from 
    readChannelID = 744655; 
    SpeciesIDFieldID = 3; 
    AudioEventsFieldID = 4; 
    % Channel Read API Key   
    % If your channel is private, then enter the read API 
    % Key between the '' below:   
    readAPIKey = '';
    % Fetch the last 10 points from the Bat Detector channel:
    speciesID= thingSpeakRead(readChannelID,'Fields',SpeciesIDFieldID,'NumPoints',10,...
    numAudioEvents = thingSpeakRead(readChannelID,'Fields',AudioEventsFieldID,...
    A = [speciesID numAudioEvents]
    [~,b] = unique(A(:,1),'last')
    C = A(b,:)
    % Extract the first column, species label:
    D = C(:,1)
    % Extract the second column, audio events:
    E = C(:,2)
    labels = string(D)
    x = E
    % This is where the mapping assignment occurs:
    % May want to remove House keys or try and divide by 10 or something.
    numbers_to_decode = {'0','17','26','32','35','71','92'};
    M = containers.Map(numbers_to_decode,names)
    k = keys(M) ;
    val = values(M) ;
    % Now get the species name from the numbers using the map:
    for i = 1:length(labels)
    % Trim the length of the audio events vector to fit that of species:
    len_z = length(z);
    len_x = length(x);
    len_xy = (len_x - len_z);
    % Trim x vector to match z:
    x = x(len_xy +1 : end);

     This was typed into the 'visualisations' sections and tested by hitting the 'save and run' button. 

    I can't stress how incredibly user friendly ThingSpeak is and they give quite a generous 3 million data points and 4 channels before it's necessary to buy a license.

  • IOT: Integration to the Cloud

    Tegwyn☠Twmffat02/14/2020 at 13:58 0 comments

    Following the Adafruit tutorial to connect to The Things Network (TTN). This will get data through a local gateway if one is in range, but will not store the data or produce fancy graphs:

    First step accomplished: Get data to the TTN!

    Next, create a Payload Format:

    function Decoder(bytes, port) {
      // Decode an uplink message from a buffer
      // (array) of bytes to an object of fields.
      var decoded = {};
      //if (port === 1)  decoded.temp = (bytes[1] + bytes[0] * 256)/100;
      if (port === 1)  decoded.field1 = (bytes[1] + bytes[0] * 256)/100;
      //if (port === 1)  decoded.humid = (bytes[3] + bytes[2] * 256)/100;
      if (port === 1)  decoded.field2 = (bytes[3] + bytes[2] * 256)/100;
      //if (port === 1)  decoded.batSpecies = (bytes[5] + bytes[4] * 256);
      if (port === 1)  decoded.field3 = (bytes[5] + bytes[4] * 256);
      return decoded;

    The data stream will now look like this:

    1. Register with ThingSpeak and find 'Create new Channel' to process the data. Their instructions are very good and it's dead easy!
    2. Go back to TTN and find the 'Integrations' tab and add ThingSpeak with the appropriate channel ID and write API key from the new ThingSpeak channnel.

    Time to render some pretty graphs:

View all 10 project logs

  • 1
    Install the software
    1. Flash a 128 Gb SD card with the latest Jetson Nano image using Balena Etcher or such like.
    2. Boot up the Nano and create a new user called tegwyn.
    3. Open a terminal and run: git clone
    4. Find the file: in ultrasonic_classifier and open it in a text editor.
    5. Follow the instructions within. The whole script could be run from a terminal using:
    6. cd /home/tegwyn/ultrasonic_classifier/ && chmod 775 && bash

      .... Or install each line one by one for better certainty of success.

    7. Find the file: in the ultrasonic_classifier directory and edit the password at line 6 accordingly.
    8. Find the bat icon on the desktop and double click to run the app.
  • 2
    Wire up the Nano
    1. Plug the AB electronics ADC hat onto the Nano, green terminals facing away from the large black heat sink.
    2. Screw the fan onto the heat exchanger and plug it into the 'fan' socket.
    3. Find R10 on the Monolithic eval board and replace it with a high tolerance 270 ohm 0806 resistor. Check that this now outputs 5.0 V with a volt meter.
    4. Wire in the power supply with a 48.7 K resistor from the 12 V battery pack to analog pin 1 on the ADC hat.
    5. Wire the 5 V out to ADC pin 2 via a 3.3 K resistor.
    6. Find J48 on the Nano and attach a jumper.
    7. Connect eval board to Nano via the DC 2.1 mm socket.
    8. Connect USB and  HDMI cable touchscreen.
    9. Set the enable switch on the eval board to 'on'.

View all instructions

Enjoy this project?



jibril wrote 02/28/2020 at 14:33 point

nice project

  Are you sure? yes | no

Ken Yap wrote 02/08/2020 at 00:03 point

Hmm, is your instrument intelligent or does it detect intelligent bats? Or both? I wouldn't be surprised if they are intelligent too.

  Are you sure? yes | no

Tegwyn☠Twmffat wrote 02/08/2020 at 09:36 point

The bats are definitely intelligent, so yes the gadget does both. However, bats can not talk to each other beyond a basic level so can not debate the meaning of life, for example ...... Unless I'm missing something?

  Are you sure? yes | no

Tegwyn☠Twmffat wrote 02/08/2020 at 09:38 point

..... Of course, the bats are a lot more intelligent than this gadget :)

  Are you sure? yes | no

Ken Yap wrote 02/08/2020 at 09:50 point


  Are you sure? yes | no

Dan Maloney wrote 02/07/2020 at 16:46 point

It's interesting that bats have social calls distinct from their echolocation sounds - didn't know that, but probably should have guessed. Curious how you found out which six bat species you have if you couldn't find a decent database of bat sounds - or did you just determine from the calls that there are six different species yet-to-be-identified?

Interesting work. We used to have bats come out every night around our house, and I loved watching them maneuver about. Always wondered what they were doing up high when the mosquitoes I was told makes up most of their diet would seem to need to stay close to the ground to feast on us mammals.

  Are you sure? yes | no

Tegwyn☠Twmffat wrote 02/07/2020 at 17:38 point

Hello Dan - Great to hear from you! I bought a couple of books on  analysing British and European bats and quizzed the guys on Facebook on some of the more tricky species. Some people are incredibly helpful! As for your bats, different species have different feeding habits - some will feed up high and some even specialise in swooping down over areas of water. I dont know much about USA bats so could not say what they might be :( Generally, as you indicated, insects are attracted to mammals such as cows and their dung so the bats will be associated with cows etc grazing in pastures.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates