Elephant AI

a system to prevent human-elephant conflict by detecting elephants using machine vision, and warning humans and/or repelling elephants

Similar projects worth following
The conflict that arises between humans and elephants in countries such as India, Sri Lanka, and Kenya, claims many hundreds of human and elephant lives per year. These negative interactions arise when humans meet elephants on their trails, when elephants raid fields for food, and when elephants try to cross railways. Machine vision and automated deterrence can mitigate such conflict.


This is an evolution of my 'Automated Elephant-detection system' that was a semi-finalist in the Hackaday Prize 2016. The current project differs substantially in that it makes use of more advanced machine vision techniques, eliminates the usage of RF communications (using 4G/3G/EDGE/GPRS), village base stations, and includes elephant-deterrence devices to completely eliminate interaction between humans and elephants whenever possible.

So, let's get to the primary goals of Elephant AI:

  • Eliminate contact between humans and elephants
  • Protect elephants from injury and death
  • Protect humans from injury and death

How will the Elephant AI accomplish these goals?

  • Detect elephants as they move along their regular paths. These paths have been used by elephants for many years (perhaps centuries) and often cut through areas now used by humans. Humans will be warned that elephants are moving on the paths so they can stay away or move with caution.
  • Detect elephants as they leave forested areas to raid human crop fields. At this point, elephant deterrence devices will attempt to automatically scare elephants. This will be using sounds of animals they dislike (e.g. bees and tigers, and human voices in the case of Maasai people in Kenya/Tanzania), and perhaps by firing chili balls into the paths of the elephants from compressed air guns.
  • Detect elephants before they stray onto railway lines. This can be done via a combination of machine vision techniques and more low-tech IR (or laser) break-beam sensors. Train drivers can be alerted to slow-down and stop before hitting the elephants who are crossing.

Just how bad is it for humans and elephants to interact? This video, shot several months ago, in India, gives some idea. It is really bad indeed. It causes great stress to elephants, and puts both the elephants and humans at risk of injury or death.

That's why Elephant AI wants to take human-elephant interaction out of the equation entirely!


We need a daylight camera (IR-filtered) and a night camera (NoIR filtered + IR illumination array) since elephants need to be detected 24hrs per day! In my original project I completely forgot about this, then decided to multiplex cameras to one Raspberry Pi. It was actually cheaper and easier to use two raspberry pi's; each with its own camera. Night-time and daytime classification of elephant images both need their own trained object detector anyway, so I don't think it's such a bad solution (for now).


This is the main part of the project. In my original automated elephant detection project I'd envisaged just comparing histograms!! Or failing that I'd try feature-matching with FLANN. Both of these proved to be completely rubbish in regard of detecting elephants! I tried Haar cascades too, but these had lots of false positives and literally took several weeks to train!

I'm currently working with an object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM). That's working out well so far. But I'm also looking to try out using TensorFlow. That is using a pre-trained model (e.g. Inception) but with a retrained final layer.

After the object detector gives us a true or false for detection of elephant, the detection device will:

  1. upload the image to a web server (via 4G/3G/GPRS modem)
  2. notify cell phones on list of detection (via SMS)
  3. activate deterrence devices

Step one is actually a great thing for education as these images could be shared with schools!

Image classification update logs:

#3 result for object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM)

#4 result for object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM)

  • 2 × Raspberry Pi 3 Model B [detection device]
  • 1 × Raspberry Pi Camera Module v2 (8MP) Standard [detection device] daytime usage [£29]
  • 1 × Raspberry Pi Camera Module v2 (8MP) NoIR [detection device] nightime usage (IR filtered) [£29]

  • #4 result for object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM)

    Neil K. Sheridan04/15/2017 at 20:18 0 comments

    Hi, this is the result so far from my larger-scale training run (see here). Unfortunately, it wasn't quite as large-scale as I hoped due to problems with the color-spaces and sizes of images from the caltech-256 dataset.

    The training run entailed the usage of:

    • 0.5*350 (positive) elephant images that I'd obtained and cropped manually
    • 2000 (negative) non-elephant images from caltech dataset. These are mostly of animals
    • Hard-negative mining on 50 negative images

    For the elephant images I included front-view, side-view at various angles, rear view (elephant bum), and close-up of elephant faces (range <1m).

    Workflow using m4.4xlarge EC2 instance:

    • Extract features (time = 15 minutes)
    • Train model (time = 120 minutes)
    • Hard-negative mining (time = 120 minutes)
    • Train model adding hard-negatives (time = 120 minutes)
    • Download model (time = 60 minutes)

    Approx cost using m4.4xlarge EC instance: $7


    So far I've only had time to test the object detector on testing sets of 10 elephant images and 10 non-elephant animal images (animals likely to be present in area: sloth bears, wild pigs, cows, water buffalo, tigers, deer, humans). I was quite pleased with the results really tho! I got 0% false-negatives (i.e. failures to detect elephants when elephants present), and 20% false-positives (i.e. detect elephants when elephants not present).

    [Image: Examples of animals in the testing set]

    Interestingly, the false-positives occurred with animals having bums which looked like elephant bums! The water buffalo, which really did look like an elephant bum even to me! And the cow, which looked a bit like one. However, the color was wrong. But this object detector is using grayscale not BGR.

    [Image: false-positive with a water buffalo]

    Next steps:

    • Removal of elephant bums (rear-view) from the training set of positive images. Since elephants are going to trigger PIR when they approach rather than walk away!
    • Removal of close-up elephant faces (<1m range). Since we should have detected them way before they get this close to the camera! If they are this close they are probably going to do something nasty to the camera!!
    • Fix the problems with the negative images that caused issues. I'll have to look through these manually and see what exactly the problems are! Very tedious I expect!
    • So increase negative images up to 4000, and use the full 350 positive images

  • Retraining TensorFlow Inception v3 using TensorFlow-Slim (Part 1)

    Neil K. Sheridan04/08/2017 at 19:19 0 comments

    i, I got started with TensorFlow today using TensorFlow-Slim. This is a "lightweight high-level API of TensorFlow (tensorflow.contrib.slim) for defining, training and evaluating complex models in TensorFlow". There's code in the repository here for retraining many of the Convolutional Neural Network (CNN) image classification models. I'll be retraining Inception v3. I'm logging my full protocol here, so you can play/follow-along if you want!

    The maintainers of TensorFlow-Slim are:

    The code in the repository is licensed under (unless otherwise stated).

    Protocol for retraining Inception v3 using the flowers dataset with TensorFlow-Slim:

    1. $ virtualenv --system-site-packages tensorflow
    2. $ source ~/tensorflow/bin/activate # bash, sh, ksh, or zsh
    3. $ pip install --upgrade tensorflow # for Python 2.7 [See for more details]
    4. $ source ~/tensorflow/bin/activate # bash, sh, ksh, or zsh
    5. Validate with a python hello tensorflow
    6. import tensorflow as tf
      hello = tf.constant('Hello, TensorFlow!')
      sess = tf.Session()
    7. 6. Create a directory, and install the TF-Slim image models library with:

      $ git clone

      7. $ cd models/slim and create a directory to download the flowers dataset to. This dataset has 5 categories of flowers with 2500 flowers images. So mkdir DATASET

      8. So you need to download the dataset convert it to TensorFlow's native TFRecord format. Happily they already wrote everything, which we got from github. So here we run it:

      $ python \
          --dataset_name=flowers \

      9. The TFRecord files should be all in your DATASET directory now! They actually wrote an .sh to do it all, but I did it myself since I'm using an EC2 machine and I prefer to do it myself anyway.

      10. Make a pre-trained checkpoint directory mkdir PRETRAINEDCHECKPOINTDIR

      11. Get the checkpoint for Inception v3 and put it in the PRETRAINEDCHECKPOINTDIR

      $ wget
      $  tar -xvf inception_v3_2016_08_28.tar.gz
       $ mv inception_v3.ckpt PRETRAINEDCHECKPOINTDIR

      12. Now we mkdir TRAINDIR, and we retrain the final layers of the CNN for 1000 steps using the new flowers dataset!

      python \
        --train_dir=TRAINDIR \
        --dataset_name=flowers \
        --dataset_split_name=train \
        --dataset_dir=DATASET \
        --model_name=inception_v3 \
        --checkpoint_path=PRETRAINEDCHECKPOINTDIR/inception_v3.ckpt \
        --checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits \
        --trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits \
        --max_number_of_steps=1000 \
        --batch_size=32 \
        --learning_rate=0.01 \
        --learning_rate_decay_type=fixed \
        --save_interval_secs=60 \
        --save_summaries_secs=60 \
        --log_every_n_steps=100 \
        --optimizer=rmsprop \
      I got the following error when I tried on my i5 laptop with 8GB!

      That's a memory error in this context. Anyway, it worked fine on the Amazon EC2 machine. It was an m4.4xlarge with 64GB.

      It was quite slow. It took 1hr to complete the 1000 steps using CPU only. Also you can see in errors: since TensorFlow was installed using pip (I didn't compile it), it isn't using Intel SSE4 SIMD instruction set etc.

      13. So after it finished the 1000 step retraining, we run the evaluation, to see how good it is at classifying those new flowers!

      python \
        --checkpoint_path=TRAINDIR \
        --eval_dir=TRAINDIR \
        --dataset_name=flowers \
        --dataset_split_name=validation \
        --dataset_dir=DATASET \
      Here's what I got!

      So 0.77 accuracy. I'm not sure if that is good or bad!

      14. Next is to retrain for 500 more steps. And then evaluate again.. But I didn't get that far..

      python \
    Read more »

  • Plan for large-scale training run with object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM)

    Neil K. Sheridan04/07/2017 at 19:30 0 comments

    In this large-scale training run I'll be using 5177 negative images, and 750 positive elephant-images. I'll attempt hard-negative mining on 500 images. Hard-negative mining is outlined in the following paper 'Object Detection with Discriminatively Trained Part Based Models' (Felzenszwalb et al.).

    I'll be using the following EC2 instance: r4.16xlarge, with vCPU=64, ECU=195, and Memory GB=488. It's about $5/hr for rental. So hopefully I won't get broken pipe on the SSH session near the end!

    Storage is 300GiB with provisioned IOPS SSD, 15000 IOPS. So that's extra cost I expect!

    Negative images to use:

        • chimps = 110
        • bears = 102
        • blimps = 86
        • bonsai = 122
        • bulldozers = 110
        • cactus = 114
        • camel = 110
        • canoe = 104
        • covered wagon = 97
        • cormorant = 106
        • dog = 103
        • duck = 87
        • elk = 101
        • fern = 110
        • firetruck = 118
        • giraffe = 84
        • goat = 112
        • goose = 110
        • gorilla = 212
        • horse = 270
        • ibis = 120
        • kangaroo = 82
        • leopards = 190
        • llama = 119
        • ostrich = 109
        • owl = 120
        • palm tree = 105
        • people = 209
        • porcupine = 101
        • raccoon = 140
        • skunk = 81
        • snake = 112
        • swan = 115
        • touring bike = 110
        • car side view = 116
        • zebra = 96
        • greyhound = 95
        • toads = 108
        • rhinos (own dataset) = 95
        • [4591 total running]
        • India cows (own dataset) = 41
        • Sloth Bears (own dataset) = 45
        • buffalo = 250
        • tigers (2) = 250

  • Positive training images: bounding boxes or not?

    Neil K. Sheridan04/06/2017 at 20:51 0 comments

    Now, what I have been doing is putting bounding boxes around the elephants in my positive training images. Then, collecting the xy coords for these in files, which I passed to python, which would cut-out the regions of interest that contained elephants, before passing them to the HOG feature extractor. The problem is the bounding boxes contained other bits of non-elephant things in some cases. Well, they'd always contain grass, sky, rocks, branches, etc.

    So, I thought, why not cut-out the elephants accurately myself. And just pass these straight to the HOG feature extractor? E.g. cut-out like this:


  • #3 result for object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM)

    Neil K. Sheridan04/01/2017 at 19:32 0 comments

      This time I used 1000 negative training images from the caltech256 dataset. I only used parts of dataset containing animals (chimps, llamas, gorillas, kangaroos, horses, elks,..) and some landscapes (not urban) this time. These 1000 again being selected pseudo-randomly from the images I had on storage drive from the dataset. I again used only 64 positive images from the earlier caltech dataset. I used hard-negative mining on 50 images this time. That took around 40 minutes on the EC2 (virtual machine) m4.4xlarge instance I was using.

      The workflow is:

      1. extract features from the positive and negative images (2 minutes)
      2. train object detector (45 minutes)
      3. hard-negative mining (40 minutes)
      4. re-train object detector with the hard-negatives (45 minutes)

      N.B. If you are using EC2 like me, you can end up with broken pipe in SSH session if the client sleeps during long training sessions :-(

      So how would it get on with cows and rhinos this time!? It even detected farmers once last time!

      Much more promising results!

      NO RHINOS DETECTED! *well in this image anyway!

      Elephants still detected!

      Farmers in fields not detected!

      No cows detected in the several images I tested!

      Tapir was unfortunately detected! It does look kind of similar to a baby elephant! Hard one!

      Sloth bear was detected! Not as elephant-like as the tapir!

      Tiger not detected. Yay!

      I didn't undertake any stringent testing protocol to gather a percentage of false-positives and false-negatives at this early stage.

      The different approach this time was to include primarily animal-based negative training images, increase the negative images used from 700 to 1000, and perform hard-negative mining on 50 vs. 10 images.

  • Instructions for using EC2 instance to train object detector [in progress]

    Neil K. Sheridan03/31/2017 at 19:58 3 comments

    To speed things up, I'm making using of the Amazon Elastic Compute Cloud (EC2) virtual machines to train my object detector using HOG and SVMs. Here is the protocol for setting up the virtual machine; in this case I'm using an M4 instance, which has either 2.3 GHz Intel Xeon® E5-2686 v4 (Broadwell) processors or 2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors. And in the case of m4.4xlarge it has 16 vCPUs and 64 GiB of memory. This one costs $0.998 per hour to rent.

    1. Choose AMI (machine image: I'm using Ubuntu), choose instance, setup security, setup storage

    2. Launch instance

    3. Download the keypair for this instance (keypair.pem)

    4. Open SSH client on your machine

    5. Set permissions for your keypair: chmod 400 /path/keypair.pem so it is not really loose else it will be rejected

    6. Use SSH client to connect to the instance:

    ssh -i /path/keypair.pem
    7. You can move files between your machine and the instance using scp
    scp -i keypair.pem myfiletotransfer.tar.gz

    ** don't forget to terminate your instance in the EC2 dashboard when you have finished. I forgot once it was left running for 720hrs! Luckily it was free tier, else I would have been in big trouble financially!

    8. Update apt-get (for Ubuntu)

    9. sudo apt-get install build-essential cmake pkg-config

    10. sudo apt-get install libgtk2.0-dev

    11. sudo apt-get install libatlas-base-dev gfortran

    12. sudo apt-get install libboost-all-dev

    13. For PIP, wget and sudo python<version>

    ** you can set up virtualenv and the wrapper if you want

    14. sudo apt-get install python<version here>-dev

    Install all the libraries for python now:

    15. sudo pip install numpy

    16. sudo pip install scipy matplotlib

    17. sudo pip install scikit-learn

    18. sudo pip install -U scikit-image

    19. sudo pip install mahotas imutils Pillow commentjson (for the .json config files if wanted)

    For OpenCV:


    wget -O
    cd opencv-2.4.10


    mkdir build
    cd build

    22. Compile it with make -j<number of cores>


    sudo make install
    sudo ldconfig

    24. sudo pip install random

    25. sudo pip install progressbar

    26. I think that's everything! It takes about 1hr :-(

    Another instance I am using is the EC2 new deep-learning AMI (image) which has TensorFlow prebuilt and other useful stuff like Python3 and CUDA pre-installed

    If you ran this on the new p2.16xlarge instance, you'd have 8x NVIDIA Tesla K80 Accelerators, each running a pair of NVIDIA GK210 GPUs which would be great for training inception vx architecture in TensorFlow even from scratch! e.g.

    It's about $14.40 per hour to rent that instance however! So it would still be quite costly to train inception from scratch!

  • Rough outline of software flow

    Neil K. Sheridan03/26/2017 at 21:11 0 comments

    Outline of software flow:

    We will capture from camera for 20s after PIR motion was true (at one image per second). Then stop capture, and pass the detection images to the object detector. It's not going to be very fast for the object detector to decide! It could easily take 30s to decide to each image, that means 20 images * 30s which equals 600s! Yeah, that's 10 mins - so this is something I'm a bit worried about. Actually, an i5 6xx w 2 cores took around 10s to decide..

  • Placement of elephant-detection devices

    Neil K. Sheridan03/25/2017 at 20:41 0 comments

    Here we have an example placement of elephant-detection devices. This would be case-by-case, based on the routes (as shown in green) that local people suspect the elephants are using to launch their crop-eating raids! The devices can be moved as the elephants change their tactics. The devices would also be placed along the established elephant paths (e.g. to watering holes, as shown in red).

    The purpose of detecting elephants along their established paths is to warn local people of their presence, so humans can avoid the area when elephants are moving!

    Here we have an example of placement of the elephant deterrence devices: these play the sounds of bees or tigers in order to scare the elephants before they cross the interface between forest and farmland. These deterrence devices are triggered by a positive sighting of an elephant by a detection device.

    Prior to installation of the elephant detection devices, and deterrence devices, we could conduct a surveillance survey (e.g. 10 days), using aerostats with Ultra HD video, in order to acquire accurate data regarding elephant movement.

  • Second results with object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM)

    Neil K. Sheridan03/24/2017 at 21:42 4 comments

    This time I used 700 negative images from the CalTech256 dataset, which included images of animals (chimps, llamas, gorillas, kangaroos, horses, elks, etc.), in addition to the mostly landscapes and urban scenes I'd used from the CalTech101 dataset last time. I actually had around 5300 images in this dataset, so these 700 were selected pseudo-randomly from the set. In addition, I did hard-negative mining. But it kept crashing the machine I was using in OracleVM, so I only did it for 5 images!

    Next I trained the object detector using the 700 negative images, the positive elephant images (from the CalTech 101 dataset), and the hard-negative images. Would the false-positive rate improve? Would we still detect cows and rhinos?!

    Unfortunately yes, it detected the rhino again!

    And even worse, it detected an elk! The elks were part of the negative training set. Although we don't know if they were used, since we selected 700 negative-training images randomly from a dataset of ~5300.

    At least it didn't detect a tractor!

    And it was still detecting elephants, no false-negatives!

    It really wasn't so bad considering that I only used 700 negative training images, and hard-negative mining on just 5 images!!! The entire training from feature-extraction, thru hard-negative mining, to training the object detector, took around 1hr. It is going to take much longer than that to train an efficacious object detector using HOG and SVM!

  • First results with object detector using Histogram of Oriented Gradients (HOG) and Linear Support Vector Machines (SVM)

    Neil K. Sheridan03/23/2017 at 20:28 0 comments

    So I trained this one using 500 negative images from the Caltech 101 dataset . That is, specifically from the sceneclass13 section. And with 64 positive elephant images from the same dataset.

    Now the sceneclass13 section contains images mostly not containing animals! Not the best choice as we will see!

    In this first test image you can see lots of overlapping bounding boxes on the left! This was prior to applying non-maxima suppression. The same test image on the right, after applying non-maxima suppression, has just one bounding box on the elephant:

    It was pretty good at detecting elephants in random photos I downloaded!

    Unfortunately it also detected rhinos!

    Hey, well rhinos are similar looking to elephants [a bit]! But then it also detected cows too! :-(

    On the bright side, it didn't think cars were elephants!

    So in this first attempt, I made the mistake of using negative images that didn't contain objects similar to elephants i.e. animals! N.B. There was no hard negative mining done, although I doubt it would make much difference considering the negative images mostly contained no animals!

    The next attempt I made was using the Caltech 256 dataset!

    I'll add the python code and dependencies here later..

View all 11 project logs

Enjoy this project?



Neil K. Sheridan wrote 03/26/2017 at 20:21 point

yes! I'm going to post it later this week! I'm just taking out the bits that aren't relevant so it is easy to follow! 

  Are you sure? yes | no

jessica18 wrote 03/26/2017 at 17:23 point

can you post the code

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates