This time I used 700 negative images from the CalTech256 dataset, which included images of animals (chimps, llamas, gorillas, kangaroos, horses, elks, etc.), in addition to the mostly landscapes and urban scenes I'd used from the CalTech101 dataset last time. I actually had around 5300 images in this dataset, so these 700 were selected pseudo-randomly from the set. In addition, I did hard-negative mining. But it kept crashing the machine I was using in OracleVM, so I only did it for 5 images!
Next I trained the object detector using the 700 negative images, the positive elephant images (from the CalTech 101 dataset), and the hard-negative images. Would the false-positive rate improve? Would we still detect cows and rhinos?!
Unfortunately yes, it detected the rhino again!
And even worse, it detected an elk! The elks were part of the negative training set. Although we don't know if they were used, since we selected 700 negative-training images randomly from a dataset of ~5300.
At least it didn't detect a tractor!
And it was still detecting elephants, no false-negatives!
It really wasn't so bad considering that I only used 700 negative training images, and hard-negative mining on just 5 images!!! The entire training from feature-extraction, thru hard-negative mining, to training the object detector, took around 1hr. It is going to take much longer than that to train an efficacious object detector using HOG and SVM!