Close

Instructions for using EC2 instance to train object detector [in progress]

A project log for Elephant AI

a system to prevent human-elephant conflict by detecting elephants using machine vision, and warning humans and/or repelling elephants

neil-k-sheridanNeil K. Sheridan 03/31/2017 at 19:583 Comments

To speed things up, I'm making using of the Amazon Elastic Compute Cloud (EC2) virtual machines to train my object detector using HOG and SVMs. Here is the protocol for setting up the virtual machine; in this case I'm using an M4 instance, which has either 2.3 GHz Intel Xeon® E5-2686 v4 (Broadwell) processors or 2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors. And in the case of m4.4xlarge it has 16 vCPUs and 64 GiB of memory. This one costs $0.998 per hour to rent.

1. Choose AMI (machine image: I'm using Ubuntu), choose instance, setup security, setup storage

2. Launch instance

3. Download the keypair for this instance (keypair.pem)

4. Open SSH client on your machine

5. Set permissions for your keypair: chmod 400 /path/keypair.pem so it is not really loose else it will be rejected

6. Use SSH client to connect to the instance:

ssh -i /path/keypair.pem ubuntu@ec2-xxx-xx-xxx-1.compute-1.amazonaws.com
7. You can move files between your machine and the instance using scp
scp -i keypair.pem myfiletotransfer.tar.gz ubuntu@ec2-xxx-xx-xxx-1.compute-1.amazonaws.com:

** don't forget to terminate your instance in the EC2 dashboard when you have finished. I forgot once it was left running for 720hrs! Luckily it was free tier, else I would have been in big trouble financially!

8. Update apt-get (for Ubuntu)

9. sudo apt-get install build-essential cmake pkg-config

10. sudo apt-get install libgtk2.0-dev

11. sudo apt-get install libatlas-base-dev gfortran

12. sudo apt-get install libboost-all-dev

13. For PIP, wget https://bootstrap.pypa.io/get-pip.py and sudo python<version> get-pip.py

** you can set up virtualenv and the wrapper if you want

14. sudo apt-get install python<version here>-dev

Install all the libraries for python now:

15. sudo pip install numpy

16. sudo pip install scipy matplotlib

17. sudo pip install scikit-learn

18. sudo pip install -U scikit-image

19. sudo pip install mahotas imutils Pillow commentjson (for the .json config files if wanted)

For OpenCV:

20.

wget -O opencv-2.4.10.zip http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.10/opencv-2.4.10.zip/download

unzip opencv-2.4.10.zip

cd opencv-2.4.10

21.

mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_NEW_PYTHON_SUPPORT=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON  -D BUILD_EXAMPLES=ON ..

22. Compile it with make -j<number of cores>

23.

sudo make install
sudo ldconfig

24. sudo pip install random

25. sudo pip install progressbar

26. I think that's everything! It takes about 1hr :-(


Another instance I am using is the EC2 new deep-learning AMI (image) which has TensorFlow prebuilt and other useful stuff like Python3 and CUDA pre-installed https://aws.amazon.com/marketplace/pp/B01M0AXXQB#product-description

If you ran this on the new p2.16xlarge instance, you'd have 8x NVIDIA Tesla K80 Accelerators, each running a pair of NVIDIA GK210 GPUs https://aws.amazon.com/blogs/aws/new-p2-instance-type-for-amazon-ec2-up-to-16-gpus/ which would be great for training inception vx architecture in TensorFlow even from scratch! e.g. https://github.com/tensorflow/models/blob/master/inception/inception/inception_train.py

It's about $14.40 per hour to rent that instance however! So it would still be quite costly to train inception from scratch!

Discussions

Neil K. Sheridan wrote 04/01/2017 at 20:05 point

Hi Thomas, thanks! Oh, I've never heard of Docker! That does look useful! It sure is tedious downloading, building and installing things each time I use EC2!

  Are you sure? yes | no

Thomas wrote 04/02/2017 at 07:06 point

If software deployment is one of one's duties it's difficult not to hear about Docker these days. Solutions around it are rather mature, and it's easy to "scale out" compute intensive operations. It's important to start with building, and deploying, Docker images automatically, and not to create images by hand. I'm happy to help if you have questions! 

  Are you sure? yes | no

Thomas wrote 04/01/2017 at 19:48 point

Hi Neil, nice writeup! I have no experience with EC2 but I've used GCE in the past. In my experience automating startup and tear-down is worth the trouble (GCE made that rather easy). In the meantime I've also worked a bit with Docker. Did you look into this option, too?

  Are you sure? yes | no