So last week we got object detection working direct from image sensor (over MIPI) on the Myriad X. MobileNet-SSD to be specific.
So how fast is it?
- 25FPS (40ms per frame)... when connected to a powerful desktop
According to 'The Big Benchmarking Roundup' (here) that's actually quite good.
But, this is connected to a big powerful computer... how fast is it when used with a Pi?
To find out, we ran it on our prototype of DepthAI for Raspberry Pi Compute Module:
And how did it fair?
- 25FPS (40ms per frame)... when connected to a Raspberry Pi Compute Module 3B+
Unlike the NCS2, which sees a drastic drop in FPS when used with the Pi, DepthAI doesn't see any at all.
Why is this?
- The video path is flowing directly from the image sensor into the Myriad X, which then runs neural inference on the image data, and exports the video and neural network results to the Pi.
- So this means the Raspberry Pi isn't having to deal with the video stream; it's not having to resize video, to shuffle it from one interface to another, etc. - all of these tasks are done on the Myriad X.
In this way the Myriad X doesn't technically even need to export video to the Pi - it could simply output detected objects and positions over a serial connection (UART) for example.
Let's compare this to results of the Myriad used with the Raspberry Pi 3B+ in its NCS formfactor, thanks to the data and super-awesome/detailed post courtesy of PyImageSearch here. (Aside: PyImageSearch is the best thing that every happened to the internet):
So this is a bump from 8.3 FPS with the Pi 3B+ and NCS2 to 25.5 FPS with the Pi 3B+ and DepthAI.
When we set out, we expected DepthAI to be 5x faster than the NCS2 when used with the Raspberry Pi 3B+ and it looks like we've hit 3.18x faster instead... but we still think we improve from here!
And it's important to note that when using the NCS2, the Pi CPU is at 220% to get the 8 FPS, and with DepthAI the Pi CPU is at 35% at 25 FPS. So this leaves WAY more room for your code on the Pi!
And as a note, this isn't using all of the resources of the Myriad X. We're leaving enough room in the Myriad X to perform all the additional calculations of disparity depth and 3D projection, in parallel to this object detection. So if there are folks who only want monocular object detection, we could probably bump up faster than by dedicating more of the chip to the neural inference... but we need to investigate to be sure.
Anyways, we're pretty happy about it:
Now we're off to integrate the code for depth, filtering/smoothing, 3D projection, etc. we have running on the Myriad X already with this neural inference code. (And find out if we indeed left enough room for it!)
Brandon & The Luxonis Team