Hack Chat Transcript, Part 1

A event log for Spatial AI and CV Hack Chat

Computer vision, in depth

Dan MaloneyDan Maloney 12/01/2021 at 21:020 Comments

Dan Maloney12:00 PM
Greetings all, welcome to the penultimate Hack Chat of 2021! I'm Dan, and as usual I'll be modding today along with Dusan as we welcome Erik Kokalj to the chat. Erik works on spatial AI at Luxonis, and we're going to go into depth about spatial AI and CV.

Dan Maloney12:00 PM
Sorry, I had to...

Erik, are you out there yet?

Lutetium12:00 PM
Hello and welcome!

Matteo Borri12:00 PM

Erik12:01 PM
Hello everyone!

Ethan Waldo12:01 PM

Dan Maloney12:02 PM
Hey, welcome, great to have you today. Can you tell us a little about yourself, and maybe how you came to be working in machine vision?

Erik12:03 PM
Sure, so I am a software engineer with an electrical engineering background. I come from Slovenia, Europe and I am opensource enthusiast

Erik12:04 PM
I started wokring at Luxonis about a year ago, prior to that I didn't have much machine vision experience, and at Luxonis I started working on examples, tutorials, technical documentation, technical support etc.

Erik12:04 PM
So I learned most of it over time doing demos/experiments

Dan Maloney12:05 PM
Sounds like the perfect way to learn, at least for me -- sink or swim, kinda.

Dan Maloney12:07 PM
So I've got a starter question: when you say "spatial AI", is that really just what a Kinect or some similar depth sensor does? Or is there more to it than that?

Erik12:07 PM
yes, exactly:) A lot of "hard" technical stuff is also abstracted by our library, so me, working on demos, didn't need that much machine vision experience

Erik12:08 PM
yes, it's very similar to Kinect. So TL;DR it's combining depth perception + AI, which can be used extensively across many fields

Erik12:09 PM
just copying some use-cases:

- Visual assistance (for visually impaired, or for aiding in fork-lift operation, etc.)

- Aerial / subsea drones (fault detection, AI-based guidance/detection/routing)

- E-scooter & micromobility (not allowing folks to ride rented e-scooters like jerks)

- Cargo/transport/autonomy (fullness, status, navigation, hazard avoidance)

- Sports monitoring (automatically losslessly zooming in on action)

- Smart agriculture (e.g guiding lasers to kill weeds, pests, or targeting watering)

Riley August12:09 PM
I'm very interested in what the state of the art hardware-wise is on the open source side there.

Dan Maloney12:09 PM
I guess that's where my confusion comes from, really -- there seems like so much you can do with "plain old CV" that doesn't need depth detection. But then again, depth really opens up some interesting doors. Add in the AI component, and it seems really powerful.

Erik12:10 PM
@riley.august most of our baseboards are opensource, at least all where Myriad X (VPU by Intel) isn't on

Riley August12:10 PM
Ooh. I'll have a look, it's nice to see other companies contributing back to the maker community like that. Depth detection does take a lot of the guesswork out of interpreting a 2D image.

Thomas Shaddack12:10 PM
thought. light field cameras. highly processing-intensive but gives intrinsic 3d image.

Erik12:10 PM
yes, exactly:)

Brandon12:10 PM
And disparity depth is most similar to human vision.

Brandon12:11 PM
And like human vision, it works in all sorts of conditions.

Brandon12:11 PM
Whereas structured light, etc. can self-interfere, have lighting limitations (may not work in sunlight) etc.

Brandon12:11 PM
Whereas disparity depth is passive. Largely works in similar conditions to our heads. :-)

Erik12:12 PM
@Dan Maloney yes, true, eg. speed estimation, distance between 2 objects, or just to know where something is (for robots)

Dan Maloney12:12 PM
"Structured light" -- is that like lidar or something different?

charliex12:12 PM
how does it do perform on specular surfaces

Erik12:13 PM
@Dan Maloney it's active stereo, so usually there's IR laser (either dot projector or lines) so disparity depth can be more accurate, and especially useful for low interest surfaces (where there aren't many features for disparity matching, eg. wall or floor)

Dan Maloney12:13 PM

Thomas Shaddack12:14 PM
the rotating table 3d scanners with a line laser projected onto an object are a rudimentary kind of that. with structured light there are more known-shape lines (or dots) and the object doesn't have to rotate.

Erik12:14 PM
@charliex (googling what that means)

Matteo Borri12:14 PM
can i use that as a lidar for short distances (2 meters) ?

charliex12:15 PM
specular reflection, so like shiny objects

Dan Maloney12:15 PM
Isn't that highly reflective surfaces? Like a mirror?

Brandon12:15 PM
Yes, OAK-D will produce depth at 2 meter range.

Erik12:15 PM
ah reflective. Im quite sure it wouldn't work

Thomas Shaddack12:15 PM
i saw datasheets for time-of-flight camera sensors. 320x240 or 640x480 i think. pretty amazing resolution.

Thomas Shaddack12:15 PM
lightfield is said to work even on reflective stuff.

Brandon12:15 PM
Stereo neural inference is what should be used for shiny objects @charliex

Brandon12:16 PM


GitHub - luxonis/depthai-experiments: Experimental projects we've done with DepthAI.

depthai_experiments中文文档 Experimental projects we've done with DepthAI. Experiments can be anything from "here's some code and it works sometimes" to "this is almost a tutorial". The following list isn't exhaustive (as we randomly add experiments and we may forget to update this list): This example demonstrates how to run 3 stage (3-series, 2 parallel) inference on DepthAI using Gen2 Pipeline Builder.

Read this on GitHub

Brandon12:16 PM
It uses AI to locate the object of interest.

charliex12:16 PM
@Thomas Shaddack same principles apply , a lightfield camera still suffers ifor specular capture if its using standard imaging sensors

Brandon12:16 PM
For example the object of interest could be a mirror or mirror balls or whatever.

charliex12:17 PM
@Brandon yeah thats what i was wondering if there was an assist

Thomas Shaddack12:17 PM
there is a more accessible ToF sensor, 16x16 pixels. the VL53L1X - may be of interest though the field of vision is at least without optics annoyingly narrow.

Brandon12:18 PM
Yes. So with disparity depth the resolution can be much higher for low price as well.

Brandon12:18 PM
For example we do 640x480 depth along with 13MP color in a $149 camera.

charliex12:18 PM
optical flow for the disparity or is there some more clever stuff going on?

Brandon12:18 PM
And 1,280x800 (1MP) depth along with 12MP color camera for $199

Brandon12:19 PM
So the disparity engine is census transform based.

Thomas Shaddack12:19 PM
what depth resolution, approx?

Brandon12:19 PM
Which produces great depth for the power.

Brandon12:19 PM
1,280 x 800 depth resolution.

Thomas Shaddack12:19 PM
i mean in millimeters.

Erik12:20 PM
oh accuracy. So below 3% error (at good conditions)

Brandon12:20 PM

Erik12:20 PM
and for passive stereo that's good lighting and good texture of the surface

Brandon12:20 PM
And here is the block diagram of how it works. And Erik is right, 3% of distance is about what disparity depth can do.

Matteo Borri12:21 PM

Brandon12:21 PM
We also have a ToF version coming Q1 next year, 1% of distance error.

charliex12:21 PM
thanks for the link

Brandon12:22 PM

Thomas Shaddack12:22 PM
could be pretty handy for forklifts.

Erik12:22 PM
for forklift automation @Thomas Shaddack :)?

Thomas Shaddack12:23 PM
yup, or semiautomation. in the beginning, make sure the driver never runs into something expensive.

Brandon12:23 PM


Erik12:23 PM
that's actually exactly what one of our partners are doing

Brandon12:23 PM
Here is example of it being used on Forklift as Erik mentioned ^

charliex12:23 PM
definitely interested to see how well it performs, waiting on delivery

Erik12:23 PM
let me find the video of results @charliex

Erik12:24 PM


New video by Erik Kokalj

Read this on Google Photos

charliex12:24 PM
yeah if you've got depth buffers available to see what the errors are like, especially on the edges super interested to see

Dan Maloney12:24 PM
BTW, everyone, @Brandon is from Luxonis too

Erik12:25 PM
these are initial results with TOF

Brandon12:25 PM
Thanks Dan! (And sorry about late-ish join)

Dan Maloney12:25 PM
No worries ;-)

Brandon12:26 PM


OpenCV AI Competition

Courses are (a little) oversubscribed and we apologize for your enrollment delay. As an apology, you will receive a 10% discount on all waitlist course purchases. Current wait time will be sent to you in the confirmation email. Thank you!

Read this on OpenCV

Brandon12:26 PM
More applications here.

ump12:26 PM
@Erik in that video, is the 3D model generated from real time operation from the camera?

Brandon12:26 PM

Brandon12:27 PM
In that example the RGB alignment and colorized point cloud are generated on-camera.

Brandon12:27 PM


Brandon12:27 PM
Here's my favorite application ^

Ethan Waldo12:32 PM

Brandon12:33 PM
Any other questions?

Ethan Waldo12:33 PM
*shuffles feet nervously*

Brandon12:33 PM
Ha. Well said.

charliex12:33 PM
mostly RTFMing here

Brandon12:34 PM
Heh. Even better said. There's a ton of flexibility. But we make it so you can get up and running in a couple minutes, usually.

Erik12:34 PM
let me ask a question to Brandon then: why did you start this company/platform?

Brandon12:35 PM


Brandon12:35 PM
WRT getting going.

Brandon12:35 PM
To Erik's question, thanks!

Brandon12:35 PM
So the full story is here:

Brandon12:35 PM


It Works! Working Prototype of Commute Guardian.

This site is best viewed in a modern browser with JavaScript enabled. Hey guys and gals! So the 'why' of the DepthAI (that satisfyingly rhymes) is we're actually shooting for a final product which we hope will save the lives of people who ride bikes, and help to make bike commuting possible again for many.

Read this on Luxonis

Brandon12:36 PM
But we were working in AR/VR and specifically the perception-in-physical-space bit.

Brandon12:36 PM
When a bunch of folks in our network were hit by distracted drivers.

Brandon12:37 PM
1 killed. 3 mortally wounded 2 of which will never walk right again, and the 3rd can no longer program (was an excellent programmer) because of a traumatic brain injury.

Dan Maloney12:37 PM
I'm not sure if this is an appropriate question, but what do you think the killer app for spatial AI is? Is it already out there, or is it a use case that has yet to be explored?

Brandon12:37 PM
Great question.

Brandon12:37 PM
So one of the most important use-cases is life and safety.

Brandon12:37 PM
As spatial AI can perceive the world like a human can.

Brandon12:37 PM
So it can be used to make automated safety systems that were the dreams of science fiction - just 10 years ago.

Brandon12:38 PM
By telling what is going on, if a person, hand, elbow, whatever are in a position of danger.