Greetings all, welcome to the penultimate Hack Chat of 2021! I'm Dan, and as usual I'll be modding today along with Dusan as we welcome Erik Kokalj to the chat. Erik works on spatial AI at Luxonis, and we're going to go into depth about spatial AI and CV.
Sorry, I had to...
Erik, are you out there yet?
Hello and welcome!
Hey, welcome, great to have you today. Can you tell us a little about yourself, and maybe how you came to be working in machine vision?
Sure, so I am a software engineer with an electrical engineering background. I come from Slovenia, Europe and I am opensource enthusiast
I started wokring at Luxonis about a year ago, prior to that I didn't have much machine vision experience, and at Luxonis I started working on examples, tutorials, technical documentation, technical support etc.
So I learned most of it over time doing demos/experiments
Sounds like the perfect way to learn, at least for me -- sink or swim, kinda.
So I've got a starter question: when you say "spatial AI", is that really just what a Kinect or some similar depth sensor does? Or is there more to it than that?
yes, exactly:) A lot of "hard" technical stuff is also abstracted by our library, so me, working on demos, didn't need that much machine vision experience
yes, it's very similar to Kinect. So TL;DR it's combining depth perception + AI, which can be used extensively across many fields
just copying some use-cases:
- Visual assistance (for visually impaired, or for aiding in fork-lift operation, etc.)
- Aerial / subsea drones (fault detection, AI-based guidance/detection/routing)
- E-scooter & micromobility (not allowing folks to ride rented e-scooters like jerks)
- Cargo/transport/autonomy (fullness, status, navigation, hazard avoidance)
- Sports monitoring (automatically losslessly zooming in on action)
- Smart agriculture (e.g guiding lasers to kill weeds, pests, or targeting watering)
I'm very interested in what the state of the art hardware-wise is on the open source side there.
I guess that's where my confusion comes from, really -- there seems like so much you can do with "plain old CV" that doesn't need depth detection. But then again, depth really opens up some interesting doors. Add in the AI component, and it seems really powerful.
@riley.august most of our baseboards are opensource, at least all where Myriad X (VPU by Intel) isn't on
Ooh. I'll have a look, it's nice to see other companies contributing back to the maker community like that. Depth detection does take a lot of the guesswork out of interpreting a 2D image.
thought. light field cameras. highly processing-intensive but gives intrinsic 3d image.
And disparity depth is most similar to human vision.
And like human vision, it works in all sorts of conditions.
Whereas structured light, etc. can self-interfere, have lighting limitations (may not work in sunlight) etc.
Whereas disparity depth is passive. Largely works in similar conditions to our heads. :-)
@Dan Maloney yes, true, eg. speed estimation, distance between 2 objects, or just to know where something is (for robots)
"Structured light" -- is that like lidar or something different?
how does it do perform on specular surfaces
@Dan Maloney it's active stereo, so usually there's IR laser (either dot projector or lines) so disparity depth can be more accurate, and especially useful for low interest surfaces (where there aren't many features for disparity matching, eg. wall or floor)
the rotating table 3d scanners with a line laser projected onto an object are a rudimentary kind of that. with structured light there are more known-shape lines (or dots) and the object doesn't have to rotate.
@charliex (googling what that means)
can i use that as a lidar for short distances (2 meters) ?
specular reflection, so like shiny objects
Isn't that highly reflective surfaces? Like a mirror?
Yes, OAK-D will produce depth at 2 meter range.
ah reflective. Im quite sure it wouldn't work
i saw datasheets for time-of-flight camera sensors. 320x240 or 640x480 i think. pretty amazing resolution.
lightfield is said to work even on reflective stuff.
depthai_experiments中文文档 Experimental projects we've done with DepthAI. Experiments can be anything from "here's some code and it works sometimes" to "this is almost a tutorial". The following list isn't exhaustive (as we randomly add experiments and we may forget to update this list): This example demonstrates how to run 3 stage (3-series, 2 parallel) inference on DepthAI using Gen2 Pipeline Builder.
It uses AI to locate the object of interest.
@Thomas Shaddack same principles apply , a lightfield camera still suffers ifor specular capture if its using standard imaging sensors
For example the object of interest could be a mirror or mirror balls or whatever.
@Brandon yeah thats what i was wondering if there was an assist
there is a more accessible ToF sensor, 16x16 pixels. the VL53L1X - may be of interest though the field of vision is at least without optics annoyingly narrow.
Yes. So with disparity depth the resolution can be much higher for low price as well.
For example we do 640x480 depth along with 13MP color in a $149 camera.
optical flow for the disparity or is there some more clever stuff going on?
And 1,280x800 (1MP) depth along with 12MP color camera for $199
So the disparity engine is census transform based.
what depth resolution, approx?
Which produces great depth for the power.
1,280 x 800 depth resolution.
i mean in millimeters.
oh accuracy. So below 3% error (at good conditions)
and for passive stereo that's good lighting and good texture of the surface
And here is the block diagram of how it works. And Erik is right, 3% of distance is about what disparity depth can do.
We also have a ToF version coming Q1 next year, 1% of distance error.
thanks for the link
could be pretty handy for forklifts.
@Thomas Shaddack :)?for forklift automation
yup, or semiautomation. in the beginning, make sure the driver never runs into something expensive.
that's actually exactly what one of our partners are doing
Here is example of it being used on Forklift as Erik mentioned ^
definitely interested to see how well it performs, waiting on delivery
yeah if you've got depth buffers available to see what the errors are like, especially on the edges super interested to see
@Brandon is from Luxonis tooBTW, everyone,
these are initial results with TOF
Thanks Dan! (And sorry about late-ish join)
No worries ;-)
Courses are (a little) oversubscribed and we apologize for your enrollment delay. As an apology, you will receive a 10% discount on all waitlist course purchases. Current wait time will be sent to you in the confirmation email. Thank you!
More applications here.
@Erik in that video, is the 3D model generated from real time operation from the camera?
In that example the RGB alignment and colorized point cloud are generated on-camera.
Here's my favorite application ^
Any other questions?
*shuffles feet nervously*
Ha. Well said.
mostly RTFMing here
Heh. Even better said. There's a ton of flexibility. But we make it so you can get up and running in a couple minutes, usually.
let me ask a question to Brandon then: why did you start this company/platform?
WRT getting going.
To Erik's question, thanks!
So the full story is here:
But we were working in AR/VR and specifically the perception-in-physical-space bit.
When a bunch of folks in our network were hit by distracted drivers.
1 killed. 3 mortally wounded 2 of which will never walk right again, and the 3rd can no longer program (was an excellent programmer) because of a traumatic brain injury.
I'm not sure if this is an appropriate question, but what do you think the killer app for spatial AI is? Is it already out there, or is it a use case that has yet to be explored?
So one of the most important use-cases is life and safety.
As spatial AI can perceive the world like a human can.
So it can be used to make automated safety systems that were the dreams of science fiction - just 10 years ago.
By telling what is going on, if a person, hand, elbow, whatever are in a position of danger.