Close
0%
0%

Auto tracking camera

A camera that tracks a person & counts reps using *AI*.

Similar projects worth following
The source code:https://github.com/heroineworshiper/countrepsRapidly becoming the next big thing, 1st with subject tracking on quad copters, then subject tracking on digital assistants. It's long been a dream to have an autonomous camera operator that tracks a subject. The Facebook Portal was the 1st sign lions saw that the problem was finally cracked. The problem is all existing tracking cameras are operated by services which collect the video & either sell it or report it to government agencies.Compiling & fixing a machine vision library to run as fast as possible on a certain computer is such a monumental task, it's important to reuse it as much as possible. To simplify the task of a tracking camera, the same code is used to count reps & track a subject. The countreps program was a lot more complicated & consumed most of

The lion kingdom started getting ideas to make a tracking camera in July 2014.  Quad copter startups were booming & tracking subjects by GPS suddenly caught on, even though it was just a rehash of the worthless results hobbyists were getting in 2008.  The lion kingdom figured it could improve on it with machine vision tracking fiducial markers.

It was terrible.  You can't make a video wearing all those markers & the picture quality wasn't good enough to reliably detect the markers.  To this day, hobbyist tracking cams are all still using chroma keying & LED's.  The lion kingdom would do better.

The next step occurred in Aug 2016 with LIDAR.



That had problems with reflections in windows & couldn't detect tilt.  It could only estimate tilt by the distance of the subject from the camera.

2018 saw an explosion in CNN's for subject tracking.  The key revelation was openpose.  That theoretically allowed a camera to track a whole body or focus in on a head, but it didn't allow differentiating bodies.  The combination of openpose & a 360 camera finally allowed a subject to be tracked in 2 dimensions, in 2019.


The problem was a 360 camera with live output was expensive & cumbersome to get working.  The live video from the 360 camera drove the recording camera & had a long lag.  Cameras which recorded high quality video didn't have a wide enough field of view or fast enough autofocus to track anything.  The tracking camera was offset from the recording camera, creating parallax errors.

Tracking would have to come from the same camera that recorded the video. That would require a wide angle lens, very fast autofocus, & very high sensitivity.  It took another year for cameras to do the job for a reasonable price.





The EOS RP allowed wide angle lenses & had much faster autofocus than previous DSLRs.  Together with a faster laptop, the tracking system was manely doing the job.  Openpose couldn't detect the boundaries of the head, only the eye positions.  That made it point low.  A face tracker running in parallel would refine the tracking, but enough confusing power would cost too much.

The key requirement of openpose seemed to be 8GB of video RAM.  The laptop had 4GB of video RAM, required manes voltage & ice packs to run at full speed, so it was far from portable.

The next step would be tracking a single face in a crowd of other faces.

openpose.mac.tar.xz

Bits for openpose & caffe that were changed for mac.

x-xz - 11.55 kB - 01/04/2019 at 18:38

Download

countreps.mac.tar.xz

The simplest demo for mac.

application/x-xz - 1.71 kB - 01/04/2019 at 18:36

Download

countreps.c

Simplest Linux demo

x-csrc - 5.34 kB - 01/02/2019 at 08:31

Download

Makefile

Simplest Linux makefile

makefile - 673.00 bytes - 01/02/2019 at 08:31

Download

  • Tracking robot evolution

    lion mclionhead11/30/2021 at 07:53 0 comments

    The phone GUI finally got to the point where it was usable.  It crashed, took forever to get connected, had a few persistent settings bugs.  It was decided that the size of the phone required full screen video in portrait & landscape mode, so the easiest way to determine tracking mode was by phone orientation.  

    Rotating the phone to portrait mode would make the robot track in portrait mode.

    Rotating the phone to landscape mode would make the robot track in landscape mode.

    Then, we have some configuration widgets & pathways to abort the motors.  It was decided to not use corporate widgets so the widgets could overlap video.  Corporate rotation on modern phones is junk, so that too is custom.  All those professional white board programmers can't figure out what to do with the useless notch space in landscape mode.

    The mechanics got some upgrades to speed up the process of switching between portrait & landscape mode.

    The key design was a 2 part knob for a common bolt.

    Some new lens adapters made it a lot more compact, but 

    The lens adapters & bolts would all have to be replaced to attach a fatter lens.

    The original robot required relocating the landscape plate in a complicated process, with a socket wrench.   The landscape plate now can be removed to switch to portrait mode with no tools.

    The quest for lower latency led to using mjpeg inside ffmpeg.  That was the only realtime codec.  The only latency is buffering in video4linux.

    On the server, the lowest latency command was:

    ffmpeg -y -f rawvideo -y -pix_fmt bgr24 -r 10 -s:v 1920x1080 -i - -f mjpeg -pix_fmt yuv420p -bufsize 0 -b:v 5M -flush_packets 1 -an - > /tmp/ffmpeg_fifo

    On the client, the lowest latency command was:

    FFmpegKit.executeAsync("-probesize 32 -vcodec mjpeg -y -i " + stdinPath + " -vcodec rawvideo -f rawvideo -flush_packets 1 -pix_fmt rgb24 " + stdoutPath,

    The reads from /tmp/ffmpeg_fifo & the network socket needed to be as small as possible.  Other than that, the network & the pipes added no latency.  The total bytes on the sender & receiver were identical.  The latency was all in ffmpeg's HEVC decoder.   It always lagged 48 frames regardless of the B frames, the GOP size, or the buffer sizes.  A pure I frame stream still lagged 48 frames.  The HEVC decoder is obviously low effort.  A college age lion would have gone into the decoder to optimize the buffering.

    The trick with JPEG was limiting the frame rate to 10fps & limiting the frame size to 640x360.  That burned 3 megabits for an acceptable stream.  The HEVC version would have also taken some heroic JNI to get the phone above 640x360.  Concessions were made for the lack of a software budget & the limited need for the tracking camera.

  • Death of IR remotes & return of wireless GUI's

    lion mclionhead11/22/2021 at 03:53 0 comments

    The journey began with a new servo output/IR input board.  Sadly, after dumping a boatload of money on the LCD panel, HDMI cables, spinning this board, it became quite clear that the last IR remote in the apartment wouldn't work.  It was a voice remote for a Comca$t/Xfinity Flex which would require a lot of hacking to send IR, if not a completely new board. 

    The idea returned to using a phone as the user input & if the phone is the user input, it might as well be the GUI, so the plan headed back to a 2nd attempt at a wireless phone GUI.  It didn't go so well years ago with repcounter.  Wifi dropped out constantly.  Even an H264 codec couldn't fit video over the intermittent connection.

    Eliminating the large LCD & IR remote is essential for portable gear.  Gopros somehow made wireless video reliable & there has been some evolution in screen sharing for Android.  Android may slow down programs which aren't getting user input.

    Screen sharing for Android seems to require a full chrome installation on the host & a way to phone home to the goog mothership though.  Their corporate policy is to require a full chrome installation for as little as hello world.

    The best solution for repcounter is still a dedicated LCD panel, camera, & confuser in a single piece assembly.  It doesn't have to be miniaturized & would benefit from an IR remote.  The same jetson could run both programs by detecting what was plugged in.  The user interfaces are different enough for it to not be a horrible replication of effort.

    Anyways, it was rewarding to see yet another home made piece of firmware enumerate as a legit CDC ACM device.  

    The new LCD driver from China had a defective cable.  Pressing it just the right way got it to work.  

    Pressing ahead with the wireless option evolved into sending H.264 frames to a VideoView widget on a phone.  Decoding a live stream in the VideoView requires setting up an RTSP server with ffserver & video encoder with ffmpeg.  The config file for the server is

    Port 8090
    BindAddress 0.0.0.0
    RTSPPort 7654
    RTSPBindAddress 0.0.0.0
    MaxClients 40
    MaxBandwidth 10000
    NoDaemon
    
    <Feed feed1.ffm>
    File /tmp/feed1.ffm
    FileMaxSize 2000M
    # Only allow connections from localhost to the feed.
    ACL allow 127.0.0.1
    </Feed>
    
    <Stream mystream.sdp>
    Feed feed1.ffm
    Format rtp
    VideoFrameRate 15
    VideoCodec libx264
    VideoSize 1920x1080
    PixelFormat yuv420p
    VideoBitRate 2000
    VideoGopSize 15
    StartSendOnKey
    NoAudio
    AVOptionVideo flags +global_header
    </Stream>
    
    <Stream status.html>
    Format status
    </Stream>
    
    

    The command to start the ffserver is:

    ./ffserver -f /root/countreps/ffserver.conf

    The command to send data from a webcam to the RTSP server is:

    ./ffmpeg -f v4l2 -i /dev/video1 http://localhost:8090/feed1.ffm

    The command to start the VideoView is

    video = binding.videoView;

    video.setVideoURI(Uri.parse("rtsp://10.0.0.20:7654/mystream.sdp"));

    video.start();

    The compilation options for ffserver & ffmpeg are:

     ./configure --enable-libx264 --enable-pthreads --enable-gpl --enable-nonfree

    The only version of ffmpeg tested was 3.3.3.

    This yielded horrendous latency & required a few tries to work.  


    The next step was a software decoder using ffmpeg-kit & a raw socket.  On the server side, the encoder requires a named FIFO sink created by mkfifo("/tmp/mpeg_fifo.mp4", 0777);

    ffmpeg -y -f rawvideo -y -pix_fmt bgr24 -r 30 -s:v 1920x1080 -i - -vf scale=640:360 -f hevc -vb 1000k -an - > /tmp/mpeg_fifo.mp4

    That ingests raw RGB frames & encodes downscaled HEVC frames.  The HEVC frames are written to a socket.

    On the client side, ffmpeg-kit requires creating 2 named pipes.

    String stdinPath = FFmpegKitConfig.registerNewFFmpegPipe(getContext());
    String stdoutPath = FFmpegKitConfig.registerNewFFmpegPipe(getContext());
    


    The named pipes are passed to the ffmpeg-kit invocation.

    FFmpegKit.executeAsync("-probesize...
    Read more »

  • Portable plans

    lion mclionhead10/17/2021 at 21:24 0 comments

    The tracking camera isn't useful unless it's portable.  It's been running on an Asus GL502V for years.  It only goes full speed on manes voltage & with ice packs.  NVidia jetson is the only portable hardware specifically mentioned by openpose.  

    The peak openpose configuration seemed to require 8GB of video RAM for a -1x384 netInputSize, although it gave acceptable results for years on 4GB of RAM & a -1x256 netInputSize.  

    Running nvidia-smi with openpose running gave the memory requirement of a -1x256 netInputSize.  

    The Jetson nano has evolved from 2GB to 4GB over the years without any change in model number, which has caused a lot of confusion.  It's based on the Maxwell chip which is what the GTX970M used.  Jetson nano's with 4GB range from $100 to over $200 & are all out of stock.

    https://www.sparkfun.com/products/16271

    https://www.okdo.com/us/p/nvidia-jetson-nano-4gb-development-kit/

    There is another pose estimator for goog coral.

    https://github.com/google-coral/project-posenet

    But a rare comparison shows posenet being nowhere close.

    We don't know if it was processing in realtime or offline with maximum settings, how much memory was allocated to each program.  Goog coral modules are similarly priced to jetson & equally out of stonk.

    The plan envisions a Jetson module & LCD driver strapped to the back of the last LCD panel.  There will be a webcam on a new pan/tilt module for rep counting & the DSLR on the servocity module for animal tracking.  A new board will convert USB output to servo PWM for camera pointing & convert IR remote signals to USB input for user input.  The servo power will come from the USB cable, by hacking the USB ports on the jetson to output 5A.  It'll all be self contained with the LCD panel.  

    The user interface will be an IR remote with key maps shown on the screen.  The only way it could be debugged in the field would be a phone with a USB cable & some kind of ssh terminal.

    The jetson burns 5V 2.5A itself.  The servos burn 5V.  The LCD panel burns 8-12V.  There need to be 2 BEC's to power everything from either a 12V battery or ATX brick.  The 16V batteries envisioned for the truck wouldn't do much good for the LCD.  The mane thing keeping it a plan instead of reality is of course that pesky 2 year disruption in the transient supply chain.

    Obtaneing the required confusing power has been so problematic, the lion kingdom believes the total confusing power of the world is less than it was 2 years ago, due to lack of new confusers being created & old confusers going out of service, but no-one is actually counting.  If we have gone backwards, it would be the 1st time it ever happened since the bronze age collapse.  

    There was an estimate of the world's confusing power from 1950 to 2018.

    https://incoherency.co.uk/blog/stories/world-computing-power.html

    The lion kingdom's obsolete Ryzen 7 2700x is rated at 59000 MIPS or the entire world's confusing power in 1973.

  • Face tracking dreams

    lion mclionhead04/04/2021 at 02:45 0 comments

    There has always been a dream of running a face tracker simultaneously with the pose tracker to refine the composition, but getting enough confusing power has been expensive.

    Impressive demo of face tracking with downloadable software components, though no turnkey package is provided.  The goog coral is a way to get portable image recognition without dedicating an entire confuser just to the tracker.  The NVidia jetson requires porting everything to their ARM platform.

    The DJI/Ryze Tello is a remarkable achievement in indoor navigation by itself.  It would be a good platform for anything that required 3D positioning indoors & is what lions originally envisioned as the ultimate copter, something which was compact & could be thrown in the air to get an automatic selfie.  If only lions could think of a need for it now.  It also has a finite lifespan with brushed motors.

    Frustrating to have battled with inferior cameras, sonar, & IMU's 10 years ago, never to get anywhere close to what the Tello can do.  The accelerometer based localization all modern copters use started taking off right when lions moved on.  No-one really knows how modern quad copters work anymore.  They don't use sonar anymore.  They use lower cost lidar modules invented in the last 10 years & extremely expensive as loose modules.

  • Portrait mode with the flash & different lenses

    lion mclionhead06/04/2020 at 04:21 0 comments

    This arrangement was the fastest to set up.

    28mm

    17mm.  Then, there was a more spaced arrangement which took longer to set up.

    There were more shadows.  For a single flash, it's better to have it closer to the camera.  The only lens to be used in practice is the 17mm with an optimum distance from the camera, but the lion kingdom put some effort into making it work with longer lenses & less optimum distances from the camera.  In testing, it gives most useful results with the 17mm.

    There were 2 different camera elevations.

    The desired material looks better at waist height, but the flash is farther from the ceiling.  There were many algorithms to try to improve the tilt tracking.  Trying to estimate the head size was required.  The head estimation leads to a different tilt when looking at the camera & looking sideways.

    Other problems are camera motion while shooting & seeing a preview after shooting.  The tracker starts tracking the preview.  A beefed up remote control could allow the lion to freeze the tracker when showing the preview, but the same amount of manual intervention can also clear the preview before the tracker goes too far off.  In practice, the camera usually isn't moving during a photo so the preview doesn't make it move.

    The 17mm has proven to be 1 stop darker than the 28mm & 50mm.  That's why it was only $600.  Forget about discovering that from the adsense funded internet.  F stop doesn't account for light transmission efficiency, so lenses with the same f stop can have wide variations in brightness.

    Then, there was a boring test of video.

  • Replacing the bulb in the 580 EX II

    lion mclionhead06/02/2020 at 06:42 0 comments

    The lion kingdom's 580 EX II died after 12 years.  Lions took many indoor photos with it.  

    Then, this arrived.  It behooves humans to get a bulb assembly rather than a bulb.  

    https://www.walmart.com/ip/Canon-Speedlight-580EX-II-flash-reflector-flash-tube-assembly-CY2-4229/525541142

    The bulb is very hard to replace on its own.  There was a starting guide on 

    https://joelgoodman.net/2012/07/19/flash-bulb-repair-canon-580ex-ii/

    It's essential to discharge the capacitor.  It still had 200V after 2 weeks with no batteries.

    There is a discharging hole with electrical contact inside, exposing the capacitor's + terminal.  This must be grounded through a 10k resistor to the flash ground, without touching the resistor or ground while touching the + lead.  The trick is to keep 1 paw behind your back while holding the + lead with your other paw.

    A few screws revealed the electronicals.

    The bulb assembly is on a corkscrew drive.  The corkscrew drive moves it to adjust the spread of the beam.

    The 12 year old bulb was cactus.

    4 cables connected to the assembly.

    The old bulb & silicone were liberated, after discovering the bulb was as fragile as paper.

    Then 1 end of the new bulb was soldered in before inserting it back into the enclosure.  

    This was the wrong way to insert the silicone. 

    The lion kingdom did what it could with the silicone on 1st.  The soldered end went back into the assembly.  The unsoldered end received its silicone 1st, then wire, & finally heat shrink.  The heat shrink was too long, but if the sharper turns break the wire, there's more wire from an old LCD backlight in the apartment.

    Based on the challenge of getting the silicone on, all 3 wires clearly need to be desoldered from the PCB 1st.  The wires should be soldered to the bulb without the silicone.  Then, the heat shrink should be put on.  Then, the silicone needs to be fed around the wires before soldering the wires back on the PCB.  The assembly probably doesn't need to be taken off the corkscrew drive if you have the right tweezers.

    The lenses only go on 1 way.

    Reassembling the 4 wires showed how the 580 EX II wasn't designed at all for manufacturability.  They wanted the best possible flash, no matter how expensive it was.  


    Then, the deed was done, showing what a luxurious flash it was compared to a cheap flash from 40 years ago.  

  • Tracking 2 people

    lion mclionhead05/28/2020 at 07:11 0 comments



    This was some rare footage of 2 people doing random, intimate movements from a stationary camera.  Showing it on a large monitor & having the camera track the monitor might be good enough to test 2 people.


    Automated tracking cam with 2 subjects was a disaster. Most often, 1 animal covered part of the other animal, causing it to put the head in the top of the frame. When they moved fast, they caused a sympathetic oscillation. Setting up the test was quite involved.

    Eventually, the corner cases whittled down.

    Openpose lost tracking when they were horizontal, but it didn't know to look to the right for the missing head either.

    When both were visible, it tended to look down.  This may be from the lack of tracking the head outline.

    When both were standing up but too different in height to fit in the frame, it tracked 1 or the other.

    A tough composition with partially visible bodies & 1 head much closer than the other made it place the smaller head where the bigger head was supposed to appear & place the bigger head at the bottom.

    Another tough one with 2 partially visible bodies.

    When both were standing & close in height, it tracked well.  Since it tracks heads, the key is to keep the heads close in height.  The trick is no matter how bad the tracking was, it never permanently lost the subjects.

  • Hacking a flash battery pack to use a lipo

    lion mclionhead05/25/2020 at 22:52 0 comments

    The flash needs to be externally mounted to keep the tracking mount as light as possible.  Also, in a high pressure photo shoot, the flash should be powered externally.  After Ken Rockwell extolled his frustrations with battery powered flashes https://www.kenrockwell.com/tech/strobes.htm, it behooved the lion kingdom to give the 580 EX II an extra boost.

    The lion kingdom invested in a cheap JJC battery pack, for just the cost of a boost converter & a cable, only to find the JJC's are sold with 1 connector type for 1 camera.  The goog reported the cables can't be bought separately.  

    https://www.amazon.com/JJC-BP-CA1-External-600EX-RT-YN600EX-RT/dp/B01GUNLQLW

    So in its haste, the lion kingdom spent $18 on used a cable from fleebay which ended up broken. 

    In the meantime, the goog updated its search results 5 days later to yield a brand new $12 cable.

    https://www.amazon.com/Connecting-Replacement-JJC-Recycling-YN600EX-RT/dp/B01G8PMZ12

    The total cost of the 1 hung lo JJC ended up more than a high end battery pack, not unlike how bang good soldering irons end up costing as much as a JBC with all the accessories.  It wasn't the 1st time lions were ripped off by the fact that goog can take 5 days to perform a novel search.

    Since 2011, a drawing has circulated around the internet with the 580 EX II pinout, but nothing useful on the voltages.  Fortunately, there actually is a servicing manual for the flash.

    https://www.manualslib.com/download/379083/Canon-580exii.html

    The control signal is 0-5V with 5V being the on signal.

    The external battery pack directly feeds the mane capacitor through some diodes.  The mane capacitor normally charges to 330V (page 24) but the status LED turns green when it's 213V & red when it's 268V (page 25).  The flash MCU resets if the voltage goes above 350V.  The external battery pack boosts what the internal charger already does, but doesn't have to completely charge the capacitor.


    Interior after modifications to use a LIPO.  Manely, the 3 battery terminals have been replaced.

    Turning to the JJC battery pack, the Nikon cable has a 100k resistor from K1' to K & a 100k resistor from GND to K2.  PH+ is the high voltage.  The resistors are probably selecting the voltage.  CNT is the control signal from the flash. 

    The Canon cable has no resistors.  All the K pins are floating.  Only PH+, GND, & CNT are connected.

    A quick test with the Nikon resistors showed it makes an unstable 290-320 volts.



    All the external battery packs use 2 boost converters in parallel.  Each boost converter runs on 6V.  If they have 8 AA's, they run both boost converters.  If they have 4 AA's, they run 1 boost converter.

    The battery pack has 3 taps: 12V at B+, 6V at BM, GND at G.  It can run on 6V from B+ to BM, or 6V from BM to G.  To run it on a 12V lipo from B+ to G, a regulator has to supply the 6V tap to get both boost converters to fire.  The lion kingdom whacked on an LM7806.





    The JJC has 2 5A current limiting fuses going to the batteries.  It's essential to leave these in place with their heatsinks.

    The JJC was modified with the 6V regulator & current limiting fuses in the battery compartment.  In testing, the JJC drew 5A to charge the flash.  Combined with the fresh batteries in the flash, it supplied more power than the flash could use without destroying itself.  Running on a lipo, the 580 EX II is basically a very expensive manes powered strobe.


    The 6V regulator got momentarily hot, but the flash couldn't draw enough power to keep it hot.  It was time to see if it did anything useful.

    BM with the LM7806 when recharging

    BM without the LM7806 when recharging.  

    BM without the LM7806 when idle.

    It's peak current is 2A while the boost converters are running 5A in series.  It's obviously making a prettier waveform when idle...

    Read more »

  • Portrait mode & HDMI tapping

    lion mclionhead05/17/2020 at 19:10 0 comments

    The tracker needs to support still photos in portrait mode.  Portrait mode is never used for video, at least by these animals.  A few days of struggle yielded this arrangement for portrait mode. The servo mount may actually be designed to go into this configuration. 

    It mechanically jams before smashing the camera. Still photos would be the 1st 30 minutes & video the 2nd 30 minutes of a model shoot, since it takes a long time to set up portrait mode.

    News flash: openpose can't handle rotations.  It can't detect a lion standing on its head.  It starts falling over with just 90 degree rotations. The video has to be rotated before being fed to openpose.

    https://www.amazon.com/gp/product/B0876VWFH7

    Also installed were the HDMI video feed & the remote shutter control.  These 2 bits achieved their final form.  Absolutely none of the EOS RP's computer connections ended up useful.  Note the gears gained a barrier from the cables.

    News flash: the EOS RP can't record video when it's outputting a clean HDMI signal.  The reason is clean HDMI causes it to show a parallel view on the LCD while sending a clean signal to HDMI.  It uses a 2nd channel that would normally be used for recording video & there's no way to turn it off. 

    Your only recording option when outputting clean HDMI is on the laptop.  Helas the $50 HDMI board outputs real crummy JPEG's at 29.97fps or 8 bit YUYV at 5fps.  It's limited by USB 2.0 bandwidth, no good for pirating video or any recording.

    The mighty Tamron 17-35mm was the next piece.  It was the lion kingdom's 1st lens in 12 years.  The lion kingdom relied on a 15mm fisheye for its entire wide angle career.  It was over $500 when it was new.  It was discontinued in 2011 & used ones are now being sold for $350.  Its purchase was inspired by a demo photo with the 15mm on an EOS 1DS.

    Defishing the 15mm in software gave decent results for still photos, but less so for video.  There will never be a hack to defish the 15mm in the camera.

    With the HDMI tap, it could finally take pictures & record video through the camera.  The tracker did its best to make some portraits.  The tracking movement caused motion blur.  Large deadband is key for freezing the camera movement.  Portrait mode still needs faster horizontal movement, because it has less horizontal room.

    Openpose lacks a way to detect the outline of an animal.  It only detects eyes, so the size of the head has to be estimated by the shoulder position.  It gets inaccurate if the animal bends over.  Openpose has proven about as good at detecting a head as a dedicated face tracker.

    The tracker has options for different lenses.  Longer lenses make it hunt. 50mm has been the limit for these servos.   Adding deadbands reduces the hunting but makes it less accurate.  It's definitely going to require a large padding from the frame edges.  For talking heads, the subject definitely needs to be standing up for the tracker to estimate the head size.


    A corner case is if the entire body is in frame, but the lower section is obstructed.  The tracker could assume the lower section is out of frame & tilt down until the head is on top.  In practice, openpose seems to create a placeholder when the legs are obstructed.

  • Tilt tracking

    lion mclionhead05/14/2020 at 19:36 0 comments

    Tilt tracking was a long, hard process but after 4 years, it finally surrendered.  The best solution ended up dividing the 25 body parts from openpose into 4 vertical zones.  Depending on which zones are visible, it tilts to track the head zone, all the zones, or just tilts up hoping to find the head.  The trick is dividing the body into more than 2 zones.  That allows a key state where the head is visible but only some zones below the head are visible.

    The composition deteriorates as the subject gets closer, but it does a good job tracking the subject even with only the viewfinder.  It tries to adjust the head position based on the head size, but head size can only be an estimation.

    It supports 3 lenses, but each lens requires different calibration factors.  The narrower the lens, the less body parts it sees & the more it just tracks the head.  Each lens needs different calibration factors, especially the relation between head size & head position.  The narrower the lens, the slower it needs to track, since the servos overshoot.  Openpose becomes less effective as fewer body parts are visible.  Since only the widest lens will ever be used in practice, only the widest lens is dialed in.

    The servos & autofocus are real noisy.  It still can't record anything.

    All this is conjecture, since the mane application is with 2 humans & there's no easy way to test it with 2 humans.  With 2 humans, it's supposed to use the tallest human for tilt & the average of all the humans for pan.

    Pan continues to simply center on the average X position of all the detected body parts.   

    There is a case for supporting a narrow lens for portraits & talking heads.  It would need a deadband to reduce the oscillation & the deadband would require the head to be closer to the center.  A face tracker rather than openpose would be required for a talking head.

View all 29 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates