Close
0%
0%

Auto tracking camera

A camera that tracks a person & counts reps using *AI*.

Similar projects worth following
The source code:https://github.com/heroineworshiper/countrepsRapidly becoming the next big thing, 1st with subject tracking on quad copters, then subject tracking on digital assistants. It's long been a dream to have an autonomous camera operator that tracks a subject. The Facebook Portal was the 1st sign lions saw that the problem was finally cracked. The problem is all existing tracking cameras are operated by services which collect the video & either sell it or report it to government agencies.Compiling & fixing a machine vision library to run as fast as possible on a certain computer is such a monumental task, it's important to reuse it as much as possible. To simplify the task of a tracking camera, the same code is used to count reps & track a subject. The countreps program was a lot more complicated & consumed most of

The lion kingdom started getting ideas to make a tracking camera in July 2014.  Quad copter startups were booming & tracking subjects by GPS suddenly caught on, even though it was just a rehash of the worthless results hobbyists were getting in 2008.  The lion kingdom figured it could improve on it with machine vision tracking fiducial markers.

It was terrible.  You can't make a video wearing all those markers & the picture quality wasn't good enough to reliably detect the markers.  To this day, hobbyist tracking cams are all still using chroma keying & LED's.  The lion kingdom would do better.

The next step occurred in Aug 2016 with LIDAR.



That had problems with reflections in windows & couldn't detect tilt.  It could only estimate tilt by the distance of the subject from the camera.

2018 saw an explosion in CNN's for subject tracking.  The key revelation was openpose.  That theoretically allowed a camera to track a whole body or focus in on a head, but it didn't allow differentiating bodies.  The combination of openpose & a 360 camera finally allowed a subject to be tracked in 2 dimensions, in 2019.


The problem was a 360 camera with live output was expensive & cumbersome to get working.  The live video from the 360 camera drove the recording camera & had a long lag.  Cameras which recorded high quality video didn't have a wide enough field of view or fast enough autofocus to track anything.  The tracking camera was offset from the recording camera, creating parallax errors.

Tracking would have to come from the same camera that recorded the video. That would require a wide angle lens, very fast autofocus, & very high sensitivity.  It took another year for cameras to do the job for a reasonable price.





The EOS RP allowed wide angle lenses & had much faster autofocus than previous DSLRs.  Together with a faster laptop, the tracking system was manely doing the job.  Openpose couldn't detect the boundaries of the head, only the eye positions.  That made it point low.  A face tracker running in parallel would refine the tracking, but enough confusing power would cost too much.

The key requirement of openpose seemed to be 8GB of video RAM.  The laptop had 4GB of video RAM, required manes voltage & ice packs to run at full speed, so it was far from portable.

The next step would be tracking a single face in a crowd of other faces.

openpose.mac.tar.xz

Bits for openpose & caffe that were changed for mac.

x-xz - 11.55 kB - 01/04/2019 at 18:38

Download

countreps.mac.tar.xz

The simplest demo for mac. Obsolete.

application/x-xz - 1.71 kB - 01/04/2019 at 18:36

Download

countreps.c

Simplest Linux demo Obsolete.

x-csrc - 5.34 kB - 01/02/2019 at 08:31

Download

Makefile

Simplest Linux makefile. Obsolete.

makefile - 673.00 bytes - 01/02/2019 at 08:31

Download

  • Canon powershot Pick, portable confusing options

    lion mclionhead07/16/2022 at 18:28 0 comments

    Canon's entry in the tracking camera space is the Powershot Pick, a very low quality OBSBOT webcam clone.  Kind of a reversal of roles that Canon makes toys while DJI makes the only trackers which support a DSLR.  

    It uses the same low power face tracking & object tracking as the others. 

    Embedded GPUs have gone the way of the dinosaur, but there was a guy who got a single board confuser to support a PCI card GPU.  This one was a 

    https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html

    $220 ODYSSEY - X86J4105800 with $35 M2 - PCI-E adapter.  

    https://www.amazon.com/ADT-Link-Extender-Graphics-Adapter-PCI-Express/dp/B07YDH8KW9

    The PCI-E adapter has a flex cable allowing the  GPU card & confuser to be in line.

    The GPU was a $270 GTX 1650 with 4GB.

    https://www.amazon.com/MSI-NVIDIA-GeForce-DL-DVI-D-PCI-Express/dp/B07Y87ZRYQ/

    The power supply was  a $20 https://www.amazon.com/RGEEK-24pin-Input-Output-Switch/dp/B07WDG49S8 12V to 150W mini ITX.  

    He couldn't get any benchmarks to work & didn't try CUDA.  The lion kingdom would try a $30 250W

    https://www.amazon.com/RGEEK-24pin-Input-Output-Switch/dp/B07PPYWGNH

    The stock drivers & games seemed to work.  All in, it was at least $600 with no guarantee CUDA will work.  A vintage jetson can actually be bought for under $300.

    https://www.seeedstudio.com/Jetson-10-1-H0-p-5335.html

    There are more expensive single board confusers which support GPUs via an M2 adapter, the $650  NUC-1260P

    https://www.newegg.com/asrock-nuc-box-1260p/p/N82E16856158081

    A PCI card is a lot more confusing power than the lion kingdom's ancient laptop, but the system wouldn't be as portable as either the laptop or vintage jetson.  

    There are some less portable mini PC's which use 19V laptop power supplies.  The $600 elitemini B550 

    https://store.minisforum.com/products/b550

    conceptually has a purpose built dock for a PCI GPU but it's not manufactured.  Of course, the lion kingdom can't afford any of these options, but the portable confusing power is out there.

  • The last temptation of christ

    lion mclionhead03/12/2022 at 08:50 0 comments

    Trying it with some sacrilegious videos playing on a monitor, the mane point is it falls apart when 2 humans are close together in a scene.  There's a lot more noise in the positions.  The accuracy is definitely worse than when tracking a single lion standing up.  It would be constantly moving around like an unstabilized phone cam.

    The windowing algorithm creates a lot of noise & causes the hits to move around even when the video is paused.

    Normally, it only detects 1 of the 2 humans or it detects both humans as a single very noisy hit.  It probably detects only 1 human when they're together & the hit oscillates between the 2 humans.

    The most desirable horizontal body positions aren't detected at all.  It's a bigger problem in practical video because most of the poses are horizontal.

    Lowpass filtering & optical flow could suppress some of the noise, but the servos already provide a lot of lowpass filtering.  The lion kingdom has long dreamed of running a high fidelity model at 1fps & scanning the other frames with optical flow.  This would entail a 1 second lag for that high fidelity model.  Every second, after getting an inference, it would have to use optical flow to fill in 1 second of frames to catch up to the present.  It might be doable with low res images.

    Face tracking might give better results, but it would lose the ability to center on the body.

    The Intel Movidius is the only embedded neural engine still made.  It's a lot more expensive than the equivalent amount of computing power 3 years ago, but that might be a fair price if computing power is permanently degraded.

    The only useful benchmark lions could find showed it getting 12fps on an undisclosed efficientdet model.  It can be implied from the words "up to 12fps" in his sales pitch that it's the fastest efficientdet model.  That's 50% faster than the fastest efficientdet in software, not really enough to improve upon the quality seen. 

    All GPU systems are all or nothing.  They can't split the work between the CPU & the GPU.  They could allow a 2nd face detection model or optical flow to be run concurrently on the CPU.  That would give more like a 250% increase.

  • Best results with raspberry pi 4

    lion mclionhead03/09/2022 at 03:18 0 comments

    The best rule with efficientdet_lite0 has been hard coding the number of animals in frame & making sure that number is always visible so it doesn't get false positives.  Tracking just the highest score would oscillate between 2 animals.  Since lions are poor, testing would require either printing out a 2nd animal or just testing with 1 animal & living with errors.

    Been finding efficientdet to be not accurate enough to have any deadband.  The mane rule has been to keep the top of the box 7% from the top of the frame with no deadband.  If the total height of the box is under 75% of the screen height, center the entire box in the frame regardless of head position.

    Minimum distance from the top must be greater than the deadband or it won't tilt up.  Another problem is if the box is under 75% but the head is above the frame, it won't tilt up in some cases.  That might be a matter of deadband.  If the difference between 75% & 7% is too low, the box will oscillate between being 75% of the screen height & 7% from the top.

    Most of the time, the box is the full frame height, so it uses the head rule.  To get under 75% of the screen height, the error rate has to go up.

    Ideally, efficientdet could be trained on videos which have been labeled with COCOv5, but all the videos with the desired body positions are shot in sleezy hotel rooms & they're very addicting to edit.  The neural net needs a wide variety of backgrounds besides hotel rooms to isolate the human forms.

    5 hours on the GTX 970M yielded a new model trained just on hotel videos.  It was a total failure compared to the COCO training set.

    Animals have long dreamed of software that could synthesize unlimited amounts of a certain type of video.  That's basically why there are so many attempts at making synthetic videos which are so bad, they're called neural network dream generators.  The best we can do is software that points the camera.

  • Simplified pose tracker dreams

    lion mclionhead02/26/2022 at 10:35 0 comments

    The lion kingdom believes the pose tracker can run fast enough on a raspberry pi if it's reduced to tracking just 4 keypoints.  It currently just uses head, neck, hip, & foot zones to drive camera tilt.  These are fusions of 3-6 keypoints.  1 way is to reduce the movenet_multipose model itself to 4 keypoints.  Another way is to try training efficientdet_lite on the 4 objects for the 4 keypoints.  

    The mane point of the 4 keypoints is tracking when the complete body isn't visible.  A full body bounding box can't show if the head or the feet are visible & can't detect anything if it's too close.

    Sadly, there's no source code for training the fastest model, movenet_multipose.  There's source code for another pose estimator based on the COCO2017 dataset:

    https://github.com/scnuhealthy/Tensorflow_PersonLab

    Key variables are defined in config.py: NUM_KP, NUM_EDGES, KEYPOINTS

    1 keypoint applies to both sides.  The other 16 have a left & right side which seem to be unnecessary.

    The mane problem is consolidating multiple keypoints into 1.  It's looking for a single point for every keypoint & COCO defines a single point for every keypoint instead of a box.  Maybe it can accept an average point in the middle of multiple keypoints or it can accept the same keypoint applying to multiple joints in different images.

    It's still a mystery to lions how the neural network goes from recognizing objects to organizing the hits into an array of coordinates.  There's not much in model.py.  The magic seems to happen in resnet_v2_101, a pretrained image classification model which presumably knows about the heirarchy of keypoints & persons.  The results of the image classification get organized by a conv2d model.  This seems to be very large, slow process, not a substitute for movenet.

    Defining tensorflow models seems to be a basic skill every high school student knows but lions missed out on, yet it's also like decoding mp3's, a skill that everyone once knew how to do but now is automated.

    The lion kingdom observed when efficientdet was trained on everything above the hips, it tended to detect only that.  If efficientdet was trained just above shoulders & below shoulders, it might have enough objects to control pitch.  It currently just has 3 states: head visible but not feet, feet visible bot not head, both visible.  It can probably be reduced to head & body visible or just body visible.  If head is visible, put it in the top 3rd.  If head is invisible but body is visible, tilt up.  

    The mane problem is training this dual object tracker.  Openpose could maybe label all the heads & bodies.  It's pretty bad at detecting sideways bodies.  The head has to be labeled from all angles, which rules out a face detector.

    There's a slight chance tracking would work with a general body detector.  Fuse the 2 largest objects.  It would tilt up if the object top was above a certain row.  Tilt down if the object bottom was above a certain row & the object top was below a certain row.

    The mane problem is the general body detector tiles the image to increase the input layer's resolution.  It can't tile the image in portrait mode.

    There was an attempt to base the tracker on efficientdet_lite0 tracking a person.  It was bad.  The problem is since there are a lot more possible positions than running, it detects a lot of false positives.  It might have to be trained using unholy videos.  Another problem is the tilt tends to oscillate.  Finally, it's become clear that the model needs to see the viewfinder & the phone is too small.  It might require porting to Android 4 to run on the lion kingdom's obsolete tablet or direct HDMI on the pi with a trackpad interface.

    The best results with the person detector came from just keeping the top of the box 10% below the top of the frame...

    Read more »

  • Faster pose tracker

    lion mclionhead02/05/2022 at 04:25 0 comments

    There was a rumor that running a raspberry pi in 64 bit mode would make pose tracking run as fast as a GPU.  Sadly, raspberry pi production joined GPU production in valhalla so these notes won't be applicable to many animals.

    Traditionally, 64 bit mode was slower because it entails moving a lot more data around.  That plagued IA64.  It might have an advantage in executing more parallel floating point operations.

    The journey began by downloading ubunt 21 for raspberry pi 4.  It requires 4GB RAM & 16 GB of SD card. It required a keyboard, mouse, & monitor for installation, which the lion kingdom didn't have.  There's no minimal ubunt like there is for raspian.

    There were some notes about enabling the UART console & bypassing the installation GUI.

    https://limesdr.ru/en/2020/10/17/rpi4-headless-ubuntu/

    but nothing useful for logging in.  The only way to log in was to edit /etc/shadow on another confuser, copying the password hash of a known password to the root password field.

    Next came disabling some programs.  There might be a need to run X someday to test an LCD panel, so lions don't delete that anymore.

    mv /usr/sbin/NetworkManager /usr/sbin/NetworkManager.bak

    mv /usr/sbin/ModemManager /usr/sbin/ModemManager.bak

    mv /usr/lib/xorg /usr/lib/xorg.bak

    mv /sbin/dhclient /sbin/dhclient.bak

    swapoff -a; rm /swapfile


    Then create some network scripts.

    ip addr add 10.0.0.16/24 dev eth0

    ip link set eth0 up

    ip route add default via 10.0.0.1

    echo nameserver 75.75.75.75 > /etc/resolv.conf         

    There's a new dance to make this run in /etc/rc.local

    https://www.linuxbabe.com/linux-server/how-to-enable-etcrc-local-with-systemd

    Most importantly, the default runlevel must be changed from oem-config.target to multi-user.target

    systemctl set-default multi-user.target

    Then install some sane programs as fast as possible.

    apt update

    apt install net-tools

    apt install openssh-server

    rm /var/cache/debconf/* if it dies because of some locking bug.

    The instructions for setting up pose tracking were in 

    https://community.element14.com/challenges-projects/element14-presents/w/documents/27468/episode-536-bonus-video---pose-detection-code

    The movenet_lightning.tflite model ran at 8fps & detected just 1 animal at a time.  posenet.tflite ran at 5fps.  movenet_multipose.tflite crashed.  There was an alternative multipose model on

    https://tfhub.dev/google/lite-model/movenet/multipose/lightning/tflite/float16/1

    This one worked at 3fps.

    The command for changing models was

    python3 pose_estimation.py --model movenet_multipose

    It wasn't close enough for rep counting, but just might be good enough for camera tracking.  Overclocking might get it up to 4fps.  As amazingly efficient as it is in getting the most out of the CPU, it's still too slow for the ultimate goal of photographing 2 people.  It's only good enough for tracking a single lion.  The next step would be fabricating a compact cooling system to overclock it & running face recognition.




  • Commercial tracking options

    lion mclionhead12/07/2021 at 07:09 0 comments

    A home made tracker by now is surely a huge waste compared to a DJI robot.  Maybe it wasn't when it began in 2019.  It's hard to believe a junk laptop which finally allowed pose tracking at a reasonable speed didn't arrive until June 2019, so lions have only had 2.5 years of tracking of any serious quality.  3 years of total pose tracking including the low quality tablet camera.  The final tablet source code was lost despite only dating back to June 2019.

    Despite inflation, the cost of webcam quality video with tracking plummeted since 2019 & we have the 1st decent look at the user interface.

    https://www.amazon.com/OBSBOT-AI-Powered-Omni-Directional-90-Degree-Correction/dp/B08NPGNMV8

    Fortunately, the picture quality is still horrible.  There are tracking speed options & a composition option to change how high the head is in the frame.  There's a differentiation between resetting the gimbal & starting tracking.  That's quite difficult on the servocity because it has no center PWM.  Apple introduced a tracking feature called centerstage in 2021 which crops the full phone cam in on a face. 

    These all track the face instead of the full body.  No-one knows how well they track 2 faces.

    The DJI RS2 is the only tracker which uses a DSLR.  Now that more detailed reviews abound, it's more limited than expected.  It tracks a feature rather than a body or a face.  It's not reliable enough to track a deforming feature for 30 minutes, unattended.  It receives servo commands from a phone where the tracking is performed.   The phone receives video from an HDMI transmitter sold separately.

    The only thing making the gimbal relevant is the proprietary wireless protocol between the phone & the motors.  The phone app could use any video source to control any 2 servos, but is locked into using just a DJI transmitter & a DJI gimbal.  It's not clear if the HDMI transmitter requires the gimbal to function.

    The stabilization of the gimbal is worthless if it's always mounted on a tripod, but its stabilization is equivalent to the lion kingdom's hacked Feiyu.  They all rely on camera stabilization to achieve perfect smoothness.  It would be a true waste to buy a DJI RS2 just for tracking, although lions could use the gimbal for photographing goats.

    Tracking a mane as well as a full body is the holy grail.  The confuser would need to run 2 models simultaneously.  

    There was an interesting bug where the stdin & stdout handlers for ffmpeg held the same mutex.  It deadlocked since ffmpeg couldn't empty stdin without writing to stdout.  The easy solution was using a file descriptor instead of a file pointer to access the streams.  That didn't require the mutex.

    To handle gimbal centering, it saves the last manually dialed position as the center.  Going in & out of tracking mode doesn't reset the center & it doesn't rewind the camera to the center.  It's essential to not have anything else change to go between portrait & landscape mode.  The user can press a center button when it's not tracking.

    Some portrait mode tests showed how much junk is required to feed the flash & remote control.  It needs a better place to install the flash & a place to install the phone.  The wireless GUI allowed some mane shots without looking at the laptop.  There might be a case for manual pointing with the GUI, but carrying around the phone & the camera remote would be a big deal.

    The openpose model couldn't track a lion bending over with his mane untied.  It could track the mane tied up but ran away straight up when the mane was let down.  Thus, limits on PWM were still needed.  

    This alignment made it appear to tilt up looking for a head.

    That was about all that could be done without a jetson nano.  The rep counter is still trending towards a dedicated...

    Read more »

  • Tracking robot evolution

    lion mclionhead11/30/2021 at 07:53 0 comments

    The phone GUI finally got to the point where it was usable.  It crashed, took forever to get connected, had a few persistent settings bugs.  It was decided that the size of the phone required full screen video in portrait & landscape mode, so the easiest way to determine tracking mode was by phone orientation.  

    Rotating the phone to portrait mode would make the robot track in portrait mode.

    Rotating the phone to landscape mode would make the robot track in landscape mode.

    Then, we have some configuration widgets & pathways to abort the motors.  It was decided to not use corporate widgets so the widgets could overlap video.  Corporate rotation on modern phones is junk, so that too is custom.  All those professional white board programmers can't figure out what to do with the useless notch space in landscape mode.

    The mechanics got some upgrades to speed up the process of switching between portrait & landscape mode.

    The key design was a 2 part knob for a common bolt.

    Some new lens adapters made it a lot more compact, but 

    The lens adapters & bolts would all have to be replaced to attach a fatter lens.

    The original robot required relocating the landscape plate in a complicated process, with a socket wrench.   The landscape plate now can be removed to switch to portrait mode with no tools.

    The quest for lower latency led to using mjpeg inside ffmpeg.  That was the only realtime codec.  The only latency is buffering in video4linux.

    On the server, the lowest latency command was:

    ffmpeg -y -f rawvideo -y -pix_fmt bgr24 -r 10 -s:v 1920x1080 -i - -f mjpeg -pix_fmt yuv420p -bufsize 0 -b:v 5M -flush_packets 1 -an - > /tmp/ffmpeg_fifo

    On the client, the lowest latency command was:

    FFmpegKit.executeAsync("-probesize 32 -vcodec mjpeg -y -i " + stdinPath + " -vcodec rawvideo -f rawvideo -flush_packets 1 -pix_fmt rgb24 " + stdoutPath,

    The reads from /tmp/ffmpeg_fifo & the network socket needed to be as small as possible.  Other than that, the network & the pipes added no latency.  The total bytes on the sender & receiver were identical.  The latency was all in ffmpeg's HEVC decoder.   It always lagged 48 frames regardless of the B frames, the GOP size, or the buffer sizes.  A pure I frame stream still lagged 48 frames.  The HEVC decoder is obviously low effort.  A college age lion would have gone into the decoder to optimize the buffering.

    The trick with JPEG was limiting the frame rate to 10fps & limiting the frame size to 640x360.  That burned 3 megabits for an acceptable stream.  The HEVC version would have also taken some heroic JNI to get the phone above 640x360.  Concessions were made for the lack of a software budget & the limited need for the tracking camera.

  • Death of IR remotes & return of wireless GUI's

    lion mclionhead11/22/2021 at 03:53 0 comments

    The journey began with a new servo output/IR input board.  Sadly, after dumping a boatload of money on the LCD panel, HDMI cables, spinning this board, it became quite clear that the last IR remote in the apartment wouldn't work.  It was a voice remote for a Comca$t/Xfinity Flex which would require a lot of hacking to send IR, if not a completely new board. 

    The idea returned to using a phone as the user input & if the phone is the user input, it might as well be the GUI, so the plan headed back to a 2nd attempt at a wireless phone GUI.  It didn't go so well years ago with repcounter.  Wifi dropped out constantly.  Even an H264 codec couldn't fit video over the intermittent connection.

    Eliminating the large LCD & IR remote is essential for portable gear.  Gopros somehow made wireless video reliable & there has been some evolution in screen sharing for Android.  Android may slow down programs which aren't getting user input.

    Screen sharing for Android seems to require a full chrome installation on the host & a way to phone home to the goog mothership though.  Their corporate policy is to require a full chrome installation for as little as hello world.

    The best solution for repcounter is still a dedicated LCD panel, camera, & confuser in a single piece assembly.  It doesn't have to be miniaturized & would benefit from an IR remote.  The same jetson could run both programs by detecting what was plugged in.  The user interfaces are different enough for it to not be a horrible replication of effort.

    Anyways, it was rewarding to see yet another home made piece of firmware enumerate as a legit CDC ACM device.  

    The new LCD driver from China had a defective cable.  Pressing it just the right way got it to work.  

    Pressing ahead with the wireless option evolved into sending H.264 frames to a VideoView widget on a phone.  Decoding a live stream in the VideoView requires setting up an RTSP server with ffserver & video encoder with ffmpeg.  The config file for the server is

    Port 8090
    BindAddress 0.0.0.0
    RTSPPort 7654
    RTSPBindAddress 0.0.0.0
    MaxClients 40
    MaxBandwidth 10000
    NoDaemon
    
    <Feed feed1.ffm>
    File /tmp/feed1.ffm
    FileMaxSize 2000M
    # Only allow connections from localhost to the feed.
    ACL allow 127.0.0.1
    </Feed>
    
    <Stream mystream.sdp>
    Feed feed1.ffm
    Format rtp
    VideoFrameRate 15
    VideoCodec libx264
    VideoSize 1920x1080
    PixelFormat yuv420p
    VideoBitRate 2000
    VideoGopSize 15
    StartSendOnKey
    NoAudio
    AVOptionVideo flags +global_header
    </Stream>
    
    <Stream status.html>
    Format status
    </Stream>
    
    

    The command to start the ffserver is:

    ./ffserver -f /root/countreps/ffserver.conf

    The command to send data from a webcam to the RTSP server is:

    ./ffmpeg -f v4l2 -i /dev/video1 http://localhost:8090/feed1.ffm

    The command to start the VideoView is

    video = binding.videoView;

    video.setVideoURI(Uri.parse("rtsp://10.0.0.20:7654/mystream.sdp"));

    video.start();

    The compilation options for ffserver & ffmpeg are:

     ./configure --enable-libx264 --enable-pthreads --enable-gpl --enable-nonfree

    The only version of ffmpeg tested was 3.3.3.

    This yielded horrendous latency & required a few tries to work.  


    The next step was a software decoder using ffmpeg-kit & a raw socket.  On the server side, the encoder requires a named FIFO sink created by mkfifo("/tmp/mpeg_fifo.mp4", 0777);

    ffmpeg -y -f rawvideo -y -pix_fmt bgr24 -r 30 -s:v 1920x1080 -i - -vf scale=640:360 -f hevc -vb 1000k -an - > /tmp/mpeg_fifo.mp4

    That ingests raw RGB frames & encodes downscaled HEVC frames.  The HEVC frames are written to a socket.

    On the client side, ffmpeg-kit requires creating 2 named pipes.

    String stdinPath = FFmpegKitConfig.registerNewFFmpegPipe(getContext());
    String stdoutPath = FFmpegKitConfig.registerNewFFmpegPipe(getContext());
    


    The named pipes are passed to the ffmpeg-kit invocation.

    FFmpegKit.executeAsync("-probesize...
    Read more »

  • Portable plans

    lion mclionhead10/17/2021 at 21:24 0 comments

    The tracking camera isn't useful unless it's portable.  It's been running on an Asus GL502V for years.  It only goes full speed on manes voltage & with ice packs.  NVidia jetson is the only portable hardware specifically mentioned by openpose.  

    The peak openpose configuration seemed to require 8GB of video RAM for a -1x384 netInputSize, although it gave acceptable results for years on 4GB of RAM & a -1x256 netInputSize.  

    Running nvidia-smi with openpose running gave the memory requirement of a -1x256 netInputSize.  

    The Jetson nano has evolved from 2GB to 4GB over the years without any change in model number, which has caused a lot of confusion.  It's based on the Maxwell chip which is what the GTX970M used.  Jetson nano's with 4GB range from $100 to over $200 & are all out of stock.

    https://www.sparkfun.com/products/16271

    https://www.okdo.com/us/p/nvidia-jetson-nano-4gb-development-kit/

    There is another pose estimator for goog coral.

    https://github.com/google-coral/project-posenet

    But a rare comparison shows posenet being nowhere close.

    We don't know if it was processing in realtime or offline with maximum settings, how much memory was allocated to each program.  Goog coral modules are similarly priced to jetson & equally out of stonk.

    The plan envisions a Jetson module & LCD driver strapped to the back of the last LCD panel.  There will be a webcam on a new pan/tilt module for rep counting & the DSLR on the servocity module for animal tracking.  A new board will convert USB output to servo PWM for camera pointing & convert IR remote signals to USB input for user input.  The servo power will come from the USB cable, by hacking the USB ports on the jetson to output 5A.  It'll all be self contained with the LCD panel.  

    The user interface will be an IR remote with key maps shown on the screen.  The only way it could be debugged in the field would be a phone with a USB cable & some kind of ssh terminal.

    The jetson burns 5V 2.5A itself.  The servos burn 5V.  The LCD panel burns 8-12V.  There need to be 2 BEC's to power everything from either a 12V battery or ATX brick.  The 16V batteries envisioned for the truck wouldn't do much good for the LCD.  The mane thing keeping it a plan instead of reality is of course that pesky 2 year disruption in the transient supply chain.

    Obtaneing the required confusing power has been so problematic, the lion kingdom believes the total confusing power of the world is less than it was 2 years ago, due to lack of new confusers being created & old confusers going out of service, but no-one is actually counting.  If we have gone backwards, it would be the 1st time it ever happened since the bronze age collapse.  

    There was an estimate of the world's confusing power from 1950 to 2018.

    https://incoherency.co.uk/blog/stories/world-computing-power.html

    The lion kingdom's obsolete Ryzen 7 2700x is rated at 59000 MIPS or the entire world's confusing power in 1973.

  • Face tracking dreams

    lion mclionhead04/04/2021 at 02:45 0 comments

    There has always been a dream of running a face tracker simultaneously with the pose tracker to refine the composition, but getting enough confusing power has been expensive.

    Impressive demo of face tracking with downloadable software components, though no turnkey package is provided.  The goog coral is a way to get portable image recognition without dedicating an entire confuser just to the tracker.  The NVidia jetson requires porting everything to their ARM platform.

    The DJI/Ryze Tello is a remarkable achievement in indoor navigation by itself.  It would be a good platform for anything that required 3D positioning indoors & is what lions originally envisioned as the ultimate copter, something which was compact & could be thrown in the air to get an automatic selfie.  If only lions could think of a need for it now.  It also has a finite lifespan with brushed motors.

    Frustrating to have battled with inferior cameras, sonar, & IMU's 10 years ago, never to get anywhere close to what the Tello can do.  The accelerometer based localization all modern copters use started taking off right when lions moved on.  No-one really knows how modern quad copters work anymore.  They don't use sonar anymore.  They use lower cost lidar modules invented in the last 10 years & extremely expensive as loose modules.

View all 35 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates