The decision was made to transfer to using a face tracker instead of manually pointing the pole cam. Manual pointing would override the face tracker if the remote control was powered on, but it was expected to disappear. The lion kingdom found itself running more often without powering the servo, since carrying around the 2nd paw controller was a hassle. It was manely used for pointing the camera at the lion instead of viewing anything else. The most desirable shots were tracking the lion during a turn, but this wasn't possible with manual pointing.
The ideal tracker would run a pose tracker & face tracker. This can run reasonably well on a 4GB GPU. The face tracker would pick out the lion & the pose tracker would track any visible body part when the face wasn't visible. Unfortunately, the embedded GPUs of the past were all discontinued so the lion kingdom settled on just the opencv face tracker on a raspberry pi 4B.
Opencv face tracking is split into face detection & face recognition. Face recognition goes at 3fps while face detection goes at 7fps, so just the largest face is tracked. There is an optical flow step which keeps tracking the same face for a short time if a larger face appears. Optical flow works better at higher frame rates, but the best the pi can do is 7.8fps. Further optimization for slower modern confusers involved scaling down the video to 640x360.
The gopro 7 delays its HDMI feed by 2 seconds, so a webcam is dedicated to face tracking. It lighter than having an HDMI cable go to the gopro, but it's very cockeyed. Some experimentation with the webcam height & angle is required.
An enclosure is required to keep other junk from smashing the face tracker & motor driver.
The mane problem with this is the evil USB connector glitching.
The raspberry pi runs an access point & the phone runs an app to control it. There's a text file with low level configuration parameters.
The cockeyed webcam & low resolution make face tracking pretty bad in the corners. The face detector is not rotation invariant & there's not enough resolution to do an equirectangular projection. The bottom center would have to get smaller or the top corners would have to be cropped instead of cockeyed. Frame rate could be sacrificed to get more resolution. There's also the matter of how much a lion should invest in replicating what surely some Chinese robot can do.
Practical tests could mostly track in ideal lighting & even in some hard backlit lighting. Backlit lighting was less robust. It was so common for face detection to drop out for a single frame, optical flow was manely useless.
The mane problems were indian burial grounds causing spurious face detection.
Cases where the lion wasn't detected while another face was, on top of indian burial ground interference. Face recognition would be a big hit, if there were embedded GPUs.
Cases where it should have detected but missed abounded. These might be from rotation.
Less ideal lighting, but still rotated. The L & R positions were worthless because the amount of rotation caused by a face on the left & right made it always drop.
Power consumption was 330mAh/mile without any speaker or headlights.
There are alternatives to the opencv face tracker. While embedded GPU's are no more, Intel is producing a USB stick.
YOLO can detect an entire body at 5fps with USB stick.
Face recognition with USB stick goes at 6fps.
Changing vision algorithms is like writing an mp3 player in 1995. It's entirely locked into 1 piece of hardware & there's no abstraction.