There is a success story how one technology overcame similar limitations: Computers and robots lived in the basements of big corporations half a century ago. Both technologies occupied big rooms, costed a house, consumed a lot of electricity, could be operated only by experts. Downscaling has made chips faster, more efficient and affordable. Now battery powered and easy to use smart devices are everywhere. Square cube law predicts similar benefits for the robotic arms: half the sized robot arm has 16 times smaller inertia leading to reduced weight, power usage, motor sizes, costs and increased safety. 

Benefits of downscaling harvesting robot:

Unit cost - Small robot hands earn their hardware cost back in days.

Transportability and mobility - Collapsible robot frame fits through doors, onto a passenger car and is easily deployed on the field. Small robotic hands do not need massive frame for stabilisation.

Easy to produce - We can produce 100 robotic arms per week without component shortage.

Low power - Robot can be powered by solar panels.

Safety - Small robot arms are unable to hurt people. 20 kg robot driving 1 cm/s does not pose a threat.

Small arms can not reach the top of apple trees as a drawback.

Most CNC machines and welding robots are designed around stiffness and precision. These technologies have issues with gentle movements are not cost effective enough for agriculture. 3D printers are designed to handle tens of millions of light load movements and they are a bargain considered to alternatives. Belts and other moving parts have to be covered to not catch the leaves. Also stepper motors consume considerable amount of power.

 I started with standard servos to keep the part count to minimum, but there was no speed or force control. This eventually led to robotic servos.

Electric motors do not like constant load and fighting with gravity should be avoided as much as possible.

There are 3 gripping families: multiple fruits, single fruit and grabbing by the stem. Stem picking is the slowest, but gives longest shelf time. Multi fruit picker is fastest but selective picking from clusters is problematic.

Colour detection worked for detecting single strawberries in good light conditions. Shadows and over illumination caused problems. Also cluster of strawberries is a challenge: it is hard to understand where one strawberry ends and another one starts.

Neural network understands that every red object is not a strawberry. It also works well with clusters, distinguishing individual strawberries and their parts. This comes handy when picking by the stem. As a drawback neural networks are 100 times slower compared to red colour detection. Running neural networks on GPU makes it fast as red object detection.

Around 1000 pictures is needed for training the neural network model. Good model performance comes from good quality pictures. Most of the pictures should be smart phone quality level in the actual field. Other ones should cover all the variation of light level, motion blur and other aspects of picture quality. Also variation of the background: the gripper and other visible parts of the robot, human hand.

Good quality pictures and good coverage different variations performs better than augmentations.  Some diseases of the fruit are rare: they occur once every ten years and its hard to get more pictures. Then augmentations makes sense.

How good camera does the robot need for picking? Small frame sizes (640x480) use little energy and little computational power and are good enough for ripeness & quality detection.

Automatic focusing causes issues. Higher leaves are closer to the camera and rob the focus from the strawberries on the ground. Fixed focus or manually adjustable cameras work better. Many webcams show white picture if used in direct sunlight. Also moving between shade and direct sunlight can make the robot loose few seconds auto adjusting white balance. Skirt around the robot limits blinding daylight and rain.

Tested USB webcams and inspection cameras have considerable delays. This causes problems when  robot arm is moving fast: when the camera picture shows the arm in the right place, the arm has overshot the position. This also leads to jerky movements when arm is closing in - arm is constantly overcorrecting.

Strawberry half a meter away on slow moving platform looks good. Strawberry an inch away gets foggy thanks to motion blur. More light forces the camera to use shorter shutter times and fogginess is reduced. Another workaround is use of global shutter camera.

Here is a strawberry picking robot video from the summer of 2022: