I've been working a lot on the software for converting a focus stack into a depth map. Writing in Clojure and testing with focus stack images found online, I managed to produce images representing depth maps of the objects in the photographs. But the method that I'm currently using to calculate the depth of each of pixel is to take the layer of greatest focus as the depth value, which has so far produced poor results. The depth maps are noisy and inaccurate.
My plan is actually to use a different approach, where a Gaussian curve is fit to the focus values in each pixel's stack and the depth is taken as the position of the peak of the curve. However, after implementing that algorithm I found it to be far too slow. It would have taken two weeks to process the simple focus stack that I was testing it with, and I expect to need to use it for focus stacks 100x to 1000x thicker. I tried again with an approximation that took the log of the values in each pixel stack and fit a parabola. It was much faster but still too slow, and the results were very poor. I believe that the results were poor because of the small number of images in the focus stack I was testing with. But the speed of the algorithm is a much bigger issue.
I wanted to try the parabola fit approximation on higher quality data to see if that would improve the results by giving me focus stacks actually resembling Gaussian curves. So I set up the scanner hardware that I built to do a "dumb" run. The motor controller still isn't ready for action, but I can still apply power directly and move the z stage. I was planning to run the motor slowly while shooting a video, and then extract the frames from the video to produce a reasonable-quality focus stack for testing. But while I was fiddling with the machine I realized that it has a fatal flaw.
In order for the naive algorithm that I'm implementing to work, each pixel must stay correlated to a specific point on the object being scanned. So the flaw should be obvious: when the camera moves the correspondence between each pixel and its point on the object is lost. The projection that the camera makes should be orthographic instead of perspective, or the position of the focal plane should be changed in the optics instead of by physically moving the object to be scanned.
I am annoyed that I didn't catch this flaw while working out the design or doing research. But the fix is obvious. I'll switch to using a stepper motor to change the focus of the webcam directly.