After receiving a suggestion to use the autocorrelation of the signal as a means of detecting pitch, I thought I'd try to compare it with my current, FFT-based approach.
I used a simple test to measure the accuracy of the FFT method- simulate some ideal signals (pure sine waves), then run the algorithm and see how close the pitches detected are to the actual frequencies. The program resulted in the following output:
Freq detected: 219.410355 Hz; note is 0, error is -4.507992 cents
Freq detected: 233.667511 Hz; note is 1, error is 4.361612 cents
Freq detected: 246.473740 Hz; note is 2, error is -3.266289 cents
Freq detected: 261.698944 Hz; note is 3, error is 0.455947 cents
Freq detected: 277.329346 Hz; note is 4, error is 0.932582 cents
Freq detected: 293.225464 Hz; note is 5, error is -2.559881 cents
Freq detected: 311.743774 Hz; note is 6, error is 3.412675 cents
Freq detected: 329.036469 Hz; note is 7, error is -3.115676 cents
Freq detected: 348.626007 Hz; note is 8, error is -2.993082 cents
Freq detected: 370.542023 Hz; note is 9, error is 2.581400 cents
Freq detected: 391.442657 Hz; note is 10, error is -2.460130 cents
Freq detected: 414.754150 Hz; note is 11, error is -2.274323 cents
Freq detected: 440.689392 Hz; note is 12, error is 2.710940 cents
Freq detected: 465.703400 Hz; note is 13, error is -1.694892 cents
Mean error: 2.142857 cents
Note 0 ... 13 correspond to A3 ... Bb4. The maximum error, in this ideal case, is roughly 4.5 cents- not TOO bad, given the limited number of FFT bins.
For the autocorrelation approach, I don't have hard figures, as I have yet to get a reliable algorithm working in C. What I have done is everything up to generating the autocorrelation coefficients for different time delays. I then output that data to a text file, loaded it into MATLAB, and semi-manually found the fundamental frequency that way.
I wasn't pleased with the results- it was almost never closer than 1 Hz from the true pitch, and frequently off by 3-4 Hz. And this was after tweaking the sampling rate and sample window length to whatever values I wanted. 3-4 Hz is something like 15-20 cents, depending on what note you're talking about, so I don't see this as being usable, at least not in the current state.
For reference, I used the algorithm described on this site. If anybody has a better version or other suggestions, post away in the comments section!