Close

DDR3 woes

A project log for Coven: mini-ITX cluster computer

Open source mini-ITX cluster computer

kmodkmod 08/02/2014 at 11:310 Comments

As I mentioned in the last log, progress is currently stopped due to the fact that the DRAM doesn't work.  There's a heckuva lot that goes into getting that to work, and consequently a heckuva lot that could be going wrong

1) the soldering could be bad (they're 0.8mm-pitch BGAs)

2) the board layout could be bad

3) the DRAM IC could be bad / incompatible with the processor

4) I could be trying to use the DRAM controller in the wrong way

5) There could be an issue in the rest of the board

etc

So I bought an Olimex A13 board, which should let me bisect the issue: I know the layout is good, since it runs.  I got my modified U-Boot onto the board and it seems to run perfectly.  So it looks like #2, #4, and #5 should be non-issues for this particular test board.  So I tried replacing the DRAM IC with one of the ones that I'm trying to use (a 1Gbit Alliance Memory part).

And... I haven't been able to get it to run again.  I guess that's somewhat good news: it means that there's an issue with either #1 or #3, which seem easier to debug than, for example, #2.  I'm not sure how to debug #3, except to put the same exact IC that Olimex used and see if I can reflow that and get it to work.  Hynix memory is surprisingly difficult to obtain in the USA (I think they must have some sort of export restrictions because they explicitly won't sell it to you if you're in the US), so I bought some from aliexpress.com, which should hopefully come in the nearish future.

In the meantime, I'm currently going forward with the theory that my reflow is bad.  I'm using a hot air gun and things seem to be working pretty well, but electrically not so much.  When I remove the ICs after attempting to attach them, I notice that it looks like the balls didn't really reflow that much.  I also notice that DDR3 ICs have an extra protrusion on the bottom of the package that provides a minimum height clearance from the PCB.  I'm not quite sure what the reasoning is behind this, but I've seen it from the three different manufacturers I've bought from, so it's probably for a good reason.  Anyway, my current theory is that I need to be using solder paste (I've been trying to just use flux, which has worked for me with my other BGAs) to accommodate the taller seating height -- the balls might only barely be making contact with the pads.

I tried this once, and didn't have any greater success, but I'm not sure that the test was conclusive.  The solder paste application wasn't very good, I'm not sure about the alignment, and to be cheap I tried re-using an IC that I had previously removed, which I thought would be ok since the balls weren't deformed at all (in itself a bad sign).  I'm going to give it another go tomorrow (getting the solder paste in there is actually quite difficult and time consuming, since the rest of the boad is already assembled and gets in the way), and if that doesn't work... I don't know.  I have another project ( http://hackaday.io/project/2204-Fray-Trace ) that has DDR3 memory attached to an FPGA, which might provide a better test bed, so I might switch to debugging on that.  Though that might end up being complicated by the fact that it's not a proven layout or assembly.

So in conclusion, debugging a problem when there are multiple possible root causes, and where it's not possible to directly measure the root causes (an x-ray machine would be nice right about now), means that debugging is agonizingly difficult.  I'm still going forward with the theory that it's an issue with my BGA assembly process, since I've narrowed it down to that or a DRAM IC incompatibility.  Hopefully those Hynix ICs arrive soon so that I can test those, and narrow it down further -- and then actually solve the problem.

Discussions