Close
0%
0%

12VHPWR WatchDog to save NVidia GPUs

Instead of wining about the power of my GPU and NVidias design issues I decided rather implement a thermal watchdog on the connectors

Similar projects worth following
This is a very quick one: Arduino with a few 4.7K resistors and 100k thermistors.
On the Windows side, a python script running with administrator rights checks the temperature of the three thermocouples and logs them. Once at least one passes 100 degrees Celsius, the PC automatically shuts down - regardless of data loss. Way better than destroying $2000 in hardware or more.

The project is ongoing. 

Known issues:

  • Needs extensive testing!
  • Temperature number shown in tray icon is currently the maximum recorded during the active session and not the maximum of all eight sensors at the present time
  • Currently, only A0 to A3 are read and communicated. The new firmware is ready for testing but not on github, yet

As far as schematics go, it could not be simpler:

I used an Arduino Nano. A0-A7 read by the firmware. A 4.7K resistor goes from each pin to +5V

The thermistor is connected directly to the pin and GND as a voltage divider. That's it. I use the Steinert algorithm to calculate the temperature and dump it onto the serial port.

  • JLCPCB FTW

    Timo Birnschein03/13/2025 at 14:35 0 comments

    I had a super simple board made by JLCPCB and it game out great. Hard to believe that they ship these for $3.5 in quantities of five.

  • Rewritten Code Base

    Timo Birnschein03/13/2025 at 14:32 0 comments

    Rebuild the code. Now supports 8 channels of which I am currently using three for the 5090 side and three PSU side. The PSU is notably colder than the card but since it's all even, it's good.

    I also added multiple stages of alarms that start talking at 80C, get really annoying at 90C, and shutdown the PC at 100C.

    The app also now doesn't crash or stall anymore and adding it to the Task Scheduler in Windows 11 is pretty straight forward. I like to see the temperatures pop up in the tray icon.

    I also got a super simple board made by JLCPCB which game out grade. $3.5 shipped for five boards is just unbeatable.

  • Dynamic Tray Icon, Error Handling, Spoken Warnings

    Timo Birnschein02/28/2025 at 15:34 0 comments

    I added some more functionality to the python code.

    Change Log 2025/02/28:

    • The tray icon is now dynamic and shows the currently highest measured temperature and changes the background color of the icon if the temperature goes above 65C, 80C, and above.
    • The script now also automatically opens the last port if still available and starts streaming the data immediately. No need to open the port manually each time.
    • A watchdog checks for the connection to be stable and if not shows a message box and gives a spoken warning to check connections before continuing.
    • Better overall error handling and checks for malformed packages to prevent the graphing algorithm to stall and be overall more stable

View all 3 project logs

Enjoy this project?

Share

Discussions

Stefan Misch wrote 03/03/2025 at 18:45 point

There must be a way to connect the Arduino directly to the power switch Front Panel connector. Then you can shut down the machine directly without a software running on the machine.

Of, if you find the right relay, make it control the power going into the PC directly. :D

  Are you sure? yes | no

Timo Birnschein wrote 03/03/2025 at 20:53 point

Yes, that is indeed extremely easy and was also my first choice! But I tried it in software first and so far it does work fine.
If I can, I'd prefer a proper shutdown. But you do bring up a great point. I could check if the PC actually shut down on the Arduino side and if not, hold the power button for 5 second for force a bios power off. That should do it.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates