Thursday, February 10, 2011

First entry: Dealing with a dead computer

Dead Computer
So after a blackout last week I found my 18 month old  HP Pavilion m9650f  in a comatose state. The computer was in an infinite cycle of 4 seconds On - 4 seconds Off. All fans (power source, CPU, Graphic card), drives and LEDs seemed to work but inexplicably the computer would turn itself off after 4 seconds only to restart 4 seconds later. No Video signal. No beeping sounds.

Troubleshooting
So I started the typical procedure: open the computer case fumble with the graphic card, hard drives and any and all components. Clearing the BIOS CMOS memory, replacing the depleted CMOS battery (which was supposed to last 7 years). Nothing.

Is it the BIOS?
Only when removing the memory dimms did I notice a change. With fewer memory dimms the cycle was faster and with no memmory dimms I could hear some beeping which told me that the BIOS initial test was working and either was purposely turning the computer off or it was deffective.

So I removed a jumper (yellow arrow) on the board marked rom_recovery which is on a SPI programming header for the BIOS EEPROM memory. Partial success! This stopped the On/Off cycle. But nothing else happened.
So I suspected the BIOS was corrupted. I could download the BIOS firmware from the HP website but programming it in a non-bootable computer would require some specialized equipment that I don't have. Researching on the subject I found anything from expensive commercial equipment to DIY projects. The prospect seemed too daunting for me and I wasn't even sure it was the BIOS.

Or the capacitor?
During all that fumbling I had noticed that a capacitor (red arrow) had a little bulge but with my limited electronic knowledge I didn't think it was important. Out of frustration I finally googled the subject and it turns out it's been a well known cause of motherboard failure for years. My "Truckee" motherboard in particular (I had version 1.01, bad sign!) is well known to have all kind of issues, although, mine had been working flawlesly for 18 months.

Calling support
Replacing a capacitor in a motherboard is completely out of my depth so I finally decided to call HP support and have it repaired. After verifying that my one year warranty had expired and that I hadn't purchased the extended warranty (which I never, ever do) the support rep wanted me to buy a one-incident-phone-support contract for 50$  or a one-year-phone-support contract for 100$. At this point I was sure there was nothing that could solve this problem over the phone so I declined.

I asked if they had a service center in my city where I could drop off the computer to be picked up later but they don't offer that possibility. I was quoted and estimate of between 250$ to 350$ and they were going to send a box to ship the computer via UPS. So I agreed but the process of getting this started over the phone took an unusually long time where mostly I had to wait on the phone and occasionally give some information, in the final steps they transferred me to a supervisor but by then I was already late for a meeting so I told him I would call again later. On the second call I had to go over the whole process again with another rep, this time I made sure to ask them to replace the motherboard not just the capacitor, but then the called was dropped. At this point I realized that if all that was required is to replace the motherboard maybe I could do it myself.

Replacing the motherboard
I've never done it before, but I've always been tempted with building my own computer. After all, how hard could it be? By now I've practically disassembled my PC and I have all the components already, the only thing I needed is a new motherboard. After some research I chose the X58M from MSI which had the closest specs to my old board (I almost went for the ASUS Sabertooth until I realized that my old board had MicroATX format).

Four days later (including a weekend) I got the new board and went to Fry's to get me some Thermal Paste for the CPU and possibly the Northbridge heatsink (several reviewers complained the Norhtbridge runs hot and some just reapplied thermal paste to correct it). I dutyfully disconnected and labeled every cable from the old board. I was surprised how easy it was to move the Intel Core i7 920 CPU from the old board to the new. From a previous unpleasant experience with an old Pentium 4 I was expecting this to be very difficult, having to align countless pins with their holes. It turns out there are no pins, just contact plates that are kept in place with a pressure latch. Nice.

Heatsink snag
Next I had to install the old Heatsink-Cooler Fan assembly on top of the CPU. Here I hit my first snag. Although the holes to mount the heatsink were in the same place, the new motherboard expected a heatsink with hooks that clip into the holes, the heatsink I had was screwed to the CPU socket assembly. My solution? take the socket assembly from the old motherboard to the new one, easy (or so I thought).

After this I screwed the new motherboard in the case and reconnected all the cables (except the front panel). I should have done some test with a partially assembled setup instead of connecting everything, but ever the optimist (the impatient, really), I said let's go for it.

Shorted motherboard
So I turn on the computer and I see a blue flash and everything goes dead. No beeps, no fans, no nothing. Uh oh! What could it be? Try again, same thing. Start disconecting, keep trying, same result. Disconnect everything except the motherboard and CPU, try again, blue flash, dead.

Go to the MSI troubleshoot page. They recommend testing "...the motherboard outside of the case to verify that the motherboard is not shorting to the case". Aha! That sounds a likely cause, I knew I should have done some testing before. Take out the motherboard, try again, same result. Take out memory, disconnect CPU fan, try again, same.

At this point I got to thinking, the only thing that is not "kosher" is the CPU socket assembly that I took from the old motherboard, so I unscrewed the heatsink, removed the CPU, removed the socket assembly and sure enough the back plate of the socket assembly was bigger and it's making contact with the pins of three capacitors.

Going rustic
So my options were: order a new Heatsink-Cooler, find some compatible screws (unlikely) or buy a drill. It was 1am so I went to Walmart got me some safety goggles, a new drill, some drill bits and a Dremel accessory. Two hours later (those back plates are made of solid metal) I had a rustic looking back plate with a 1 inch by 1/4 inch section removed. This time I tested thoroughly and everything was working fine. An hour later I had a fully functioning computer again (or so I thought, that story in the next post).

No comments: