Crashing Computer: Recovering from Disaster Recovery Plans

A couple weeks ago, I rebuilt my old computer with a new motherboard purchased for disaster recovery in case my original mobo died, but had some problems with crashing. The problems seemed to be related to RAM. Some configurations crashed pretty quickly, and often, and other ones were pretty stable, but nothing was as stable as the system I was hoping to retire.I bought new RAM, mainly because I needed more than the 4GB that seemed to be working. (I got one configuration that didn’t crash.)

I tested out the new RAM in different slots, increasing the voltage, slowing using the timings that were in the RAM’s configuration, using defaults, using auto configuration, and a few other settings, and it just wasn’t holding stable.  It was more stable than the G.Skill stick, but it wasn’t as stable as the Corsair.  Crashes were frequent enough to be a problem.

I finally turned to the documentation, and found out that the RAM I’d purchased wasn’t on their tested RAM list.  So I RMA’d it, to return to NewEgg, and bought one of the models they had tested.

By today, though, I was tired, and needed a stable computer to work with, so I swapped in the old motherboard.

I’ve never had this many problems before. Dozens of times, I’d purchase RAM based only on the nominal performance, and timings, so they’d match the installed memory, and it would work.

My conclusion was that the motherboard, not the RAM, is the problem. The person who sold the board to me, wangrp, probably had some problems, was already out of the warranty or return period, and sold it on Ebay. I’m going to sell this as not working or for parts, and try to recover $20 or so.

So, the old motherboard, with it’s nonfunctional AHCI chip, is working fine. I will have another 8GB of RAM coming in, which I really don’t need, but it’ll be nice to have when I’m running a bunch of VMs.  Ultimately, I still need to find a backup motherboard.

Disaster Recovery Lessons Learned

The intention of this motherboard purchase was to have a backup computer, when the current motherboard fails.

The entire purchase needed to be planned out better. Because I didn’t have a second PC, my data disaster recovery had to be retested before buying the motherboard. The backups and restoration were fine, but it took time to sit down and do tests. It takes a few hours, even with an acceptable backup system.

When parts arrived, I should have built the new system immediately, so it could be returned if defective.

This entire process of just getting to a functioning computer was probably five hours because of the required DR tests.  It’s not a short task.  It’s really an entire day of work, and there is no “multitasking” to make it happen faster. If your return period is 14 days, that’s really only 10 work days, and if you add in delivery times, maybe 8 days to complete the build and test.

So, all preparatory work should be done before purchasing anything.

These small logistical problems add up to eliminate Ebay as a source for parts, unless you can quickly test and return them.

I should have ordered parts to build a second computer. I had a spare disk. A power supply would have added $50 to the cost, but the second computer could have been running RAM tests while the original computer was being used for making money.

With enough parts, the backup testing wouldn’t have been so important, because the functioning computer wouldn’t be disassembled.

If I’m going to buy from Ebay again, I’ll need a better testbed with more spare parts.

In the Real World, this Sucks

In a business environment, this rate of performance is unacceptable. The downtime is just too long. The only reasonable solution would be a same-day replacement.

If we assume that RAM and hard drives go bad, the only real “solution” is to replace them every few years, and replace the entire computer every five years. Additionally, because we need some “burn in” time to weed out the bad RAM and hard disks, maybe a month of lead time to break in the machine is in order.

So, the right solution, that ultimately costs less money, is to have planned upgrades, perform burn-ins, pre-install everything, and swap machines in quickly.

Takeaways:

  • Don’t buy critical parts on Ebay.
  • Plan out the build process ahead of time.
  • Build and return immediately if it doesn’t work.

Leave a Reply