While doing a routine health check, I realized that one of my servers (that is running a hobby project from my friend and I) has been working non-stop for almost two years.
The server was assembled by me. A 3U rack mount chassis bought directory from factory (a real bargain!). Two redundant PSUs. An Intel dual sockets motherboard. Two 2.8GHz Xeon CPUs with hyper-threading. 2GB of RAM (yes... RAM was quite expensive at that time so that was what my friend and I could afford :P). Two sets of RAID: 2 x 80GB RAID 1 for OS and database, plus 8 x 200GB RAID 5 for application data.
I spent nearly a month on assembling the hardware and fine tuning the OS kernel (Linux) and RAID/DB/cache settings etc. Still can't forget the noise produced by the 2 PSU and 10 hard disks, especially when I had to do it in my bedroom and had the machine running 7x24 for stress tests!
Actually the server was up and running more than two years ago. But about six months after the server was deployed to the data center, one day it suddenly crashed. After a few hours of diagnostic, it was concluded that one of the SATA ports failed! And it took me another five to six hours to rebuilt the RAID, performed some tests and restarted the application.
The service was resume about two years ago around midnight. Luckily it has been running smoothly since then. This also means that I haven't been to the data center for two years! :P
1 comment:
"While doing a routine health check" ... 睇到呢句我好開心,因為以為你去左做body check!(我開到個window好小去睇呢個blog)點知 ... 原來只係o的機做health check,主人無做到 ... =_= ~
Post a Comment