Thursday, December 31, 2009

Dell Server 1800 + Linux + Rebuilding drive + reboot = Failed Array?

So I came in to swap some backups up on Monday, even though I was supposed to be out on vacation. I heard a load alarm coming from the server room. after searching around for a while I found it was our Dell PowerEdge 1800 that had been sent to us a couple of years ago, with a vender app loaded on it before my time. Well after restarting it stopped. YAY Good. Well not quite. After rebooting the server then after getting to the point where it discovers the raid array. It alarmed again stating that the Array had a missing member or was rebuilding.

So I hit Ctrl+A to hop into the utility. After looking around I could see that one of the drives was not showing up. Which added up since while it was loading I could hear some Putting and Clicking sounds drives shouldn't make. Well I contact that vender and they sent out a replacement drive. After putting that in and getting back into the Utility I had it search for new Drives and it started the rebuild. Well since this was a critical server to have up, I went ahead and exited the application so that it could boot into the OS.

Now I knew that it wouldn't perform as fast but at least it would be up, and in the background would be rebuilding. Well as Linux started to boot up it said that the Journal as dirty (or something to that nature). Pretty much it sounded to me, a windows guy, as a check disk. So I stuck around to keep an eye on it. At 67.5% it locked gave me some errors about EXT3-FS yada yada(sorry didn’t have something to write the errors down on when they came up) and then rebooted. I knew that wasn't good.

Well after coming back up it told me that the Array had no failed. :( Nooooooo. It always happens on vacations or weekends. Well after trying various things with the Vender support they had me force online the raid. We reboot and the Linux boot loader comes up. YAY. Well then some more errors and a error along the line of "kernel failed to init." Not good. Well so now I wait for the install media for that vender application, so it can be re-setup and then at which point then we will restore as much data as possible.

Morale of the story, you may just want to stay in the utility and let it rebuild even though it may take awhile.

I use for the most part Microsoft products but dabble in opensource software when the need arrises. i.e. Setup a Clonzilla server for a large school district for imaging.

