In any multi-device redundant array: Number of devices that can fail simultaneously while still running: N. Number of devices that happen to fail spontaneously: N + 1.
Speaking of redundancy and the failure thereof (though, in this tale, not due to the hardware):
I heard from a coworker about this guy who had a bunch of disks in an array, hot-swappable and RAIDed, all that good stuff. However, he had no understanding of how the system worked. So, a drive went dead, its little red light came on, and he pulled it out of the array. However, he noted that the empty hole in the array looked, well, messy. Should be over at one end, see. So he moved one of the remaining drives. Of course, to do this, he had to first pull out one of the remaining drives. Now two drives down. Not one of the RAID configs that will tolerate that. Array now hosed.
Oy veh. Keep everyone with aesthetic motives away from servers!
The good news is that a nice fellow from London says, "We look after everything" for this particular array. It's the old "two lightbulbs at once" jobbie, and yes, tomorrow is an epic restore from backups...
Meh, it's only cardiology. People can use their backup hearts or something.
no subject
Date: 2009-03-26 06:22 pm (UTC)Speaking of redundancy and the failure thereof (though, in this tale, not due to the hardware):
I heard from a coworker about this guy who had a bunch of disks in an array, hot-swappable and RAIDed, all that good stuff. However, he had no understanding of how the system worked. So, a drive went dead, its little red light came on, and he pulled it out of the array. However, he noted that the empty hole in the array looked, well, messy. Should be over at one end, see. So he moved one of the remaining drives. Of course, to do this, he had to first pull out one of the remaining drives. Now two drives down. Not one of the RAID configs that will tolerate that. Array now hosed.
no subject
Date: 2009-03-26 08:01 pm (UTC)The good news is that a nice fellow from London says, "We look after everything" for this particular array. It's the old "two lightbulbs at once" jobbie, and yes, tomorrow is an epic restore from backups...
Meh, it's only cardiology. People can use their backup hearts or something.
no subject
Date: 2009-03-30 02:57 am (UTC)ie: the failures that occur simple indicate the failsafe system is working. Thus we should rejoice in our increasing safety!
no subject
Date: 2009-03-30 09:09 am (UTC)