What's your disk MTBF?

A typical disk has a quoted MTBF of a million hours or so. (For SCSI; IDE may be somewhat less.) Let's call that 100 years - allowing for a mix of IDE and SCSI drives.

Now I have about 1,000 drives. Of a wide range of types and ages - IDE, SCSI, FC-AL. Some dating from the late 1990s. Based on the MTBF and the number of drives, I would expect to see 10 disk failures a year - or almost one each month.

I'm not actually seeing anything like that failure rate. It's much lower. Had a disk fail earlier in the week, but that's rare. I'm guessing that I'm seeing only a third of the expected rate of failure.

That's on my main systems anyway. Those are in a controlled environment - stable power, A/C provides constant temperature (not as cool as I would like, but stable). They spin for years without being provoked. I have a 12-disk D1000 array that sat unused in a cupboard for the best part of a year before being rescued and thrown together for beta testing ZFS, and so far 4 of the 12 disks in that have failed in the last year. Which is a rate much larger than you would expect based on the rated MTBF.

My conclusion would be - and this should be well known to everyone anyway - that if you take care of your kit and give it a nice stable environment in which to operate, it will reward you with excellent reliability. Beat it up and treat it like rubbish, and it will refuse to play nice.

