andygates: (Default)
[personal profile] andygates
We've got a big data store. It's currently built as a single max-size 2Tb RAID 5 array over 11 disks. It performs like a three-legged dog and each of the 300Gb disks has about 75Gb unused (because that would take it over the 2Tb SCSI RAID limit).

We need to rebuild it. We've been advised that volume sizes over 1.5Tb are dogs under SCSI RAID so we're keen not to do that. Our options are:

* RAID 10, 10 disks+hotswap, total volume space 1.5Tb, good write performance.

* RAID 5, 7 disks+hotswap, total volume 1.6Tb, but it feels wasteful of all these lovely disks.

* Two smaller RAID 5 volumes, 4+hotswap and 5+hotswap, giving us .9 and 1.2 Tb respectively. The volumes should behave better, because they're smaller and on fewer disks, but will this just move the bottleneck up to the servers' RAID controller?

Gurus, your wisdom is much appreciated!

Date: 2006-10-19 02:00 pm (UTC)
From: [identity profile] thudthwacker.livejournal.com
What is the array being used for, and how much of the available capacity is currently being taken up?

Date: 2006-10-19 03:16 pm (UTC)
From: [identity profile] andygates.livejournal.com
General file storage, just under 1Tb, and it's about 2:1 write:read from the activity monitors.

Date: 2006-10-19 03:46 pm (UTC)
From: [identity profile] thudthwacker.livejournal.com
My personal tendency would be to go for the RAID 10 setup. Overall performance is better, and you don't have to worry about the possibility of misestimating what should go on which of the two smaller RAID volumes and having to shuffle stuff around.

In the interests of full disclosure, I'm irrationally touchy about having to move stuff between volumes. This is a holdover from several years back when we had an old SGI Challenge with a pile of differently-sized SCSI drives, and would regularly have to spend hours moving user groups off one and onto another when they ran out of space.

"...on the other hand..."

Date: 2006-10-19 03:50 pm (UTC)
From: [identity profile] thudthwacker.livejournal.com
Of course, you get over half a TB more space if you set up two volumes, and that's not inconsiderable.

So, if you're worried that the controller will bottleneck if it has to handle two volumes, you can set up the 1.2TB volume first and put everything on it (since you're under a TB of use now), and check performance. You then set up the second volume and port some stuff over to it, and look for performance degradation. If it looks good, you can port more stuff over. If it looks bad, you can always do RAID 10. (He said cavalierly, as if he were the one who was going to be spending several hours shoveling bits around.)

Date: 2006-10-19 03:53 pm (UTC)
From: [identity profile] gedhrel.livejournal.com
For large raid for fileservers, we tend to use 4+1 blocks of disk. The point about this is that (1) a recreate is relatively painless; (2) compared to, say, 14+1, the odds of getting two failures in one raid-5 group are that much smaller (it still happens, alas).

In the absence of any specifics I'd suggest 2x (4+1 raid 5) and a pool spare. Amalgamate the resulting volumes however you like (at the server level if necesasry). Raid-5 is fine (some would say great) for a stock fileserver. Depending on the size of the files in question, you might want to drop the stripe size to give the array a chance to do whole-stripe writes - a lot depends on the capabilities of your raid kit (what is it?) where that is concerned.

Date: 2006-10-19 03:55 pm (UTC)
From: [identity profile] gedhrel.livejournal.com
Incidentally for the 500GB disks in our high-density kit we've tended to allocate 2 pool spares per 14-disk row; mostly because of the pain assocaited with a physical replacement.

Date: 2006-10-19 04:08 pm (UTC)
From: [identity profile] gedhrel.livejournal.com
Incidentally the raid-5 performance issue comes down to the fatness of your reads and writes. For large reads, having lots of disks is great because you can stream data off all of them in a go. But for file service you need stripe-sized writes for this to work well, because otherwise a small write means reading from the disks that're touched plus parity* as well as the corresponding writes.

So you want your stripe size down. That means (1) think about using fewer disks per raid-5; (2) look at the average write sizes - think your fileserver should be able to tell you this - and see if you can get your stripe size to match this.

* there's a simple XOR trick that means you don't need to read the whole stripe in this case.

Date: 2006-10-19 04:09 pm (UTC)
From: [identity profile] thudthwacker.livejournal.com
Amalgamate the resulting volumes however you like (at the server level if necesasry).

[smacks self in forehead]

I've even had my coffee.

Date: 2006-10-19 05:52 pm (UTC)
From: [identity profile] andygates.livejournal.com
Guys, you're both golden, and I'll wave this at my coworker in the morning.

It's been a while

Date: 2006-10-19 09:11 pm (UTC)
From: [identity profile] carl42nz.livejournal.com
Where do you think you bottle neck is?

With much reads it can give a latency for the disk to get to a stripe position for reading (ie more stripes means more chance of latency per read) - this might have been reduced by hardware technology on-drive.
there's similiar consideration for writes but I can't recall it.

If data is moving to and from the same devices, can they be split physically, and rejoined logically. This separates the load on each channel.

What are the main constraints? price? physical size/cable size? speed?
The old superserver range I used to work on there were some optimal configurations for which channels and which raid patterns the controller handled better. The high level raid 5 & 10 were a hassle (I recall) because of the extra in-channel bandwidth consumed by the parity and drive command (after all it had to be calculated: fast, then the drive latency probability: medium % but sigificant delay many uS, then the commands placed into the channel 10%-20% bandwidth lost straight away.) also big disks sometimes makes for poor administration wich increases latency chances and put bandwidth hogs on the same highway as smaller, higher priority requests.)

Profile

andygates: (Default)
andygates

April 2017

S M T W T F S
      1
2345678
9 101112131415
16171819202122
23242526272829
30      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 22nd, 2026 02:27 pm
Powered by Dreamwidth Studios