(speaking about RAID 5 being better than RAID 10 for sequential read/write.)
Yeah, and that's all I'm interested in. So what I said at first still holds. It's what is most important for most of us here.
Are you sure? I mean, what makes you so sure your actual workload is significantly more sequential than random? Or may be you don't care about what happens on your drives and just want to be sure they deliver the best sequential IO, even if it kills your random IO (that's just a choice).
Before you read the following, I want you to understand that I don't intent to deliver the absolute truth below. It's just a bunch of remarks, questions, and few attempts to answer these questions. Every constructive criticism is very welcome.
Focussing on sequential access (and if you are right, most people here are), is like focussing on megapixel when choosing a digital camera. That's important, but that's not enough. And may be it's not even the most important thing.
I would like to make a quick comparison (data taken from storagereview.com).
Let's take 1 SATA HDD and 1 SSD :
Code:
- Western Digital Caviar Black WD1001FALS (1000 GB SATA)
Average Random Access Time (Read) 12.2 ms
Average Random Access Time (Write) 13.2 ms
Maximum Transfer Rate 111.0 MB/sec
IOMeter File Server - 1 I/O 89 IO/sec
Code:
- MTRON MSP-SATA7035-64 (64 GB SATA)
Average Random Access Time (Read) 0.1 ms
Average Random Access Time (Write) 6.8 ms
Maximum Transfer Rate 108.0 MB/sec
IOMeter File Server - 1 I/O 688 IO/sec
Ok, it's not the best SSD around, but recent SSDs don't outperform this one by a factor 5. So it's ok.
Compared to HDDs, SSDs don't offer a very good "Maximum Transfer Rate". They are good, but not really better. And sequential access is bound to this tranfer rate.
Compared to HDDs, SSDs do offer a *very* good "Average Random Access Time", and a very good number of I/O per second: about 120 times faster for random access time, and about 10 ou 20 times the number of I/O. And random access is bound to random access time and number of I/O per sec.
You'll have to admit, most people here want SSDs, because they think SSDs deliver really good performances (they are right).
I don't think people want to pay ~3$ per GB for a SSD that deliver between 110 and 200 MB/s Maximum Transfer Rate when 1 or 2 HDD will deliver the same at only ~0.1$ per GB.
So why do people love their SSDs, if it's not about random access performance boost they deliver?
If they really feel the difference (and they do), that's because random access is everywhere on the file system, and it happens during every workloads.
Sometimes, under particular workloads, random access "disappears" under a huge sequential transfer (big files duplication/transfer
), but it's still here, under the hood. That's why Apple has introduced an auto-defragmentation in Mac OS X, that's why ZFS optimize writes so that it can flush them sequentially. Everybody tries to reduce random access because it's what kills the performance of your drives.
In the real world, it's a matter of balance between sequential and random access (and it's quite complicated to separate one from the other), but random access is definitively of paramount importance.
I'm in the process of collecting data about my own I/O using DTrace tools, during my regular daily workload (client side, not server side). So far, it shows that unless you play with very big files (media postprod, big content duplication...) most of daily usage is more random than sequential (email, browsing, chat, coding, usenet). You can take a look at a first test here :
http://patpro.net/~patpro/iopattern2.eps
regards