CylonGlitch:
what you say is correct, however the big drop in performance is related to what I just wrote in my previous post. Now I must really leave, can't do anything since tomorrow!
What I wrote was only relevant to SSD devices, not HDD. I did do a quick search for this and found nothing; as I mentioned. Please link the websites you are referring to. I have no performance lose on my SSD devices. The drive still needs to perform a read modify write every time you do a write unless the block is completely written. Some technologies, or some controllers may write zeros to the block before writing the new block back. But in this case I wouldn't think the controller would try to detect if the block is blank first because this operation inside the controller would actually take quite a while to perform, checking every bit to verify it is unset (either 1 or 0 depending on the technology of the storage) and then skipping the write of a blank block back.
Let's say that the tech requires a blank block when writing. In the SSD's that I worked with this isn't the case; but let's assume it is here. From the controller side, you get a command to write X bytes to a block; here is what the flow of operations would look like.
1) Read Block
2) Merge Block with Writing block (however many bytes are being written)
3) If block Read is all 0's, go to Step #4, else Write a block of 0's (in this case to initialize the block)
4) Write the new block from step #2.
What you are claiming is that step 3 can be skipped if the block is already 0'ed. (NOTE I am saying 0's for sake of understanding).
What I claim is that step #2 and #3 can happen in parallel. Step #2 is the long pole here, merging X bytes together takes quite a bit of time where as one full block write is actually quite quick. Thus as a developer you would code the controller to do both operations at the same time and thus increase throughput to the chip. Internally in the controller you just have a preset block hard coded and dump that to any block that needs it -- it's quite easy to do. Thus in this case, the checking if all bytes in the block that was just read, which in itself takes a good amount of time, can be skipped all the time if you just assume that you ALWAYS have to write a fixed block back. Now, with the technology being so fast and small, it is possible to OR all of the bits of a block together and find out in one clock cycle if the write needs to go through, then that could be useful but it would require significant hardware to do this.
Again, please point me to the articles you're reading, I would like to see this.