Intel 320 SSD write performance – contd.

I wrote about Intel 320 SSD write performance before, but I was not satisfied with these results.

Somewhat each time on Intel 320 SSD I was getting different write performance, so it made me looking into this with details.

So let’s run experiment as in previous post, this is sysbench fileio random write on different file size, from 10GiB to 140GiB with 10GiB step. I use ext4 filesystem, and I perform filesystem format before increasing filesize.

The results are pretty much as in previous post, the throughput drops as we increase filesize:

However, there is when interesting stuff begin. Now when we run the same iterations again, the result will look like:

As you see, second time the throughput is much worse, even on medium size files. Just after 50GiB size, throughput gets below 40MiB/sec And this is with the fact, that I perform filesystem format before each run.

This leads me to conclusion that write performance on Intel 320 SSD is decreasing in time, and actually it is quite unpredictable in each given point of time. Filesystem format does not help, and only secure erase procedure allows to return to initial state. There are commands for this procedure for reference.

hdparm --user-master u --security-set-pass Eins /dev/sd$i
hdparm --user-master u --security-erase Eins /dev/sd$i

Discussing this problem with engineers working with Intel 320 SSD drives I was advised to use artificial space provisioning, about 20%. Basically we create partition which takes only 80% of space.

So let’s try this. The experiment the same as previously, with difference that I use 120G partition, and max filesize is 110GiB.

You can see that throughput in first iteration is basically the same as with full drive, but second iteration performs much better. Throughput never drops below 40MiB/sec, and stays on about 50MiB/sec level.

So, I think, this advise to use space provisioning is worth to consider if you want to have some kind of protection and maintain throughput on some level.

Raw results and used scripts as always you can find on our Benchmarks Launchpad



6 thoughts on “Intel 320 SSD write performance – contd.

  1. tobi

    You might be getting different results each time because the SSD is garbage collecting in the background. The more time you let pass between test runs the longer the SSD has time to optimize itself.

    Reply
  2. Will Smith

    Have you tried varying the amount of time between 1st and 2nd runs?

    My hypothesis is that the SSD needs time to “recover” i.e. the garbage collection is running at low priority and only does its thing when you give the SSD a “pause”.

    Maybe there’s justification for leaving servers on overnight, or even, in extreme production environments, having a live-SSD and a recovering-SSD and alternating (perhaps each day?)

    Reply
  3. John Laur

    The GC algorithm can only erase blocks it are very sure do not contain data. How can it know this? Well, if a block is completely zeroed it might have a clue; or it can know something about the filesystem. Does Intel’s GC algorithm know about NTFS? Probably. How about Ext4? Maybe. How about XFS? Doubtful. But how can we know and plan for this when the algorithms is not well documented?

    When the drive is formatted only a small space of data is changed. To the SSD controller the drive is still full of blocks of allocated data from the previous run, so it’s not surprising in the least to see this behavior. The filesystem may not allocate files into exactly the same blocks on the second run as it did the first, so the amount of “free” erased blocks that the SSD can use for performance advantage is reduced. But by secure erasing and repartitioning with a smaller partition size you can ensure that a certain fixed set of blocks are never written at all. This in effect gives the user the ability to tune the reserved allocation of flash and limit the effect of the ‘write wall’ that is so well documented here. In fact, simply keeping your filesystem X% full may not be enough (and without a driver, controller, disk, etc. that supports TRIM all the way down the stack it will never be enough) – artificially limiting the partition size (or volume size on a RAID array) in this manner will be necessary.

    You also have to be very careful if you are building arrays with SSD as the build algorithm will have a tendency to write blocks to a large portion of each drive. (Raid-1 silver will “fill” a SSD) So the real solution is to be able to tune the amount of reserved flash on the hardware (iodrive) or at least be able to purchase drives with more native reserved space (most Sandforce drives).

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>