On Benchmarks on SSD

To get meaningful performance results on SSD storage is not easy task, let’s see why.
There is graph from sysbench fileio random write benchmark with 4 threads. The results were taken on PCI-E SSD card ( I do not want to name vendor here, as the problem is the same for any card).

The benchmark starts on the newly formatted card, and some period (fresh period A) you see line with high result, which at some point of time drops (point B) and after some recovery period there is steady state ( state C ).

What happens there, as you may know, SSD has garbage collector activity, and the point B is time when garbage collector starts its work. You can read more on this topic on
Write amplification wiki page.

So as you understand it is important to know, what the state the card was in, when the benchmark was running. Apparently, many manufactures like to put in the specification of device the result from fresh period A, while I think steady state C is more important for end users. So in my further results I will point what was the state of the card during benchmark.

However it makes task of running benchmark on SSD trickier. It is similar to benchmarks on database but up-down. The database just after start is in “cold state” and you need to make sure you have enough warmup and only take results in the hot state, when all internal buffers are filled and populated.
Well, you may say – just to put card in steady state C and run the benchmark, but it is only part of the problem.

The next issue comes from TRIM command. TRIM command is the command sent to device when the file is deleted, and it allows for SSD controller to mark all space related to file as free and reuse it immediately. Not all devices support TRIM command, for example the first generation of Intel SSD cards did not support it, while G2 does.
So why TRIM is the problem for the benchmark – basically if you delete all files, it returns the card to fresh state A. The many benchmark scenarios ( and my initial sysbench fileio scripts) suppose to create files at the start of benchmark and remove afterward. The similar issue is when you restore database from backup, run benchmark, and remove files. That it may happen during your run you cover all states A->B->C, and the final result is pretty much useless. So as the conclusion if you want to see the result from steady state you should make sure you have it in your benchmark.

As we speak about benchmark results, there is another trick from vendors, I want to put your attention. Quite often you can see in specification from imaginary Vendor X say:

  • Read: Up to 520 MB/s
  • Write: Up to 480 MB/s
  • Sustained Write: Up to 420 MB/s
  • Random Write 4KB : 70,000 IOPS

The good thing there is that vendor put both maximal write ( most likely from state A) and Sustained Write ( I guess from state C).
However if you multiply 4KB*70000IOS, you will get 280000KB/s = 274MB/s, which is quite far from
declared 520MB/s.
What is the trick there: the trick is that maximal throughput in MB/sec you are getting when you use big block size, say 64K or 128K, and maximal throughput in IOPS you are getting when you use small block size, 4K in this case.

So when you read Write: Up to 480 MB/s, Random Write 4KB : 70,000 IOPS, you should know that 480MB/s was received with big block size, and for 4KB block size you should expect only 274MB/s ( and most likely in fresh state A).

As SSD market involving, we will see more and more the benchmark results, so be ready to read it carefully.

Posted in benchmarks, ssd | 2 Comments

SLC and MLC

All modern solid state drives use NAND memory based on SLC (single level cell) or MLC (multi level cell) technologies.
Not going into physical details – SLC basically stores 1 bit of information, while MLC can do more. Most popular option for MLC is 2 bit, and there is movement into 3 bit direction.

This fact gives us next characteristics:

  • SLC provides less capacity
  • SLC is more expensive
  • SLC is know to have better quality chip, it fails less than MLC

Along with that there is also limitation on amount of write operations. SLC can handle about 100,000 write cycles, while MLC is 10,000 ( the numbers are rough, and changing with technology improvement)

No wonder that vendor very quickly come with next separation:

  • SLC for enterprise market ( servers )
  • MLC for consumer market ( desktops, workstations, laptops)

As obvious example here is Intel SSD cards: X25-E ( SLC) is sold as enterprise level card, and X25-M ( MLC ) is  sold for mass market. As another example of difference in capacity and price:

  • FusionIO 160GB SLC card price $6,308.99
  • FusionIO 320GB MLC card price $6,729.99

That is for the same price MLC card comes with doubled capacity.

However with increasing capacity difference between MLC and SLC is getting fuzzier. For MLC most critical part is software (firmware) algorithm which ensures a uniform usage of available NAND chips, and with bigger capacity it is much easier to implement.
This problem with handling lifetime and manage write cycle for MLC opened way for hardware solution like SandForce controller and recently Anabit announced “Memory Signal Processing (MSP™) technology enables MLC-based solutions at SLC-grade reliability and performance”.

Also important is increasing capacity for MLC devices, for example, if we take 10,000 writes vs 100,000 writes than to provide the same life time MLC would need about 10x more capacity, and
it seems not problem. I expect soon we will see MLC cards with 1600GB, which ideally will have the same lifetime as SLC 160GB cards.

On this way interesting to see Intel announces enterprise line for SSD card will be based on
eMLC
( enterprise MLC ), where each cell has 30,000 writes lifetime and with maximal capacity 400GB

So it seems market is gradually moving into “MLC is ready for enterprise” direction, and sounds as good option to have devices with high capacity and reasonable price in near future.

Some articles on this topics:

Posted in MLC, SLC | 2 Comments

Starting new blog

This blog is the project of Percona and intended to provide all information about SSD performance related topics.

Posted in ssd | 1 Comment