MLC SSD card lifetime and write amplification

As MLC-based SSD cards are raising popularity, there is also a raising concern how long it can survive. As we know, a MLC NAND module can handle 5,000-10,000 erasing cycles, after which it gets unusable. And obviously the SSD card based on MLC NAND has a limited lifetime. There is a lot of misconceptions and misunderstanding on how long such card can last, so I want to show some calculation to shed a light on this question.

For base I will take Virident FlashMAX M1400 (1.4TB) card. Virident guarantees 15PB (PB as in petabytes) of writes on this card.
15PB sounds impressive, but how many years it corresponds to ? Of course it depends on your workload, and mainly how write intensive it is. But there are some facts that can help you to estimate.

On Linux you can look into the /proc/diskstats file, which shows something like:

 251       0 vgca0 30273954 0 968968610 416767 122670649 0 8492649856 19260417 0 19677184 220200747

where 8492649856 is the number of sectors written since the reboot (sector is 512 bytes).

Now you can say that we may take /proc/diskstats stats with the 1h interval, and it will show write how many bytes per hour we write, and in such way to calculate the potential lifetime.
This will be only partially correct. There is such factor as Write Amplification, which is very well described on WikiPedia, but basically SSD cards, due an internal organization, write more data than it comes from an application.
Usually the write amplification is equal or very close to 1 (meaning there is no overhead) for sequential writes and it gets a maximum value for fully random writes. This value can be 2 – 5 or more and depends on many factors like the used capacity and the space used for an over-provisioning.

Basically it means you should look into the card statistic to get an exact written bytes.
For Virident FlashMAX it is

vgc-monitor -d /dev/vgca  | grep writes
                                 379835046150144 (379.84TB) (writes)

Having this info let’s take look what a lifetime we can expect under a tpcc-mysql workload.
I put 32 users threads against 5000W dataset (about 500GB of data on the disk) during 1 hour.

After 1 hour, /proc/diskstat shows 984,442,441,728 bytes written, which is 984.44GB and the Virident stat shows 1,125,653,692,416 bytes written, which is 1,125.65GB
It allows us to calculate the write amplification factor, which in our case is
1,125,653,692,416 / 984,442,441,728 = 1.143. This looks very decent, but remember we use only 500GB out of 1400GB, and the factor will grow as we fill out more space.

Please note we put a quite intensive write load during this hour.
MySQL handled 25,000 updates/sec, 20,000 inserts/sec and 1,500 deletes/sec, which corresponds to
write throughput 273.45MB/sec from MySQL to disk.

And it helps to calculate the lifetime of the card if we put such workload 24/7 non-stop.
15PB (of total writes) / 1125.65GB (per hour) = 13,325.634 hours = 555.23 days = 1.52 years

That is under non-stop tpcc-mysql workload we may expect the card will last 1.52 years. However, in real production you do not have an uniform load every hour, so you may base your estimation on daily or weekly stats.

Unfortunately there is no easy way to predict this number until you start workload on the SSD.
You can take look into /proc/diskstat, but
1. There is write amplification factor which you do not know
and 2. A throughput on regular RAID is much less than on SSD and you do not know what your throughput will be when you put workload on SSD.


This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to MLC SSD card lifetime and write amplification

  1. malkor13 says:

    Great article, makes you think if SSD really worth it in production enviroment…
    Surely the price is not there yet…I will assume that the disk you have tested should last about 2-3 years on medium workload..

  2. Phil Robins says:

    I think that even in most production environments – 1125GB per hour 24/7 is way beyond most applications – many systems are read heavy as opposed to write. Of course the trick is to only use MLC in such environments where such heavy writes would not occur.

  3. alec matusis says:

    I cannot find lifetime write allowance for FusionIO duo MLC cards. From what their sales rep told us, it’s 2x the card capacity daily writes for 10 years. Is this spec available anywhere?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>