On the flip side, new concepts have been introduced in order to optimize the performance, life span and reliability of these novel devices as well. One such concept is the TRIM operation.
Layout of an SSD
SSDs are blazingly fast and are getting faster and cheaper every year. Their reliability also has improved quite a bit since their inception. However, SSDs are still not as reliable as magnetic media, neither are they as durable as a hard disk. In fact, the underlying read-write mechanisms are very different from what one sees inside an HDD.
To understand the problems an SSD suffers from, and why we need TRIM operation to overcome those problems, let’s look at the structure of the SSD first. Data is stored typically in groups of 4KB cells, called pages. The pages are then grouped into clusters of 128 pages, called Blocks and each block is 512KB, for most SSDs.
You can read data from a page that contains some information or you can write data to pages that are clean (with no preexisting data in them, just a series of 1s). However, you can’t overwrite data on a 4KB page that has already been written to, without overwriting all the other 512KB.
This is a consequence of the fact that the voltages required to flip a 0 to 1 are often much higher than the reverse. The excess voltage can potentially flip bits on the adjacent cells and corrupt data.
Deletion Operation the Performance Degradation of an SSD
When data is said to be ‘deleted’ by the OS, the SSD merely marks all the corresponding pages as invalid, rather than deleting the data. This is quite similar to what happens inside an HDD as well, the sectors are marked as free rather than getting physically zeroed out. This makes the deletion operation much much faster.
In case of HDDs, this works just fine. When new data needs to be written, you can overwrite the old data on a freed sector without any issues or worries about the surrounding sectors. HDDs can modify data in-place.
In the case of an SSD, this is not so simple. Let’s say that you modify a file and that corresponds to a change of a single 4KB page. When you try to modify a 4KB page in an SSD, the entire content of its block, the whole 512KB of it, needs to read into a cache (the cache can be built into the SSD or it can be the system’s main memory) and then the block needs to be erased and then you can write the new data your target 4KB page. You will also have to write back the remaining unmodified 508KB of data that you copied to your cache.
This results adds to the phenomenon of Write Amplification where each write operation gets amplified to a read-modify-write operation for chunks of data that are much larger than the actual data that needs to be put in place.
Initially, this amplification doesn’t show up. Your SSD performs very well in the beginning. Eventually, as blocks get filled up, the inevitable point is reached where more and more write operations start involving the expensive read-modify-write operations. The user starts noticing that the SSD is not performing as well as it initially did.
SSD controllers also try to make sure that the data is spread out throughout the disk. So that all dies get equal levels of wear. This is important because flash memory cells tend to wear-out quickly, and therefore if we continuously use only the first few thousands of blocks ignoring the rest of the SSD, those few blocks will get worn out soon. Spreading data across multiple dies also improves your performance as you can read or write data in parallel.
However, now the writes are spread out, increasing the chances of a block having a page. This further accelerates the degradation process.
TRIM Command and Freeing of the Blocks
The TRIM command minimizes performance degradation by periodically trimming the invalid pages. For example, Windows 10 TRIMs your SSD once every week. All the data that has been marked as deleted by the OS gets actually cleaned out of the memory cells by the SSD controller when that operation is run. Yes, it still has to go through the read-modify-write operation but it happens only once a week and can be scheduled in the hours when your system is mostly ideal.
The next time you want to write to a page, it is actually empty and ready for a direct write operation!
The actual frequency of TRIM command depends on the kind of system you are running. Databases tend to do a lot of IOs and would thus require a more frequent trimming. However, if you do it too frequently the database operations will slow down for the period when TRIM is running. It is the job of a system architect to find the right schedule and frequency.
TRIM command is very useful in delaying the performance degradation of your device. It helps maintain the average performance of your device. But that’s only on average.
Suppose, if you are working with a text document and are constantly write to the file, editing things out and saving so you don’t lose any progress. The pages storing the document’s data will still need to go through the excruciating read-modify-write cycle because TRIM is not a service that’s constantly optimizing your SSD. Even if it did run as a service, the performance impact will still be visible because it is built into the very mechanics of an SSD’s operation.
Also running SSD TRIM too often can reduce the longevity of your storage. Since all that deletion and write-cycle will wear out the cells rendering the data stored within them read-only.
Despite all the shortcomings of an SSD it still packs massive performance benefits when compared against a traditional hard disk drive. As the market share for these magical devices grows, more research and engineering efforts will be directed towards bettering the underlying technology.
Operating system vendors, SSD chip manufactures and the people who write all the complex firmware logic come together to give us this awesome device. TRIM is but one of the many many layers of complexity that’s packed in there.