Linked by Thom Holwerda on Thu 27th Sep 2012 19:36 UTC
Apple I bought a brand new iMac on Tuesday. I'm pretty sure this will come as a surprise to some, so I figured I might as well offer some background information about this choice - maybe it'll help other people who are also pondering what to buy as their next computer.
Thread beginning with comment 536980
To view parent comment, click here.
To read all comments associated with this story, please click here.
Member since:

I've got a few issues with your calculation. Obviously "ideal wear levelling" doesn't exist generically: what's ideal for one pattern is non-ideal for other patterns. And in fact a 128GB SSD is likely to be comprised of at least 8 NAND chips, which for performance reasons are running in parallel and may not take part in distributed wear leveling. So removing your assumptions might decrease your calculation by at least a factor of 8.

Of course, there's also write amplification as well to factor in.

But given how modern SSDs work (basically a log structured device) write amplification should be quite low (approaching 1) and wear leveling will actually be close to ideal once static wear leveling is employed.

We are also talking about reliability on it's own, but in real devices reliability is one of many conflicting goals: performance, capacity, cost, dimensions. etc. The point I'm trying to make is that it's not safe to make assumptions. Even if the MTBF was 100% accurate, it only describes a curve with a multitude of failure points. Even with 3-5 years MTBF, you can still fail in a few months time. I'm just recommending those with write-heavy data loads take extra precaution against data loss with flash drives.

My point was that for most people, you're unlikely to hit the FLASH p/e limit even with 3000 p/e cycles. Firmware issues are more likely to toast your data than physical FLASH errors, which I admit has been a problem with the early generations of drives. But firmware is getting better and the market more mature.

Reply Parent Score: 1

quackalist Member since:

Was reading a review of the latest Samsung SSD's and for all the talk/tech about how well it functioned the one thing that caught my attention was the 5 year guarantee!!! Compare that to the risible guarantees that now come with mechanical HD's.

Reply Parent Score: 1

Alfman Member since:

"My point was that for most people, you're unlikely to hit the FLASH p/e limit even with 3000 p/e cycles."

I don't disagree, for many people with typical usage flash will outlive the device, but it all depends on your writing patterns. I stand by my opinion that if you regularly update very large datasets, the lifespan of these newer 3K P/E NAND chips can be consumed quickly. Mind you, I'm not trying to imply that everyone is at such a high risk. The most troubling common use case that's been mentioned is using NAND for swap "just in case a badly behaving app exceeds available memory", well that badly behaving app could be trashing the NAND's cells for no good reason.

That might not seem so bad if you've only got a small swap file. However the flash controller will be busy re-provisioning healthy under-utilized cells (where static files reside) and replacing them with highly active swap pages. This is done to extend the average lifetime overall, however it implies even more writes than the swapping alone, and it puts the static data in additional risk by moving them to older cells. We mustn't overlook the controller's own writes for it's page tables. Unlike a HDD where the disks are clicking like crazy, with SDD you might not even notice.

I think people may not realise just how flaky these MLC NAND chips can be. Not to scare anyone, but just to provide some insight, here is a screen shot of the NAND dump for the last flash case I worked on, which was one of the more popular brands.

Here we read in 5 pages, each page being read 4 times and repeated on screen. (We're only seeing the first 100 or so bytes out of the full ~6K page.) The colors highlight inconsistent read errors on each page (not showing the write errors). This is perfectly normal in newer chips where the controller's ECC is designed to compensate on the order of 30 bit errors per 1KB. It's fine as long as the errors don't exceed the ECC as engineered

Never the less, due to the probabilistic nature of some of these bits when read, it becomes non-trivial to deterministically calculate how many bits are bad by reading them once as the controller does - therefore it's conceivable that the controller will write data in a page, *thinking* it's still correctable via ECC, only to find out the data actually contained more bit errors than it's ECC could compensate for. Of course the engineers should anticipate this and should mark the page bad even when there are correction bits to spare, however unless the flash is heavily over provisioned, the controller has to be conservative in order to not prematurely use up all it's spare's a delicate balance resulting from the conflicting goals brought up earlier.

Anyway, I hope the main take away is: enjoy the real benefits of SSD, but please remember to keep a backup.

Reply Parent Score: 2

WereCatf Member since:

I was thinking about this last night and it would be an exceedingly interesting project to try and measure what kinds of workloads yield what kinds of results. It's easy enough to automate such testing under Linux, for example, all you need to do is collect enough information about the different kinds of workloads and then accelerate them, then do the math when the drive goes bust about how much real-world time it would've taken. I'm actually formulating a way of collecting said information right now and how to apply it to a benchmark, too bad that I have no financial means of buying a bunch of SSDs and burning them through until they die.

Reply Parent Score: 2