Linked by malxau on Tue 17th Jan 2012 13:34 UTC
Windows Along with Storage Spaces coming in Windows 8, ReFS forms the foundation of storage on Windows for the next decade or more. Key features of the two when used together include: metadata integrity with checksums; integrity streams providing optional user data integrity; allocate on write transactional model for robust disk updates; large volume, file and directory sizes; storage pooling and virtualization making file system creation and management easy; data striping for performance and redundancy for fault tolerance; disk scrubbing for protection against latent disk errors; resiliency to corruptions with "salvage" for maximum volume availability in all cases; and shared storage pools across machines for additional failure tolerance and load balancing.
Thread beginning with comment 504058
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[2]: liking the direction
by malxau on Fri 20th Jan 2012 15:23 UTC in reply to "RE: liking the direction"
malxau
Member since:
2005-12-04

It seems that MS ReFS only has checksums for metadata, so the data itself might still be corrupted. ReFS exposes an API so a vendor can use checksums on data, but that is not default behavior.


From the article:
In addition, we have added an option where the contents of a file are check-summed as well. When this option, known as “integrity streams,” is enabled, ReFS always writes the file changes to a location different from the original one. This allocate-on-write technique ensures that pre-existing data is not lost due to the new write. The checksum update is done atomically with the data write, so that if power is lost during the write, we always have a consistently verifiable version of the file available whereby corruptions can be detected authoritatively...

By default, when the /i switch is not specified, the behavior that the system chooses depends on whether the volume resides on a mirrored space. On a mirrored space, integrity is enabled because we expect the benefits to significantly outweigh the costs. Applications can always override this programmatically for individual files.


It's the default behavior when redundancy on spaces is present. When that is true, we can use the data checksum to find a good copy if one exists and another is bad. Without redundancy, all we can do with the checksum is prevent bad data going to applications - ie., start failing requests. Since applications will not always deal with that gracefully, the benefit without redundancy is much more limited - where failure really is better than incorrect data.

I realize many people might want to get religious on this point, but seriously, watch what happens when reads are failed under applications first. Particularly consider what happens when a read for a block of code is failed, for example. Then consider the fraction of that code that will actually be executed.

- M // ReFS co-author

Reply Parent Score: 2