Data Deduplication on Windows Servers

Microsoft has posted some information about the usefulness of data deduplication on Windows servers. Data deduplication is particularly useful for file shares; virtual desktops; ISO images (disk image of optical discs); and backup snapshots. Microsoft estimates that an average of almost 50% of storage space can be saved on servers using data deduplication.

An admin can select a simple option for data deduplication in Server Manager:

After deduplication is performed, users and their applications will not have any notice that duplicate files have been removed. The deduplication process works by breaking files into chunks and identifying which of those chunks are unique.

In optimization original 'file streams' are replaced with reparse points that indicate which file chunks should be used.

The garbage collection process removes chunks that are no longer needed after files have been modified or deleted. Integrity scrubbing can be employed when files are corrupted in the chunk store. Backup copies are made of chunks which are used more than 100 times.