Windows Storage Spaces

A single command wiped all my data:

Get-VirtualDisk | Remove-VirtualDisk -Confirm:$false

I blame Claude (An AI Chatbot). He told me to enter this into PowerShell.😭

My disaster recovery story

I use the Windows Storage Spaces (WSS) to manage my data storage system. I have two pools and I transferred the data from one of my pools to an external drive so I can alter that pool to another configuration. I use AI chat bots to bounce ideas and get some opinions. It was helping me with making adjustments with PowerShell commands. When I was prepared to clear one of my pools, I didn’t take a closer look when I entered that command. It ended up wiping both my pools. The pool that I didn’t want wiped out was my Library pool which contained all my family photos and videos, work files, amongst many other important data.

I spent time to research various partition recovery tools to see if I could reconstruct the array. I have successfully restored formatted and deleted partitions with various tools, but WSS proved to be infeasible to recover. I was able to reconstruct data by files, but it was not a practical recovery method.

Fortunately, I have a copy of all my files. See my tech guide about having backups.😌 I did lose some mostly unimportant data but it wasn’t a big deal.

Many would argue there are a lot of cons using WSS when compared to other RAID system, such as ones built on Linux or hardware solutions. However, I feel the benefits outweigh the downsides.

The advantages of WSS include its integration as a native Windows drive, allowing the built-in Windows File Explorer search to function intuitively for basic users. File contents are indexed, enabling searches by photo metadata, PDF/Word/Excel content, and general file patterns without requiring third-party search tools. When files are on a networked computer, only the host machine maintains the index, so client systems don’t need to index network locations. Additionally, WSS doesn’t require specialized hardware RAID cards and is included free with Windows.

Since all my drives were wiped (except my boot drive). I am using this as an opportunity to do more extensive research and testing. I’m quite surprised that there is very little online information about the WSS system. I’ve spent lots of time trying various configurations testing different values of drive counts, columns, interleave size and AUS cluster sizes. If you know me, you won’t be surprised that I made a big coloured Excel spreadsheets with extensive data analytics so I can figure out the most optimal way to configure my arrays for maximum performance.

My testing methodology

I run CrystalDiskMark and compare those scores along with my “real world” tests that I devised: Timing a copy of a single 100GB file from my nVME drive to the array. Timing the read of that same file. I also then timed the copying of all the files within C:\Windows\servicing\LCU, which happens to be hundreds of thousands of small files (roughly 290,000 files = 3.2GB),  this gave me a sense of writing tiny files to larger interleave sizes. It’s technically not random, but gives me a sense of “real world” performance. Then I time the read operations of all those files.

My findings:

  • I learned that “free” system RAM remembers the file data so to fairly test the read speed, I would briefly run Prime95 for 1 second to max out the memory usage and clear out any file caching before running subsequent tests.
  • My assumption that Base-2+1 columns values (3,5,9) which aligns with data interleave would have noticeable performance improvements. On the contrary, four drives aligned in four columns, perform about 48% faster than three of the same drives, even when AUS size matches stripe data size.
  • You can use less columns than # of disks. This just reduce data space efficiency. Parity and data will rotate among all the disks. Performance is proportional. So it doesn’t make sense to use less columns.
  • Interleave size makes the biggest difference in performance. My 4x16TB array seems to do best with 128KB chunk size. My 4x4TB array works best with 16KB interleaves. Higher and lower values vary the read vs writes, sustained vs random, so you should test these yourself before committing your entire data to be saved to the array. You won’t be able to change any of this without recopying all your data.
  • Formatting to the smallest AUS cluster size has negligible performance penalties on random and sequential reads and writes. Larger values can be detrimental even on sequential reads and writes. Thus, I would always format to the smallest AUS to minimize wasted space. It’s not uncommon to have 1 million files on your drives. Each file will, on average, waste half the cluster size. Thus 16KB cluster size NTFS format will waste about 8GB of storage space. Cutting that down to the minimum of 4KB means means you’ll have an extra 6GB of storage space.
  • The major downside to WSS is parity calculation stalls. Once in a while, typically after heavy writes, the whole storage pool will stall for exactly 1 minute which freezes programs that is accessing that data with no recourse but to wait. It is not predictable, and you can’t change the parameters on how long and when these parity calculation stalls happen. There also doesn’t seem to be a way to manually invoke this. It happens on both reads and writes with it comes to parity arrays.

tl;dr summary:

  • Even vs odd # of drives don’t factor into noticeable performance.
  • Match the column count to drive count.
  • Interleave size has the biggest impact on performance. Test a variety to find best performance.
  • Use the smallest AUS size for best efficiency.
  • You’ll need to suffer the occasional 1-minute parity calculation stalls.

Leave a Comment

Your email address will not be published. Required fields are marked *