Issue

Christopher
Technical Support
StableBit DrivePool
2.2.0.691 - 2.2.0798
Windows 10 (64 bit)
Public
Alex

The original trace doesn't decode with either .691 or .798 for Windows 10 x64. For traces, I need the exact version number and OS / architecture from when the trace was taken.

As for measurement, I've added a number of features to dpcmd to help troubleshoot these kinds of issues. I don't like dealing with "it looks like the measuring is off", we should be dealing with hard numbers.

Let me explain how "Other" is computed. For the used space on the pool, StableBit DrivePool reports the sum of all file sizes on the pool. It's as simple as that. Everything else that the underlying filesystem reports as used space is "Other".

Other space = (Sum of used space on all pool parts) - (Sum of all file sizes on the pool)

So what this is saying is that if the underlying filesystem reports more used space than we've accounted for from the file data, then that's reported as "Other". There can be plenty of perfectly legitimate candidates for other:
  • Other data stored on the pool part volume that's not part of the pool.
  • Slack space due to filesystem cluster size choice (e.g. 4K clusters cannot store less than 4K of data).
  • Directory indexes.
  • Alternate data streams.
  • Other filesystem metadata, USN log, allocation bitmap, ACLs, etc...
There's one additional special category of "Other", which is not immediately obvious:
  • Data written to any currently open files on the pool (which happens to change the file size).
This last category is due to how StableBit DrivePool updates its real-time file measurement data. That is to say, it only does so when a file is closed. This way, it doesn't have to query the actual file size from the disk (we use the underlying filesystem's data structures), and it doesn't have to worry about the file size changing while it's updating the statistics (because the file is already closed).

This means that if an application opened a file and then wrote a huge amount of data to it, the measurement size wouldn't update in the UI until that file was closed. This is normally not a problem, because the primary use of measurement data is to let the balancing algorithm decide when and how to balance the pool. Since we can't move open files anyway, having it work this way makes sense here. EXE files and memory mapped files may experience an additional delay in measurement updates, due to how Windows handles them (section objects, mapping mapping, etc...).

----

Now, with all of that said, how would you confirm whether measurement is incorrect (with the new dpcmd updates in .827)?
  1. Close all of the files on the pool, and ensure that those processes don't reopen for this test. Use this to help you:
    dpcmd list-open-files p:

    If you have any files that are open on the pool, the process / PID that is keeping those files open will be listed.

    You will need to close all open files for this test only. Normally, measurement works in real-time with existing open files. Do not use force-close-open-files. That only closes the file parts and not the files on the pool, which is not what we want here.

  2. Run:
    dpcmd check-pool-fileparts p:

    This will scan the entire pool and give you summary statistics at the end. This summary is the sum of all of the files and streams on the pool.

  3. Finally, run:
    dpcmd list-poolparts p: 1

    This will give you the pool measurement data from StableBit DrivePool in textual format.
By comparing the output from step 2 (under file sizes) and step 3 (under file parts), you will be able to tell if there are any discrepancies.

Do not include Stream sizes from Step 2 (those are considered "Other").

Do not include directory parts (and their streams) from Step 3 (those are considered "Other" as well).

If the numbers from Step 2 and Step 3 closely match and you're still seeing a lot of "Other", then you should be asking, "why is the underlying filesystem reporting more used space than is actually used by the pooled files?". As I've stated above, this is perfectly normal and expected. But if that discrepancy is huge, then you can use additional tools (like fsutil fsinfo ntfsinfo d:) to look into the situation.

The volume size information from dpcmd list-poolparts p: 1 comes from the underlying filesystem and not StableBit DrivePool. It's shown here for convenience.