Code Change Request

# 27334

Back to Code Changes

Christopher
Technical Support
StableBit CloudDrive
1.0.0.802
Windows 10 (64 bit)
Public
Alex

* [Issue #27334] For Google Drive and Box, an authoritative local storage engine is required for chunk IDs.
    - Drives utilizing the legacy RAM-based storage engine will refuse to mount (ChunkId_Persistent = false).
    - All existing non-authoritative databases will be converted to authoritative databases by indexing all of the chunks 
      stored at the provider. This is a one time process and may take a few minutes, depending on the size of the drive.
    - For Google Drive, we cannot rely on Google's ability to search for a file because that has been proven to be incorrect 
      at times.
Public
Alex

Ok, this makes perfect sense. I think we can finally fix this once and for all. Google's directory listing is simply sometimes excluding files / folders from the list, when querying by name (kind of ironic because Google is a search engine company).

This explains both issues:
  • When downloading a chunk, if we don't know whether a file ID exists for that chunk, we will query for it using Google's directory listing API (with a filter on the filename). If this query doesn't find anything, we continue as if the chunk doesn't exist and satisfy the read request with nulls.

    So I think the solution that we need in order to fix this is to not rely on Google's file name based querying for locating chunks. Right now, if you create a new Google Drive cloud drive, you will be using an authoritative local database for Chunk ID queries, so those drives should not be susceptible to this issue. Only legacy drives or drives that have opted to use the RAM chunk ID helper, through modifying advanced settings, are susceptible to this issue.
  • For the double CloudPart issue, that's caused by the same thing. When a cloud drive mounts, it attempts to look up the file ID of the CloudPart folder that corresponds to the UID of the drive. It does this, once again, using the Google directory listing API. If what we're seeing here is right, then that query sometimes doesn't work and we end up not finding the cloud part folder. In builds prior to .786 we would actually recreate the CloudPart folder, which would cause two identical CloudPart folders to exist on the provider, and thus effectively splitting the cloud drive's data. The fix in .786 should effectively prevent that from happening and what the user will get is a cloud drive mount error (which they are able to retry).
As far as fixing this:

For #1, I will write code to migrate everyone to a authoritative chunk ID database on the first drive mount. This process might take some minutes, but it's a one time process and it should fix this once and for all.

And #2 should already be fixed as of .786.

"If this is related to the Chunk ID system, wouldn't deleting the database "fix" the issue? "

No, that will in fact cause the issue. By deleting the database you are forcing the chunk ID helper to rebuild it. This forces it to query Google for each chunk ID as needed, and as these queries are sometimes unreliable, you're effectively building an incorrect database.

The way to prevent this issue right now is to never delete the database after creating a new drive, because it contains the only valid record of all the chunk IDs.