Before posting, and to avoid disappointment, please read the following:

  • This forum is not for 2BrightSparks to provide technical support. It's primarily for users to help other users. Do not expect 2BrightSparks to answer any question posted to this forum.
  • If you find a bug in any of our software, please submit a support ticket. It does not matter if you are using our freeware, a beta version or you haven't yet purchased the software. We want to know about any and all bugs so we can fix them as soon as possible. We usually need more information and details from you to reproduce bugs and that is better done via a support ticket and not this forum.

Cloud scanning with large number of files

For technical support visit https://support.2brightsparks.com/
Post Reply
angerthosenear
Newbie
Newbie
Posts: 3
Joined: Fri Aug 16, 2019 2:47 pm

Cloud scanning with large number of files

Post by angerthosenear »

I'm backing up a folder we use for archival purposes. This gets about 11k new files per week. Backing up to S3, so far it is about 400k files in S3.

We are wanting to keep the past 7 days locally if we need to inspect any of these files, the rest are moved to S3 (deleting it locally).

The profile is set up and doing that just fine, however it must rescan the entire S3 every time. File and folder decisions are both set Source overwrite S3 (with move the file enabled) and if exists on source but not S3 to Move the file. Everything else is set to Do nothing.

From the docs since I'm using a move on the Source to Destination Fast Backup cannot be used. However with this quantity of files, scanning the objects that are in Source against the Destination would be significantly faster (11k files to scan in S3 versus all 400k). Basically an "Only scan Source against Destination" type option for users with massive numbers of Destination files.

Would changing the "What to do if the same file has been changed on Source and Amazon S3" to Do nothing fix it? The local files will always ben new and never changed.

The only alternative I can think of is to make this purely a backup operation to enable Fast Backup then a Powershell script in the Post-Run to delete the files. This doesn't seem ideal since it may delete a file that failed.

Any insight would be great. Thanks!
ZaphodBeeblebrox
Newbie
Newbie
Posts: 1
Joined: Mon Feb 27, 2023 7:40 pm

Re: Cloud scanning with large number of files

Post by ZaphodBeeblebrox »

I am experiencing a similar issue. I am connecting to a Sharepoint container with a large number of files. My backup job type in SyncBack is Mirror.

I'm seeing over 2 days of time spent scanning the Sharepoint container and retreiving "Delta Items".

Not sure if this is caused by a similar mechanism but it is a massively long time
Swapna
Expert
Expert
Posts: 1023
Joined: Mon Apr 13, 2015 6:22 am

Re: Cloud scanning with large number of files

Post by Swapna »

Hi,
Would changing the "What to do if the same file has been changed on Source and Amazon S3" to Do nothing fix it? The local files will always be new and never changed.
Sorry, the option "What to do if the same file has been changed on Source and Amazon S3" does not improve the scan time of a profile run. Please increase the "Number of scanning threads to use (too many will degrade performance)" under Modify profile > Expert > Cloud > Advanced page and see if it can improve performance.

As stated in the help file the option "Number of scanning threads to use" can significantly improve performance, but if too many threads are used then it can also significantly reduce performance (by overloading the network, CPU, and increasing memory usage).
I am connecting to a Sharepoint container with a large number of files. My backup job type in SyncBack is Mirror. I'm seeing over 2 days of time spent scanning the Sharepoint container and retrieving "Delta Items".
Tick the option "Do not use the delta API" under Modify > Expert > cloud > Advanced page to disable the use of Delta API, save the settings, and run the profile.

Thank you.
angerthosenear
Newbie
Newbie
Posts: 3
Joined: Fri Aug 16, 2019 2:47 pm

Re: Cloud scanning with large number of files

Post by angerthosenear »

Swapna wrote:
Fri Mar 03, 2023 4:34 am
Hi,
Would changing the "What to do if the same file has been changed on Source and Amazon S3" to Do nothing fix it? The local files will always be new and never changed.
Sorry, the option "What to do if the same file has been changed on Source and Amazon S3" does not improve the scan time of a profile run. Please increase the "Number of scanning threads to use (too many will degrade performance)" under Modify profile > Expert > Cloud > Advanced page and see if it can improve performance.

As stated in the help file the option "Number of scanning threads to use" can significantly improve performance, but if too many threads are used then it can also significantly reduce performance (by overloading the network, CPU, and increasing memory usage).
This isn't changeable. It's fixed at 1. However if I disable "Retrieve a list of all the files and folders then filter (faster in most situations)" then I can edit the number of scanning threads. The docs say: "If your cloud storage system has tens of thousands of files on it then you may need to disable this option as SyncBackPro may use a large amount of CPU time retrieving the list." But doesn't mention anything about the speed if this would even help. I can't imagine this would make anything faster versus requesting a full list.

I do notice in the log under S3 the "Waiting for parallel threads" is red and shows 3 mins 13 sec (in our recent run). So that setting may help? But I feel like fetching the file list is still a rather large ask. Seems like a lot of S3 API calls that are always going to be discarded anyways.

We have hundreds of thousands of files, not just tens of thousands. Crossing the half million mark soon. Locally we have about 10k files to be scanned which is fast enough.

The closest result we want is something like Fast Backup (since we never need to scan the cloud for differences), then just purge the files copied. And even easier in this case it will always be new files, never modified files that need to be updated.
Swapna
Expert
Expert
Posts: 1023
Joined: Mon Apr 13, 2015 6:22 am

Re: Cloud scanning with large number of files

Post by Swapna »

Hi,

Please send a support ticket with your profile configuration to [email protected] for assistance. Do not post (nor link to) your profile on this open forum.

Thank you
Post Reply