Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is 650 GB per day between hard disks connected by 1 GBit ethernet ok? #579

Open
svenha opened this issue Jan 4, 2024 · 4 comments
Open

Comments

@svenha
Copy link

svenha commented Jan 4, 2024

Two 8 TB HDD, connected over 1 GBit ethernet, fast CPUs on both sides. The initial backup took 11.5 days for 7.5 TB of data (speed: 650 GB per day). btrbk is version 0.32.6, package from Ubuntu 22.04.

I wonder whether I see a typical speed of a full btrbk backup or have a hardware problem or ...

@svenha
Copy link
Author

svenha commented Jan 17, 2024

OK. It seems I have hit some limitations of my mainboard.

Aside from this, I am still interested which backup speeds others achieve in HDD setups.

@tbone2k-git
Copy link

I am just a casual visitor looking into the btrbk project for the 1st time, but since no one answered your question, I will.. o)

I think it totally depends!

Transfer from HDD to HDD in general gets slower if:

  • lots of small files, the smaller the slower
  • lots of metadata on these files (permissions, extended attributes, preserved date/time stamps etc.)
  • concurrent access (operating system or applications are reading / writing while backup is running)
  • encryption (if your CPU / hardware as en/decoding support it helps big time, may not be noticeable in best case)
  • the source / target HDD is getting full (access times will drop off, can cut general performance in half easily)
  • fragmentation of files (performance >

Now that you copy over the network, things like

  • network topology
  • link speed
  • NIC settings (buffers, jumbo packets etc.)
    also come into play.

If there is RAID in the mix somewhere, even more things need to work as expected. An outdated or bad driver of your NIC or RAID controller can make all the difference, so can the MTU setting or buffer setting of your network card (NIC). If you make use of SMB and/or Linux and Windows machines in the mix, transfer speeds can be slow because of bad SAMBA / smbd / client settings as well.

I can tell from experience, that if things are slow, it can be dead slow, like 1.5mb/s from HDD to HDD over 1GBit network for any file size. In that case checking NIC settings and SMB is where I would start. But before you benchmark transfer speeds in a backup or file copy situation, you should make sure generic network transfer speeds are ok (with tools like "netio" e.g.).

If you have slow transfer speeds from A to B, but not between A to C and B to C, things can get complex, some driver, protocol or setting mismatch will kill the performance for a specific connection only, even though all things work fine and fast elsewhere and when tested on their own. Updating drivers or OS, changing the network card or switching to another SMB version can help.

What I would expect with non fragmented, basically empty HDDs:
If you copy a single 10GB file from HDD to HDD over 1GBit, it should transfer at around 110mb/s (max network speed).
If you copy a single 10GB file from HDD to HDD locally, it should transfer at around 200-300mb/s (max read/write of HDDs).

Small files (anything less than 50mb e.g.) and copying additional metadata for each file will slow things down noticeably.
Very small files (anything less than 5mb e.g.) will slow things down drastically. Very very small files (anything less than 50kb e.g.) will slow things down to almost full stop (not really, but compared to the bigger files it is basically no progress for hours.. o).

Maybe this of help to you, have a nice day. o)

@svenha
Copy link
Author

svenha commented Oct 9, 2024

@tbone2k-git Thanks for your ideas and comments. I think my main problem might be that I have many small files from text archives etc., several millions.

@tbone2k-git
Copy link

Yes, millions of small files are a problem for an initial backup at least, once the files are copied over and you only "sync" or "mirror" these files to destination, it should be quick as well, quicker at least. o)

If those files don't change any more, you could put them into an archive of some kind (zip, rar, whatever), so they can be copied in "one block", which should happen much faster.

You can probably still test if your transfer speeds are normal and as expected with some big files I guess, just to make sure all hardware components and the software involved is performing as expected. As mentioned you should see around 110mb/s over 1Gbit network with big files.

I sync 2-3 million files and around 14TB regularly, with 50-500GB of new data in each run. It takes around 30 minutes to determine what files are new or changed and which got deleted (with full log and counting bytes to be done etc.), then it starts deleting obsolete files in destination and finally it starts to copy the new or changed files over 1GBit network as well, which will take a varying amount of time (0-3h) depending on number of files and overall size in bytes.

This is in a Windows environment using Robocopy.exe though. Robocopy.exe is a tool provided by Microsoft with every Windows installation, it uses multi threading and some more tricks to copy / sync files including all metadata quite fast. Not sure how similar things compare from the Linux land. I still try to find out what options I have on Linux for doing what I do in the Windows world and its NTFS file system based snapshots (also called "Shadow Copies" or "Previous Versions").

It's kind of a bummer, that the btrfs snapshot / subvolume feature is not that easy to access in Linux. There are some tools flying around, like "btrfs assistent" and "snapper", but no real OS integration it seems. I cannot list and navigate between snapshots of a drive or folder with any of the available file managers e.g., I also cannot enable compression and encryption without handling the terminal, which I can do, but it should be one click away on the properties page of a folder if you ask me. o)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants