-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s3fs is significantly slower than boto3 for large file uploads #900
Comments
In file-like mode (using The non-file upload-from-disk method is
and already has a larger 50MB block (called |
@martindurant Thanks for the reply! Increasing the block_size can indeed boost speed by up to 50% in my tests, but it's still about twice as slow as boto3, which I believe benefits from better concurrency handling. The put_file() method looks useful, but it seems to work only with file paths. Is there a version modified with fsspec that could, for example, support pickle.dump() directly? My main goal is to unify I/O operations with upath + fsspec, including S3 |
Would you care to test with #901 It's worth pointing out that s3transfer and maybe boto3 use threads and/or processes for parallelism, which matters in low0latency situations where the CPU time for stream compression might be significant. s3fs is single-threaded. |
I was trying to match your code of writing the whole contents of a file.
(where you could have I also started #901 for you to speed test. |
Hi, with pickle.dumps(), it calls fsspec's write() interface, so I think we only need to make change to line to enable the new block size. Meanwhile, in my testing I find max_concurrency has no impact to performance... Here is my testing code:
With an 1GB object, default block size 5MB gives me 40MB/s while 50MB gives 70MB/s, and 500MB gives 90MB/s. But boto3 gives 160MB/s. |
I've noticed that while uploading large files (greater than 1GB), s3fs.write() performs around three times slower than the boto3.upload_file() API.
Is this slower performance expected when using s3fs, and are there any configurations or optimizations that could improve its upload speed?
The text was updated successfully, but these errors were encountered: