Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCS chunked request overwrites existing file instead of appending to it #144

Open
mtbohm opened this issue Jul 17, 2024 · 2 comments
Open

Comments

@mtbohm
Copy link

mtbohm commented Jul 17, 2024

We are using UpChunk to upload files into a Google Cloud bucket, using a signed URL generated by an external backend. This works perfectly, as we can see the uploaded files appear in the GCP bucket. However, when the filesize exceeds the configured chunk-size, the library makes several requests to the GCS upload URL (as expected), but after the chunked requests are done, only the latest chunk is stored in the GCS bucket. It seems like each PUT request containing a single chunk overwrites the previous one.

To illustrate, we set the chunk size to 4 MiB:

const upload = UpChunk.createUpload({
  endpoint: res.signedUrl,
  file: file,
  chunkSize: 4096, 
});

And then upload a file of size ~5.2 MiB. The first chunk's PUT has these request headers:

Screenshot 2024-07-17 at 23 27 34

And the second chunk has:

Screenshot 2024-07-17 at 23 27 43

But then after both requests are completed, the file in the GCS bucket looks like this:

Screenshot 2024-07-17 at 23 27 52

We've confirmed that, if you look in GCS after the first request is done, before the second one completes, the file is actually 4 MiB. This seems to confirm that the file is being overwritten, and not appended to.

Having gone through both Google Cloud's documentation, as well as UpChunk's docs, there is nothing that seems to indicate what might be causing this, or how to configure it to append instead of overwrite. Any thoughts would be greatly appreciated!

@mmcc
Copy link
Contributor

mmcc commented Jul 17, 2024

Hi there! This is a weird one I'll confess I haven't seen... This might be totally off base, but can you confirm which GCS upload type you're using? This library assumes you're using Resumable Uploads, and a mismatch there is the only thing that immediately comes to mind for what could be going on here.

@mtbohm
Copy link
Author

mtbohm commented Jul 18, 2024

Indeed, we are using resumable uploads. The signed URL that we use in the createUpload function takes this format (some values changed and obscured):

https://storage.googleapis.com/BUCKET_NAME_OBSCURED/tnq7h8crJn-KayBIouGq4?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=assets%40svc-acc-obscured.iam.gserviceaccount.com%2F20240718%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20240718T095653Z&X-Goog-Expires=599&X-Goog-Signature=956b59830e8d16d84c2ff813d700178d7da906dd9f27f1da3efd38bc705a96b68731dbea0b5dcc2a66adcc6fb67ceb432f147ceb4a4e3f02addbca82021357ee99efadf0de789de5ab41219aa96eb63566296c2f71959ddbb1ca528266c83a8e1d6f5fb6586f34e2b12ee4255690bfb652ad59b205f7c27e0ed79ceec1087b0f3809f8356849138685b3e497f875da0ebdb95cdc8b6f89cdb6e58a208c41952dca499530ad0cba2db808321dc4ede19ebf2490a9415ee3fe0f113eb77a59051c01d7a0ba298b7e699897b2113fbb1555fa63a8849ee2dc9ff84d19bac94e94b653614d4aa5250746253d42470cd63ba00e77a89a5fae92d6755018b1de3c9a49&X-Goog-SignedHeaders=content-type%3Bhost&uploadType=resumable

And we can confirm that uploadType=resumable was also included in the POST request to GCS to generate this URL (using the GCS golang library). So from what I can tell the URL is formatted correctly and the necessary parameters are there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants