Skip to content

Commit

Permalink
Ensuring that empty S3 files are not moved between S3 Buckets when a …
Browse files Browse the repository at this point in the history
…Work is published (#1591)
  • Loading branch information
jrgriffiniii authored Nov 1, 2023
1 parent 7d13ef2 commit 02022a6
Show file tree
Hide file tree
Showing 3 changed files with 150 additions and 75 deletions.
10 changes: 10 additions & 0 deletions app/services/s3_query_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,16 @@ def data_profile
def publish_files(current_user)
source_bucket = S3QueryService.pre_curation_config[:bucket]
target_bucket = S3QueryService.post_curation_config[:bucket]
empty_files = client_s3_empty_files(reload: true, bucket_name: source_bucket)
# Do not move the empty files, however, ensure that it is noted that the
# presence of empty files is specified in the provenance log.
unless empty_files.empty?
empty_files.each do |empty_file|
message = "Warning: Attempted to publish empty S3 file #{empty_file.filename}."
WorkActivity.add_work_activity(model.id, message, current_user.id, activity_type: WorkActivity::SYSTEM)
end
end

files = client_s3_files(reload: true, bucket_name: source_bucket)
snapshot = ApprovedUploadSnapshot.new(work: model)
snapshot.store_files(files, current_user:)
Expand Down
36 changes: 36 additions & 0 deletions spec/fixtures/s3_list_bucket_empty_result.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>pdc-describe-test1</Name>
<Prefix>10-34770/pe9w-x904</Prefix>
<KeyCount>4</KeyCount>
<MaxKeys>1000</MaxKeys>
<IsTruncated>false</IsTruncated>
<Contents>
<Key>10-34770/pe9w-x904/SCoData_combined_v1_2020-07_README.txt</Key>
<LastModified>2022-04-21T18:29:40.000Z</LastModified>
<ETag>&quot;008eec11c39e7038409739c0160a793a&quot;</ETag>
<Size>10759</Size>
<StorageClass>STANDARD</StorageClass>
</Contents>
<Contents>
<Key>10-34770/pe9w-x904/SCoData_combined_v1_2020-07_README.empty.txt</Key>
<LastModified>2022-04-21T18:29:40.000Z</LastModified>
<ETag>&quot;008eec11c39e7038409739c0160a793b&quot;</ETag>
<Size>0</Size>
<StorageClass>STANDARD</StorageClass>
</Contents>
<Contents>
<Key>10-34770/pe9w-x904/SCoData_combined_v1_2020-07_datapackage.json</Key>
<LastModified>2022-04-21T18:30:07.000Z</LastModified>
<ETag>&quot;7bd3d4339c034ebc663b990657714688&quot;</ETag>
<Size>12739</Size>
<StorageClass>STANDARD</StorageClass>
</Contents>
<Contents>
<Key>10-34770/pe9w-x904/foo/</Key>
<LastModified>2022-04-21T18:29:40.000Z</LastModified>
<ETag>&quot;008eec11c39e7038409739c0160a793c&quot;</ETag>
<Size>0</Size>
<StorageClass>STANDARD</StorageClass>
</Contents>
</ListBucketResult>
Loading

0 comments on commit 02022a6

Please sign in to comment.