Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File to upload exceeds max size #107

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

say-yawn
Copy link
Contributor

@say-yawn say-yawn commented Nov 11, 2019

About this PR

As a fix to #104 gracefully exit if the file is bigger than the max size.

Acceptance Criteria

  • If the file to upload exceeds the maximum file size that is downloadable by Firefox gracefully exit
  • Shavar list creation should continue to upload the rest of the lists

Practical test

Test shavar-list-creationg gracefully exits if the file is bigger than the max size on Firefox

  1. Go to shavar_list_creation.ini and add the section:
[google-whitelist]
entity_url=https://raw.githubusercontent.com/mozilla-services/shavar-prod-lists/create-seperate-entitylist-for-google/google-entitylist.json
output=google-trackwhite-digest256
s3_key=entity/google-trackwhite-digest256
  1. Go to constants.py and edit the WHITELIST_SECTIONS to include "google-whitelist"
  2. Run shavar-list-creation by executing the command python ./lists2safebrowsing.py
  3. Check that you get the following message at least once
    !!! google-trackwhite-digest256 NOT UPLOADED. File size is 1434496 which exceeded max size 1048576 !!!
  4. Check that the google-trackwhite-digest256 is NOT uploaded to S3

Dependent on chunk size increase on Firefox

As a fix to #104 gracefully exit if the file is bigger than the max size
@englehardt englehardt self-requested a review December 11, 2019 18:02
if file_size >= MAX_CHUNK_SIZE:
exceeded_max_size_msg = ('!!! {} NOT UPLOADED. '
'File size is {} which exceeded max size {} !!!')
print(exceeded_max_size_msg.format(
Copy link
Contributor

@englehardt englehardt Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use print instead of throwing an error? With print we will still upload lists which don't exceed the limit, but what if we may related changes across multiple lists. Let's say we add new domains to the blocklist which should have corresponding entitylist updates, but the entity list exceeds the chunksize. We'd upload the new blocklist but not the new entitylist, which might lead to breakage. Thus. it seems safer to throw an error. Are there any downsides to throwing an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No downsides I could think of. Though now I think about this, we should probably make the size be dependent on the version as well right? Since Nightly (73) will have the increased chunk size while the Beta and Release will still have the older max chunk size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too much work that would be safest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants