-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: perform a single hash validation of wheel files #8027
feat: perform a single hash validation of wheel files #8027
Conversation
I do not like the amount of pypa/installer that is being duplicated by poetry already, and this isn't helping! Are you sure you wouldn't prefer to fix up your MR in that project - I see that the pipeline is currently failing - and wait for it to be released? Then just use pypa/installer instead of re-implementing it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have copied my comments from the other PR so they will not be forgotten.
I agree that we should prefer a solution in installer
if possible. Another approach to avoid calculating the hashes twice might be to add a "use hashes from RECORD
file" parameter to the install()
function that can be set if validation was successful.
By the way, what's the performance penalty for calculating the hashes twice for a big wheel like torch?
computed_record = next( | ||
record for _, record in records if record.path == item.filename | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't always work. I tested a large project and got a StopIteration
for some packages.
You can reproduce it by running poetry add fonttools
for example.
What's even worth: The package is installed, but the RECORD
file is not written.
poetry run pip uninstall fonttools
now throws ERROR: Cannot uninstall fonttools 4.39.4, RECORD file not found. Hint: The package was installed by Poetry 1.6.0.dev0.
and poetry run pip install fonttools
gives Requirement already satisfied
.
You can only "manually" uninstall the package.
self._validate_hash_and_size(records) | ||
return super().finalize_installation(scheme, record_file_path, records) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._validate_hash_and_size(records) | |
return super().finalize_installation(scheme, record_file_path, records) | |
try: | |
self._validate_hash_and_size(records) | |
finally: | |
return super().finalize_installation(scheme, record_file_path, records) |
I wonder if we should do something like that to make sure the RECORD
file is written despite of bugs in _validate_hash_and_size()
. However, when I tried this, the exception was swallowed, which not good either. Thus, we would have to put more effort here...
It's almost double. Here's the benchmark while installing https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp39-cp39-linux_x86_64.whl (2.1 GB) performed on Poetry 1.5.1 and pypa/installer@5f281b8 (ie with the hash in-memory computation fix)
cc @Secrus as now maintainer of I agree this is not the best place to fix it, it was a quick hack in case we wanted to keep the feature in 1.5.1. I'll open an issue on |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Re-introduce wheel content file validation in a more performant way, aka computing (and validating) hash once while installing the wheel (and verifying the wheel at the end of the installation process).
Related to #7983 and #7987