-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't replace underscores with hyphens in distribution name #921
Open
navrkald
wants to merge
3
commits into
pypa:main
Choose a base branch
from
navrkald:bugfix/dont_replace_underscores_with_hyphens_in_distribution_name
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Fixed replacing underscores by hyphens in package names. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More that this comes from elsewhere and is intended to match behavior with PyPI (the server we care most about). This ostensibly breaks that compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. Pls check https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#name
So underscores and hyphens shouldn't matter in PyPI and it should be backward compatible.
As well this PR is just correct implementation of PEP 508.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there has been some recent work on this, and possibly some changes. I want to dig into that before reviewing. That said, the linked guide is outdated; https://setuptools.pypa.io/en/stable/ is the authoritative documentation for setuptools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some initial research, I don't know what's the right thing to do, and it seems like there's ongoing discussion about package/distribution name normalization:
_safe_name
: Namespace packages with dots in them fail to upload to PyPI(cc @pfmoore @dstufft @ewdurbin as contributors to the above discussions)
It seems like
_safe_name
is attempting to implement normalization defined in PEP 503.For additional context, the issue that led to this PR states:
I think the important question here is: if Twine doesn't convert underscores to hyphens, will that be compatible with PyPI?
Additionally, it makes me a little nervous that Twine has its own normalization function. Has this logic been centralized anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't answer that, I'm afraid. But I do think that PyPI and twine should agree on either allowing unmodified distribution names (as defined in the project metadata) or using the same normalisation (and IMO PEP 503 is the main contender for a "universal" normalisation rule these days). On the other hand, PyPI should have access to the un-normalised name, so it can respect the project author's wishes in the UI and similar places. I don't have context for this issue, but if the field we're discussing here is the only place that the project name is passed to PyPI, I'd say that means that it should be passed unmodified.
My impression is that PyPI has much more difficult compatibility considerations, though, so for practical reasons it might be necessary to "follow what PyPI allows" for the short term. I'd view that as a temporary measure, though.
Packaging has
packaging.utils.canonicalize_name
which (as far as I know) implements PEP 503 normalisation. But the PEP 503 rule is designed to be easy to copy and paste, so I'd consider replicating it to avoid a dependency onpackaging
as perfectly acceptable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I agree that Twine shouldn't normalize filenames (which it doesn't currently do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see, I missed it switched back to effectively using
safe_name()
.So, the problem is still roughly the same, just being caused by the fact that
safe_name
normalizes_
to-
.You should not have to do any normalization of the name for PyPI.
Where PyPI wants things to be normalized it normalizes it itself, where it doesn't, it doesn't.
I think there was a time when that wasn't the case, and you had to do some normalization, unfortunately #70 and the linked issue #47 don't provide enough information for me to remember specifics.
If it were me, I'd be inclined to try removing all normalization from the
name
field and see if PyPI complains. I'm pretty sure it won't 1, but if it does then that would be a PyPI bug IMO.Footnotes
Assuming that the unnormalized name of the project matches the filename. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that twine should upload the file with the name as given (i.e., not normalise) I'm a bit concerned about the idea that neither twine nor the index server is validating that the files being uploaded are correct - where "correct" implies "normalised". But I guess twine can take a "trust the user" attitude, and servers might be prepared to accept unnormalised names. I'm not sure how the rest of the ecosystem would handle unnormalised wheel names (sdist names are still a bit of a mess so "legacy" forgives a lot there). I'm going to say that's not my problem though 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Difference that's splitting hairs but twine takes a trust the build-system approach followed by a trust the repository server to error with a semi-helpful error response we can provide to the user.
Genuinely, I'm not certain twine needs to exist. It provides very little benefit at this point. Most of what it does for upload can now be achieved with the standard library (historically not the case) on most (if not all of the supported versions). Most of the checks it has are simplistic and could be done by a build system to verify its output during a finalisation stage (where it would provide more value and immediate feedback to the user closer to where the issue is).
There was a dream of twine becoming a build-then-upload tool and way of hiding every build backend but neither Brian nor I have made that happen. Speaking for myself, I certainly don't have the time or desire for that. With the number of build backends, it would likely be a nightmare to paper over all their APIs (assuming they're supporting an API for us to use) and we'd have to work with whatever existing config file they're using, section, etc. Just doesn't seem worth the effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a standard API for it, so you wouldn't need to paper over anything, just use the standard APIs... that being said, there's already a project doing that (https://github.com/pypa/build) now so adding another one seems silly.