-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UX for constraining build-time and runtime dependencies #29
Comments
I have let this sink in a bit more and thought about it. The only slight worry I have with this approach, is that the wheels will not be necessarily reproducible. I don't think it's a big issue, as anyone serious about reproducible wheels should be reconstructing the original build environment anyway1, and this won't be in issue there. Still, I feel very slightly uneasy about the built artifacts being highly dependent on the environment ended up being resolved by Pep 517 builders. Just something worth nothing IMO. Other thing, I think we should be adding a local tag to the version (eg. The main design decision here would be how to specify the locked version, I would really like to gather some feedback from other people here. A similar problem, and its solution, can be found at https://www.python.org/dev/peps/pep-0440/#compatible-release. My initial proposal would be: ...
[tool.mesonpy.pin-build-dependencies]
numpy = '*.*.*' And the tricky part, the patterns:
All the examples would pin the first three version number (eg. for Anyway, none of this is trivial to implement, and there isn't much difference in difficulty either, I think regex might be a bit easier? So, I think we should go with what's easier for users, unless that proves itself problematic when implementing. That said, it is worth to point out that if we are constraining the dependencies on the fly, we need to put ...
[project]
...
dynamic = [
'dependencies',
]
[tool.mesonpy]
dependencies = [
'numpy~=3.2.1',
]
[tool.mesonpy.pin-build-dependencies]
numpy = '*.*.*' Footnotes
|
Agreed, need to have a reproducible environment for building. I think this topic is something we're going to have to teach people a bit about anyway (I've started some write-ups). It is absolutely untrue that PEP 517 will give you reproducible builds when you have compiled extensions. The illusion that they are reproducible is more harmful than helpful imho.
Hmm, I'm not sure I like messing with version strings, that seems unjustified (and what do you get when you have say five such dependencies?). A project like SciPy will already have a complex version string - from the latest nightly: scipy-
Interesting - yes there's a lot of design flexibility here. I have no clear opinion yet. Note that conda-forge implements this in a reasonable form, looking at its
Ah, good point. That may be a problem - aren't other tools supposed to be reading that |
I looked at this again, and want to point out that Your example |
(this is a drive-by comment, sorry!) Regarding 4, a couple of ideas... To go the route of mirroring the existing style from conda-build, where you are "calling" a function.
Alternatively, lean into structured TOML data:
Well, depends on how/when you think those two should be composed. Basically, I'm imagining something like...
(a) If it's OK to have an environment variable that is always set during release... That'll let you have dynamic dependencies via a (b) It's also reasonable to skip dynamic build dependencies and instead rely upon the build "flow" difference between This, of course, means that you're assuming that all sdist builds are eventually-public and I reckon that's reasonable. IIUC, between a strategy for pinning runtime numpy based on the build environment and, a way to ensure oldest-supported-numpy be added as a build dependency for release builds will be sufficient to get the behaviours you want? FWIW, you could even squish all of this into a single table...
What you're describing here feels to me like a poor man's substitute for an SBOM, for describing the build environment. There's no need to try and squish that into the local version label IMO. It might reasonable to just dump the relevant build environment information into a file in the sdist/wheel, and use that... instead of trying to squish things into the local version label. And, if you want the protections that a local version label provides against a PyPI upload, put a sentinel value in there when you generate the non-release variants. Footnotes
|
Thanks for the input @pradyunsg!
TIL there's such a thing. I don't have a real preference here - both syntax choices work, and are about equally readable. I'm not quite sure what
I'd prefer to avoid that, because it's going to be one of those things people forget whenever they build outside of the release CI jobs.
I like this option better. A little heretical for a PyPA/Pip dev to suggest it perhaps, but hey I'll take it:)
Yes, I believe it is. That said, it's an additional improvement over having to make manual changes in release branches, and it's currently of secondary importance compared to
We partly do that anyway, for example the detected BLAS/LAPACK version gets written to a generated |
I guess I used constraint instead of |
Ah, this makes sense now, thanks. And it's another choice to make. We don't just need
or it could be done separately like I originally suggested:
I think in the end all of these are roughly equivalent, it's mostly a matter of preferred syntax (what's easiest to read & parse). I suspect structured TOML is best from a parsing perspective. |
Okay, after spending quite some time with it, I think a viable option would be to follow the normal requirement specification format, but using the [tool.mesonpy.binary-runtime-constrains]
numpy = '>=@.@.@' # for 1.20.3, it would add >=1.20.3 to the requirements
pythran = '~=@.@.@' # also equivalent to ">=@.@.@,==@.@.*", for 0.10.0 it would add >=0.10.0,==0.10.*
torch = { strategy='exact' } # this would match the exact same version Can y'all try to break this approach? |
I've tried to understand what exactly is the problem we are trying to solve, I've read all the comments, but I'm not completely sure of the problem formulation. A way to derive the runtime dependencies from the version of the dependencies at the time of compilation is required. However, I'm not sure I understand all the details. Let's assume we have a package A that exports some C API. Package A follows some rules for API and ABI compatibility. Let's say that minor versions ( [build-system]
requires = [
'A ~= X'
] If at the moment of compilation package A is available in version X.Y.Z, the resulting binaries of package B are compatible only with versions X.Y.* thus we would like to have the equivalent of [project]
dependencies = [
'A ~= X.Y'
] encoded in the wheels. This seems reasonable. However, what if we want to release a version of B compatible with A version X.Y+1 ? We need to recompile it using this version of A. However, because the dependencies versions are not recorded in the wheel tags, the wheels for B compatible with A versions X.Y and X.Y+1 are identical. The only way I see to solve this is to have specific version of B compatible with specific ABI compatible versions of A. This requires having the same version specification in |
No, it'd have to be
Again, it'd need to be
Yes, you'd need to make sure you build in an environment with
The wheel names are identical if we don't add an extra identifier, but the wheels themselves would be different, and contain different metadata. Installers will refuse to install a |
Yes, thank you. The syntax of the
Sure, but what do you do with a wheel with the same file name bud different metadata? You cannot upload it to PyPI and I think any other wheel archive would reasonably do not allow to replace an existing version with a different binary. If you cannot redistribute the wheel, is this relevant only for local installations? Namely we need to make sure that when A version X.Y+1 is installed a compatible version of B is also installed. |
You can add a local version identifier to differentiate it, that will give you a different wheel name, allowing you to upload it. Supporting uploading different wheels with the same name is also something PyPI could potentially support in the future. PyPI does not support it yet, but PEP 658 provides an API for installers to look directly at the metadata before downloading the wheel. Right now, installers will just download wheels until they find a compatible one. |
Quick first question: what happens with version specifiers like |
Nothing. I haven't decided what to do when you try to match one of these modifiers and one is not present, but my initial instinct would be to error out. |
Distributing wheel with the same version but different content is IMHO a very bad idea. You can add an identifier to the file name, but not the version stored in the wheel metadata. It seems an even worst idea to me. However, supporting locally build wheels is necessary. What kind of operations on the version identifiers do we need to support? Which formats of version identifiers do we need to support? I think we need to support all version identifier formats allowed by PEP 440 and we need to be able to replace the build version into an arbitrary string, with the possibility to drop version identifiers components at the end of the identifiers and replace them with |
The problem wit this is that if you have version |
I disagree, there's a point you simply cannot provide all the compatibility details in the name. This is already true btw with
You can add an identifier to the version, the local identifier field (eg.
I don't know. Something that's good enough for most people, and provide an escape hatch for the ones it isn't.
Yes, and I think that's the main issue with this approach, but I think we'll have issues with these kinds of matching semantics with other approaches too. There are a couple things we can do to fix this, I am struggling to decide what I think would be better, but perhaps we should just backfill the pre, post, and dev release fields based on the operator? |
Sure, but if you are touching it up outside the build tool, then you can in the same way adjust the dependencies manually and we don't need to be providing infrastructure for this. And if you are updating the
The solution is to provide a bit more complex language. A first attempt could be something based on the Python string formatting language using the fields of the [tool.meson-python.dynamic]
dependencies = [
'numpy >= {v}',
'foobar ~= {v.major}.{v.minor}',
'insane == {v.major}.*.{v.micro}',
] This is also easier to implement. |
You only used
|
I understand the issue. I don't understand what the proper solution would be: independent of the way in which you arrive at it, how would the requirement string need to look like in this case? |
Well, the main issue with that now is that if the upstream releases a new version with a different number of release parts, our constraint significantly changes its meaning. >>> def match(version, template, check):
... v, c = Version(version), Version(check)
... s = Specifier(template.format(v=v))
... print(f'Does `{c}` match `{s}` (generated from `{template}`)? {c in s}')
...
>>> match('0.10.0', '~={v}', '0.10.1')
Does `0.10.1` match `~=0.10.0` (generated from `~={v}`)? True
>>> match('0.10.0.1', '~={v}', '0.10.1')
Does `0.10.1` match `~=0.10.0.1` (generated from `~={v}`)? False Remember that |
As we proposed, if the dependency is on a pre or dev version, we simply do not generate any constraint. My initial proposal was pinning it, but Ralf was worried because that would break development workflows, so we decided to just not generate any constraint. This is the behavior right now with any build backend anyway, so it shouldn't be a big issue. |
Sure, but that's what the user asked for. How would you fix that using any other form of templating? You cannot know what the user really meant. I understand there are limitation, but I don't see how you lift these limitations with a different form of templating. The only thing you can do is adapt the expansion of the templating to work is some special cases, but it would necessarily break in others. |
See the reply I just posted. |
That doesn't say what the Also, you can turn that question around: any project may have dev or pre versions, now or in the future. How can you then use anything beyond |
It applied to the last component.
Use it to match what? |
This is by design, dev releases aren't meant to be compared, and pre releases are only supposed to match when they're directly compared against other pre release, otherwise IMO we should cut our losses and just don't generate constraints for dev or pre releases (we would output a message mentioning this to the user though). |
See https://numpy.org/devdocs/dev/depending_on_numpy.html#adding-a-dependency-on-numpy for NumPy. SciPy does have a C API, mainly There's a lot more projects to which this applies I'm sure - pretty much anything with a C or C++ API I'd think:
I'm sure there's more. |
I think this is probably the only sane thing to do, given the constraints. I thought you had an issue with the templating form suggested, not with generating or not requirements. A possible addition to consider is the use of markers to constrain the generation of the pins on some conditions. For example: build-time-pins = [
'pythran ~= {v}; is_stablerelease', # or some alternative spelling of this
]
@property
def is_stablerelease(self):
return not self.is_devrelease and not self.is_prerelease could be added. |
I think it should be invisible/automatic rather than done via markers. This is pretty much always needed, and it's too easy to forget. |
Having the requirement emitted would make sense if the requirement is |
For which resolvers I had a look in the pip source code. AFAICT, pre-releases are always allowed to match in pip. |
Then so it would if it was
All of them, this is the default behavior mandated by the PEP. From https://peps.python.org/pep-0440/#handling-of-pre-releases
But I realize maybe I wasn't very clear, pre-releases are accepted if they are already installed, but by default they are not allowed when installing. Doing anything other than pinning isn't very meaningful anyway, as likely you won't want the same constraint for pre releases anyway. I don't think this is a big deal as people should not be publishing wheels built against pre releases.
Well, both, kinda 😅. They are intertwined, the constraint generation of dev and pre releases was definitely the main point, but I also have some thoughts about your syntax which I did not go into to avoid bikesheding as it is tied to the first point. From my reply above:
I think we are now all in agreement that skipping constraints on dev and pre releases is at least a reasonable way to fix the issues regarding the constraints against dev and pre releases. Adopting that into our design, it means we now only need to match the release section of the version, which can also simplify the syntax required. I don't want to come off harsh here, but considering that, I think the
I am happy to discuss alternative syntax, my main point here really is that we can avoid |
@henryiii, I know this is a bit1 involved, so it's totally fine if you don't want or can't get into it right now, but this is something we would maybe like to turn into a PEP later, so it'd be good to hear your feedback as a scikit-build maintainer. Having to move the dependencies field to a tool specific field sucks, so a PEP standardizing could help with that. IMO it could also make sense to relax PEP 621's requirement, to allow this case. Footnotes
|
I would like to add something here:
Also, I opened https://discuss.python.org/t/relaxing-or-clarifying-pep-621-requirements-regarding-dynamic-dependencies/24752 regarding allowing us to keep |
Thanks! I commented there. This will be important to settle. The PEPs here are very unhelpful. What I had in mind was simply taking the existing We could still consider ignoring the PEP's assumption/requirement if we can't think of any practical downside of doing so. This is already being done today by any ad-hoc approach to applying these constraints, and I'm not aware of problems with that. |
I've tested @dnicolodi's implementation in gh-319 with SciPy. It worked smoothly and built a wheel with the below diff on SciPy's diff --git a/pyproject.toml b/pyproject.toml
index 8f776b7d3..d828076cb 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -63,11 +63,6 @@ maintainers = [
# release branches, see:
# https://scipy.github.io/devdocs/dev/core-dev/index.html#version-ranges-for-numpy-and-other-dependencies
requires-python = ">=3.9"
-dependencies = [
- # TODO: update to "pin-compatible" once possible, see
- # https://github.com/FFY00/meson-python/issues/29
- "numpy>=1.21.6",
-]
readme = "README.rst"
classifiers = [
"Development Status :: 5 - Production/Stable",
@@ -88,7 +83,7 @@ classifiers = [
"Operating System :: Unix",
"Operating System :: MacOS",
]
-dynamic = ['version']
+dynamic = ['version', 'dependencies']
[project.optional-dependencies]
test = [
@@ -131,6 +126,15 @@ source = "https://github.com/scipy/scipy"
download = "https://github.com/scipy/scipy/releases"
tracker = "https://github.com/scipy/scipy/issues"
+[tool.meson-python]
+# base dependencies, used for the sdist
+dependencies = ["numpy>=1.21.6"]
+
+# additional requirements based on the versions of the dependencies
+# used during the build of the wheels, used for the wheels
+build-time-pins = ['numpy >= {v.major}.{v.minor}']
+ Unfortunately I do have a semantics question to raise, so let's deal with that first before discussing syntax first. The metadata generated is:
Which makes sense given the implementation .... but isn't quite what we were aiming for I suspect (right?). Ideally we want to replace I can also see an upside though - it makes quite explicit what is happening there, and there won't be issues if the installed version is too old (EDIT: Thoughts? |
PEP 508 is maybe not the most informative. PEP 440 is a bit nicer to read. Anyhow, my reading of these PEPs is that the comma separating expressions in version specifiers is to be read like an "AND", see https://peps.python.org/pep-0440/#version-specifiers Therefore, things like The reason why I decided to implement things like this is simplicity: the static dependency specifiers can be arbitrarily complex and replacing only one part of them is not trivial. It would require doing some algebra with version numbers, and I would happily avoid that, unless strictly required. I cannot guarantee that there is no tool out there with a bug that prevent it from interpreting these requirements in the right way, but it would be indeed a bug, and I expect it to be very unlikely.
I'm not sure I understand. Why would the first form error out? A 1.21.6 version would satisfy both inequalities. I don't see why there should be a preferred ordering for the inequalities. Is there a typo somewhere? |
Makes sense - I agree with the choice. Any potential surprise can be addressed with docs.
No there's no typo or ordering. What I meant there was that the sdist req is |
So okay, we're all good semantics-wise then I think. Before I try to go through this whole thread again regarding syntax or test/review in more detail, let me check my understanding of the state of things:
|
I don't find it intuitive. Actually, I still haven't understood how you say "the complete version identifier of the installed package" with this syntax: it cannot be
API? The only thing the user needs to know is that there is an object with some fields. Also the
There are infinite ways for users to screw up metadata syntax, and even more in which version requirements can be wrong. I don't think these are relevant concerns. Especially considering that we are discussing an advanced feature and not something used by inexperienced programmers.
The problem with thread safety is limited to the
Previously in the discussion you suggested to allow the users to provide via an entry point a function to call to generate the build-time version requirements. Now you are concerned that the users may provide a nonsensical string. How do the two things go together? I think that an user provided function has many more ways to do the wrong thing that the very simple substitution function I'm proposing. Also, how is it more difficult to avoid the user to use nonsensical strings using string formatting to substitute value than doing the same for the proposed Your objections to the string formatting approach are essentially that the user may use it incorrectly. I don't think that this should be the main concern. Eliminating this concern requires writing a non trivial parser for the I find the If you don't want to use simple string formatting, we could consider the use template strings, https://docs.python.org/3/library/string.html#template-strings |
@dnicolodi @FFY00 you both presented your arguments, and I think the differences are fairly minor compares to the similaries and the benefits of getting this merged. You clearly each have your preferences, but in the end we have to go with a single syntax. How about we let other maintainers choose their preferred syntax? I can spend a bit of time summarizing. And I also think @henryiii's opinion is important if |
Is this still something being worked on? Trying to figure out what stuff I can trim from my "keep an eye out for this" list. :) |
@pradyunsg There is a prototype in #319 but I have the feeling there isn't yet consensus on how the syntax for this feature should look like. |
I'm still interested in this (but, long TODO list and this isn't quite near the top). Also, the PEP that @henryiii posted a draft version for on Discourse which included "dependency narrowing" would be useful to have, because that would make use of this feature quite a bit easier (no need to move anything to a meson-python tool-specific version in |
Since it seems hard to come up with a syntax that covers all edge cases, what about the following very general approach inspired by setuptool's dynamic metadata? From a user's perspective, it would look something like: [project]
dynamic = [
"dependencies",
"version"
]
[tool.meson-python.dynamic]
dependencies = { attr = "myapp.helper.dependencies" }
version = { attr = "myapp.helper.version"} and then in |
@tobiasdiez Please note that the |
There is a practical UX problem with
pyproject.toml
that we need to solve. A short description would be:numpy
C API has to deal with API and ABI compatibility (see the numpy docs on that for details).numpy >= 1.19.5
(both a build and runtime dep)oldest-supported-numpy
(build dep metapackage, contains==
pins per platform) andnumpy>='buildtime_dep,<1.25.0' # should be
N+3where
N is current minor version.That is obviously a complex set of dependencies to express in
pyproject.toml
, and we can't have two sets in there at the same time. The solution @FFY00 and I just discussed is:main
branch, use the loose development dependencies (numpy>=1.19.5
)mesonpy
for the various kinds of pinnings people need for releases and API/ABI compatibility.A rough sketch of how (4) could look:
Note that this feature is needed for some other packages than just NumPy too (basically every package offering a C/C++ API has to deal with this), but NumPy alone is important enough - every package containing even a single Cython extension which uses NumPy has to deal with this. However, it's probably still a bit too specific to want native support in
pyproject.toml
via a standard that all tools must implement. Hence the choice to put this inmesonpy
.The text was updated successfully, but these errors were encountered: