-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove nan_as_null
parameter for DataFrame protocol
#226
Conversation
This could be considered a breaking change - not sure if this can just be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be considered a breaking change - not sure if this can just be merged?
I think it can be, as long as no implementation uses it yet. Which is covered largely by the discussion in gh-125 and in pola-rs/polars#10267 I believe.
However, to make extra sure, let's open PRs that remove the keyword where needed to implementers or ping the relevant authors. So far we covered:
- pandas
- Polars
- PyArrow
In addition, we know of:
- Modin (Cc @anmyachev as author of FEAT-#0000: enable interchange protocol for Series modin-project/modin#6361)
- Vaex (implementation at https://github.com/vaexio/vaex/blob/15245cf4332d4423ac58bd737aee27d911a1b252/packages/vaex-core/vaex/dataframe_protocol.py#L720)
- static-frame (implementation at https://github.com/static-frame/static-frame/blob/159039d4ec05001990393142cca7928ca04a011a/static_frame/core/protocol_dfi.py#L276)
They all have the keyword in their signatures, so even if it raises when a non-default value is passed in, it can't just be removed from the signature of __dataframe__
in the implementations, or it's going to break other libraries calling it with nan_as_null=nan_as_null
.
So perhaps we change the description in the docs here first to "DO NOT USE", then remove it from the consumers calling __dataframe__
, and only later on remove it from the signature of __dataframe__
?
Sounds like the right approach. I just opened a PR with a warning - feel free to edit the text however you fit: |
With gh-228 merged, let's close this PR. Thanks @stinodego! |
We'll still need to remove the parameter once the consumers have been updated, but up to you! |
Yes for sure - we can reopen this one later. It's just easier to not have PRs in the queue that aren't mergeable for quite a while. |
Closes #125
We discussed this today, figured I'd make a PR!