Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io: remove pandas based io for lp writing, use polars instead #366

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/release_notes.rst
Original file line number Diff line number Diff line change
@@ -5,6 +5,7 @@ Upcoming Version
----------------

* When writing out an LP file, large variables and constraints are now chunked to avoid memory issues. This is especially useful for large models with constraints with many terms. The chunk size can be set with the `slice_size` argument in the `solve` function.
* To achieve better performance, the LP file writing is now using the `polars` package per default. Setting `io_api` to `lp-polars` is therefore deprecated, as the standard `io_api=lp` uses the `polars` package. The user should see no difference from this change but faster lp file writing. The previous `pandas` based implementation was removed.

Version 0.3.15
--------------
21 changes: 16 additions & 5 deletions linopy/common.py
Original file line number Diff line number Diff line change
@@ -340,13 +340,24 @@ def check_has_nulls_polars(df: pl.DataFrame, name: str = "") -> None:

Raises:
------
ValueError: If the DataFrame contains null values,
a ValueError is raised with a message indicating the name of the constraint and the fields containing null values.
ValueError: If the DataFrame contains null values or NaN values,
a ValueError is raised with a message indicating the name of the constraint and the fields containing nulls.
"""
has_nulls = df.select(pl.col("*").is_null().any())
null_columns = [col for col in has_nulls.columns if has_nulls[col][0]]
null_check = df.select(
[
(
pl.col(col).is_null()
| (pl.col(col).is_nan() if dtype == pl.Float64 else False)
)
.any()
.alias(col)
for col, dtype in zip(df.columns, df.dtypes)
]
)

null_columns = [col for col in null_check.columns if null_check[col][0]]
if null_columns:
raise ValueError(f"{name} contains nan's in field(s) {null_columns}")
raise ValueError(f"{name} contains null/nan values in field(s) {null_columns}")


def filter_nulls_polars(df: pl.DataFrame) -> pl.DataFrame:
Loading