Skip to content

Commit

Permalink
Merge pull request #35 from pydiverse/pre-commit
Browse files Browse the repository at this point in the history
Improve pre-commit setup
  • Loading branch information
finn-rudolph authored Nov 27, 2024
2 parents 07cbc49 + 185bb5b commit 61949ee
Show file tree
Hide file tree
Showing 23 changed files with 1,258 additions and 5,971 deletions.
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1 +1 @@
* @pydiverse/code-owners
* @pydiverse/code-owners
2 changes: 1 addition & 1 deletion .github/scripts/docker_compose_ready.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# This script checks if all the services defined in our docker compose file
# are up and running.

set -e
set -e
set -o pipefail

running_services=$(docker compose ps --services --status running)
Expand Down
19 changes: 9 additions & 10 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,18 @@ on:
jobs:
lint:
name: Pre-commit Checks
timeout-minutes: 30
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout branch
uses: actions/checkout@v4

- name: Setup Pixi
uses: prefix-dev/[email protected]
with:
environments: py310

- name: Linting - Run pre-commit checks
run: pixi run postinstall && pixi run pre-commit run -a --color=always --show-diff-on-failure
uses: actions/checkout@eef61447b9ff4aafe5dcd4e0bbf5d482be7e7871 # v4.2.1
- name: Set up pixi
uses: prefix-dev/setup-pixi@ba3bb36eb2066252b2363392b7739741bb777659 # v0.8.1
- name: Install repository
# needed for generate-col-ops hook
run: pixi run postinstall
- name: pre-commit
run: pixi run pre-commit run -a --color=always --show-diff-on-failure

test:
name: pytest
Expand Down
63 changes: 58 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,68 @@
exclude: ^.pixi$
repos:
- repo: local
hooks:
# ensure pixi environments are up to date
# workaround for https://github.com/prefix-dev/pixi/issues/1482
- id: pixi-install
name: pixi-install
entry: pixi install -e default
language: system
always_run: true
require_serial: true
pass_filenames: false
- id: generate-col-ops
name: generate-col-ops
language: system
entry: python generate_col_ops.py
entry: pixi run python generate_col_ops.py
types: [python]
pass_filenames: false

- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.5.7
hooks:
# ruff
- id: ruff
name: ruff
entry: pixi run ruff check --fix --exit-non-zero-on-fix --force-exclude
language: system
types_or: [python, pyi]
require_serial: true
- id: ruff-format
name: ruff-format
entry: pixi run ruff format --force-exclude
language: system
types_or: [python, pyi]
require_serial: true
# mypy
# - id: mypy
# name: mypy
# entry: pixi run mypy
# language: system
# types: [python]
# require_serial: true
# taplo
- id: taplo
name: taplo
entry: pixi run taplo format
language: system
types: [toml]
# pre-commit-hooks
- id: trailing-whitespace-fixer
name: trailing-whitespace-fixer
entry: pixi run trailing-whitespace-fixer
language: system
types: [text]
- id: end-of-file-fixer
name: end-of-file-fixer
entry: pixi run end-of-file-fixer
language: system
types: [text]
- id: check-merge-conflict
name: check-merge-conflict
entry: pixi run check-merge-conflict --assume-in-merge
language: system
types: [text]
# typos
- id: typos
name: typos
entry: pixi run typos --force-exclude
language: system
types: [text]
require_serial: true
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ build:
sphinx:
configuration: docs/source/conf.py
formats:
- pdf
- pdf
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ pixi run pytest --postgres --mssql
## Testing db2 functionality

For running @pytest.mark.ibm_db2 tests, you need to spin up a docker container without `docker compose` since it needs
the `--priviledged` option which `docker compose` does not offer.
the `--privileged` option which `docker compose` does not offer.

```bash
docker run -h db2server --name db2server --restart=always --detach --privileged=true -p 50000:50000 --env-file docker_db2.env_list -v /Docker:/database ibmcom/db2
Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,4 @@ reference/api
:hidden:
changelog
license
license
2 changes: 1 addition & 1 deletion docs/source/license.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

```{literalinclude} ../../LICENSE
:language: none
```
```
7,053 changes: 1,156 additions & 5,897 deletions pixi.lock

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions pixi.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ hatchling = "*"
[feature.dev.dependencies]
ruff = ">=0.5.6"
pre-commit = ">=3"
pre-commit-hooks = "*"
taplo = "*"
typos = "*"
pixi-pycharm = ">=0.0.6"
pytest = ">=7.1.2"
pytest-xdist = ">=2.5.0"
Expand Down
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,6 @@ required-imports = ["from __future__ import annotations"]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.typos.default.extend-words]
nd = "nd"
2 changes: 1 addition & 1 deletion pytest.ini
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ markers =
mssql: a test that requires mssql (run docker-compose up)
ibm_db2: a test that requires ibm_db2 (see README.md)
skip_backends
skip_backends
4 changes: 2 additions & 2 deletions src/pydiverse/transform/_internal/backend/polars.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ def compile_col_expr(expr: ColExpr, name_in_df: dict[UUID, str]) -> pl.Expr:
*[compile_order(order, name_in_df) for order in arrange], strict=True
)

# The following `if` block is absolutely unecessary and just an optimization.
# The following `if` block is absolutely unnecessary and just an optimization.
# Otherwise, `over` would be used for sorting, but we cannot pass descending /
# nulls_last there and the required workaround is probably slower than polars`s
# native `sort_by`.
Expand Down Expand Up @@ -167,7 +167,7 @@ def compile_col_expr(expr: ColExpr, name_in_df: dict[UUID, str]) -> pl.Expr:
if partition_by:
# when doing sort_by -> over in polars, for whatever reason the
# `nulls_last` argument is ignored. thus when both a grouping and an
# arrangment are specified, we manually add the descending and
# arrangement are specified, we manually add the descending and
# nulls_last markers to the ordering.
if arrange:
order_by = merge_desc_nulls_last(order_by, descending, nulls_last)
Expand Down
12 changes: 6 additions & 6 deletions src/pydiverse/transform/_internal/backend/sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -682,24 +682,24 @@ def dedup_order_by(
# the user to come up with dummy names that are not required later anymore. It has
# to be done before a join so that all column references in the join subtrees remain
# valid.
def create_aliases(nd: AstNode, num_occurences: dict[str, int]) -> dict[str, int]:
def create_aliases(nd: AstNode, num_occurrences: dict[str, int]) -> dict[str, int]:
if isinstance(nd, verbs.Verb):
num_occurences = create_aliases(nd.child, num_occurences)
num_occurrences = create_aliases(nd.child, num_occurrences)

if isinstance(nd, verbs.Join):
num_occurences = create_aliases(nd.right, num_occurences)
num_occurrences = create_aliases(nd.right, num_occurrences)

elif isinstance(nd, TableImpl):
if cnt := num_occurences.get(nd.table.name):
if cnt := num_occurrences.get(nd.table.name):
nd.table = nd.table.alias(f"{nd.table.name}_{cnt}")
else:
cnt = 0
num_occurences[nd.table.name] = cnt + 1
num_occurrences[nd.table.name] = cnt + 1

else:
raise AssertionError

return num_occurences
return num_occurrences


def get_engine(nd: AstNode) -> sqa.Engine:
Expand Down
2 changes: 1 addition & 1 deletion src/pydiverse/transform/_internal/ops/signature.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ def best_match(self, sig: Sequence[Dtype]) -> tuple[list[Dtype], Any] | None:
]


# retunrs the index of the signature in `candidates` that matches best
# returns the index of the signature in `candidates` that matches best
def best_signature_match(
sig: Sequence[Dtype], candidates: Sequence[Sequence[Dtype]]
) -> int:
Expand Down
3 changes: 2 additions & 1 deletion src/pydiverse/transform/_internal/tree/col_expr.py
Original file line number Diff line number Diff line change
Expand Up @@ -1114,7 +1114,8 @@ def dtype(self):
self._dtype = copy.copy(
common_ancestors[
signature.best_signature_match(
val_types, [[anc] * len(val_types) for anc in common_ancestors]
val_types,
[[ancestor] * len(val_types) for ancestor in common_ancestors],
)
]
)
Expand Down
2 changes: 2 additions & 0 deletions src/pydiverse/transform/extended.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# ruff: noqa: A004

from __future__ import annotations

from ._internal.pipe.functions import (
Expand Down
2 changes: 1 addition & 1 deletion tests/test_backend_equivalence/test_dtypes.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from __future__ import annotations

from pydiverse.transform._internal.pipe.verbs import alias, filter, inner_join, mutate
from pydiverse.transform.extended import *
from tests.util.assertion import assert_result_equal


Expand Down
6 changes: 1 addition & 5 deletions tests/test_backend_equivalence/test_filter.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
from __future__ import annotations

from pydiverse.transform import C
from pydiverse.transform._internal.pipe.verbs import (
filter,
mutate,
)
from pydiverse.transform.extended import *
from tests.util import assert_result_equal


Expand Down
18 changes: 4 additions & 14 deletions tests/test_backend_equivalence/test_group_by.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,8 @@

import pytest

from pydiverse.transform import C
from pydiverse.transform._internal.pipe import functions
from pydiverse.transform._internal.pipe.verbs import (
arrange,
filter,
group_by,
join,
left_join,
mutate,
select,
ungroup,
)
import pydiverse.transform as pdt
from pydiverse.transform.extended import *
from tests.util import assert_result_equal


Expand Down Expand Up @@ -48,12 +38,12 @@ def test_mutate(df3, df4):
>> ungroup()
>> left_join(u, t.col2 == u.col2)
>> mutate(
x=functions.row_number(
x=pdt.row_number(
arrange=[u.col4.nulls_last(), t.col4.nulls_first()],
partition_by=[t.col1],
),
p=t.col1 * u.col4,
y=functions.rank(
y=pdt.rank(
arrange=[(t.col1 * u.col4).nulls_last().nulls_first().nulls_last()]
),
),
Expand Down
6 changes: 1 addition & 5 deletions tests/test_backend_equivalence/test_ops/test_ops_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,7 @@

from datetime import datetime

from pydiverse.transform import C
from pydiverse.transform._internal.pipe.verbs import (
filter,
mutate,
)
from pydiverse.transform.extended import *
from tests.util import assert_result_equal


Expand Down
6 changes: 1 addition & 5 deletions tests/test_backend_equivalence/test_ops/test_ops_string.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
from __future__ import annotations

from pydiverse.transform import C
from pydiverse.transform._internal.pipe.verbs import (
filter,
mutate,
)
from pydiverse.transform.extended import *
from tests.util import assert_result_equal


Expand Down
13 changes: 1 addition & 12 deletions tests/test_backend_equivalence/test_slice_head.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,7 @@
from __future__ import annotations

import pydiverse.transform as pdt
from pydiverse.transform import C
from pydiverse.transform._internal.pipe.verbs import (
alias,
arrange,
filter,
group_by,
left_join,
mutate,
select,
slice_head,
summarize,
)
from pydiverse.transform.extended import *
from tests.util import assert_result_equal


Expand Down

0 comments on commit 61949ee

Please sign in to comment.