Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Kedro compatibility with workflow tools like uv, Hatch, PDM, Rye, Poetry #3974

Open
galenseilis opened this issue Jun 30, 2024 · 6 comments
Labels
Component: Documentation 📄 Issue/PR for markdown and API documentation

Comments

@galenseilis
Copy link

Description

I'm sometimes frustrated when managing project dependencies and virtual environments, especially as project complexity grows. Traditional tools like pip and venv can be cumbersome and lack advanced features for dependency resolution (although pip is better than it used to be), version management, and project configuration. This often leads to conflicts and inefficiencies.

I would like Kedro to support modern package managers such as Hatch, PDM, Rye, or Poetry. These tools offer robust dependency management, streamlined environment setup, and enhanced configuration capabilities that can greatly improve the developer experience and productivity.

While it's possible to manually configure these package managers alongside Kedro, native support would ensure seamless integration and reduce the overhead associated with maintaining separate configurations. This would also help standardize the development workflow across teams.

Although I don't think it is for everyone yet, I have 'really' enjoyed using Rye.

Context

This change is important to me because it simplifies dependency management, reduces configuration overhead, and enhances the overall developer experience. By using modern package managers like Hatch, PDM, Rye, or Poetry with Kedro, I can:

  • Easily create and manage virtual environments (e.g. Rye's Shims are amazing), ensuring consistent project setup across different machines and team members.
  • Automatically resolve and manage dependencies, minimizing conflicts and versioning issues.
  • Utilize advanced configuration options provided by these package managers, leading to more maintainable and scalable project setups. For example, many of these tools support some type of plugin mechanism so that custom CLI commands can be added. I'm not sure if/how that would interact with Kedro's plugin system.

For example, PDM's (and others) built-in support for CLI tools that modify pyproject.toml simplifies dependency declaration and management, while its lockfile ensures reproducibility. This is particularly beneficial in collaborative environments where consistency is crucial.

How it can benefit other users:

  • Other users will benefit from faster setup and less time spent troubleshooting dependency issues.
  • Standardizing on modern package managers promotes consistency and best practices across projects. An initial Kedro setup will be consistent because of Kedro's templating based on Cookiecutter, but after that packaging can drift.
  • Users can take advantage of advanced features like dependency groups, easier publishing, and more powerful environment management. (e.g. see here).

Overall, integrating support for these package managers would align Kedro with modern Python development practices and significantly enhance its usability for a broad range of users.

Possible Implementation

None

Possible Alternatives

  • One alternative is we stay with the current state.
  • Another alternative is that Kedro develop its own package management system based on its existing plugin mechanism.
@galenseilis galenseilis added the Issue: Feature Request New feature or improvement to existing feature label Jun 30, 2024
@astrojuanlu
Copy link
Member

Hi @galenseilis, thanks for opening this issue!

Luckily, Kedro is already compatible with all PEP 621-compliant tools, and also with Poetry. I have personally enjoyed using PDM for most of my personal Kedro projects for a while.

there are 2 ways to go about this:

  1. From an existing project created with a normal kedro new (hence our official starters, using setuptools at the time of writing), then either change the [build-system] table manually, or use your desired workflow tool (for example pdm init as explained in https://pdm-project.org/en/stable/usage/project/#import-the-project-from-other-package-managers)
  2. Create your own Kedro starter that uses your desired workflow tool instead of setuptools

If going for the latter, the workflow would be

$ pdm init
Creating a pyproject.toml for PDM...
Please enter the Python interpreter to use
 0. [email protected] (/Users/juan_cano/Projects/QuantumBlackLabs/kedro-init/.tmp/.venv/bin/python)
 1. [email protected] (/Users/juan_cano/Projects/QuantumBlackLabs/kedro-init/.tmp/.venv/bin/python3.12)
 2. [email protected] (/Users/juan_cano/Projects/QuantumBlackLabs/tmp/ml_observability_course/.venv/bin/python3.11)
 3. [email protected] (/usr/bin/python3)
 4. [email protected] (/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/bin/python3.12)
Please select (0): 
Project name (tmp): spaceflights-pdm
Project version (0.1.0): 
Do you want to build this project for distribution(such as wheel)?
If yes, it will be installed by default when running `pdm install`. [y/n] (n): y
Project description (): Spaceflights Kedro project, using PDM
Which build backend to use?
0. pdm-backend
1. setuptools
2. flit-core
3. hatchling
Please select (0): 
License(SPDX name) (MIT): None
Author name (Juan Luis Cano Rodríguez): 
Author email ([email protected]): 
Python requires('*' to allow any) (>=3.12): >=3.9
Project is initialized successfully
$ tree
.
├── README.md
├── __pycache__
├── pyproject.toml
├── src
│   └── spaceflights_pdm
│       └── __init__.py
└── tests
    └── __init__.py

5 directories, 4 files
$ cat pyproject.toml 
[project]
name = "spaceflights-pdm"
version = "0.1.0"
description = "Spaceflights Kedro project, using PDM"
authors = [
    {name = "Juan Luis Cano Rodríguez", email = "[email protected]"},
]
dependencies = []
requires-python = ">=3.9"
readme = "README.md"
license = {text = "None"}

[build-system]
requires = ["pdm-backend"]
build-backend = "pdm.backend"


[tool.pdm]
distribution = true

then:

$ kedro-init .
[02:16:47] Looking for existing package directories                                                                                                       cli.py:25
[02:16:53] Initialising config directories                                                                                                                cli.py:25
           Creating modules                                                                                                                               cli.py:25
           🔶 Kedro project successfully initialised!

And you're all set!

$ kedro registry list
- __default__

We encourage the community to create Poetry, PDM, Rye starters (and give my kedro-init a try if you so desire).

Probably evolving our official starters to use an alternative workflow tool isn't going to happen any time soon (until, of course, The One Tool Everybody Uses emerges 😉) so in principle I would say there is not much else for us to do, except perhaps document this better.

What do you think @galenseilis?

@astrojuanlu astrojuanlu added the Community Issue/PR opened by the open-source community label Jul 1, 2024
@astrojuanlu astrojuanlu changed the title ✨ Include or implement a package management system Document Kedro compatibility with workflow tools like Hatch, PDM, Rye, Poetry Jul 1, 2024
@galenseilis
Copy link
Author

Hi @galenseilis, thanks for opening this issue!

Luckily, Kedro is already compatible with all PEP 621-compliant tools, and also with Poetry. I have personally enjoyed using PDM for most of my personal Kedro projects for a while.

there are 2 ways to go about this:

1. From an existing project created with a normal `kedro new` (hence our official starters, using setuptools at the time of writing), then either change the `[build-system]` table manually, or use your desired workflow tool (for example `pdm init` as explained in https://pdm-project.org/en/stable/usage/project/#import-the-project-from-other-package-managers)

2. Create your own Kedro starter that uses your desired workflow tool instead of `setuptools`
   
   * If you're in a hurry, you can initialise your project using your workflow tool of choice and then use my `kedro-init` plugin https://pypi.org/project/kedro-init

If going for the latter, the workflow would be

$ pdm init
Creating a pyproject.toml for PDM...
Please enter the Python interpreter to use
 0. [email protected] (/Users/juan_cano/Projects/QuantumBlackLabs/kedro-init/.tmp/.venv/bin/python)
 1. [email protected] (/Users/juan_cano/Projects/QuantumBlackLabs/kedro-init/.tmp/.venv/bin/python3.12)
 2. [email protected] (/Users/juan_cano/Projects/QuantumBlackLabs/tmp/ml_observability_course/.venv/bin/python3.11)
 3. [email protected] (/usr/bin/python3)
 4. [email protected] (/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/bin/python3.12)
Please select (0): 
Project name (tmp): spaceflights-pdm
Project version (0.1.0): 
Do you want to build this project for distribution(such as wheel)?
If yes, it will be installed by default when running `pdm install`. [y/n] (n): y
Project description (): Spaceflights Kedro project, using PDM
Which build backend to use?
0. pdm-backend
1. setuptools
2. flit-core
3. hatchling
Please select (0): 
License(SPDX name) (MIT): None
Author name (Juan Luis Cano Rodríguez): 
Author email ([email protected]): 
Python requires('*' to allow any) (>=3.12): >=3.9
Project is initialized successfully
$ tree
.
├── README.md
├── __pycache__
├── pyproject.toml
├── src
│   └── spaceflights_pdm
│       └── __init__.py
└── tests
    └── __init__.py

5 directories, 4 files
$ cat pyproject.toml 
[project]
name = "spaceflights-pdm"
version = "0.1.0"
description = "Spaceflights Kedro project, using PDM"
authors = [
    {name = "Juan Luis Cano Rodríguez", email = "[email protected]"},
]
dependencies = []
requires-python = ">=3.9"
readme = "README.md"
license = {text = "None"}

[build-system]
requires = ["pdm-backend"]
build-backend = "pdm.backend"


[tool.pdm]
distribution = true

then:

$ kedro-init .
[02:16:47] Looking for existing package directories                                                                                                       cli.py:25
[02:16:53] Initialising config directories                                                                                                                cli.py:25
           Creating modules                                                                                                                               cli.py:25
           🔶 Kedro project successfully initialised!

And you're all set!

$ kedro registry list
- __default__

We encourage the community to create Poetry, PDM, Rye starters (and give my kedro-init a try if you so desire).

Probably evolving our official starters to use an alternative workflow tool isn't going to happen any time soon (until, of course, The One Tool Everybody Uses emerges 😉) so in principle I would say there is not much else for us to do, except perhaps document this better.

What do you think @galenseilis?

This makes sense to me! I agree with your conclusion that documenting the compatability with these tools where applicable is the way to go forward. :)

@galenseilis
Copy link
Author

galenseilis commented Jul 1, 2024

I did not encounter any major issues with setting up Kedro with Rye.

https://galenseilis.github.io/posts/kedro-init-rye/

@astrojuanlu
Copy link
Member

With #4116 we're taking on the necessary changes so that the default template is totally compatible with workflow tools (Hatch, PDM, Rye, Poetry, and lastly uv). So I'm relabeling this as a documentation issue.

@astrojuanlu astrojuanlu added Component: Documentation 📄 Issue/PR for markdown and API documentation and removed Issue: Feature Request New feature or improvement to existing feature labels Nov 8, 2024
@astrojuanlu astrojuanlu changed the title Document Kedro compatibility with workflow tools like Hatch, PDM, Rye, Poetry Document Kedro compatibility with workflow tools like uv, Hatch, PDM, Rye, Poetry Nov 28, 2024
@astrojuanlu
Copy link
Member

#4116 has been fixed. Now Kedro starters and Kedro projects in general are fully PEP 621 compatible, and can be used with all modern project management tools (including Poetry 2.0 whenever it's out)

Now the question is... Should we document all of them? Or should we favour a specific one?

@galenseilis
Copy link
Author

#4116 has been fixed. Now Kedro starters and Kedro projects in general are fully PEP 621 compatible, and can be used with all modern project management tools (including Poetry 2.0 whenever it's out)

Now the question is... Should we document all of them? Or should we favour a specific one?

I cannot seem to come up with a good answer for you. I personally have these preferences: uv > pdm > rye = poetry > hatch. On the other hand, I'm sure that others have their own preferences.

@merelcht merelcht removed the Community Issue/PR opened by the open-source community label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Documentation 📄 Issue/PR for markdown and API documentation
Projects
Status: No status
Development

No branches or pull requests

3 participants