-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poetry Support for Kedro Projects #1722
Comments
Thanks for raising the issue, would be great if you can provide some kind of In general, a "Kedro Project" itself is the top directory, do you currently have a workaround? let say your new project is called
|
Sure On my way. Will share a git rep soon. |
@DataPsycho |
First of all, thank you @DataPsycho for writing a very detailed README which is easy to follow. I think @deepyaman approach is preferrable.
With the structure that you provided, basically you need to copy everything inside Here are the potential alternatives I am thinking about: Starting from a fresh project with
|
Hi, There is more file to move around.
After I get that structure. I have to do the following moving:
I always have to start with poetry first. Using poetry I have add kedro as a package for the virtual environment of the project. Then I am able to use kedor. But reverse is not what a poetry use would do: Create a venv install poetry in it and activate it. then create a new project with kedro and go inside of the project then start poetry into the repo which will create another new poetry-venv. now old venv will have no use. |
@DataPsycho Is there any difference that just go inside I don't quite understand why it necessarily create another poertry environment, I may test it out tomorrow. |
Its fine to do the copy pasting. But Kedro is a package with cli. But Poetry is an environment management and package management system. How do I use kedro from the start without poetry or Pipenv?
If I install Kedro in the base python image:
But now I am locked with the Kedro version, I can not move between versions. I have to create all my projects with same Kedro version. So this is a no go for me. If want to use Poetry:
To be able to use Kedro first I must have to create a virtual environment first and install Kedro in it. But Poetry responsible for creating a virtual environment and adding Kedro init. So If want to use Kedro to initialize a project I need a virtualenv with kedro but then after initialization of the project when I cd into the project and initialize poetry with |
@DataPsycho I agree this is not the smoothest experience. I just want to mention that your
|
Ok. Then we can close the feature request I guess. Thanks for your support and the time you have spent. @deepyaman 's Idea was great. I will see If I will have time to create a new starter for poetry like project structure. For now we can close it. I will close it by tomorrow, if you have nothing to add. Thanks |
@DataPsycho You can find more info how you can extend it with |
A new starter might be added for poetry/PIPENV. |
I'm reopening this because I think it's a very good topic and I'd be interested in hearing from other users about it 🙂 It's been mentioned several times before by differently people but we've never had thoughts collected together in one place, so let's start doing that here! In the past we've also wondered whether we should switch to using poetry. Currently we support a Some previous related issues (there's probably others too): From these and other conversations I know the following users have independently shown interest in kedro + poetry. There's also been interest within QB, though I'm not sure exactly who. So I definitely think there's some significant interest in this. @datajoely do you know anyone else here? |
Carlos Bareto, but I don't know his GitHub handle |
TBH, I like the idea of adding support for Poetry in Kedro projects. I think the main advantages of Poetry are:
|
I agree with @arnaldog12. Also, I integrated my current project with Poetry. If you want, I can share that as a poetry starter. |
Much appreciate the initiative. Happy to share any knowledge needed which I have already tried to develop the starter template. |
One note for posterity on using Poetry with Kedro projects--there was a fix that's especially relevant to Kedro projects added in Poetry 1.2.0b3. Before this, you need to make sure to define any extras like |
So if I'm new to poetry and new to kedro, and I've installed poetry |
If still there is no better way, follows this thread above what I had to make kedro compatible with poetry |
I've read both this and the closed issue I haven't found any mention of the relationship between |
you can run |
For folks subscribed to this old issue: we're (1) modernizing the way Kedro projects are structured, to make them look more similar to normal Python libraries https://github.com/kedro-org/kedro/milestone/36 and (2) looking into ways to initialize a Kedro project in an existing directory #2512. Our idea though is to favor PEP 621 compliant |
Today I found a project that uses Poetry + Kedro: https://github.com/madziejm/project-fontr People subscribed to this issue, could you have a look and let us know what else can we do to better support this use case? Otherwise I'm voting to close the issue. |
Hi! Great that you are working on it and hopefully poetry moves to PEP 621 soon :) For people wondering how you can use a conda-poetry-kedro setup for now, I use it as follow:
conda create --name myenv
conda activate myenv
conda install -c conda-forge poetry
pip install kedro
poetry add "black~=22.0"
poetry add "flake8>=3.7.9,<5.0"
poetry add "ipython>=7.31.1, <8.0; python_version < '3.8'"
poetry add "ipython~=8.10; python_version >= '3.8'"
poetry add "isort~=5.0"
poetry add "jupyter~=1.0"
poetry add "jupyterlab~=3.0"
poetry add "jupyterlab~=3.0"
poetry add "kedro~=0.18.13"
poetry add "kedro-datasets[pandas.CSVDataSet, pandas.ExcelDataSet, pandas.ParquetDataSet]~=1.0"
poetry add "kedro-telemetry~=0.2.0"
poetry add "kedro-viz~=6.0"
poetry add "nbstripout~=0.4"
poetry add "pytest-cov~=3.0"
poetry add "pytest-mock>=1.7.1, <2.0"
poetry add "pytest~=7.2"
poetry add "scikit-learn~=1.0" I use
You should end up with a pyproject.toml looking like this (see .txt), which you can then use in the future to init your poetry env directly using |
Hi @ac-willeke Thanks for the tips. But, doesn't using |
Many use conda to specify python version within the virtual env, another option is pyenv |
Hi! Yes, I agree conda/poetry is redundant. I used to combine the two in projects with libraries that are not easily installed using poetry. For example, python bindings for gdal (dependent on C++) are not that easy to install if you don't have admin rights. So then I would start my project with conda, install gdal, install all other packages using poetry (as I like the clean structure of poetry). But I recently moved to gdal images from docker, so then you can use solely poetry as a package manager :) So maybe my example above with the conda/poetry env was not the best, sorry for that! |
Did some experiments today and I confirm Kedro supports Poetry. Or Poetry supports Kedro, depending on how you want to look at it. Starting point:
Then added the necessary files (for example using https://github.com/astrojuanlu/kedro-init):
Now everything works:
I don't think there's anything else we'll do for now. I'm closing this issue, feel free to keep commenting if you disagree. |
Hey @astrojuanlu , I wasn't able to pip install the |
Hello @GuiMarthe , If there's traction and interest I will consider publishing it to PyPI. Voice your interest here or opening an issue on https://github.com/astrojuanlu/kedro-init/issues |
@astrojuanlu Thank you for Kedro, and for your explanations on what is needed to use Poetry! As I did want to use a Poetry managed virtual environment, and prefer not to depend on an experimental repo, I experimented myself and found an acceptable workflow, that I documented (for myself) in this little Guide to use Kedro with Poetry. I hope it may be helpful to others too. Please let me know if I missed anything and feel free to use/share this if you deem it useful. Also, please let me know if you decide to properly support Poetry initialization. |
Thanks for sharing @ourownstory ! Your writeup reminded me that kedro-init doesn't account for the new project tools of Kedro 0.19. I might give it another pass and publish a 0.1 version to PyPI :) |
Great, thank you! I hope this may lead to the eventual integration to the main package similar to OP's suggestion? |
For that, let's continue the conversation in #681 |
Description
The way kedro initiate a new project and create the folder structure does not goes well with Poetry . Usually I would create a Poetry environment before doing anything and then install all my required pacakges one by one. After I create a Poetry environment and added the kedro package the pyproject toml looks as followes:
poetry new --src KedroPoetry
Lets run the demo pytest to see if everything works.
This goes well:
Now its time to add a Kedro Project:
kedro new
The command completely ignored the current
pyproject.toml
file. and as there is a src file it did not add the project in the src folder instead create a directory on the root outside of src. Now there is no kedro setup section in pyproject.toml so kedro cli will complain for broken setup.Context
As Poetry provide one of the modern approach for packaging Python projects it will be good to have direct support for Poetry like project structure for Kedro or at-least a hackable way out will also work.
Possible Implementation
There could be a new flag in cli to initiate project with Kedro when there is already a pyproject.toml file and a project setup for Poetry.
The text was updated successfully, but these errors were encountered: