-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Partitioned Dataset Lazy Saving example more robust #3052
Comments
Thank you for raising this issue, it's very well written. Our team will have a look shortly. |
From my understanding, this is not a Kedro problem. It's how Lambda variable scope work. See this example
This StackOverFlow thread explains better: https://crawler.algolia.com/admin/crawlers/189d20ee-337e-4498-8a4c-61238789942e/overview |
In that case, I think that this is mainly a matter a documenting the right approach in the doc. wdyt? |
I have marked this as a documentation effort. I suggest that for the person who pick up this ticket, we can put a note section to warn about the scope of lambda, and check if we can improve our docs example, maybe add this to a faq. |
This is something to tackle as part of #2941 |
Partitioned Dataset Lazy Saving
Problem
When using partitioned datasets with lazy saving in Kedro, following the current documentation example, the key value mapping is messed up. Specifically, the same value is saved for all keys.
Expected Behavior
Without lazy loading (lambda), everything works as expected, with keys and values correctly associated.
Example
Expand example (on click)
Example dataset
Example catalog
Example node
Run node
Observe bug
Fix
Complete the lambda with arguments to make it aware of the actual value(s) passed to address the lambda scoping issue.
In our example, this means the following:
Environment
Resources
Slack discussion 🤗
The text was updated successfully, but these errors were encountered: