-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataCatalog]: Add functionality to search datasets in the catalog #3917
Comments
Also search by kind - if I wanted to find all Parquet files today I'd have to get very creative. Retrieving paths associated with those would be super complicated. |
@stephkaiser do you remember if we already opened an issue or discussion about the "metadata table view"? (cc @rashidakanchwala for when you're back) |
Isn't it what IIRC if you do e.g. |
Related #3312 |
@astrojuanlu we currently don't have an issue for this, I believe it was an idea we discussed when discussing this issue kedro-org/kedro-viz#1635 |
Notice that |
But
this is a bit more advanced, I'd say |
When you say "search datasets in the catalog", what workflow are you talking about? Is this inside a notebook, on the CLI, directly in the IDE or on Kedro-Viz? Each of these user flows might have a different preferred solution. |
After the discussion at backlog grooming, we've decided to:
|
(Just dropping a comment now for when this is ready to be tackled properly) When implementing the dict like interface in #4218, several possibilities were proposed to filter on the values with a regex, ordered after by order of apparition in the PR:
Option 1 was overall considered as interesting, but several of us express concerns that it is not consistent with standard dict interface, hence it would affect discoverability. Option 2 tends to be the leading choice, but option 3 jumped back a couple of minutes before merging after @idanov 's comment that it may be confusing because it's not clear what we are filtering on. I think we did not think enough about the implications and I want to reconsider it before the official release of
I think it's worth rediscuting / voting for this specific method, and not consider #3931 done at this point. PS : Overall this new implementation is 🔥, I just want to speak now instead " |
Thanks for the summary, I do agree with you 🙂 For now, we agreed to think about the renaming when we work on this ticket. We wanted the name to be aligned with the filtering we will suggest, so we decided to keep the old name in the #4218 |
Description
Users struggle to find datasets within the catalog, particularly when dealing with a large number of datasets. They express the need for search features to facilitate dataset discovery.
Context
"As a user in my list object, I can filter by name but I can't filter by what. So it would be good to be able to say give me all the
sql
datasets and then the names of the tables that are attached."Comment form @astrojuanlu: Kedro Viz has an item in their roadmap to include a table view of all the metadata, could help with this.
Possible Implementation
Integrate search functionality into the catalog, enabling users to search for datasets based on keywords, patterns and by kind. Include support for regex search to accommodate users with advanced search requirements.
The text was updated successfully, but these errors were encountered: