Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydata Work Prioritization #145

Closed
6 of 13 tasks
RNKuhns opened this issue Mar 8, 2023 · 7 comments
Closed
6 of 13 tasks

Pydata Work Prioritization #145

RNKuhns opened this issue Mar 8, 2023 · 7 comments
Labels
Planning Project planning

Comments

@RNKuhns
Copy link
Contributor

RNKuhns commented Mar 8, 2023

This is a meta issue to prioritize our work for Pydata Seattle 2023.

High-level tasks for the presentation:

High-level tasks for new features/maintenance ahead of Pydata Seattle 2023. These are divided into "must", "should" and "nice-to-have":

Must:

  • sktime and skbase initial integration is completed. Targeting mid-April release of sktime v0.18.0
  • Create "example" repository that includes a stylized package that uses skbase (will also be useful for additional testing of our lookup, testing and validation functionality too)
  • Local (class) configuration interface
  • [ENH] Refactor skbase.base._meta and add unit tests #106

Should

  • [ENH] Implements sklearn style pretty printing directly in skbase #150
  • Add persistance (re-work of sktime persistance approach to make sure it works more generally)
  • Upgrade documentation, with emphasis on landing page
  • Redo governance section of documentation
  • Create functionality roadmap (borrow from HackMD and design document and add other ideas)
  • Improve documentation of the skbase.testing module

Nice-to-have

  • [ENH] Add global config interface #149
  • Example showing how to use skbase to create scikit-learn compliant class (need to investigate if this is possible. If their checks are "duck-typed" then it should be)
@fkiraly fkiraly added the Planning Project planning label Mar 8, 2023
@fkiraly
Copy link
Contributor

fkiraly commented Mar 8, 2023

seems like you dropped from the call?

My thoughts on priorities:

  1. "cook show" style tutorial creation, as described here: [DOC] tutorial/workshop for pydata Seattle 2023 #141
    consists of two parts: (1a) jupyter notebooks for demo; (1b) multiple stages of "package being built" for the cook show

  2. sktime/skbase integration. It is realistic to publish an skbase based sktime version 0.18.0 by mid Apr - test framework refactor fails for mysterious reasons, but it is not actually needed for a rebasing of the core module, as the existing sktime tests are robust enough.

  3. optional bonus feature: persistence. This has been implemented in sktime by myself and @achieveordie, and is likely a nice feature we could deliver until then, in skbase (lateral transfer)

  4. optional proof-of-concept: sklearn with skbase. Small version: extension template based on skbase that passes sklearn interface contract tests. Big version: PR to sklearn fork that replaces the base class and refactors the tag system to use skbase's, etc.

@fkiraly
Copy link
Contributor

fkiraly commented Mar 10, 2023

from call with Ryan - which features would be great to create that we can showcase?
These are "should" or "nice to have" (tbd), "must" is the actual material based on status quo.

  • config (local) - PR exist - should
  • global config - may need some thought - nice to have
  • pretty printing (html and text) - seems easy - should
  • persistence - in sktime, needs translation - nice to have
  • documentation of using testing module? it presented (not just afterthought), should, otherwise nice to ahve
  • landing page upgrade! - must have
  • moving documentation from design issues and hackmd into issues or official doc page - should

@RNKuhns
Copy link
Contributor Author

RNKuhns commented Mar 10, 2023

@fkiraly I've updated my initial issue comment based on the notes you recorded above in our call today. I'm open to adjusting the checklists as needed (let me know if something ended up somewhere you didn't expect).

@fkiraly
Copy link
Contributor

fkiraly commented Mar 11, 2023

Looks reasonable - the presentation itself is obvious the absolute highest priority, but I do not see it in the list. I guess it is understood, but may be useful to list it there.

@RNKuhns
Copy link
Contributor Author

RNKuhns commented Mar 12, 2023

I was using that first check-box to the other issue for the presentation. I figured it would have enough stuff to stay on top of that we'd keep it separate in that issue.

@RNKuhns
Copy link
Contributor Author

RNKuhns commented Mar 12, 2023

@fkiraly for the example repository. We had talked about skbase moving to its own GH organization. I've got https://github.com/scikit-base setup to eventually be that (skbase was taken). Do you want me to add the example repository there or do we want to put it under sktime to start and move it over later?

@fkiraly
Copy link
Contributor

fkiraly commented Apr 28, 2023

pydata presentation has now happened.

@fkiraly fkiraly closed this as completed Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Planning Project planning
Projects
None yet
Development

No branches or pull requests

2 participants