Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document strategies to use more samples #129

Open
noahho opened this issue Jan 13, 2025 · 3 comments
Open

Document strategies to use more samples #129

noahho opened this issue Jan 13, 2025 · 3 comments
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed

Comments

@noahho
Copy link
Collaborator

noahho commented Jan 13, 2025

Document dataset constraints for TabPFN more, document strategies to use for more samples (subsampling ensemble, SklearnBasedRandomForestTabPFN [tabpfn-extensions (https://github.com/PriorLabs/tabpfn-extensions/blob/dbc3f5da25821135602fdc4d95cc8c217afbc3b0/src/tabpfn_extensions/rf_pfn/SklearnBasedRandomForestTabPFN.py#L106])

@noahho noahho added documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed labels Jan 13, 2025
@jokus-pokus
Copy link

Where can I read about the constraints?

@Daniel-KK-world
Copy link

yes where can we read about the constraints?

@LennartPurucker
Copy link
Collaborator

Some information about the pretraining limits is currently documented here: https://github.com/PriorLabs/TabPFN/blob/main/src/tabpfn/classifier.py#L219-L238

The current pre-training limits are:

  - 10_000 samples/rows
  - 500 features/columns
  - 10 classes, this is not ignorable and will raise an error if the model is used with more classes.

This is not shown on the docs, so it would also be good to add this.

Otherwise, as a starting point, I recommend reading the User Guide in our paper .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants