-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify data setup #634
Simplify data setup #634
Conversation
This might be helpful: https://github.com/fatiando/pooch |
I appreciate the suggestion; I think this is a good idea. |
100% agree with @liadomide - |
@i-Zaak what exactly do you have in mind? to publish also with pooch, thus keep the data replicated in 2 places or replace Zenodo entirely? |
I'd use pooch to fetch the zip file from zenodo, verify checksum and unzip. I recently started using it for the EBRAINS datasets: _ = pooch.retrieve(
url='https://object.cscs.ch/v1/AUTH_227176556f3c4bb38df9feea4b91200c/hbp-d000059_Atlas_based_HCP_connectomes_v1.1_pub/200-Schaefer17Networks.zip',
known_hash='5086f4b3405acff84ffe132cee17c67a90000a3fae98da50d4e14fb55d7f5d57',
path='some/path/where/the/data/lives',
processor=pooch.Unzip(extract_dir='.')
) |
I'm onboard for pooch but I'd just like to get #633 done first so that it's easier to add the dependency. |
On a longer term view, it could be of interest to keep the Zenodo for the DOI but have the primary source be a Datalad repository which stores significant demo datasets, simulated data etc, that could help jumpstart users' projects. For instance, having 100 HCP subjects preprocessed, with some basic simulations done could be a huge benefit. On short term, having an easy fetcher for our Zenodo but also Ebrains is very nice. |
We could start incorporating various data sources in the TVB data (datalad, zenodo, EBRAINS) similarly to how MNE is handling datasets: https://mne.tools/stable/overview/datasets_index.html In the end, there should be minimal data contained in the Another candidate to be merged in: https://gitlab.ebrains.eu/fousekjan/tvb-ebrains-data |
@liadomide do you think this PR is still relevant, given the work in #691 ? (btw do you have a link to the issue TVB-1999? I cannot find it at req.thevirtualbrain.org) |
@maedoc here is the link for TVB-1999: https://tvb-projects.atlassian.net/browse/TVB-1999 |
close in favor of #691 |
This adds a simple way to get the demo dataset from Zenodo w/o the wget/unzip stuff or missing data with the pip package. It's currently in the library part but could be anywhere, @liadomide thoughts?