Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Move dataset cloud from GCS to Zenodo #134

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

raphaelreinauer
Copy link
Collaborator

@raphaelreinauer raphaelreinauer commented Apr 14, 2024

This PR moves the dataset cloud functionality from Google Cloud Storage (GCS) to Zenodo. Zenodo provides a more accessible and open platform for hosting and sharing datasets. It is free of charge and not connected to a GCS account, which could be deactivated. This ensures longtime support for the dataset cloud.

When adding a new dataset, the following steps should be followed:

  1. Ensure you have access to Zenodo and obtain an access token.
  2. The DatasetUploader class is used to upload the dataset files to Zenodo. Provide the necessary metadata and file paths.
  3. After the upload is successful, a configuration file will be automatically created for the dataset.
  4. Commit the generated configuration file to the repository as part of the PR.

By committing the dataset configuration file to the repository, everyone can access and use the dataset, even without having a Zenodo access key. The configuration file contains the necessary information to download and retrieve the dataset from Zenodo.

This change simplifies adding new datasets and makes them more easily available to all users.

@@ -54,7 +54,7 @@
pass
print("Using TPU!")
except ModuleNotFoundError:
print("No TPUs...")
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still WIP; I'll let you know once it's ready to review.

@raphaelreinauer raphaelreinauer changed the title WIP: Use zenodo instead of gcs for DatasetCloud WIP: Move dataset cloud from GCS to Zenodo Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants