Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Benchmark Dataset Accessible to the public #189

Open
lauravchrobak opened this issue Dec 12, 2024 · 0 comments
Open

Make Benchmark Dataset Accessible to the public #189

lauravchrobak opened this issue Dec 12, 2024 · 0 comments
Labels
back-end Involves the back-end (API/DB)

Comments

@lauravchrobak
Copy link

lauravchrobak commented Dec 12, 2024

See document [here](https://docs.google.com/document/d/1M7_vqzkbqjLzA4HY_nB6v_vOVTUSziqUrNrsoe-axAw/edit?tab=t.0#heading=h.tarajnnzekgy

Design Decision: How to access benchmark externally
Problem: Right now our benchmark is on google drive, which works for internal training but is not accessible externally. Furthermore, due to FNDB licensing we cannot publish a benchmark.

Solution: Instead we can use the history tracking capability of the FNDB so that anyone can download the defined benchmark. History tracking means we can query the FNDB from a fixed time.

Features:

  • Have access to csv of train/test/val images as generated from current implementation (see above) in a public github repo
  • train/test/val splits are tagged in the database per our current implementation
  • Script uses fn api to draw down those images+ annotations from a specific date

Notes:

  • Take-away from the tech meeting: do both of the above (1) python script to download benchmark (2) provide list of image urls so that folks can check cross check what is missing (inevitably some images won't be available (3) provide fndb tags for easy download
  • Since we aren’t hosting all this data, inevitably some of the images will disappear. Internally we can always use a saved benchmark version for development but a perfect recreation won’t be possible externally.
@lauravchrobak lauravchrobak added the back-end Involves the back-end (API/DB) label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back-end Involves the back-end (API/DB)
Projects
None yet
Development

No branches or pull requests

1 participant