Before running main scripts you have to preprocess your dataset to the appropriate format for webdataset.
Details about this format you can find in examples/sample_configs/README.md
There is information about used datasets below.
Use commands below to download and preprocess any UCR dataset:
cd /home/dev
wget https://timeseriesclassification.com/Downloads/ElectricDevices.zip
python examples/dataset_preprocess/ucr_prepr.py --ucr_zip=ElectricDevices.zip --save_folder=/data/ElectricDevices
rm ElectricDevices.zip
wget https://timeseriesclassification.com/Downloads/InsectSound.zip
python examples/dataset_preprocess/ucr_prepr.py --ucr_zip=InsectSound.zip --save_folder=/data/InsectSound
rm InsectSound.zip
Dataset is already in necessary format.
Download Amex dataset. Create /data/amex
directory and move train.parquet and test.parquet to it.
Dataset is taken from this hackathon, but links for downloading data and sending results don't work.
TBA