Skip to content

Latest commit

 

History

History
40 lines (26 loc) · 1.33 KB

README.md

File metadata and controls

40 lines (26 loc) · 1.33 KB

Dataset preprocess

Before running main scripts you have to preprocess your dataset to the appropriate format for webdataset.
Details about this format you can find in examples/sample_configs/README.md There is information about used datasets below.

Requires pre-processing

UCR datasets (ElectricDevices and Insect)

Use commands below to download and preprocess any UCR dataset:

cd /home/dev
wget https://timeseriesclassification.com/Downloads/ElectricDevices.zip
python examples/dataset_preprocess/ucr_prepr.py --ucr_zip=ElectricDevices.zip --save_folder=/data/ElectricDevices
rm ElectricDevices.zip
wget https://timeseriesclassification.com/Downloads/InsectSound.zip
python examples/dataset_preprocess/ucr_prepr.py --ucr_zip=InsectSound.zip --save_folder=/data/InsectSound
rm InsectSound.zip

Does not require pre-processing

Amex

Dataset is already in necessary format.
Download Amex dataset. Create /data/amex directory and move train.parquet and test.parquet to it.

Alpha

Dataset is taken from this hackathon, but links for downloading data and sending results don't work.

VTB

TBA