Trouble Training and Evaluating Architectures due to Data Directory Structure Confusion #35

Jonayet10 · 2024-07-23T03:31:56Z

I am having issues reproducing the results of this codebase. The issues stem from the fact that your README file and code makes it unclear what the structure of the input data directory should be. Could you clarify how exactly you want the data to be stored in the root directory before training a model? Below is what I have done in my efforts to reproduce the results of your experiment:

I am using Google Colab for all this, and I fixed a bunch of incompatible versions of dependencies in the requirements.txt file.
I downloaded the CelebA dataset, with the Google Drive link that is in your README file. From there, I downloaded CelebA/Img/img_align_celeba.zip.
I then unzipped this file and wrote a Python script to split the data into img_align_celeba_splits/train, img_align_celeba_splits/test, img_align_celeba_splits/val based on the splits/ folder in your codebase, where the train, test, and val folders each contain male and female folders with the jpg images inside. I split into male and female based on your celeba/CelebA_demographics.txt file.
I also changed the user/configs/config_user_celeba.yaml file by replacing the comet_api_key and comet_workspace with my own, and the default_val_root, default_train_root, and default_test_root with the respective values.
Then, after successfully running ‘bash scripts/create_configs_celeba.sh’ I get errors when running ‘bash scripts/experiments_default_celeba.sh.’ The error I get is an error raised in the make_dataset function of src/utils/data_utils_balanced.py, which is ValueError: ‘class_to_index’ must have at least one entry to collect any samples.’
Based on inspection of your code in the src/utils/data_utils_balanced.py file, I thought maybe the structure of the data directory should be like .../train/1/img1.jpg, ../test/2/img2.jpg, etc. (Note the names of the jpg image stayed the same as it was from when I unzipped the the CelebA/Img/img_align_celeba.zip file, and my data directory (img_align_celeba_splits) now has train, test, and val folders each containing the images enclosed in a folder for each image.). After this, running ‘bash scripts/experiments_default_celeba.sh’, I get the error: File "/usr/lib/python3.10/random.py", line 482, in sample raise ValueError("Sample larger than population or is negative") ValueError: Sample larger than population or is negative
To fix the current error, I changed the min_num parameter to 1 instead of args.min_num_images when creating instances of the ImageFolderWithProtectedAttributes class in the src/utils/data_utils_balanced.py file, and I commented out ‘instances_additional[dem] = random.sample(instances_additional[dem], k=num_additional_images_to_keep[dem])’in this file.
After this, when running ‘bash scripts/experiments_default_celeba.sh’ gave me an error about how ‘optimizer’ is not defined as a key in the config dictionary in src/train/fairness_train_celeba.py. So I manually added optimizer: Adam and lr: 0.0005 to the config dictionary. I am not sure why I had to do this. You load the config file created from running ‘bash scripts/create_configs_celeba.sh’ into the options variable in src/train/fairness_train_celeba.py file, which contains the lr and optimizer but you do not do anything with them after loading the file?
So after these changes, I reach the point where training the model starts after I run ‘bash scripts/experiments_default_celeba.sh’, but I get the error ‘Error processing file: celeba_configs/configs_default/coat_lite_small/config_coat_lite_small_CosFace_Adam.yaml’

So in conclusion, how exactly do you want the data directory to be formatted, as right now I had to change things like min_num parameter to 1 to conform with how I have it, and even after that I continue to have issues training and evaluating the architectures. The celeba_configs/configs_default/coat_lite_small/config_coat_lite_small_CosFace_Adam.yaml exists and looks fine by the way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble Training and Evaluating Architectures due to Data Directory Structure Confusion #35

Trouble Training and Evaluating Architectures due to Data Directory Structure Confusion #35

Jonayet10 commented Jul 23, 2024

Trouble Training and Evaluating Architectures due to Data Directory Structure Confusion #35

Trouble Training and Evaluating Architectures due to Data Directory Structure Confusion #35

Comments

Jonayet10 commented Jul 23, 2024