Question about the 'sherbet_pretrain' parameters #2

1245505490 · 2024-06-23T13:28:24Z

Thank you for sharing the code.
In the pre-training phase, the parameters are saved in 'sherbet_pretrain'. Why, in the subsequent training, are the parameters not actually used in the model training from 'sherbet_pretrain'.

LuChang-CS · 2024-06-24T06:02:45Z

Hi, in main/train.py, the pre-trained weights are saved by the sherbet_pretrain model

sherbet_pretrain = Sherbet(sherbet_feature, pretrain_model_conf, hyper_params)
sherbet_pretrain.save_weights(op_conf['pretrain_path'])

As you can see, the saved weights include the weights of sherbet_feature.

In training, all weights, including sherbet_feature are loaded by the following code:

if op_conf['from_pretrain']:
    sherbet_pretrain = Sherbet(sherbet_feature, pretrain_model_conf, hyper_params)
    sherbet_pretrain.load_weights(op_conf['pretrain_path'])

Then, the model used for training sherbet is created using sherbet_feature that has the saved weight.

sherbet = Sherbet(sherbet_feature, model_conf, hyper_params)

This is to only load the weight of the feature extraction part. Note that, in this way, the weight of sherbet_feature and sherbet_pretrain will be updated after training. It will be fine as long as you do not continue to use sherbet_pretrain for other purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the 'sherbet_pretrain' parameters #2

Question about the 'sherbet_pretrain' parameters #2

1245505490 commented Jun 23, 2024

LuChang-CS commented Jun 24, 2024

Question about the 'sherbet_pretrain' parameters #2

Question about the 'sherbet_pretrain' parameters #2

Comments

1245505490 commented Jun 23, 2024

LuChang-CS commented Jun 24, 2024