Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the 'sherbet_pretrain' parameters #2

Open
1245505490 opened this issue Jun 23, 2024 · 1 comment
Open

Question about the 'sherbet_pretrain' parameters #2

1245505490 opened this issue Jun 23, 2024 · 1 comment

Comments

@1245505490
Copy link

Thank you for sharing the code.
In the pre-training phase, the parameters are saved in 'sherbet_pretrain'. Why, in the subsequent training, are the parameters not actually used in the model training from 'sherbet_pretrain'.

@LuChang-CS
Copy link
Owner

Hi, in main/train.py, the pre-trained weights are saved by the sherbet_pretrain model

sherbet_pretrain = Sherbet(sherbet_feature, pretrain_model_conf, hyper_params)
sherbet_pretrain.save_weights(op_conf['pretrain_path'])

As you can see, the saved weights include the weights of sherbet_feature.

In training, all weights, including sherbet_feature are loaded by the following code:

if op_conf['from_pretrain']:
    sherbet_pretrain = Sherbet(sherbet_feature, pretrain_model_conf, hyper_params)
    sherbet_pretrain.load_weights(op_conf['pretrain_path'])

Then, the model used for training sherbet is created using sherbet_feature that has the saved weight.

sherbet = Sherbet(sherbet_feature, model_conf, hyper_params)

This is to only load the weight of the feature extraction part. Note that, in this way, the weight of sherbet_feature and sherbet_pretrain will be updated after training. It will be fine as long as you do not continue to use sherbet_pretrain for other purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants