-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The training result is blank #11
Comments
Are there screenshots of the training process? |
No error
…------------------ 原始邮件 ------------------
发件人: "OFA-Sys/DAFlow" ***@***.***>;
发送时间: 2022年11月14日(星期一) 下午3:48
***@***.***>;
***@***.******@***.***>;
主题: Re: [OFA-Sys/DAFlow] The training result is blank (Issue #11)
Are there screenshots of the training process?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi @wang674 , Have you been able to solve this problem? I am encountering a similar issue while fine-tuning model on a custom dataset. The model produces the expected output until epoch 6, but afterwards, it begins to generate blank outputs. |
The network is sensitive to weight initialisation and learning rate. If we use proper learning rate initially and do default weight initialisation, works well. |
Hi @hyyuan123 , |
@kanthprashant |
Hello,I had a similar problem with you,I used my equipment to train the weight given by the git_hub of author .However ,after many rounds of training,even though my learning rate had been modified very small,the predicted result was still gray and white .Later,I found that my code had two problems: torch.save(
{
"state_dict": sdafnet.state_dict(),
},
"savemodel.pt",
)
torch.save(sdafnet.state_dict(), "savemodel.pt") In my code ,I save the model in dictionary form,but load the prediction directly ,that is , load it into the network by torch.save(net.state_dict(),save_path) The second reason is this:
If you use The solution is to directly load the weights on a single GPU when loading the model weights, rather than loading the weights on multiple GPUs. If you need to train using multiple GPUs, you can use |
你好,我想问问我的训练过程中保存的模型文件加载不出来,训练结果也是空的,loss值也没收敛,可能存在什么问题? |
|
|
The training result is blank。
The text was updated successfully, but these errors were encountered: