-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions about the code #1
Comments
Thanks for your interests in our work. |
Thanks for your reply. I've noticed a misunderstanding I've had about gradient propagatio. I'm used to rewriting backward for backpropagation, ignoring that using tensor and tensor.data (required_grad=False in tensor.data ) can also achieve the rescaling of gradient. |
i do not understand why use the '.sqrt()' function, can you explain it? thanks ! |
Hello, thanks for your excellent work!
There are some questions about the code, looking forward to your reply.
Q1.
In this code, the update of table_q is invalid, is there a problem here?
Q2.
During model inference, the scale needs to be calculated from the statistical values of the input. Did you try to use a fixed trained scale at inference time.
The text was updated successfully, but these errors were encountered: