You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When running a toy example with one layer, zero weights, zero tensor input and zero tensor targets, on cuda backend with f32 the gradients returned are invalid when a layer is larger than certain size (not exactly clear what that limit is).
To Reproduce
Run the supplied test with NDArray backend
Observer the gradients, they should all be zero
Run the supplied test with Cuda backend using f32
Observer the gradients, they are not zero
Decrease the dimesions to 64*64 and output 16 and run with Cuda backend.
Observe the gradient, they should be zero as expected.
Expected behavior
Gradients are always zero.
Additional context
On suggestion by @nathanielsimard I tried cuda backend with f16 and the gradients would be as expected, all zeroes!
Describe the bug
When running a toy example with one layer, zero weights, zero tensor input and zero tensor targets, on cuda backend with f32 the gradients returned are invalid when a layer is larger than certain size (not exactly clear what that limit is).
To Reproduce
Expected behavior
Gradients are always zero.
Additional context
On suggestion by @nathanielsimard I tried cuda backend with f16 and the gradients would be as expected, all zeroes!
Test:
Result when running the above test (Cuda, 128*128, 64):
Result when running with NDArray (NDArray, 128*128, 64):
The text was updated successfully, but these errors were encountered: