You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trying to run the trainer on a recent installation (CUDA toolkit 11.0, cudnn 8.0, pytorch 1.7.0) I get the following error:
Start training epoch 1
0%| | 0/391 [00:01<?, ?it/s]
Traceback (most recent call last):
File "phd_lab/extract_latent_representations.py", line 26, in <module>
main()
File "/space/conda/user/ulf/envs/phd-lab/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/space/conda/user/ulf/envs/phd-lab/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/space/conda/user/ulf/envs/phd-lab/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/space/conda/user/ulf/envs/phd-lab/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "phd_lab/extract_latent_representations.py", line 22, in main
main(config_path=Path(config), run_id=run_id, device=device)
File "/home/ulf/projects/github/phd-lab/phd_lab/experiments/main.py", line 81, in __call__
executor(
File "/home/ulf/projects/github/phd-lab/phd_lab/experiments/train_test_executor.py", line 130, in __call__
trainer.train()
File "/home/ulf/projects/github/phd-lab/phd_lab/experiments/trainer.py", line 229, in train
train_metric = self.train_epoch()
File "/home/ulf/projects/github/phd-lab/phd_lab/experiments/trainer.py", line 265, in train_epoch
self._eval_metrics(labels, outputs)
File "/home/ulf/projects/github/phd-lab/phd_lab/experiments/trainer.py", line 170, in _eval_metrics
metric.update(y_true, y_pred)
File "/home/ulf/projects/github/phd-lab/phd_lab/metrics/classification.py", line 35, in update
self.accuracy_accumulator += self._accuracy(y_pred, y_true, (5,))[0]
File "/home/ulf/projects/github/phd-lab/phd_lab/metrics/classification.py", line 25, in _accuracy
correct_k = correct[:k].view(-1).float().sum(0)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
Downgrading to CUDA Toolkit 10.2, cudnn 7.6.5, pytorch 1.5.1 solves that problem.
The text was updated successfully, but these errors were encountered:
Trying to run the trainer on a recent installation (CUDA toolkit 11.0, cudnn 8.0, pytorch 1.7.0) I get the following error:
Downgrading to CUDA Toolkit 10.2, cudnn 7.6.5, pytorch 1.5.1 solves that problem.
The text was updated successfully, but these errors were encountered: