-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when trying to train on a custom dataset #6
Comments
Just an update, I have also tried with masks having 1 as positive and 255 as negative value but it also crushes with an error: For some reason no matter what combination I have tried I fail to train my simple model. a) I have used number of class 1 or 2. Obviously, I am missing something here. |
I feel this is related to setting up the mask indices correctly for cross entropy. Do you have a similar setup that worked for other segmentation models you have tried? There is nothing specific regarding output classes to finetuning the DINOv2 model. |
I am not sure what do you mean by setup to be honest. I used the setup you provided for ADE20k so this should be fine I guess. I also suspect the masks might be the problem but I don't know how to solve the issue. |
It is hard to diagnose a problem like this. Could you provide a small code snippet to reproduce your problem, maybe with dummy input? |
I am trying to train a model on a custom dataset which contains only 1 class basically (and the background of course).
I have modified the pipeline to include the new dataset. For some reason I cannot train the model though.
First of all how should the masks be created. I have tried using those combination:
a) grayscale image with 255 value for positive pixels and 0 for negative
b) grayscale image with 1 value for positive pixels and 0 for negative.
None of them seem to work as expected. If the training is able to be run then the IoU value is 1 from the first epoch. When I debug the issue I found that no actual mask was passed (mask seem like it's all negative values) and, thus, both Intersection and Union are 0 and only eps is present (
IoU=eps/eps=1
).On the other hand there is the case where the training can't be completed (it throws an error
runtimeError: CUDA error: device-side assert triggered
inloss = criterion(logits, masks)
).a) Should I define 2 classes for the model or 1?
b) Additionally, the above error does not seem to be thrown if I use
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
instead ofmask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE) - 1
(but the IoU issue appears then). Why are you subtracting 1 from the masks?c) How should I create the masks for the model?
The text was updated successfully, but these errors were encountered: