-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Visual Prompt Encoder with negative visual prompts #91
Comments
In training, negative prompts are used to contrast categories and help the model learn better separations. For example, in a mini-batch with a batch size of 2, you may have categories A and B in Figure 1, and C and D in Figure 2. The embeddings for all these categories (A, B, C, and D) are gathered during training. When computing the classification loss, positive prompts are pairs from the same category, while negative prompts are from different categories. For example, B, C, and D are negative prompts to category A; C, D, and A are negative prompts to category B. |
What I understand is that during the training process, except for the current category, all others are negative examples? |
In inference, using negative prompts is essentially a multi-class classification problem. For example, if you're detecting red apples, but the image also contains green apples, providing only visual prompts for red apples might cause the model to detect the green apples as well. In this case, you can provide both visual prompts for red and green apples, transforming the problem into a binary classification task. Then, you can select the bounding box with the detection result as red apples at the end. |
I understand. I initially thought it was a similar implementation to SAM, which involved multiple iterations of prompts and negative prompts. |
Yes, this is the way. |
I have another question. When making predictions, if the target object is not present in the image, is it filtered based on the similarity threshold? |
Hi, thanks for the great job! I also have a question related to this paper.
The paper mentions that negative example prompts can be utilized. Could you please explain how negative example prompts are implemented in the training strategy?
The text was updated successfully, but these errors were encountered: