Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor mask quality at the edge of objects #5

Open
Zailushang211 opened this issue Jan 15, 2025 · 3 comments
Open

Poor mask quality at the edge of objects #5

Zailushang211 opened this issue Jan 15, 2025 · 3 comments
Assignees

Comments

@Zailushang211
Copy link

Hi, Thanks a lot for your outstanding work.
I followed the demo, but found that the mask quality at the edge of objects is quite poor, which is not seen in SAM2, did i miss any config? or any suggestion to improve?
image

@HarborYuan
Copy link
Collaborator

Hi @Zailushang211 ,

Thanks for your interest in our work. One possible reason for this is that the instructions given were not clear enough. For example, giving a description that applies to all three objects above could lead to this situation. Using a larger model or giving a clearer description might alleviate this problem. Also, the image doesn't seem to be of very high quality, which might also affect performance.

Please let me know if you have any other questions.

@HarborYuan HarborYuan self-assigned this Jan 15, 2025
@Zailushang211
Copy link
Author

Hi @Zailushang211 ,

Thanks for your interest in our work. One possible reason for this is that the instructions given were not clear enough. For example, giving a description that applies to all three objects above could lead to this situation. Using a larger model or giving a clearer description might alleviate this problem. Also, the image doesn't seem to be of very high quality, which might also affect performance.

Please let me know if you have any other questions.

Thanks a lot for your explanation, will try again. I did use the 8B model, make no much difference.
In terms of "a description that applies to all three objects", I think this is a common need for many applications.
As seen from the results, the model did find the right object, but not segment it perfectly at the edge, such a edge jaggies issue is also seen in FastSAM(https://github.com/CASIA-IVA-Lab/FastSAM), doubt if they have any connections.

@HarborYuan
Copy link
Collaborator

If this happens for both language instruction input (Sa2VA) and visual prompt input (FastSAM) prompts, it is more likely that the models are not robust in this case due to image clarity or other factors. This is a canteen scene, and the image clarity doesn't seem to be very good, which may affect the performance. If this were a real-world application project, maybe you could try inputting the mask generated by Sa2VA into SAM-2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants