Poor mask quality at the edge of objects #5

Zailushang211 · 2025-01-15T07:17:38Z

Hi, Thanks a lot for your outstanding work.
I followed the demo, but found that the mask quality at the edge of objects is quite poor, which is not seen in SAM2, did i miss any config? or any suggestion to improve?

HarborYuan · 2025-01-15T07:26:45Z

Hi @Zailushang211 ,

Thanks for your interest in our work. One possible reason for this is that the instructions given were not clear enough. For example, giving a description that applies to all three objects above could lead to this situation. Using a larger model or giving a clearer description might alleviate this problem. Also, the image doesn't seem to be of very high quality, which might also affect performance.

Please let me know if you have any other questions.

Zailushang211 · 2025-01-16T02:05:27Z

Hi @Zailushang211 ,

Thanks for your interest in our work. One possible reason for this is that the instructions given were not clear enough. For example, giving a description that applies to all three objects above could lead to this situation. Using a larger model or giving a clearer description might alleviate this problem. Also, the image doesn't seem to be of very high quality, which might also affect performance.

Please let me know if you have any other questions.

Thanks a lot for your explanation, will try again. I did use the 8B model, make no much difference.
In terms of "a description that applies to all three objects", I think this is a common need for many applications.
As seen from the results, the model did find the right object, but not segment it perfectly at the edge, such a edge jaggies issue is also seen in FastSAM(https://github.com/CASIA-IVA-Lab/FastSAM), doubt if they have any connections.

HarborYuan · 2025-01-16T02:46:53Z

If this happens for both language instruction input (Sa2VA) and visual prompt input (FastSAM) prompts, it is more likely that the models are not robust in this case due to image clarity or other factors. This is a canteen scene, and the image clarity doesn't seem to be very good, which may affect the performance. If this were a real-world application project, maybe you could try inputting the mask generated by Sa2VA into SAM-2.

HarborYuan self-assigned this Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor mask quality at the edge of objects #5

Poor mask quality at the edge of objects #5

Zailushang211 commented Jan 15, 2025

HarborYuan commented Jan 15, 2025

Zailushang211 commented Jan 16, 2025

HarborYuan commented Jan 16, 2025

Poor mask quality at the edge of objects #5

Poor mask quality at the edge of objects #5

Comments

Zailushang211 commented Jan 15, 2025

HarborYuan commented Jan 15, 2025

Zailushang211 commented Jan 16, 2025

HarborYuan commented Jan 16, 2025