Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Camera Conventions #17

Open
YeZhang0621 opened this issue Jan 6, 2025 · 1 comment
Open

Question about Camera Conventions #17

YeZhang0621 opened this issue Jan 6, 2025 · 1 comment

Comments

@YeZhang0621
Copy link

Hi! Thanks for the great work!

Recently, I'm trying to run MVSplat360 on DNA-Rendering dataset. Some Questions below:

  1. The camera extrinsic params stored in preprocessed torch files should be OpenCV-style World-to-Camera or OpenCV-style Camera-to-World? The README of project says camera-to-world, but in the convert_dl3dv.py I saw these code:
for frame in meta_data["frames"]:
        timestamps.append(
            int(os.path.basename(frame["file_path"]).split(".")[0].split("_")[-1])
        )
        camera = [saved_fx, saved_fy, saved_cx, saved_cy, 0.0, 0.0]
        # transform_matrix is in blender c2w, while we need to store opencv w2c matrix here
        opencv_c2w = np.array(frame["transform_matrix"]) @ blender2opencv
        opencv_c2ws.append(opencv_c2w)
        camera.extend(np.linalg.inv(opencv_c2w)[:3].flatten().tolist())
        cameras.append(np.array(camera))

the code inversed the c2w matrix, and stored it into torch file, which made me confused.

  1. How to verify whether the camera parameters are aligned with the codebase correctly? My way is to try different transformation to the camera parameters provided by DNA-Rendering, such as flipping y and z axis, c2w to w2c, ... , and also combination of them. However, NONE of them work. Here are some results running on DNA-Rendering dataset(I used camera parameters from DNA-Rendering, non-flipping, and changed to w2c):
    2
    13
    26

Left is Groud Truth, Middle is GSplat, Right is Refined Image.

And here are the visualiaztions of epipolar lines:
0008_01_new_00
0008_01_new_01
0008_01_new_02

Could you help me check whether these results are normal? Am I using the right camera conventions? If not, any idea what could be wrong? Thanks !!!

@donydchen
Copy link
Owner

Hi, @YeZhang0621, thanks for your appreciation.

Sorry for the confusion caused. The reason why we inverse it is because we assumed that w2c is stored in the torch file in the dataloader (mainly because we directly modified the dataloader from re10k), as shown at

return w2c.inverse(), intrinsics

In other words, w2c in the torch file and c2w in the code.

Ideally, the epipolar line in the target image should pass through the same point in the source view. The problem is that your example contains a small overlap region, making it difficult to find the points that exist on both views. Try debugging with those nearest views that contain a larger overlap. Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants