Question about Camera Conventions #17

YeZhang0621 · 2025-01-06T08:16:43Z

Hi! Thanks for the great work!

Recently, I'm trying to run MVSplat360 on DNA-Rendering dataset. Some Questions below:

The camera extrinsic params stored in preprocessed torch files should be OpenCV-style World-to-Camera or OpenCV-style Camera-to-World? The README of project says camera-to-world, but in the convert_dl3dv.py I saw these code:

for frame in meta_data["frames"]:
        timestamps.append(
            int(os.path.basename(frame["file_path"]).split(".")[0].split("_")[-1])
        )
        camera = [saved_fx, saved_fy, saved_cx, saved_cy, 0.0, 0.0]
        # transform_matrix is in blender c2w, while we need to store opencv w2c matrix here
        opencv_c2w = np.array(frame["transform_matrix"]) @ blender2opencv
        opencv_c2ws.append(opencv_c2w)
        camera.extend(np.linalg.inv(opencv_c2w)[:3].flatten().tolist())
        cameras.append(np.array(camera))

the code inversed the c2w matrix, and stored it into torch file, which made me confused.

How to verify whether the camera parameters are aligned with the codebase correctly? My way is to try different transformation to the camera parameters provided by DNA-Rendering, such as flipping y and z axis, c2w to w2c, ... , and also combination of them. However, NONE of them work. Here are some results running on DNA-Rendering dataset(I used camera parameters from DNA-Rendering, non-flipping, and changed to w2c):

Left is Groud Truth, Middle is GSplat, Right is Refined Image.

And here are the visualiaztions of epipolar lines:

Could you help me check whether these results are normal? Am I using the right camera conventions? If not, any idea what could be wrong? Thanks !!!

donydchen · 2025-01-07T04:10:09Z

Hi, @YeZhang0621, thanks for your appreciation.

Sorry for the confusion caused. The reason why we inverse it is because we assumed that w2c is stored in the torch file in the dataloader (mainly because we directly modified the dataloader from re10k), as shown at

mvsplat360/src/dataset/dataset_dl3dv.py

Line 347 in 986a991

return w2c.inverse(), intrinsics

In other words, w2c in the torch file and c2w in the code.

Ideally, the epipolar line in the target image should pass through the same point in the source view. The problem is that your example contains a small overlap region, making it difficult to find the points that exist on both views. Try debugging with those nearest views that contain a larger overlap. Cheers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Camera Conventions #17

Question about Camera Conventions #17

YeZhang0621 commented Jan 6, 2025

donydchen commented Jan 7, 2025

Question about Camera Conventions #17

Question about Camera Conventions #17

Comments

YeZhang0621 commented Jan 6, 2025

donydchen commented Jan 7, 2025