Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with reprojection when compared with DUST3R #52

Open
finlay96 opened this issue Sep 30, 2024 · 0 comments
Open

Issue with reprojection when compared with DUST3R #52

finlay96 opened this issue Sep 30, 2024 · 0 comments

Comments

@finlay96
Copy link

Hi there,

For a downstream task I am needing to reproject an image taken from multiple camera's into another camera, and your work is perfect for that! I have had success with doing this with DUST3R, however when using MAST3R there seems to be pixel shifting problems.

Using this as a sanity check:

cam = 1
world2cam = np.linalg.inv(cams2world[cam])
pts = point_cloud_list[cam] # HW x 3
# Transform point cloud to camera coordinate system
pts_homo = np.hstack([pts, np.ones((pts.shape[0], 1))])
pts_cam = world2cam @ pts_homo.T # 4xN
x = pts_cam[0,:]*focals[cam] / pts_cam[2,:]
y = pts_cam[1,:]*focals[cam] / pts_cam[2,:]

fig = plt.figure()
ax = fig.add_subplot()
ax.plot(x,y)
ax.axis('equal')

The plots for DUST3R look like this, with 0, 0 being central:
Screenshot from 2024-09-30 12-41-36

However for MAST3R we have this, whereby there is a large amount of shift in both axis, implying a possible error in camera poses
Screenshot from 2024-09-30 12-42-03

I was wondering if it known what this reason could be and if there's a fix possible, as would love to migrate from DUST3R over to MAST3R.

MAST3R Version: Commit hash b1b5578

Files used:
data.zip

Many thanks,
Finlay

Below is code used to generate the files as well to load them

prefix = "DUST3R" if USE_DUST3R else "MAST3R"
focals = np.load(f"data/{prefix}/focals.npy")
cams2world = np.load(f"data/{prefix}/cams2world.npy")
point_cloud_list = np.load(f"data/{prefix}/points_cloud_list.npy")
rgbimg = np.load(f"data/{prefix}/rgbimg.npy")

if USE_DUST3R:
    focals = focals[:, 0]
    point_cloud_list = [p.reshape(-1, 3) for p in point_cloud_list]
DUST3R
weights_path = "dust3r/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth"
model = AsymmetricCroCo3DStereo.from_pretrained(weights_path).to(device)
schedule = 'cosine'
lr = 0.01
niter = 300
min_conf_thr = 8.0  # Defaults at 3.0

pairs = make_pairs(images, scene_graph='complete', prefilter=None, symmetrize=True)
output = inference(pairs, model, device, batch_size=1)
scene = global_aligner(output, device=device, mode=GlobalAlignerMode.PointCloudOptimizer,
min_conf_thr=min_conf_thr)
loss = scene.compute_global_alignment(init="mst", niter=niter, schedule=schedule, lr=lr)

imgs = np.stack(scene.imgs)
focals = scene.get_focals().detach().cpu().numpy()
poses = scene.get_im_poses().detach().cpu().numpy()
pts3d = [pt.detach().cpu().numpy() for pt in scene.get_pts3d()]
conf_masks = [conf.detach().cpu().numpy() for conf in scene.get_masks()]
MAST3R
optim_level = "refine"
lr1, lr2 = 0.07, 0.014
niter1, niter2 = 500, 200
min_conf_thr = 1.5
matching_conf_thr = 5.0
clean_depth = True
scenegraph_type = 'complete'
winsize = 1
win_cyclic = False
refid = 0
TSDF_thresh = 0.0
shared_intrinsics = False
if optim_level == 'coarse':
niter2 = 0

weights_path = "MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth"
model = AsymmetricMASt3R.from_pretrained(weights_path).to(device)

scene_graph_params = [scenegraph_type]
if scenegraph_type in ["swin", "logwin"]:
scene_graph_params.append(str(winsize))
elif scenegraph_type == "oneref":
scene_graph_params.append(str(refid))
if scenegraph_type in ["swin", "logwin"] and not win_cyclic:
scene_graph_params.append('noncyclic')
scene_graph = '-'.join(scene_graph_params)
pairs = make_pairs(images, scene_graph=scene_graph, prefilter=None, symmetrize=True)

scene = sparse_global_alignment(filelist, pairs, "/tmp/mast3r",
model, lr1=lr1, niter1=niter1, lr2=lr2, niter2=niter2, device=device,
opt_depth='depth' in optim_level, shared_intrinsics=shared_intrinsics,
matching_conf_thr=matching_conf_thr)

# 3D pointcloud from depthmap, poses and intrinsics
if TSDF_thresh > 0:
	tsdf = TSDFPostProcess(scene, TSDF_thresh=TSDF_thresh)
	pts3d, _, confs = to_numpy(tsdf.get_dense_pts3d(clean_depth=clean_depth))
else:
	pts3d, _, confs = to_numpy(scene.get_dense_pts3d(clean_depth=clean_depth))
	conf_masks = to_numpy([c > min_conf_thr for c in confs])

focals = scene.get_focals().cpu().numpy()
poses = scene.get_im_poses().cpu().numpy()
imgs = scene.imgs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant