Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment of image and depth map #7

Open
kirkscheper opened this issue Mar 4, 2020 · 9 comments
Open

Alignment of image and depth map #7

kirkscheper opened this issue Mar 4, 2020 · 9 comments

Comments

@kirkscheper
Copy link

This is more of a question than an issue.

I was taking a look at the ground truth optical flow but noticed that it doesn't quite lineup with the images (or event frames) from the DAVIS.

I tried with the h5py ground truth datasets, data from the precomputed ground truth in the npz files and ground truth computed using the script in this repo all with no luck. I also tried rectifying the images using the calibration data from the yaml info.

The image below is the overlay of the ground truth from the h5py with an undistorted image.
raw_overlay

This shows the overlay of an undistorted image with the ground ruth computed by this repo.
undistorted_rectified_overlay

I noticed that the dist_rect images are simply the dist_raw images passed through the same calibration pipeline as the left image which would suggest that the dist_raw should be from the same view point as the left image but they do not lineup.

It seems I am just missing something. Could you detail to me (and in the documentation) how I can align these frames correctly? Any help is greatly appreciated.

@alexzzhu
Copy link
Contributor

alexzzhu commented Mar 5, 2020

Thanks for the question! I'm not sure if I completely understand the exact terms here, so I'd like to clarify the terms I'm familiar with. There is an undistorted image, which is generated by applying the lens model to the distorted image so that the lens projection equation holds. There is then the rectified image, which applies an additional 3D rotation and potential scaling to the image, such that the horizontal stereo assumption holds.

For the ground truth, I believe that the flow was computed for the rectified images, although it has been a while and I might be mistaken. In the images you are showing, are these the rectified images, or the undistorted ones?

@kirkscheper
Copy link
Author

Apologies, I just noticed a typo in my original comment.

The first image is an overlay using the raw image and the precomputed ground truth provided in the hdf5 file which appears to have been made with an older version of the code on this repo as it is seems to be distorted (straight lines not straight), the previous version of the code doesn't appear to perform any rectification (please correct me if I am wrong). Here the raw depth map from the rosbag and the optical flow line up exactly so there appears to be no view point change during the ground truth generation.

The second image is an overlay using an undistorted and rectified image using the calibration parameters in the dataset (and cv2.fisheye functions) and the ground truth computed with the current master of this repo, which appears to use the pre-rectified depth maps from the Ros bag and the projection matrix of the left camera.

The depth maps in the rosbag look like they are generated for a slightly different view point than the left camera, it look like I am just missing a small correction.

For reference, I forked the repo and added a branch which shows what I am doing: https://github.com/kirkscheper/mvsec/blob/4b7199d88c9110e0d0353a889138e6b5e854d0ba/tools/gt_flow/compute_flow.py#L356

@alexzzhu
Copy link
Contributor

Hmm could you share the separate ground truth depth and grayscale images? Also, does this occur throughout the entire bag? It might also just be that there was some smearing in the global map for this time instance.

@kirkscheper
Copy link
Author

Sorry for the delayed response.

As far I can can see this offset is persistent throughout the datasets. It seems to be dependent on the depth and on the position in the scene which is why I think it is some kind of view point issue (the projected view point of the depthmap/ground truth flow is not the same as the left camera) or a calibration issue (or I am just making an error somewhere along the way).

The documentation states that the lidar frames should be from the viewpoint of the left image but does the T_cam0_lidar from the camchain-imucam need to be applied somehow?

See the attached for some images of the ground truth optical flow, raw images and rectified images for the sequence number 1 of the indoor scene.

imgs_sample.zip

As you can imagine, as the DVS only perceives contrast changes, so the optical flow can only be accurately estimated at locations with significant/high contrast i.e. edges so having a correctly aligned ground truth is very important.

Thanks in advance for any help you can give me.

@alexzzhu
Copy link
Contributor

It's expected if you see that the depth map extends beyond the events (e.g. the objects appear fatter in the depth map). This is because of the way we generate the local map, which may be liable to have errors in the pose. However, this extension is typically on the order of only a few pixels, and there are usually no events right beyond the boundaries of each object. Is this what you're seeing? It might be easier to visualize if you could generate a video of this to see if the effect is consistent over time.

@kirkscheper
Copy link
Author

You can find a video of the flight here: https://youtu.be/73iEJfZGGmw

@kirkscheper
Copy link
Author

And here's one with the overlay (to make it easier to see): https://youtu.be/dRWscigkEGg

@alexzzhu
Copy link
Contributor

alexzzhu commented Apr 3, 2020

This seems like it's to be expected unfortunately. As we accumulate depth points over multiple frames, there are errors introduced from the odometry. The typical effect is that objects in the depth/flow appear inflated compared to the original versions in the image/event space. Thresholding errors over only points with events should alleviate these issues somewhat as the points immediately beyond an object usually do not contain events.

@JiahangWu
Copy link

Sorry for the delayed response.

As far I can can see this offset is persistent throughout the datasets. It seems to be dependent on the depth and on the position in the scene which is why I think it is some kind of view point issue (the projected view point of the depthmap/ground truth flow is not the same as the left camera) or a calibration issue (or I am just making an error somewhere along the way).

The documentation states that the lidar frames should be from the viewpoint of the left image but does the T_cam0_lidar from the camchain-imucam need to be applied somehow?

See the attached for some images of the ground truth optical flow, raw images and rectified images for the sequence number 1 of the indoor scene.

imgs_sample.zip

As you can imagine, as the DVS only perceives contrast changes, so the optical flow can only be accurately estimated at locations with significant/high contrast i.e. edges so having a correctly aligned ground truth is very important.

Thanks in advance for any help you can give me.

Hi @kirkscheper, when I tried to overlap the event and depth map, I also found that they are not aligned. So did you solve this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants