Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support rendering of color and N-D features simultaniously #529

Open
SuperXia123 opened this issue Jan 6, 2025 · 10 comments

Comments

@SuperXia123
Copy link

SuperXia123 commented Jan 6, 2025

Support a user-defined feature map as optional input, which could be used to encode any wanted feature with flexibility, such as semantics, normals, etc...

The input feature map is a torch tensor with dimension [N, C],where N is the gaussian number and C is the channel number which depends on the user wanted features. The output feature map is [C, H, W], indicating the alpha-blended gaussian features.

The users the decode the output feature map according to their encoding strategy, while the rasterization engine do not care about the meaning of the feature map.

A similar implementation of this feature could be found here.

@maturk
Copy link
Collaborator

maturk commented Jan 6, 2025

Rasterizing of N dimensional features is already supported by the gsplat backend. here

@SuperXia123
Copy link
Author

@maturk Thanks for your response!
If I understand correctly, one can not render colors and self-defined feature channels simultaneously during a single function execution.
If that is true, should I perform the rasterization function twice if I want both colors and features to be rasterized?

@maturk
Copy link
Collaborator

maturk commented Jan 6, 2025

Correct simultaneous rendering with a single forward pass call is not supported besides for rgb+depth rendering.

I met a similar problem in my dn-splatter project where i wanted to render rgb+d and normals with a single forward pass. Its easier to do it twice, unless performance is necessary. But this approach is not so elegant.

Happy to discuss this problem further, since it might be a useful thing to support better.

@SuperXia123 SuperXia123 changed the title Feature request: support rendering of arbitrary feature map Feature request: support rendering of color and N-D features simultaniously Jan 7, 2025
@yzslab
Copy link
Contributor

yzslab commented Jan 7, 2025

But, I think it is possible.

If you are using rasterization:

  1. Manually call the spherical_harmonics to convert your SHs to RGB
  2. Concatenate your RGB and features as a new tensor in [N, 3+F], where F is the number of dimension of your features.
  3. Call the rasterization with the concatenated tensor and sh_degree=None

Example:

camera = ...

dirs = model.get_means() - camera.camera_center
colors = spherical_harmonics(model.active_sh_degree, dirs, model.get_shs())
colors = torch.clamp_min(colors + 0.5, 0.0)

extra_features = ...

input_features = torch.concat([colors, extra_features], dim=-1)

rendered_colors, rendered_alphas, meta = rasterization(
    ...
    colors=input_features,
    sh_degree=None,
    ...
)

rendered_rgbs = rendered_colors[..., :3]
rendered_features = rendered_colors[..., 3:]

Another option is to re-implement your own rasterization pipeline using fully_fused_projection and rasterize_to_pixels, instead of relying on the provided rasterization interface.

@SuperXia123
Copy link
Author

SuperXia123 commented Jan 10, 2025

But, I think it is possible.

If you are using rasterization:

  1. Manually call the spherical_harmonics to convert your SHs to RGB
  2. Concatenate your RGB and features as a new tensor in [N, 3+F], where F is the number of dimension of your features.
  3. Call the rasterization with the concatenated tensor and sh_degree=None

Example:

camera = ...

dirs = model.get_means() - camera.camera_center
colors = spherical_harmonics(model.active_sh_degree, dirs, model.get_shs())
colors = torch.clamp_min(colors + 0.5, 0.0)

extra_features = ...

input_features = torch.concat([colors, extra_features], dim=-1)

rendered_colors, rendered_alphas, meta = rasterization(
    ...
    colors=input_features,
    sh_degree=None,
    ...
)

rendered_rgbs = rendered_colors[..., :3]
rendered_features = rendered_colors[..., 3:]

Another option is to re-implement your own rasterization pipeline using fully_fused_projection and rasterize_to_pixels, instead of relying on the provided rasterization interface.

@yzslab I tried this strategy and it worked. Thanks a lot!

@RayYoh
Copy link

RayYoh commented Jan 17, 2025

But, I think it is possible.
If you are using rasterization:

  1. Manually call the spherical_harmonics to convert your SHs to RGB
  2. Concatenate your RGB and features as a new tensor in [N, 3+F], where F is the number of dimension of your features.
  3. Call the rasterization with the concatenated tensor and sh_degree=None

Example:
camera = ...

dirs = model.get_means() - camera.camera_center
colors = spherical_harmonics(model.active_sh_degree, dirs, model.get_shs())
colors = torch.clamp_min(colors + 0.5, 0.0)

extra_features = ...

input_features = torch.concat([colors, extra_features], dim=-1)

rendered_colors, rendered_alphas, meta = rasterization(
...
colors=input_features,
sh_degree=None,
...
)

rendered_rgbs = rendered_colors[..., :3]
rendered_features = rendered_colors[..., 3:]

Another option is to re-implement your own rasterization pipeline using fully_fused_projection and rasterize_to_pixels, instead of relying on the provided rasterization interface.

@yzslab I tried this strategy and it worked. Thanks a lot!

Hi, may I ask if this works well for you?
In my case, I find it difficult to converge.

@SuperXia123
Copy link
Author

But, I think it is possible.
If you are using rasterization:

  1. Manually call the spherical_harmonics to convert your SHs to RGB
  2. Concatenate your RGB and features as a new tensor in [N, 3+F], where F is the number of dimension of your features.
  3. Call the rasterization with the concatenated tensor and sh_degree=None

Example:
camera = ...
dirs = model.get_means() - camera.camera_center
colors = spherical_harmonics(model.active_sh_degree, dirs, model.get_shs())
colors = torch.clamp_min(colors + 0.5, 0.0)
extra_features = ...
input_features = torch.concat([colors, extra_features], dim=-1)
rendered_colors, rendered_alphas, meta = rasterization(
...
colors=input_features,
sh_degree=None,
...
)
rendered_rgbs = rendered_colors[..., :3]
rendered_features = rendered_colors[..., 3:]
Another option is to re-implement your own rasterization pipeline using fully_fused_projection and rasterize_to_pixels, instead of relying on the provided rasterization interface.

@yzslab I tried this strategy and it worked. Thanks a lot!

Hi, may I ask if this works well for you? In my case, I find it difficult to converge.

@RayYoh Yes it works perfectly in my case. My code is like follows (using the original inria 3dgs naming convention)

def rasterization_gsplat(
        gaussians: GaussianModel,
        cam_intrinsic: torch.Tensor, cam_extrinsic: torch.Tensor,
        cam_center: torch.Tensor,
        img_width: int, img_height: int,
        bg_color: torch.Tensor,
        scaling_modifier=1.0, override_color=None):
   
    means3D = gaussians.get_xyz
    opacity = gaussians.get_opacity
    rotations = gaussians.get_rotation
    scales = gaussians.get_scaling * scaling_modifier

    # simultaniously render colors and user-defined features, see discussion:
    # https://github.com/nerfstudio-project/gsplat/issues/529
    if override_color is not None:
        # manually call the spherical_harmonics to convert SHs to RGB
        colors = spherical_harmonics(
            gaussians.active_sh_degree,
            dirs=means3D - cam_center, coeffs=gaussians.get_features)
        colors = torch.clamp_min(colors + 0.5, 0.0)  # [N, D]
        # add extra features here
        # colors = torch.concat([colors, extra_features], dim=-1)
        sh_degree = None
    else:
        colors = gaussians.get_features  # [N, K, 3]
        sh_degree = gaussians.active_sh_degree

    rendered_images, rendered_alphas, info = rasterization(
        means=means3D,  # [N, 3]
        quats=rotations,  # [N, 4]
        scales=scales,  # [N, 3]
        opacities=opacity.squeeze(-1),  # [N,]
        colors=colors,
        viewmats=cam_extrinsic[None],  # [1, 4, 4]
        Ks=cam_intrinsic[None],  # [1, 3, 3]
        backgrounds=bg_color[None],
        render_mode="RGB+ED",
        width=int(img_width),
        height=int(img_height),
        packed=False,
        sh_degree=sh_degree,
    )

    # [P, H, W, C] -> [C, H, W], P refers number of image planes and equals 1
    #     in case of single image rendering.
    rendered_image = rendered_images[0].permute(2, 0, 1)
    rendered_rgb = rendered_image[:3, :, :]
    rendered_depth = rendered_image[3:4, :, :]
    rendered_alpha = rendered_alphas[0].permute(2, 0, 1)
    radii = info["radii"].squeeze(0)  # [N,]

    try:
        info["means2d"].retain_grad()  # [1, N, 2]
    except Exception:
        pass

    return {"render": rendered_rgb,
            "viewspace_points": info["means2d"],
            "visibility_filter": radii > 0,
            "radii": radii,
            "depth": rendered_depth,
            "opacity": rendered_alpha,
            }

@RayYoh
Copy link

RayYoh commented Jan 20, 2025

But, I think it is possible.
If you are using rasterization:

  1. Manually call the spherical_harmonics to convert your SHs to RGB
  2. Concatenate your RGB and features as a new tensor in [N, 3+F], where F is the number of dimension of your features.
  3. Call the rasterization with the concatenated tensor and sh_degree=None

Example:
camera = ...
dirs = model.get_means() - camera.camera_center
colors = spherical_harmonics(model.active_sh_degree, dirs, model.get_shs())
colors = torch.clamp_min(colors + 0.5, 0.0)
extra_features = ...
input_features = torch.concat([colors, extra_features], dim=-1)
rendered_colors, rendered_alphas, meta = rasterization(
...
colors=input_features,
sh_degree=None,
...
)
rendered_rgbs = rendered_colors[..., :3]
rendered_features = rendered_colors[..., 3:]
Another option is to re-implement your own rasterization pipeline using fully_fused_projection and rasterize_to_pixels, instead of relying on the provided rasterization interface.

@yzslab I tried this strategy and it worked. Thanks a lot!

Hi, may I ask if this works well for you? In my case, I find it difficult to converge.

@RayYoh Yes it works perfectly in my case. My code is like follows (using the original inria 3dgs naming convention)

def rasterization_gsplat(
        gaussians: GaussianModel,
        cam_intrinsic: torch.Tensor, cam_extrinsic: torch.Tensor,
        cam_center: torch.Tensor,
        img_width: int, img_height: int,
        bg_color: torch.Tensor,
        scaling_modifier=1.0, override_color=None):
   
    means3D = gaussians.get_xyz
    opacity = gaussians.get_opacity
    rotations = gaussians.get_rotation
    scales = gaussians.get_scaling * scaling_modifier

    # simultaniously render colors and user-defined features, see discussion:
    # https://github.com/nerfstudio-project/gsplat/issues/529
    if override_color is not None:
        # manually call the spherical_harmonics to convert SHs to RGB
        colors = spherical_harmonics(
            gaussians.active_sh_degree,
            dirs=means3D - cam_center, coeffs=gaussians.get_features)
        colors = torch.clamp_min(colors + 0.5, 0.0)  # [N, D]
        # add extra features here
        # colors = torch.concat([colors, extra_features], dim=-1)
        sh_degree = None
    else:
        colors = gaussians.get_features  # [N, K, 3]
        sh_degree = gaussians.active_sh_degree

    rendered_images, rendered_alphas, info = rasterization(
        means=means3D,  # [N, 3]
        quats=rotations,  # [N, 4]
        scales=scales,  # [N, 3]
        opacities=opacity.squeeze(-1),  # [N,]
        colors=colors,
        viewmats=cam_extrinsic[None],  # [1, 4, 4]
        Ks=cam_intrinsic[None],  # [1, 3, 3]
        backgrounds=bg_color[None],
        render_mode="RGB+ED",
        width=int(img_width),
        height=int(img_height),
        packed=False,
        sh_degree=sh_degree,
    )

    # [P, H, W, C] -> [C, H, W], P refers number of image planes and equals 1
    #     in case of single image rendering.
    rendered_image = rendered_images[0].permute(2, 0, 1)
    rendered_rgb = rendered_image[:3, :, :]
    rendered_depth = rendered_image[3:4, :, :]
    rendered_alpha = rendered_alphas[0].permute(2, 0, 1)
    radii = info["radii"].squeeze(0)  # [N,]

    try:
        info["means2d"].retain_grad()  # [1, N, 2]
    except Exception:
        pass

    return {"render": rendered_rgb,
            "viewspace_points": info["means2d"],
            "visibility_filter": radii > 0,
            "radii": radii,
            "depth": rendered_depth,
            "opacity": rendered_alpha,
            }

Hi @SuperXia123, thanks for your kind reply. I have also made it work well; the problem is raised by the amp training, i.e., fp16. I noticed that you also render the depth map. I want to ask if you used the GT depth map to supervise this.
In my case, directly using GT to supervise depth makes the loss not decrease; instead, normalizing the depth to 0-1 works.

@SuperXia123
Copy link
Author

SuperXia123 commented Jan 22, 2025

@RayYoh I used the inverse depth map (which also in range 0-1) to supervise geometry and it converges.

I think, perhaps, using depth-loss directly may be very sensitive to noise, meaning extremly large loss could be introduced by only a slight inconsistence.

@RayYoh
Copy link

RayYoh commented Jan 22, 2025

Hi @SuperXia123, thanks for your kind reply. Does the inverse depth map get the value by 1/d for the original depth map? May I ask if there is any codebase that can be referred for this implementation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants