Is the gradient of the NCC cost function missing the coefficient of 1/n? #1184

will-zzy · 2024-12-06T11:03:36Z

Describe the bug
Thank you for your work! I have recently been studying the implementation of surface evolution and am hoping to apply it to CUDA. However, when I derived the gradient of the NCC cost function, I noticed that the dZNCC at libs/MVS/SceneRefine.cpp line 868 is missing a coefficient (although this does not affect the optimization, as it can be considered part of the optimization step size).

const Real dZNCC((Real)imageA(r,c)*invSqrtVAVB - (Real)imageB(r,c)*ZNCCinvVB + imageMeanB(r,c)*ZNCCinvVB - imageMeanA(r,c)*invSqrtVAVB);

Nonetheless, from a rigorous perspective, should the dZNCC be multiplied by a factor of 1/n, similar to what is done in line 888?
Thank You!

Screenshots

The text was updated successfully, but these errors were encountered:

cdcseacave · 2024-12-21T09:20:42Z

Hi @will-zzy and thank you spotting this. Indeed it seems that the division by n is missing. I'm very curious about your work here. Can you elaborate on what your goals, what is the project you are working on? I'm asking because I'm also working on implementing this code in CUDA, and maybe we can unite our forces.

will-zzy · 2024-12-27T02:02:20Z

@cdcseacave
Thank you for your reply! My goal is to combine surface evolution with Gaussian splatting. Since surface evolution based on functional optimization can be easily adapted to Gaussian splatting (if we use mesh geodesics as the metric, the splatting method becomes equivalent to mapping a “Gaussian texture” on the mesh onto the image), this may lead to a method that achieves high-quality geometric reconstruction and real-time realistic rendering.

Therefore, my first goal is to implement the relevant mesh Euler operations on the GPU. It seems there is already a fairly mature approach in graphics, for example: "High-performance simplification of triangular surfaces using a GPU." This paper maintains a GPU version of CGAL's data structures.

My second goal is to incorporate the gradient of the geodesic-based differentiable rendering loss into the gradient flow of the NCC cost. I still need to study the derivations for this. Some references include "Differentiable Rendering using RGBXY Derivatives and Optimal Transport" and "Discrete Geodesic Graph (DGG) for Computing Geodesic Distances on Polyhedral Surfaces.", "Geometry Field Splatting with Gaussian Surfels"

cdcseacave · 2024-12-27T10:02:13Z

I like what you try to achieve, but I do not see clearly how to connect a mesh to 3DGS. If we represent a mesh as connected triangles, it is very hard / unintuitive to change its topology, so that means you will need to introduce the mesh very late in the optimization when the surface is very close to the true shape, and find a very accurate way extracting that surface from 3dgs (to capture the true topology). I think it will be better some other way of representing the surface, maybe implicitly like SDF, however in that case I do not see clearly how to connect 3dgs to SDF.

will-zzy · 2024-12-27T12:44:41Z

@cdcseacave
In fact, my idea is not about 3D Gaussian ellipsoids. It is more similar to Quadratic Gaussian Splatting (QGS), where 2D Gaussian distributions are defined on surfaces:

Imagine that this quadratic paraboloid is made up of triangular meshes. As the triangular meshes change, the 2D Gaussian distributions defined on the surfaces will produce similar effects. Of course, connected triangular faces have higher degrees of freedom and should show more detail.

For example, on the two triangular faces below, a 2D Gaussian ellipse spans across two connected triangles. If the centroid and covariance of the 2D Gaussian distribution are predefined, the Gaussian weight at any point on the triangular mesh can be determined based on the mesh's geodesic distance.

Therefore, this definition method does not require changing the mesh's topology. It is more like 2D Gaussian distributions drifting across the surface formed by the triangular mesh.

The original idea of QGS is based on the following theory: for models optimized using photometric consistency errors, we want a representation method where the more realistic the rendering, the higher the quality of geometric reconstruction. If this holds true, we can use various methods to improve rendering quality to enhance geometric reconstruction accuracy.

The opposite of this theory is: the further the expressed geometric structure deviates from the real surface, the worse the rendering quality. This shows that if a representation method's rendering results are highly sensitive to geometry, then it has a stronger ability to express that "more realistic rendering leads to higher geometric reconstruction quality." Therefore, we are committed to rendering in the surfel form. For 3D Gaussian ellipsoids, their strong texture fitting ability means they can render realistic images even on inaccurate geometry, making it inefficient to optimize geometric structures using photometric consistency errors.

On the other hand, the key to Gaussian splatting methods rendering realistic images lies in the volumetric rendering formula, not the shape of the primitives. The volumetric rendering formula depends on the relative positions of the primitives, not their absolute positions, i.e., C = ∑_i {ci αi Ti}. This means that even if all 2D Gaussian primitives are compressed onto a single plane, realistic images can still be rendered after sufficient optimization.
Additionally, surfel representations can use exact volumetric rendering formulas instead of approximations, as demonstrated in paper "Geometry Field Splatting with Gaussian Surfels".

cdcseacave · 2024-12-27T14:29:53Z

Thank you for the details, however I am still missing a link in this chain I think. I am familiar with the 3DGS space, including QGS, and I'm also pursuing a way to accurately model the surface with 3DGS. 3DGS is a first approach. we can think of 2DGS as a specialization (simplification) of QGS, PGSR as well, etc. But all have in common the fact that they do not model the surface directly, but parametric splats: continuous functions, that together form a continuous space. Even QGS is using TSDF to model the surface after training and extract the mesh.

I still do not understand which are the building blocks you propose, as if they are triangles, I do not see how they are topology agnostic. As soon as you have 2 connected tirangles like above, that is a fixed topology.

will-zzy · 2024-12-28T06:07:44Z

@cdcseacave
Thank you for the additional discussion. Indeed, almost all GS methods do not explicitly model a topologically connected triangular mesh. However, in methods like PGSR and MVG-splatting, we can verify that their face elements, once converged, lie on the real surface because the homography-based warp NCC takes effect.

I apologize for not clearly stating the prerequisites. My understanding is that you are concerned about the topology among the primitives, given that existing GS methods typically optimize freely in continuous 3D space, causing most primitives to converge onto or near the actual surface without modeling any topology. During optimization and densification, these primitives do not explicitly interact with each other (e.g., through topology).

In contrast, our primitives differ from most methods. They are not triangular patches or triangles embedded with Gaussians (as in GaMeS), but rather a set of Gaussian distributions defined on a manifold mesh with a fixed topology.

My idea is as follows: we start with an initial mesh $M_0$ (just like in mesh refinement). It need not be highly accurate; it could be derived from a sparse point cloud or from a large model. This mesh has a fixed topology and is assumed to be a manifold surface, without requiring it to be watertight. Topology changes can only occur through mesh refinement operations like decimation or subdivision. (Of course, a sparse point cloud cannot form a complete scene mesh; during optimization, we may add new connected components through mesh growth to gradually model the entire surface.)

We then initialize the centroids $\mu$ and covariances $\Sigma$ of 2D Gaussian distributions on $M_0$. These 2D Gaussians are not bound to a single triangular face, nor is their scale tied to face size (a constraint sometimes seen in editing-based work such as GaMeS or “Mesh-based Gaussian Splatting for Real-time Large-scale Deformation”). Each 2D Gaussian can span multiple triangular faces through geodesic distances.

During optimization, the Gaussian centroids are allowed to move freely across different triangular faces. Because we use geodesic distance as the metric, the Gaussian distributions are completely defined on the manifold mesh. Thus, they do not affect the mesh’s topology but instead rely on the mesh (with its fixed topology at any given time) to fit the surface texture.

Thank you again for your insights. If there is anything imprecise in my idea, please let me know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the gradient of the NCC cost function missing the coefficient of 1/n? #1184

Is the gradient of the NCC cost function missing the coefficient of 1/n? #1184

will-zzy commented Dec 6, 2024

cdcseacave commented Dec 21, 2024

will-zzy commented Dec 27, 2024 •

edited

Loading

cdcseacave commented Dec 27, 2024

will-zzy commented Dec 27, 2024 •

edited

Loading

cdcseacave commented Dec 27, 2024

will-zzy commented Dec 28, 2024

Is the gradient of the NCC cost function missing the coefficient of 1/n? #1184

Is the gradient of the NCC cost function missing the coefficient of 1/n? #1184

Comments

will-zzy commented Dec 6, 2024

cdcseacave commented Dec 21, 2024

will-zzy commented Dec 27, 2024 • edited Loading

cdcseacave commented Dec 27, 2024

will-zzy commented Dec 27, 2024 • edited Loading

cdcseacave commented Dec 27, 2024

will-zzy commented Dec 28, 2024

will-zzy commented Dec 27, 2024 •

edited

Loading

will-zzy commented Dec 27, 2024 •

edited

Loading