KAIST CS479: Machine Learning for 3D Data
Programming Assignment 3
Instructor: Minhyuk Sung (mhsung [at] kaist.ac.kr)
TA: Seungwoo Yoo (dreamy1534 [at] kaist.ac.kr)
Following the success of Neural Radiance Fields (NeRF) in novel view synthesis using implicit representations, researchers have actively explored adapting similar concepts to other 3D graphics primitives. The most successful among them is Gaussian Splatting (GS), a method based on a point-cloud-like representation known as Gaussian Splats.
Unlike simple 3D points that encode only position, Gaussian Splats store local volumetric information by associating each point with a covariance matrix, modeling a Gaussian distribution in 3D space. These splats can be efficiently rendered by projecting and rasterizing them onto an image plane, enabling real-time applications as demonstrated in the paper.
In this assignment, we will explore the core principles of the Gaussian Splat rendering algorithm by implementing its key components. As in our previous assignment on NeRF, we strongly encourage you to review the paper beforehand or while working on this assignment.
Table of Content
To get started, clone this repository first.
git clone --recursive https://github.com/KAIST-Visual-AI-Group/CS479-Assignment-3DGS
We recommend creating a virtual environment using conda
.
To create a conda
environment, issue the following command:
conda create --name cs479-gs python=3.10
This should create a basic environment with Python 3.10 installed.
Next, activate the environment and install the dependencies using pip
:
conda activate cs479-gs
The remaining dependencies are the ones related to PyTorch and they can be installed with the command:
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install torchmetrics[image]
pip install imageio[ffmpeg]
pip install plyfile tyro==0.6.0 jaxtyping==0.2.36 typeguard==2.13.3
pip install simple-knn/.
Register the project root directory (i.e., gs_renderer
) as an environment variable to help the Python interpreter search our files.
export PYTHONPATH=.
By default, the configuration is set to render lego
scene. You can select different scenes by altering argument Args
in render.py
. Run the following command to render the scene:
python render.py
For now, running this command will result in an error, as the Gaussian Splat files have not been downloaded yet.
All by-products made during rendering, including images, videos, and evaluation results, will be saved in an experiment directory under outputs/{SCENE NAME}
.
This codebase is organized as the following directory tree. We only list the core components for brevity:
gs_renderer
│
├── data <- Directory for data files.
├── src
│ ├── camera.py <- A light-weight data class for storing camera parameters.
│ ├── renderer.py <- Main renderer implementation.
│ ├── scene.py <- A light-weight data class for storing Gaussian Splat parameters.
│ └── sh.py <- A utility for processing Spherical Harmonic coefficients.
├── evaluate.py <- Script for computing evaluation metrics.
├── render.py <- Main script for rendering.
├── render_all.sh <- Shell script for rendering all scenes for evaluation.
└── README.md <- This file.
Download the scene files (data.zip
) from here and extract them into the root directory.
After extraction, the data
directory should be structured as follows:
data
│
├── nerf_synthetic <- Directory containing camera parameters and reference images
│ ├── chair
│ ├── drums
│ ├── lego
│ └── materials
├── chair.ply <- Gaussian splats for "Chair" Scene.
├── drums.ply <- Gaussian splats for "Drums" Scene.
├── lego.ply <- Gaussian splats for "Lego" Scene.
└── materials.ply <- Gaussian splats for "Materials" Scene.
Implement the coordinate transformation from world space to normalized device coordinates (NDC) in the project_ndc
method of renderer.py
.
Given a homogeneous coordinate
where
To project the point onto the image plane, first perform the matrix multiplication
where
where
Lastly, compute the binary mask indicating the points that are behind the near plane by checking whether the
Implement the projection of the covariance matrix onto the image plane in the compute_cov_2d
method of renderer.py
.
You are only allowed to modify the code inside the block marked with TODO
in the method.
After transforming the centers of 3D Gaussian splats to the camera space, compute the Jacobian matrix of the world-to-camera and projective transformations.
Specifically, we can use the Jacobian matrix
where J
initialized with zeros of the correct shape. You need to fill in the correct values in the tensor.
Next, compute the covariance matrix in the image plane by projecting the world-space covariance matrix using the Jacobian matrix:
where
Finally, implement the rendering equation for point-based radiance fields in the render
method of renderer.py
, which computes pixel colors by blending the colors of 2D Gaussian splats stacked on the image plane.
Note that we assume that the center coordinates and covariance matrices of 2D Gaussian splats lie in the image space.
Due to memory constraint, the renderer divides an image into multiple tiles and processes each tile separately.
For each tile, the renderer computes the 2D Gaussian splats projected onto the tile and accumulates the colors of the splats.
The provided skeleton already implements this process, and you can use in_mask
to identify the splats that should be used for rendering the current tile.
Implement the following four steps in the render
method at the locations marked as TODO
:
- Sort the Gaussians in ascending order based on their depth.
- Compute the displacement vector
$\mathbf{d}_{i,j} \in \mathbb{R}^2$ between the center of$i$ -th pixel in the current tile and the$j$ -th Gaussian splat indicated byin_mask
. - Compute the Gaussian weight at the pixel center by evaluating the Gaussian distribution at the displacement vector as
$\mathbf{w}_{i,j} = \exp (-\frac{1}{2} \mathbf{d}_{i,j}^T \Sigma_{j}^{-1} \mathbf{d}_{i,j} )$ where$\Sigma_{j}$ is the covariance matrix of the$j$ -th 2D Gaussian splat. - Perform alpha blending to accumulate the colors of the splats, using the product of opacities and Gaussian weights from Step 3 to determine the final pixel colors. The color of the
$i$ -th pixel is computed as:
where
-
Proximity: how close the
$i$ -th pixel is to the splat center in the image space, and - Opacity: how opaque the splat is.
After completing the tasks above, run the script
./render_all.sh
to render and save images for all scenes: chair
, lego
, materials
, and drums
.
To evaluate the results, run the following command:
python evaluate.py
This will create a file named metrics.csv
in the current directory, which will be used for grading.
For reference, our implementation produces the following metrics:
Scene | LPIPS (↓) | PSNR (↑) | SSIM (↑) |
---|---|---|---|
Chair | 0.037 | 27.043 | 0.953 |
Lego | 0.047 | 25.723 | 0.939 |
Materials | 0.043 | 25.014 | 0.937 |
Drums | 0.079 | 21.583 | 0.896 |
Average | 0.052 | 24.841 | 0.931 |
💡 For details on grading, refer to section Grading.
Compile the following files as a ZIP file named {STUDENT_ID}.zip
and submit the file via Gradescope.
- The folder
gs_renderer
that contains every source code file; - A folder named
{STUDENT_ID}
with four subdirectories containing the rendered images (.png
files) used for evaluation; - A CSV named
{STUDENT_ID}.csv
containing the evaluation metrics from theevaluate.py
script.
You will receive a zero score if:
- you do not submit,
- your code is not executable in the Python environment we provided, or
- you modify any code outside of the section marked with
TODO
.
Plagiarism in any form will also result in a zero score and will be reported to the university.
Your score will incur a 10% deduction for each missing item in the What to Submit section.
Otherwise, you will receive up to 30 points from this assignment that count toward your final grade. Your submissions will be graded based on the average metrics calculated across the four scenes.
Evaluation Criterion | LPIPS (AVG) (↓) | PSNR (AVG) (↑) | SSIM (AVG) (↑) |
---|---|---|---|
Success Condition (100%) | 0.065 | 22.000 | 0.900 |
Success Condition (50%) | 0.080 | 20.000 | 0.850 |
As shown in the table above, each evaluation metric is assigned up to 10 points. In particular,
- LPIPS (AVG)
- You will receive 10 points if the reported value is equal to or, smaller than the success condition (100%);
- Otherwise, you will receive 5 points if the reported value is equal to or, smaller than the success condition (50%).
- PSNR (AVG)
- You will receive 10 points if the reported value is equal to or, greater than the success condition (100%);
- Otherwise, you will receive 5 points if the reported value is equal to or, greater than the success condition (50%).
- SSIM (AVG)
- You will receive 10 points if the reported value is equal to or, greater than the success condition (100%);
- Otherwise, you will receive 5 points if the reported value is equal to or, greater than the success condition (50%).
- torch-splatting: Our implementation is based on this repository.