-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V3 and Gaussian Splatting #15
Conversation
…e simon for wip docs
…raint is arbitrarily set atm
…dition to videos. Fix typos
- Added Support for initial post parameters. Includes training types ("gaussian", "tensorf"), output types ("video", "ply", "splat", "model") - Added initial user defined job configs stored in scene record (JobRecord). Contains sfm_config and nerf_config for corresponding workers - Added from_dict() method in scene.py for dataclasses
…c job config/dispatch
…ed with GoWebServer in near future
…nerf-worker. Colmap white-background detection support.
Add gaussian splatting, and fix typos.
Update guassian requirements, roadmap, and installation
Fix bold and update shields
Also, I tried really hard to keep commit history for workers (but sfm is the only one that still exists in this version), but I couldn't get it to transfer to their repos (outside go-web-server which i started from scratch in). So, there wont be an diff data for the submodules and little commit history for them, but I tried to make it up with a lot of in code and wiki documentation |
Just gonna merge, as we are in indefinite hiatus until more rcos attention is gathered, and I'd like to add some more features on my own time. |
Hi All, this is the partner PR to
frontend
, but its really really big, so I'm sorry.Anyhow, heres the writeup, but its not nearly everything, and a lot is documented 1. in the code, 2. in the wiki, and 3. in this and the submodule repo issues.
Hey All! I said I'd work on this project over the summer, but I really underestimated how hard it is to get a job right now, so prepare for way too many changes for a single PR.
TLDR: My feature bloat had feature bloat.
Reasoning
Over the spring semester, we really tried to replace Tensorf with Gaussian splatting, and got demolished by nvidia errors for no reason. So, I ended up getting rid of the entire build system and transferring to docker only (which brought a bunch more pain). With gaussian splatting we get a bunch of shiny new features. Mainly, real time scene rendering (I get 150fps on my desktop and 60ish on the lower tier rpi laptop). This gave me the reason i needed to finally learn so frontend tech. Overall, I kept changing something and that led me to something else I could add.
I've tried my best to not override too much work and only make noncontroversial improvements, but there may be some toes I stepped on. If I did, you have my apologies for that.
Changes
With V3 comes a overall restructure of the web server.
Originally, I wrote the new API and features in the pythons server, but I wanted to learn GO, and it had a lot of really nice features for a web server (JSON/BSON auto marshal by tagging, easy concurrency, strong http frameworks, better MongoDB integration, and good error handling), so I eventually bit the bullet and tried to do basically a 1-to-1 translation from what I had in python to go.
Most of the services you would see in the old webserver are still here, except maybe slightly more decoupled. Refer to the Inline comments and go-web-server wiki for a more in-depth overview of changes. It is almost all composition by dependency injection (although some interfaces could be created to make it better)
WebServer
is a translation of the main http handler. In this version, almost all processing that is not http has been factored out intoClientService
. It now has a lot more REST API endpoints and can handle JWT tokens for auth. Incoming Requests are now automatically validated and have constraints enforcement.ClientService
now does most of the actual REST API request handling heavy lifting. It delegates to the correct database managers and delegates new/finished jobs to theAMPQService
. Request resources are now required to belong to a user when requested.AMPQService
is a object oriented version of the old thread based rabbitmqservice. It is now much more tolerant to connection issues and has the same publisher consumer concurrency from the old system.API Changes
Overall, there has been an expansion of what external API endpoints are available, including progress reports, scene metadata, user endpoints, more output handles, previews, and more.
Also, requests now have more descriptive error messages and accurate http status codes.
See API Changes for more.
Worker Communication
Worker communication should be mostly the same as previously. However, there is now more information communicated between workers to support background colors for better training,
dynamic configs, and error tolerance.
See Worker Communication for more.
Worker Services
Since we are a microservice architecture, I wanted to actually have the micro part. All the Dockerfiles should be new and improved, with smaller dependencies, experimental BUILDKIT caching for much faster (second time onwards) image building, and multistage builds to reduce container size. This is also a reason to switch to GO, as the webserver container is like 25MB instead of 7GB so we can eventually have multiple instances running to be load-balanced.
Also, we were previously handling environment variables a little weirdly. Now, if building from backend
docker compose
strategy, the.env
file should be in the project root, and all the relevant environment variables should be passed to the services. Just in case of local run, the services can still load their own .env files, but in most cases it will be unnecessaryThe Elephant in the Room (Gaussian Splatting)
The root cause of this all is the new nerf-worker, which is essentially a wrapper around the inria Gaussian Splatting code (with some improvements). The nerf-worker now has two main services
A user can now train a single scene for up to 30000 iterations (~25 Minutes on 3080ti), choose their output types, and save up to 5 selected training iterations for output. One thing to note is that
.ply
files have rather large file size (50-500MB), so unless we ever get the school to pay for a dedicated S3 or remoted storage, nerf-worker and go-web-server will eat at your file system.You can see the Nerf Worker wiki for more detailed changes.
Service Removal / Submoduling
Originally I wanted to allow the dev to choose between the flask server and the go server, as well as tensoRF and Gaussian Splatting, and I may add that in the future. However, in my opinion go-web-server and Gaussian Splatting have such overwhelming advantages that I have removed those services in this version. That is a main reason as to why I created the frontend/backend repositories, so that I do not override a rich commit history for other services and versions.
I did try my best (with a lot of help from chatgpt) to filter all the vidtonerf commits by file so that I could take as much of the history from VidToNerf and web-app-react to backend/frontend, but I could never quite make it work. Instead, I've tried to transfer as many issues as possible from the old repositories and make updates to them all.
Each service is now its own repository and submoduled in the backend repo, allowing for more decoupling of version control, and allowing easier issue identification and personalized wikis. Also, if we ever add GitHub actions, it will be much easier to add to each individual repository.
Some Backend Notes
I have fleshed out a lot of the wiki pages, so please consult them if you have questions. Also, I fully expect nerf-worker to require finagling to get building by the team, and please please please reach out to me over discord for some help. The single largest time sink for this project was a month and a half where I tried to get gaussian splatting into docker, so I expect some issues. I did document a lot of my process locally and a little on the nerf-worker wiki, but there is guaranteed to be some pieces that I missed.
Frontend
Typescript
I have transferred all of web-app-react to typescript, to speed up development (and also because I needed it to make the real-time splat scene viewer)
Tech Stack
web-app-react
was moved toWeb-App-Vite
and thenfrontend
. Originally, this was a Create-React-App (CRA) template, but that comes with webpack as the bundler. This would normally be fine, but A. there exists much better ones, and B. Webpack hates Web Worker threads (which were an absolute must for the SplatCloudViewer), so I have moved the applicate to Vite. You can use bothnpm
andyarn
as your node package manager and be fine.UI/UX
The original react app used
react-bootstrap
for the UI components, a perfectly fine choice. However, typescript has a few hiccups with resolving their types, and the red squigglies were annoying. Also, I am anything but good at UI, so I tried to switch to a more modern component library inMantine
, but the webpage now looks a tad bit soulless. There's a lot of a super cool libraries that came out this year like Shadcn, so if design is your forte, please please take the reigns and give the page some personality. Also theAbout
page needs some love.Improvements/Features
There was a lot added to the webpage
Missing Work
About
page needs a redesign to reflect the new architectureCommunity
was never implemented, but no time like the presentTesting Apologies
I always intended to get unit testing done, but from active development I have probably done over 5000 manual functional/system and unit tests, and just kept plugging along. I can vouch for a large functionality, but cannot guarantee it. Its a total douche-canoe move to leave this to the next team, so I will also be heavily working on testing in the near future.
Relevant PR's
Frontend PR