Skip to content

Zero-shot, text-driven appearance manipulation on multiple views of an object to generate 3D renderings.

License

Notifications You must be signed in to change notification settings

animikhaich/3D-Text2LIVE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

3D-Text2LIVE (BU GRS CS640 Project 2022)

Zero-shot, text-driven appearance manipulation on multiple views of an object to generate 3D renderings.
Report Bug · Request Feature

Demo GIF

Table of Contents

About The Project

This project is a part of the Boston University Course: GRS CS640 - Artificial Intelligence and builds on top of Text2LIVE. In particular, it involves the following three papers:

A wide range of editing effects are now available to content creators thanks to extensive research into changing the appearance and style of objects in photographs. However, majority of the research in this field focuses on global editing rather than localized editing. To address this (Bar-Tal et al. 2022) developed an algorithm with localized editing of images using only text prompt. Given the substantial work being done on 3D objects and the widespread usage of 3D models in CAD modeling and video games, the same flexibility and range of editing effects ought to be available in 3D. Due to this, we propose 3D Text2LIVE, which gives the same degree of creative control over the appearance and style of 3D models as can be done with 2D photographs

Report and Presentation

Models and Results

Link to the trained models and results can be found here: Google Drive

Proposed Architecture

Architecture

Sample Results

Golden Hotdog

Golden Hotdog

Ice Chair

Ice Chair

Ship on Fire

Ship on Fire

Hardware Requirements

We recommend an Nvidia GPU for Training the models. As per our experimentation the following specifications are recommended:

  • Text2LIVE: Nvidia A100 (or any GPU with VRAM greater than 18 GB)
  • NeRF: Nvidia Tesla V100 (or any GPU with 11 GB VRAM or Higher)
  • DreamFusion3D: Nvidia Tesla V100 (or any GPU with VRAM greater than 16 GB)

License

Distributed under the GNU AGPL V3 License. See LICENSE for more information.

Contributors

Animikh Aich

Himanshu Patil

Vedika Srivastava

Acknowledgements

About

Zero-shot, text-driven appearance manipulation on multiple views of an object to generate 3D renderings.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •  

Languages