-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add preliminary EyefulTower dataset support #2672
Add preliminary EyefulTower dataset support #2672
Conversation
4d7442b
to
76bae53
Compare
Also - I developed this on Windows. I had to add the change to disable I still wasn't able to run
|
@VasuAgrawal re: splits I think this is what you want: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L211 IMO it would be nice if the nerfstudio dataparser used provided dataset-provided splits by default before falling back to "fraction." Happy to make a PR if there are no objections. |
This seems like a good idea to me! |
Agreed with using the filename split as well. :) I think we can also include a default method with default command line arguments, maybe with a method name "nerfacto-eyeful-tower"? I also think it would be cleaner if we move most of the code from "download_data.py" to a new file and only import the necessarily parts to connect it to the CLI. Cool work, excited to get this merged! |
Turns out someone already thought of adding split support: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L195 - my bad! |
76bae53
to
e96f6db
Compare
aa2bc6b
to
25b6d58
Compare
UpdatesThanks for the help so far everyone! A couple updates:
|
I'll take a crack at splitting the updates into a separate file, and adding some documentation. I'm not sure about the dedicated command yet (i.e. |
Per @ethanweber 's recommendation, I added a new method to |
Hey @VasuAgrawal, one thing we could do is make a super small addition to your other repo https://github.com/facebookresearch/EyefulTower, which would hook the custom nerfacto config as a method. If this is interesting to you, let me know. It would avoid the dangerous precedent with "Method X Dataset" as you said. I think it would involve having two additional files like this...
|
Some more updates:
However, I didn't want users to have to manually check which datasets were fisheye and which ones weren't, and what an appropriate
This is roughly the extent of the fisheye support I want to add in this PR, and it may be too much already due to the aforementioned concerns. There's a few other things that would be really nice to have, e.g. specifying crop radius by degrees instead of pixels (to transparently handle different zooms), allowing for per-camera/per-frame radii, ensuring the sampling doesn't potentially produce out of bounds indices (with a sufficiently large radius, or rectangular image like those in Eyeful, I think this happens right now, but I think they just become redundant values and change the sampling to be non-uniform?). These things can be added in future PRs :).
|
Here's a couple fisheye results, on TensorBoard output also works with fisheye images, but I think there's some bugs in the pipeline somewhere as the renderings around the periphery don't look anywhere as good as those near the center. Not sure what's causing this right now. Also not sure how significant of a problem this is as the visualization the viewer seems to be okay? |
Installed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a pass with some comments. Looks close to being merged.
pyproject.toml
Outdated
# TODO(1480) enable when pycolmap windows wheels are available | ||
# "pycolmap>=0.3.0", # NOTE: pycolmap==0.3.0 is not available on newer python versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put this in a different PR, as it's not so related to adding support for EyefulTower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can pull it into a separate PR. I put it here since I was doing my development on Windows and this was blocking me from installing the environment, but now that it's done I can easily make a separate PR. Coming soon.
pyproject.toml
Outdated
# NOTE: Disabling projectaria-tools because it doesn't have prebuilt windows wheels | ||
# Syntax comes from here: https://pip.pypa.io/en/stable/reference/requirement-specifiers/ | ||
"projectaria-tools>=1.3.1; sys_platform != 'win32'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not put this change in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
@@ -62,12 +62,12 @@ def list_images(data: Path) -> List[Path]: | |||
"""Lists all supported images in a directory | |||
|
|||
Args: | |||
data: Path to the directory of images. | |||
data: Path to the directory of images. Nested folders are searched as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should search nested folders by default. Can you make this by default turned off, and make a new parameter search_nested_folders: bool = False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, can do.
nerfstudio/utils/tensor_dataclass.py
Outdated
else: | ||
# Don't broadcast the remaining fields | ||
new_dict[k] = v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from the support for the fisheye mask radius parameter in the metadata
dict that I added. I discussed it a bit in the discussion in the PR, quoted below:
I'm not super happy with this solution since it doesn't show up in the config output in the command line, and requires updating the _broadcast_dict_fields methods to not overwrite metadata, whose implications I don't understand.
In slightly more detail - it's because when creating the Cameras
object (https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L292-L303), any populated metadata
seems to get dropped due to the call through this function. Specifically, we end up calling this _broadcast_dict_fields
method with dict_ = metadata, batch_shape = batch_shape
, and then iterating through all the keys of the metadata. Since the one key that's currently set (https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L291) is not a torch.Tensor
, TensorDataClass
, or Dict
, it silently falls through the if statement and gets dropped from the new output dictionary. The created Cameras
object then gets a blank metadata
dict instead of one containing fisheye_crop_radius
(and any other metadata keys that might be added in the future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay got it. I think this is fine and won't have downstream implications. I see there was already support for metadata and "fisheye_crop_radius", so why wasn't this needed before for the base_datamanager.py Cameras inititialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior to this, the metadata
in Cameras
was being overwritten to {}
, which I think was a bug in the implementation. I think the reason it didn't get caught is because the value is (probably?) normally specified as a command line parameter, which takes a different path to populate the fisheye mask radius and doesn't exercise this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also fixed this here: #2783 (which includes a test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more comments.
I think I've addressed all the feedback. The other PRs will probably be in tomorrow-ish. Please take another look when you can and let me know if you'd like any additional changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments, and I think we can get it merged asap. Thanks!
Returns: | ||
Paths to images contained in the directory | ||
""" | ||
allowed_exts = [".jpg", ".jpeg", ".png", ".tif", ".tiff"] + ALLOWED_RAW_EXTS | ||
image_paths = sorted([p for p in data.glob("[!.]*") if p.suffix.lower() in allowed_exts]) | ||
glob = "**/[!.]*" if recursive else "[!.]*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT but can you rename this to glob_str
in case someone imports glob in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you got it, thanks!
nerfstudio/utils/tensor_dataclass.py
Outdated
else: | ||
# Don't broadcast the remaining fields | ||
new_dict[k] = v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay got it. I think this is fine and won't have downstream implications. I see there was already support for metadata and "fisheye_crop_radius", so why wasn't this needed before for the base_datamanager.py Cameras inititialization?
I tried it on my end, and I was able to download and train both a perspective and fisheye dataset. Once you merge main and fix the merge conflict and change glob to glob_str, it should be ready @VasuAgrawal. |
I made these changes here https://github.com/nerfstudio-project/nerfstudio/tree/convert-cameras-json, but I didn't know how to push to your fork... |
Thanks @ethanweber ! I just pulled your changes into my fork and pushed. I made a separate fork and did the pull request though there since I didn't think I had write access directly to the Nerfstudio code, but if I do and you'd prefer I do any future commits directly as a nerfstudio branch for easier collaboration, I'm happy to do that. |
Just for fun, I did confirm that the default EyefulTower scene trains with |
Hi @VasuAgrawal, can you change one last thing? If you can remove this comment that would be good. Someone must have beat us to making this update. |
Using a fork worked just fine! Thanks for merging my changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm too!
* Add ability to download EyefulTower dataset * wip before I copy linning's stuff in * Generate per-resolution cameras.xml * Generate transforms.json at download * Fix a couple of quotes * Use official EyefulTower splits for train and val * Disable projectaria-tools on windows * Fix extra imports * Add a new nerfacto method tund for EyefulTower * Split eyefultower download into a separate file * Add some fisheye support for eyeful data * Reformatted imports to not be dumb * Apparently this file was missed when formatting originally * Added 1k resolution scenes * revert method_configs.py to original values * Also add 1k exrs * Address feedback * Revert changes to pyproject.toml, to be added in a later PR * Oops, probably shouldn't have gotten rid of awscli ... * changed glob variable name * Update tensor_dataclass.py --------- Co-authored-by: Ethan Weber <[email protected]> Co-authored-by: Brent Yi <[email protected]>
Summary
This PR adds preliminary support for the EyefulTower dataset. Specifically, the following main functionality:
ns-download-data
script, with selectable dataset (of 11) and resolution (of 4).transforms.json
upon download, allowing each dataset to be loaded into memoryUsage
You can download a single EyefulTower dataset and resolution with basic download command. This will grab the
riverview
scene at 2k JPEG resolution.You can use the flags to download more of the dataset, or just download it all.
Data is downloaded using the
aws s3 sync
command as suggested in the EyefulTower documentation, which means redundant data will not be redownloaded. This makes it easy to start small with the dataset (e.g. running onlyns-download-data eyefultower
) and eventually working your way through the commands until you've downloaded everything.Training
I've tested training on 3 scenes (riverview, office_view2, office1b), all using this command and parameters with the dataset swapped out:
All 3 datasets seemed to finish training and offer good results. See results section below.
Comments / questions / future work
Metashape: Originally, I tried using the
ns-process-data
script to convert from thecameras.xml
files provided with Eyeful into something that nerfstudio understands. This doesn't work as-is though, since the Eyeful datasets use rig constraints, which the metashape processing inprocess_data.py
doesn't understand. I made some progress towards implementing this and generated updatedcameras.xml
files for each of the downscaled datasets (which the original EyefulTower doesn't provide), but hit an issue when generating the transforms (since the exact tree isn't documented by metashape), and decided to just convert the existingcameras.json
like I did. Adding support for rigs would be a great future addition.EXR: I've generated the
transforms.json
file for the EXR images, but I haven't tried using them with any of the nerfstudio pipelines as I don't believe there's currently HDR training support. Assuming that's correct, I'm hoping that offering this dataset integration is a good first step towards adding HDR support in nerfstudio.Fisheye: 5 of the 11 datasets from EyefulTower are taken with the V1 rig, which uses fisheye cameras. I've marked them as
OPENCV_FISHEYE
in thetransforms.json
, but I haven't tested those datasets as I'm not sure how good the fisheye support is. I see that some support was just merged in this PR a few hours ago. Perhaps fisheye support for EyefulTower can be a future PR?Larger datasets: 6 of the 11 datasets from EyefulTower are taken with the V2 rig, and should in theory work fine with this code on a sufficiently powerful machine. On my workstation, though, I ran out of RAM when trying to load one of the larger scenes (apartment), so I stuck to the 3 smaller V2 scenes (riverview, office_view2, office1b). I believe the default dataloader tries to load the entire dataset into memory at once. Are there other dataloaders that can load the data in batches, rather than loading it all at once?
Automatic downscaling: The ns-process script automatically generates downsampled images. The method I took, directly generating the
transforms.json
, does not do that. This seems to work fine, at least for the smaller scenes as described above. I'm not sure exactly where the downsampled images are used, but I did notice that there's a line in the training output which says:Does this indicate that there's automatic downscaling that could be applied at runtime, to perhaps scale down the larger datasets (e.g. apartment) into something smaller, e.g. 1k or 512, such that the entire (downscaled) dataset could fit into memory? This question assumes there's no batched dataloader we could use instead.
Splits: The EyefulTower dataset provides a splits.json file indicating how data should be partitioned into a train and test split. Currently, I don't use this file, as I'm not sure what format nerfstudio expects the splits in, or if there's support at all. If someone could point me in the right direction, I can try to add support. Right now, all the cameras, both train and test, are dumped into the transforms.json.
Subsampled data: In a similar vein to the splits file, as a way to try to reduce the amount of data that's loaded, I also generate a transforms_300.json and transforms_half.json, which (as you might've guessed) contain 300 images and half the dataset, respectively. I couldn't find a flag that would let me use these transforms_*.json files rather than the original transforms.json file in the generated datasets, but I'd love to know if one exists.
Results