Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset.add does not handle id #24

Open
wuyuanyi135 opened this issue Dec 16, 2019 · 5 comments
Open

Dataset.add does not handle id #24

wuyuanyi135 opened this issue Dec 16, 2019 · 5 comments

Comments

@wuyuanyi135
Copy link

When Dataset.add(str) is called, the new image does not have a distinct id. It will be overwritten after calling Dataset.add(str) again due to the duplicated id.

@Ira236
Copy link

Ira236 commented Dec 21, 2019

Same issue :( did you solve it?

@jsbroks
Copy link
Owner

jsbroks commented Jan 10, 2020

could you provide example code

@saurabheights
Copy link
Contributor

saurabheights commented Apr 24, 2020

@jsbroks
Working on converting kitti 2d object detection dataset to coco format. Kitti basically has one folder of images and another folder of labels.txt files. Basically parallel files for image and label with same name and different extension.

I load all images, each end up having id 0. Adding these images to dataset leads to single image output when dumping json.

    # Method copied from Image class to add custom sort and only load num_samples. Not necessary to use this method to replicate the issue.
    images = get_images_from_folder(Image, images_root_dir, num_samples)

    # # Update id of each image <- Without this, it wont work
    # for index, image in enumerate(images):
    #     image.id = index

    # Create categories <- Ignore this. not important to replicate, but added for my question later.
    categories = ['Car', 'Van', 'Truck', 'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram', 'Misc', 'DontCare']
    categories_object = [Category(name=name, id=index) for index, name in enumerate(categories)]
    category_name_to_object_map = dict(zip(categories, categories_object))

    # Just get name of annnotation files, sorted(os.walk doesnt give files sorted by name)
    annotation_files = get_annotations_files(labels_root_dir, num_samples)
    # Read all category and bounding box for each image and add to image.
    for file_id, annotation_file in enumerate(annotation_files):
        categories, bounding_boxes = read_annotation_file(annotation_file)
        for category, bounding_box in zip(categories, bounding_boxes):
            images[file_id].add(bounding_box, category_name_to_object_map[category])

    # Have tried passing images directly to dataset constructor, have used for loop, each ending with recursion limit error, when using num_samples = 7500 something, = kitti total images.
    dataset = Dataset('kitti', images=[], id=1, metadata={}, info=info, licenses=licenses)
    print('Adding images to dataset')
    for image in tqdm(images):
        dataset.add(image)

@saurabheights
Copy link
Contributor

saurabheights commented Apr 24, 2020

So, there are a total of 3 issues above:-

a. Not having a clear idea on how to use library. A small tutorial on how the api is meant to be used would be quite useful, especially a lot of stuff gets happened in index operations.

b. The id issue as stated above, if one tries to load multiple images from a folder, each image has same id, i.e. 0, which when passed to dataset, gets overwritten. Overwriting seems to me as valid behaviour since each image should have unique id, maybe all is required is to set incremental id in Image.get_images_from_folder method, starting from 0(or 1)

c. The recursion limit issue.

Traceback (most recent call last):
File "/home/sk/workspace/datasets/imantics/imantics/annotation.py", line 193, in index
self.index(image)
File "/home/sk/workspace/datasets/imantics/imantics/annotation.py", line 193, in index
self.index(image)
File "/home/sk/workspace/datasets/imantics/imantics/annotation.py", line 193, in index
self.index(image)
[Previous line repeated 996 more times]
File "/home/sk/workspace/datasets/imantics/imantics/annotation.py", line 189, in index
found = annotation_index.get(self.id)
RecursionError: maximum recursion depth exceeded in comparison

I will look into the recursion issue myself, as its my top priority task, however if you have any quick insights, it would help a lot. P.S. Will move it to separate Issue if ok by you?

@amplejoe
Copy link

amplejoe commented May 3, 2020

I'm as well having this issue -- only a single image is added to a dataset, while annotations are indexed correctly. Steps I take:

  1. Create images from path (inside loop):
    image = imantics.Image.from_path('path/to/file')
  2. Add a number of Annotations to said image (which are indexed correctly):
    image.add(imantics_annot)
  3. Create dataset and add images to it (either using the constructor or the add method):
    ds = imantics.Dataset("test", images=image_array)

Current workaround:
Explicitly adding the image id during image creation:
image = imantics.Image(path='path/to/file', id=idx_number)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants