Here we provide instructions and code for running evaluations for the tasks in AVDB
See AOS_random_baseline
for an example of using the gym-AVD environment to produce outputs for evaluation. The evaluation expects a single file that holds a models chosen paths from each starting image to a destination image for the object of interest. The format of the file is as follows:
A json file holding a single Dict with Keys=scene-names Values=
`a Dict with Keys=object ids Values=`
`a Dict with Keys=inital-image-name Values=PATH`
Where PATH is a list of integers, where each integer represent an action taken. The mapping from integers to actions is:
- 0:forward
- 1:backward
- 2:rotate-cw
- 3:rotate-ccw
(we do not allow right and left movement)
The evaluation first checks to make sure the path is valid from starting image to destination, so it is fine to output noisy paths.
computes 3 evaluation metrics:
- Success rate: how often a destination image is reached
- Average path length of successful episodes
- Average (shortest path length)/(model path length) of successful episodes. This has a max of 1, higher is better, and is useful for comparing across different cenes of different sizes.
We use the MSCOCO evaluation code to compute the mAP metric for object detection.
- Build the coco evaluation cython code
cd COCO_evaluation/cocoapi/PythonAPI/
make all
cd ../../
- Convert AVD annotations to COCO format yourself, or download the converted files
To Download the files:
mkdir Data
cd Data
Download the tar here
tar -xf tdid_gt_boxes.tar
Or to convert yourself:
cd evaluation/
#Update paths in `` with:
#a path to save the annotations, we will call it VAL_GROUND_TRUTH_BOXES
#now update the scene_list in
#to make the test set
#change the path to save the annotations, we will call it TEST_GROUND_TRUTH_BOXES
See COCO_evaluation/
for an example of runnning the evaluation code for the full object detection and few-shot detection tasks.
Alternatively, see the project target_driven_instance_detection
for an example of an instance detector trained/tested on AVD.