Skip to content

PromptGD - A simple baseline for Language-driven Grasp Detection task.

Notifications You must be signed in to change notification settings

ZQuang2202/PromptGD

Repository files navigation

Introduction

Implementation code for PromptGD - A language-driven Grasp Detection.

Installation

  • Create environment:
git clone https://github.com/ZQuang2202/PromptGD.git & cd PromptGD
conda create -n grasp python=3.9
conda activate grasp
  • Install packages:
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt

Methodology

image I introduce a novel paradigm leveraging language reasoning via a foundational vision-language model. By learning the semantic relationship between image and instruction text, this approach enhances model performance in open-vocabulary settings over traditional methods.

Datasets

Training

  • Train baseline GR-ConvNet model:
python3 train_network_baseline.py \
    --network grconvnet3 \
    --use-depth 0 \
    --dataset grasp-anything \
    --dataset-path $path_to_your_grasp_anything_dataset \
    --batch-size 32 \
    --batches-per-epoch 100 \
    --epochs 100 \
    --optim adam \
    --lr 0.001 \
    --lr-step-size 10 \
    --logdir logs/ \
    --seen 1
  • Train GR-ConvNet when augmented with instructional text:
python3 train_network_with_clip.py \
    --network grconvnet3 \
    --use-depth 0 \
    --dataset grasp-anything \
    --dataset-path $path_to_your_grasp_anything_dataset \
    --batch-size 8 \
    --batches-per-epoch 600 \
    --epochs 100 \
    --optim adam \
    --lr 0.001 \
    --lr-step-size 10 \
    --logdir logs/ \
    --seen 1
  • Train PromptGD:
python3 train_network_PromptGD.py \
    --clip-version ViT-B/32 \
    --use-depth 0 \
    --dataset grasp-anything \
    --dataset-path $path_to_your_grasp_anything_dataset \
    --batch-size 8 \
    --batches-per-epoch 300 \
    --epochs 100 \
    --lr 0.003 \
    --lr-step-size 5 \
    --logdir logs/prompt_gd \
    --seen 1

Testing

For testing procedure, we can apply the similar commands to test different baselines on different datasets:

python3 evaluate.py \
    --network $path_to_your_check_point \
    --input-size 224 \
    --dataset grasp-anything \
    --dataset-path  $path_to_your_grasp_anything_dataset \
    --use-depth 0 \
    --use-rgb 1 \
    --num-workers 8 \
    --n-grasp 1 \
    --iou-threshold 0.25 \
    --iou-eval \
    --seen 0

Acknowledgement

The code is developed based on Kumra et al..

About

PromptGD - A simple baseline for Language-driven Grasp Detection task.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published