Skip to content

laiwenq/VLM_liquid_perception

Repository files navigation

Vision-Language Model-based Physical Reasoning for Robot Liquid Perception


🎥 Demo Video

Demo

  • Hardware and Materials used in Demo:
    • 6-DoF mobile robot arm with wrist-mounted F/T sensor.
    • High-resolution RGB-D camera on a tripod.
    • Three common household liquids: Peanut Oil, Soy Sauce, and Whiskey.

Install Dependencies

To install necessary dependencies, run:

git clone [email protected]:laiwenq/VLM_liquid_perception.git
pip install -r requiremetns.txt

We use OpenAI's gpt-4-vision-preview as the backbone LVLM API. Feel free to change it to your own models.

Evaluation on Liquid Perception and Recognition Tasks

We have put all codes needed for the evaluation in a Jupyter Notebook. We also provide the full prompts and evaluation data for you to have a try. Feel free to replace the actions with real robotic actions for an online evaluation!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published