GitHub

Vision-Language Model-based Physical Reasoning for Robot Liquid Perception

🎥 Demo Video

Hardware and Materials used in Demo:
- 6-DoF mobile robot arm with wrist-mounted F/T sensor.
- High-resolution RGB-D camera on a tripod.
- Three common household liquids: Peanut Oil, Soy Sauce, and Whiskey.

Install Dependencies

To install necessary dependencies, run:

git clone [email protected]:laiwenq/VLM_liquid_perception.git
pip install -r requiremetns.txt

We use OpenAI's gpt-4-vision-preview as the backbone LVLM API. Feel free to change it to your own models.

Evaluation on Liquid Perception and Recognition Tasks

We have put all codes needed for the evaluation in a Jupyter Notebook. We also provide the full prompts and evaluation data for you to have a try. Feel free to replace the actions with real robotic actions for an online evaluation!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
prompts		prompts
resources		resources
.DS_Store		.DS_Store
README.md		README.md
flowchart.png		flowchart.png
main.ipynb		main.ipynb
requirements.txt		requirements.txt
video_cover_img.png		video_cover_img.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎥 Demo Video

Install Dependencies

Evaluation on Liquid Perception and Recognition Tasks

About

Releases

Packages

Languages

laiwenq/VLM_liquid_perception

Folders and files

Latest commit

History

Repository files navigation

🎥 Demo Video

Install Dependencies

Evaluation on Liquid Perception and Recognition Tasks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages