Skip to content

Latest commit

 

History

History
24 lines (10 loc) · 730 Bytes

File metadata and controls

24 lines (10 loc) · 730 Bytes

PaliGemma Android HF

This repository is an implementation of inferring the PaliGemma Vision Language Model on Android using Hugging Face-Gradio Client API for tasks such as zero-shot object detection, image captioning and visual question-answering.

Pipeline:

Logo

Demo Outputs:

Visual question-answering, zero-shot object detection, image captioning

Logo

Reference Expression Segmentation

Logo