Skip to content

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

License

Notifications You must be signed in to change notification settings

pkgoogle/ai-hub-models

 
 

Repository files navigation

Qualcomm® AI Hub Models

Qualcomm® AI Hub Models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

  • Explore models optimized for on-device deployment of vision, speech, text, and genenrative AI.
  • View open-source recipes to quantize, optimize, and deploy these models on-device.
  • Browse through performance metrics captured for these models on several devices.
  • Access the models through Hugging Face.
  • Check out sample apps for on-device deployment of AI Hub models.
  • Sign up to run these models on hosted Qualcomm® devices.

Supported python package host machine Operating Systems:

  • Linux (x86, ARM)
  • Windows (x86)
  • Windows (ARM-- ONLY via x86 Python, not ARM Python)
  • MacOS (x86, ARM)

Supported runtimes

Models can be deployed on:

  • Android
  • Windows
  • Linux

Supported compute units

Supported precision

  • Floating Points: FP16
  • Integer: INT8 (8-bit weight and activation on select models), INT4 (4-bit weight, 16-bit activation on select models)

Supported chipsets

Select supported devices

  • Samsung Galaxy S21 Series, Galaxy S22 Series, Galaxy S23 Series, Galaxy S24 Series
  • Xiaomi 12, 13
  • Google Pixel 3, 4, 5
  • Snapdragon X Elite CRD (Compute Reference Device)

and many more.

Installation

We currently support Python 3.9, 3.10 (recommended), 3.11, and 3.12. We recommend using a Python virtual environment (miniconda or virtualenv).

NOTE: Many quantized models are supported only with python 3.10.

You can setup a virtualenv using:

python -m venv qai_hub_models_env && source qai_hub_models_env/bin/activate

Once the environment is setup, you can install the base package using:

pip install qai_hub_models

Some models (e.g. YOLOv7) require additional dependencies. You can install those dependencies automatically using:

pip install "qai_hub_models[yolov7]"

Getting Started

Each model comes with the following set of CLI demos:

  • Locally runnable PyTorch based CLI demo to validate the model off device.
  • On-device CLI demo that produces a model ready for on-device deployment and runs the model on a hosted Qualcomm® device (needs sign up).

All the models produced by these demos are freely available on Hugging Face or through our website. See the individual model readme files (e.g. YOLOv7) for more details.

Local CLI Demo with PyTorch

All models contain CLI demos that run the model in PyTorch locally with sample input. Demos are optimized for code clarity rather than latency, and run exclusively in PyTorch. Optimal model latency can be achieved with model export via Qualcomm® AI Hub.

python -m qai_hub_models.models.yolov7.demo

For additional details on how to use the demo CLI, use the --help option

python -m qai_hub_models.models.yolov7.demo --help

See the model directory below to explore all other models.


Note that most ML use cases require some pre and post-processing that are not part of the model itself. A python reference implementation of this is provided for each model in app.py. Apps load & pre-process model input, run model inference, and post-process model output before returning it to you.

Here is an example of how the PyTorch CLI works for YOLOv7:

from PIL import Image
from qai_hub_models.models.yolov7 import Model as YOLOv7Model
from qai_hub_models.models.yolov7 import App as YOLOv7App
from qai_hub_models.utils.asset_loaders import load_image
from qai_hub_models.models.yolov7.demo import IMAGE_ADDRESS

# Load pre-trained model
torch_model = YOLOv7Model.from_pretrained()

# Load a simple PyTorch based application
app = YOLOv7App(torch_model)
image = load_image(IMAGE_ADDRESS, "yolov7")

# Perform prediction on a sample image
pred_image = app.predict(image)[0]
Image.fromarray(pred_image).show()

CLI demo to run on hosted Qualcomm® devices

Some models contain CLI demos that run the model on a hosted Qualcomm® device using Qualcomm® AI Hub.

To run the model on a hosted device, sign up for access to Qualcomm® AI Hub. Sign-in to Qualcomm® AI Hub with your Qualcomm® ID. Once signed in navigate to Account -> Settings -> API Token.

With this API token, you can configure your client to run models on the cloud hosted devices.

qai-hub configure --api_token API_TOKEN

Navigate to docs for more information.

The on-device CLI demo performs the following:

  • Exports the model for on-device execution.
  • Profiles the model on-device on a cloud hosted Qualcomm® device.
  • Runs the model on-device on a cloud hosted Qualcomm® device and compares accuracy between a local CPU based PyTorch run and the on-device run.
  • Downloads models (and other required assets) that can be deployed on-device in an Android application.
python -m qai_hub_models.models.yolov7.export

Many models may have initialization parameters that allow loading custom weights and checkpoints. See --help for more details

python -m qai_hub_models.models.yolov7.export --help

How does this export script work?

As described above, the script above compiles, optimizes, and runs the model on a cloud hosted Qualcomm® device. The demo uses Qualcomm® AI Hub's Python APIs.

Qualcomm® AI Hub explained

Here is a simplified example of code that can be used to run the entire model on a cloud hosted device:

import torch
import qai_hub as hub
from qai_hub_models.models.yolov7 import Model as YOLOv7Model

# Load YOLOv7 in PyTorch
torch_model = YOLOv7Model.from_pretrained()
torch_model.eval()

# Trace the PyTorch model using one data point of provided sample inputs to
# torch tensor to trace the model.
example_input = [torch.tensor(data[0]) for name, data in torch_model.sample_inputs().items()]
pt_model = torch.jit.trace(torch_model, example_input)

# Select a device
device = hub.Device("Samsung Galaxy S23")

# Compile model for a specific device
compile_job = hub.submit_compile_job(
    model=pt_model,
    device=device,
    input_specs=torch_model.get_input_spec(),
)

# Get target model to run on a cloud hosted device
target_model = compile_job.get_target_model()

# Profile the previously compiled model on a cloud hosted device
profile_job = hub.submit_profile_job(
    model=target_model,
    device=device,
)

# Perform on-device inference on a cloud hosted device
input_data = torch_model.sample_inputs()
inference_job = hub.submit_inference_job(
    model=target_model,
    device=device,
    inputs=input_data,
)

# Returns the output as dict{name: numpy}
on_device_output = inference_job.download_output_data()

Working with source code

You can clone the repository using:

git clone https://github.com/quic/ai-hub-models/blob/main
cd main
pip install -e .

Install additional dependencies to prepare a model before using the following:

cd main
pip install -e ".[yolov7]"

All models have accuracy and end-to-end tests when applicable. These tests as designed to be run locally and verify that the PyTorch code produces correct results. To run the tests for a model:

python -m pytest --pyargs qai_hub_models.models.yolov7.test

For any issues, please contact us at [email protected].


LICENSE

Qualcomm® AI Hub Models is licensed under BSD-3. See the LICENSE file.


Model Directory

Computer Vision

Model README Torch App Device Export CLI Demo
Image Classification
ConvNext-Tiny qai_hub_models.models.convnext_tiny ✔️ ✔️ ✔️
ConvNext-Tiny-w8a16-Quantized qai_hub_models.models.convnext_tiny_w8a16_quantized ✔️ ✔️ ✔️
ConvNext-Tiny-w8a8-Quantized qai_hub_models.models.convnext_tiny_w8a8_quantized ✔️ ✔️ ✔️
DenseNet-121 qai_hub_models.models.densenet121 ✔️ ✔️ ✔️
DenseNet-121-Quantized qai_hub_models.models.densenet121_quantized ✔️ ✔️ ✔️
EfficientNet-B0 qai_hub_models.models.efficientnet_b0 ✔️ ✔️ ✔️
GoogLeNet qai_hub_models.models.googlenet ✔️ ✔️ ✔️
GoogLeNetQuantized qai_hub_models.models.googlenet_quantized ✔️ ✔️ ✔️
Inception-v3 qai_hub_models.models.inception_v3 ✔️ ✔️ ✔️
Inception-v3-Quantized qai_hub_models.models.inception_v3_quantized ✔️ ✔️ ✔️
MNASNet05 qai_hub_models.models.mnasnet05 ✔️ ✔️ ✔️
MobileNet-v2 qai_hub_models.models.mobilenet_v2 ✔️ ✔️ ✔️
MobileNet-v2-Quantized qai_hub_models.models.mobilenet_v2_quantized ✔️ ✔️ ✔️
MobileNet-v3-Large qai_hub_models.models.mobilenet_v3_large ✔️ ✔️ ✔️
MobileNet-v3-Large-Quantized qai_hub_models.models.mobilenet_v3_large_quantized ✔️ ✔️ ✔️
MobileNet-v3-Small qai_hub_models.models.mobilenet_v3_small ✔️ ✔️ ✔️
RegNet qai_hub_models.models.regnet ✔️ ✔️ ✔️
RegNetQuantized qai_hub_models.models.regnet_quantized ✔️ ✔️ ✔️
ResNeXt101 qai_hub_models.models.resnext101 ✔️ ✔️ ✔️
ResNeXt101Quantized qai_hub_models.models.resnext101_quantized ✔️ ✔️ ✔️
ResNeXt50 qai_hub_models.models.resnext50 ✔️ ✔️ ✔️
ResNeXt50Quantized qai_hub_models.models.resnext50_quantized ✔️ ✔️ ✔️
ResNet101 qai_hub_models.models.resnet101 ✔️ ✔️ ✔️
ResNet101Quantized qai_hub_models.models.resnet101_quantized ✔️ ✔️ ✔️
ResNet18 qai_hub_models.models.resnet18 ✔️ ✔️ ✔️
ResNet18Quantized qai_hub_models.models.resnet18_quantized ✔️ ✔️ ✔️
ResNet50 qai_hub_models.models.resnet50 ✔️ ✔️ ✔️
ResNet50Quantized qai_hub_models.models.resnet50_quantized ✔️ ✔️ ✔️
Shufflenet-v2 qai_hub_models.models.shufflenet_v2 ✔️ ✔️ ✔️
Shufflenet-v2Quantized qai_hub_models.models.shufflenet_v2_quantized ✔️ ✔️ ✔️
SqueezeNet-1_1 qai_hub_models.models.squeezenet1_1 ✔️ ✔️ ✔️
SqueezeNet-1_1Quantized qai_hub_models.models.squeezenet1_1_quantized ✔️ ✔️ ✔️
Swin-Base qai_hub_models.models.swin_base ✔️ ✔️ ✔️
Swin-Small qai_hub_models.models.swin_small ✔️ ✔️ ✔️
Swin-Tiny qai_hub_models.models.swin_tiny ✔️ ✔️ ✔️
VIT qai_hub_models.models.vit ✔️ ✔️ ✔️
VITQuantized qai_hub_models.models.vit_quantized ✔️ ✔️ ✔️
WideResNet50 qai_hub_models.models.wideresnet50 ✔️ ✔️ ✔️
WideResNet50-Quantized qai_hub_models.models.wideresnet50_quantized ✔️ ✔️ ✔️
Image Editing
AOT-GAN qai_hub_models.models.aotgan ✔️ ✔️ ✔️
LaMa-Dilated qai_hub_models.models.lama_dilated ✔️ ✔️ ✔️
Super Resolution
ESRGAN qai_hub_models.models.esrgan ✔️ ✔️ ✔️
QuickSRNetLarge qai_hub_models.models.quicksrnetlarge ✔️ ✔️ ✔️
QuickSRNetLarge-Quantized qai_hub_models.models.quicksrnetlarge_quantized ✔️ ✔️ ✔️
QuickSRNetMedium qai_hub_models.models.quicksrnetmedium ✔️ ✔️ ✔️
QuickSRNetMedium-Quantized qai_hub_models.models.quicksrnetmedium_quantized ✔️ ✔️ ✔️
QuickSRNetSmall qai_hub_models.models.quicksrnetsmall ✔️ ✔️ ✔️
QuickSRNetSmall-Quantized qai_hub_models.models.quicksrnetsmall_quantized ✔️ ✔️ ✔️
Real-ESRGAN-General-x4v3 qai_hub_models.models.real_esrgan_general_x4v3 ✔️ ✔️ ✔️
Real-ESRGAN-x4plus qai_hub_models.models.real_esrgan_x4plus ✔️ ✔️ ✔️
SESR-M5 qai_hub_models.models.sesr_m5 ✔️ ✔️ ✔️
SESR-M5-Quantized qai_hub_models.models.sesr_m5_quantized ✔️ ✔️ ✔️
XLSR qai_hub_models.models.xlsr ✔️ ✔️ ✔️
XLSR-Quantized qai_hub_models.models.xlsr_quantized ✔️ ✔️ ✔️
Semantic Segmentation
DDRNet23-Slim qai_hub_models.models.ddrnet23_slim ✔️ ✔️ ✔️
DeepLabV3-Plus-MobileNet qai_hub_models.models.deeplabv3_plus_mobilenet ✔️ ✔️ ✔️
DeepLabV3-Plus-MobileNet-Quantized qai_hub_models.models.deeplabv3_plus_mobilenet_quantized ✔️ ✔️ ✔️
DeepLabV3-ResNet50 qai_hub_models.models.deeplabv3_resnet50 ✔️ ✔️ ✔️
FCN-ResNet50 qai_hub_models.models.fcn_resnet50 ✔️ ✔️ ✔️
FCN-ResNet50-Quantized qai_hub_models.models.fcn_resnet50_quantized ✔️ ✔️ ✔️
FFNet-122NS-LowRes qai_hub_models.models.ffnet_122ns_lowres ✔️ ✔️ ✔️
FFNet-40S qai_hub_models.models.ffnet_40s ✔️ ✔️ ✔️
FFNet-40S-Quantized qai_hub_models.models.ffnet_40s_quantized ✔️ ✔️ ✔️
FFNet-54S qai_hub_models.models.ffnet_54s ✔️ ✔️ ✔️
FFNet-54S-Quantized qai_hub_models.models.ffnet_54s_quantized ✔️ ✔️ ✔️
FFNet-78S qai_hub_models.models.ffnet_78s ✔️ ✔️ ✔️
FFNet-78S-LowRes qai_hub_models.models.ffnet_78s_lowres ✔️ ✔️ ✔️
FFNet-78S-Quantized qai_hub_models.models.ffnet_78s_quantized ✔️ ✔️ ✔️
FastSam-S qai_hub_models.models.fastsam_s ✔️ ✔️ ✔️
FastSam-X qai_hub_models.models.fastsam_x ✔️ ✔️ ✔️
MediaPipe-Selfie-Segmentation qai_hub_models.models.mediapipe_selfie ✔️ ✔️ ✔️
SINet qai_hub_models.models.sinet ✔️ ✔️ ✔️
Segment-Anything-Model qai_hub_models.models.sam ✔️ ✔️ ✔️
Unet-Segmentation qai_hub_models.models.unet_segmentation ✔️ ✔️ ✔️
YOLOv8-Segmentation qai_hub_models.models.yolov8_seg ✔️ ✔️ ✔️
Object Detection
DETR-ResNet101 qai_hub_models.models.detr_resnet101 ✔️ ✔️ ✔️
DETR-ResNet101-DC5 qai_hub_models.models.detr_resnet101_dc5 ✔️ ✔️ ✔️
DETR-ResNet50 qai_hub_models.models.detr_resnet50 ✔️ ✔️ ✔️
DETR-ResNet50-DC5 qai_hub_models.models.detr_resnet50_dc5 ✔️ ✔️ ✔️
FaceAttribNet qai_hub_models.models.face_attrib_net ✔️ ✔️ ✔️
FootTrackNet_Quantized qai_hub_models.models.foot_track_net_quantized ✔️ ✔️ ✔️
Lightweight-Face-Detection qai_hub_models.models.face_det_lite ✔️ ✔️ ✔️
MediaPipe-Face-Detection qai_hub_models.models.mediapipe_face ✔️ ✔️ ✔️
MediaPipe-Face-Detection-Quantized qai_hub_models.models.mediapipe_face_quantized ✔️ ✔️ ✔️
MediaPipe-Hand-Detection qai_hub_models.models.mediapipe_hand ✔️ ✔️ ✔️
PPE-Detection qai_hub_models.models.gear_guard_net ✔️ ✔️ ✔️
PPE-Detection-Quantized qai_hub_models.models.gear_guard_net_quantized ✔️ ✔️ ✔️
Person-Foot-Detection qai_hub_models.models.foot_track_net ✔️ ✔️ ✔️
YOLOv11-Detection qai_hub_models.models.yolov11_det ✔️ ✔️ ✔️
YOLOv8-Detection qai_hub_models.models.yolov8_det ✔️ ✔️ ✔️
YOLOv8-Detection-Quantized qai_hub_models.models.yolov8_det_quantized ✔️ ✔️ ✔️
Yolo-NAS qai_hub_models.models.yolonas ✔️ ✔️ ✔️
Yolo-NAS-Quantized qai_hub_models.models.yolonas_quantized ✔️ ✔️ ✔️
Yolo-v6 qai_hub_models.models.yolov6 ✔️ ✔️ ✔️
Yolo-v7 qai_hub_models.models.yolov7 ✔️ ✔️ ✔️
Yolo-v7-Quantized qai_hub_models.models.yolov7_quantized ✔️ ✔️ ✔️
Pose Estimation
Facial-Landmark-Detection qai_hub_models.models.facemap_3dmm ✔️ ✔️ ✔️
HRNetPose qai_hub_models.models.hrnet_pose ✔️ ✔️ ✔️
HRNetPoseQuantized qai_hub_models.models.hrnet_pose_quantized ✔️ ✔️ ✔️
LiteHRNet qai_hub_models.models.litehrnet ✔️ ✔️ ✔️
MediaPipe-Pose-Estimation qai_hub_models.models.mediapipe_pose ✔️ ✔️ ✔️
OpenPose qai_hub_models.models.openpose ✔️ ✔️ ✔️
Posenet-Mobilenet qai_hub_models.models.posenet_mobilenet ✔️ ✔️ ✔️
Posenet-Mobilenet-Quantized qai_hub_models.models.posenet_mobilenet_quantized ✔️ ✔️ ✔️
Depth Estimation
Midas-V2 qai_hub_models.models.midas ✔️ ✔️ ✔️
Midas-V2-Quantized qai_hub_models.models.midas_quantized ✔️ ✔️ ✔️

Audio

Model README Torch App Device Export CLI Demo
Speech Recognition
HuggingFace-WavLM-Base-Plus qai_hub_models.models.huggingface_wavlm_base_plus ✔️ ✔️ ✔️
Whisper-Base-En qai_hub_models.models.whisper_base_en ✔️ ✔️ ✔️
Whisper-Small-En qai_hub_models.models.whisper_small_en ✔️ ✔️ ✔️
Whisper-Tiny-En qai_hub_models.models.whisper_tiny_en ✔️ ✔️ ✔️

Multimodal

Model README Torch App Device Export CLI Demo
TrOCR qai_hub_models.models.trocr ✔️ ✔️ ✔️
OpenAI-Clip qai_hub_models.models.openai_clip ✔️ ✔️ ✔️

Generative Ai

Model README Torch App Device Export CLI Demo
Image Generation
ControlNet qai_hub_models.models.controlnet_quantized ✔️ ✔️ ✔️
Riffusion qai_hub_models.models.riffusion_quantized ✔️ ✔️ ✔️
Stable-Diffusion-v1.5 qai_hub_models.models.stable_diffusion_v1_5_quantized ✔️ ✔️ ✔️
Stable-Diffusion-v2.1 qai_hub_models.models.stable_diffusion_v2_1_quantized ✔️ ✔️ ✔️
Text Generation
Baichuan2-7B qai_hub_models.models.baichuan2_7b_quantized ✔️ ✔️ ✔️
IBM-Granite-3B-Code-Instruct qai_hub_models.models.ibm_granite_3b_code_instruct ✔️ ✔️ ✔️
IndusQ-1.1B qai_hub_models.models.indus_1b_quantized ✔️ ✔️ ✔️
JAIS-6p7b-Chat qai_hub_models.models.jais_6p7b_chat_quantized ✔️ ✔️ ✔️
Llama-v2-7B-Chat qai_hub_models.models.llama_v2_7b_chat_quantized ✔️ ✔️ ✔️
Llama-v3-8B-Chat qai_hub_models.models.llama_v3_8b_chat_quantized ✔️ ✔️ ✔️
Llama-v3.1-8B-Chat qai_hub_models.models.llama_v3_1_8b_chat_quantized ✔️ ✔️ ✔️
Llama-v3.2-3B-Chat qai_hub_models.models.llama_v3_2_3b_chat_quantized ✔️ ✔️ ✔️
Mistral-3B qai_hub_models.models.mistral_3b_quantized ✔️ ✔️ ✔️
Mistral-7B-Instruct-v0.3 qai_hub_models.models.mistral_7b_instruct_v0_3_quantized ✔️ ✔️ ✔️
PLaMo-1B qai_hub_models.models.plamo_1b_quantized ✔️ ✔️ ✔️
Qwen2-7B-Instruct qai_hub_models.models.qwen2_7b_instruct_quantized ✔️ ✔️ ✔️

About

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.5%
  • Jupyter Notebook 2.2%
  • Shell 0.3%