Inferless

qwen3-embedding-0.6b Public template
600M parameter, 100 language embedding model that turns up to 32k token inputs into instruction-aware vectors. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>

inferless/qwen3-embedding-0.6b’s past year of commit activity

Python 0 0 0 0 Updated Jun 23, 2025
devstral-small Public template
An agentic LLM for software engineering tasks, excels at using tools to explore codebases, editing multiple files and power software engineering agents. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>

inferless/devstral-small’s past year of commit activity

Python 0 0 0 0 Updated Jun 23, 2025
deepseek-r1-qwen3-8b Public template
A distilled 8B parameter reasoning powerhouse, leveraging deep chain‑of‑thought from the DeepSeek R1‑0528—delivering SOTA open‑source performance. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>

inferless/deepseek-r1-qwen3-8b’s past year of commit activity

Python 0 0 0 0 Updated Jun 23, 2025
nanonets-ocr-s Public template
Nanonets-OCR-s that turns images or PDFs into structured Markdown capturing tables, LaTeX, captions and tags—for fast, powerful, human-readable OCR. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>

inferless/nanonets-ocr-s’s past year of commit activity

Python 0 0 0 0 Updated Jun 23, 2025
Open-NotebookLM Public

inferless/Open-NotebookLM’s past year of commit activity

Python 0 0 0 0 Updated Jun 11, 2025
yolo11m-detect Public

inferless/yolo11m-detect’s past year of commit activity

Python 0 1 0 0 Updated May 20, 2025
kokoro Public template
82M parameters lightweight text-to-speech (TTS) model that delivers high-quality voice synthesis. <metadata> gpu: T4 | collections: ["SSE Events"] </metadata>

inferless/kokoro’s past year of commit activity

Python 1 1 0 0 Updated May 19, 2025
qwen3-14b Public template
14B model with hybrid approach to problem-solving with two distinct modes: "thinking mode," which enables step-by-step reasoning and "non-thinking mode," designed for rapid, general-purpose responses. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

inferless/qwen3-14b’s past year of commit activity

Python 0 1 0 0 Updated May 15, 2025
qwen2.5-omni-7b Public template
An advanced end-to-end multimodal which can processes text, images, audio, and video inputs, generating real-time text and natural speech responses. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>

inferless/qwen2.5-omni-7b’s past year of commit activity

Python 0 0 0 0 Updated May 12, 2025
qwen3-8b Public template
Qwen3-8B is a language model that supports seamless switching between “thinking” mode-for advanced math, coding, and logical inference-and “non-thinking” mode for fast, natural conversation. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>

inferless/qwen3-8b’s past year of commit activity

Python 0 1 0 0 Updated May 12, 2025

View all repositories

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inferless

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!