| Model | Intro | Tags | Action |
|---|---|---|---|
|
chatterbox-tts
|
Chatterbox TTS is a state-of-the-art, open-source text-to-speech system developed by Resemble AI. It supports zero-shot voice cloning, emotion control, and high-quality audio generation, all MIT-licensed and fully production-ready. View on Hugging Face | ||
|
cogito-v1-preview-llama-3B
|
Cogito-v1-preview-llama-3B is a high-performance "hybrid reasoning" model released by Deep Cogito. Built on the Llama 3.2 3B architecture View on Hugging Face | ||
|
cogito-v1-preview-llama-8B
|
Cogito-v1-preview-llama-8B is an 8-billion parameter hybrid reasoning model released in April 2025 by San Francisco-based startup Deep Cogito. View on Hugging Face | ||
|
cogito-v1-preview-qwen-14B
|
Cogito-v1-preview-qwen-14B is a hybrid reasoning model developed by Deep Cogito, a San Francisco-based startup that emerged from stealth in April 2025. It is built on the Qwen 2.5 architecture but heavily modified to include self-reflection and "deep thinking" capabilities similar to OpenAI’s o1 or DeepSeek-R1. View on Hugging Face | ||
|
cogito-v1-preview-qwen-32B
|
Cogito-v1-preview-qwen-32B (often referred to as Cogito v1 Preview) is a high-performance hybrid reasoning model developed by Deep Cogito. View on Hugging Face | ||
|
context-1
|
Context-1 is a 20B parameter agentic search model trained to retrieve supporting documents for complex, multi-hop queries. It is designed to be used as a retrieval subagent alongside a frontier reasoning model View on Hugging Face | ||
|
Cydonia-24B-v4.3
|
general-purpose model optimized for strong reasoning, coding, and chat performance View on Hugging Face | ||
|
DeepCoder-14B-Preview
|
DeepCoder-14B-Preview is a high-performance, open-source code reasoning model View on Hugging Face | ||
|
DeepCoder-1.5B-Preview
|
DeepCoder-1.5B-Preview is a lightweight yet powerful code-reasoning model released in April 2025 as part of the DeepCoder series by the Agentica team and Together AI. View on Hugging Face | ||
|
DeepSeek-Coder-V2-Lite-Instruct
|
A lightweight coding model designed for efficient code generation and reasoning. View on Hugging Face | ||
|
DeepSeek-OCR
|
DeepSeek-OCR (released in late 2025, with v2 arriving in January 2026) is a specialized multimodal model designed to solve the "token explosion" problem in traditional Document AI. While standard Vision-Language Models (VLMs) often convert a single page into thousands of tokens, DeepSeek-OCR treats OCR as a multimodal compression task, achieving high accuracy with a fraction of the computational cost. View on Hugging Face | ||
|
DeepSeek-Prover-V2-7B
|
DeepSeek-Prover-V2-7B is a specialized, open-source language model released in 2025 that focuses on formal theorem proving in Lean 4. View on Hugging Face | ||
|
DeepSeek-R1-Distill-Qwen-14B
|
DeepSeek-R1-Distill-Qwen-14B is a premier reasoning model from the DeepSeek-R1 family, specifically engineered to deliver frontier-level logic within a compact 14.7-billion parameter frame. View on Hugging Face | ||
|
Devstral-Small-2-24B-Instruct-2512
|
Devstral is an agentic LLM for software engineering tasks. Devstral Small 2 excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench. View on Hugging Face | ||
|
Devstral-Small-2505
|
Devstral-Small-2505 is a specialized, agentic large language model released in May 2025 through a collaboration between Mistral AI and All Hands AI. View on Hugging Face | ||
|
ERNIE-4.5-21B-A3B-PT
|
ERNIE-4.5-21B-A3B-PT is a high-efficiency Large Language Model (LLM) developed by Baidu, released as part of their ERNIE 4.5 family in late 2025. It is specifically designed to balance high-level reasoning capabilities with low computational costs. View on Hugging Face | ||
|
FLUX.1-dev
|
FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. View on Hugging Face | ||
|
FLUX.2-klein-9B
|
FLUX.2-klein-9B is a high-performance, mid-sized text-to-image model that belongs to the next generation of the FLUX family (developed by Black Forest Labs). View on Hugging Face | ||
|
gemma-3-12b-it
|
Gemma-3-12B-IT is a mid-sized, instruction-tuned multimodal model from Google’s Gemma 3 family, View on Hugging Face | ||
|
gemma-3-1b-it
|
a compact instruction-tuned model designed for fast and efficient general-purpose tasks View on Hugging Face | ||
|
gemma-3-27b-it
|
instruction-tuned model designed for strong reasoning, coding, and chat performance View on Hugging Face | ||
|
gemma-3-4b-it
|
The Gemma-3-4B-IT (Instruction Tuned) is a mid-sized, multimodal model from Google’s latest open-weights family, released in March 2025. It represents a significant architectural shift from the Gemma 2 series, moving from a text-only focus to a native vision-language (multimodal) design. View on Hugging Face | ||
|
gemma-3n-E4B-it
|
Gemma-3n-E4B-it is part of the experimental "N" (Native) series from Google, released in early 2026. This model represents a pivot toward native multimodal reasoning, View on Hugging Face | ||
|
gemma-4-31B-it
|
gemma-4-31B-it View on Hugging Face | ||
|
gemma-4-31B-it-uncensored-heretic
|
A 31B Gemma-4 model modified for reduced safety restrictions and more open responses View on Hugging Face | ||
|
gemma-4-31B-it-uncensored-heretic-2GPU
|
gemma-4-31B-it-uncensored-heretic-2GPU View on Hugging Face | ||
|
gemma-4-E2B-it
|
An efficient Gemma 4 model optimized for strong performance with lower resource usage View on Hugging Face | ||
|
gemma-4-E2B-it_sound
|
gemma-4-E2B-it with sound support View on Hugging Face | ||
|
gemma-4-E4B-it
|
gemma-4-E4B-it View on Hugging Face | ||
|
Glistening-Gem-31B-v1.0
|
Glistening-Gem-31B-v1.0 View on Hugging Face | ||
|
GLM-OCR
|
GLM-OCR is a compact, high-performance multimodal model released in February 2026 by Zhipu AI (Z.ai). It is specifically designed to bridge the gap between traditional OCR (character recognition) and full "Document Understanding" (layout, tables, and reasoning). View on Hugging Face | ||
|
GLM-Z1-32B-0414
|
GLM-Z1-32B-0414 is a high-performance, open-source reasoning model with 32 billion parameters, released by the zai-org group View on Hugging Face | ||
|
gpt-oss-20b
|
The GPT-OSS-20B (Generative Pre-trained Transformer - Open Source Software) is a significant milestone in the move toward high-performance, transparent large language models. It is part of a broader family of models designed to provide a powerful, open-source alternative to proprietary models like GPT-3 or early GPT-4 iterations. View on Hugging Face | ||
|
gpt-oss-safeguard-20b
|
GPT-OSS-Safeguard-20B is an open-weight, safety-focused reasoning model View on Hugging Face | ||
|
Holo3-35B-A3B
|
Holo3 is our latest generation of large-scale Vision-Language Models (VLMs) specifically optimized for GUI Agents. View on Hugging Face | ||
|
Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated
|
Model tuned for Claude-style reasoning with reduced safety restrictions View on Hugging Face | ||
|
Huihui-Qwen3.5-35B-A3B-abliterated
|
This is an uncensored version of Qwen/Qwen3.5-35B-A3B created with abliteration View on Hugging Face | ||
|
Huihui-Qwen3.6-35B-A3B-abliterated
|
The Huihui-Qwen3.6-35B-A3B-abliterated (released April 19, 2026) is a specialized variant of Alibaba's latest Qwen3.6 MoE mode View on Hugging Face | ||
|
Inferx-bundle-Qwen3.6-35B-A3B-FP8-Qwen3-Embedding-0.6B-Qwen3-Reranker-0.6B
|
this is a bundle of Qwen3.6-35B-A3B-FP8, Qwen3-Embedding-0.6B and Qwen3-Reranker-0.6B View on Hugging Face | ||
|
InnerVerse-GLM47Flash-v1
|
A fast, reasoning-focused model optimized for efficient inference and strong instruction following View on Hugging Face | ||
|
IntelliAsk-Qwen3-32B-450-Merged
|
IntelliAsk-Qwen3-32B-450-Merged View on Hugging Face | ||
|
InternVL3_5-38B-FP8-Dynamic
|
InternVL3.5-38B-FP8-Dynamic is a state-of-the-art multimodal large language model (MLLM) optimized for high-efficiency inference View on Hugging Face | ||
|
InternVL3_5-38B-Instruct
|
InternVL3.5-38B-Instruct is an advanced multimodal large language model (MLLM) released in late 2025 by Shanghai AI Laboratory. View on Hugging Face | ||
|
InternVL3_5-8B
|
InternVL3.5-8B-Instruct is the latest state-of-the-art multimodal large language model (MLLM) from OpenGVLab (Shanghai AI Lab), View on Hugging Face | ||
|
Kimi-Linear-48B-A3B-Instruct-AWQ-8bit
|
Kimi-Linear-48B-A3B-Instruct-AWQ-8bit is a high-efficiency, long-context model released by Moonshot AI in late 2025. It represents a significant departure from standard Transformer architectures, specifically designed to eliminate the "quadratic bottleneck" that usually slows down long-context processing. View on Hugging Face | ||
|
Kimi-VL-A3B-Thinking-2506
|
Kimi-VL-A3B-Thinking-2506 is a state-of-the-art vision-language model (VLM) released by Moonshot AI in mid-2025. View on Hugging Face | ||
|
L3.3-70B-Loki-V2.0
|
Llama-based model tuned for immersive roleplay, storytelling, and strong narrative consistency View on Hugging Face | ||
|
Llama-3.1-8B-Instruct
|
Llama-3.1-8B-Instruct is the lightweight, instruction-tuned variant of Meta’s Llama 3.1 family. View on Hugging Face | ||
|
Llama-3.3-70B-Instruct-AWQ
|
Llama-3.3-70B-Instruct-AWQ is the 4-bit quantized version of Meta's December 2024 flagship "efficiency" model. View on Hugging Face | ||
|
Magistral-Small-2509-AWQ-4bit
|
Magistral-Small-2509-AWQ-4bit is the 4-bit quantized version of Mistral AI's Magistral Small 1.2 View on Hugging Face | ||
|
medgemma-27b-text-it-FP8-Dynamic
|
MedGemma-27B-Text-IT-FP8-Dynamic is an FP8 Dynamic–quantized derivative of Google’s MedGemma-27B-Text-IT model, optimized for high-throughput inference while preserving strong performance on medical and biomedical instruction-tuned text-only tasks. View on Hugging Face | ||
|
Midnight-Miqu-70B-v1.5-FP8-Dynamic
|
Midnight-Miqu-70B-v1.5 is a high-performance 70B parameter model specifically engineered for creative writing, long-form roleplay, and complex character interactions. It is a "DARE Linear" merge of Midnight-Miqu-v1.0 and Tess-v1.6, designed to retain the legendary prose quality of the original "Miqu" (the leaked Mistral-70B weights) while improving instruction following and world-state tracking. View on Hugging Face | ||
|
Ministral-3-14B-Reasoning-2512
|
The Ministral-3-14B-Reasoning-2512 (often referred to as part of the "Les Ministraux" family) is one of Mistral AI's most sophisticated "mid-weight" models. It is specifically engineered to bridge the gap between low-latency edge computing and the deep reasoning capabilities typically reserved for massive 70B+ parameter models. View on Hugging Face | ||
|
Ministral-3-8B-Instruct-2512-BF16
|
Ministral-3-8B-Instruct-2512-BF16 (released in December 2025/January 2026) is the newest "edge-sovereign" multimodal model from Mistral AI. View on Hugging Face | ||
|
Mistral-Small-24B-Instruct-2501
|
Mistral-Small-24B-Instruct-2501 (often referred to as Mistral Small 3) is a high-efficiency language model released in late January 2025. View on Hugging Face | ||
|
Mixtral-8x7B-Instruct-v0.1
|
Mixtral-8x7B-Instruct-v0.1 is a high-performance Sparse Mixture-of-Experts (SMoE) model released by Mistral AI. View on Hugging Face | ||
|
Molmo2-4B
|
Molmo2-4B is a highly efficient, small-scale Vision-Language Model (VLM) View on Hugging Face | ||
|
Molmo2-8B
|
multimodal model optimized for image and video understanding with strong grounding and reasoning capabilities View on Hugging Face | ||
|
Moonlight-16B-A3B
|
Moonlight-16B-A3B is a high-efficiency Mixture-of-Experts (MoE) language model released in February 2025 by Moonshot AI (the creators of Kimi). It was designed to push the "Pareto frontier"—delivering the reasoning power of much larger models while maintaining the inference speed and VRAM footprint of a small model View on Hugging Face | ||
|
NextCoder-14B
|
NextCoder-14B is a specialized large language model designed for code editing and modification View on Hugging Face | ||
|
NextCoder-7B
|
NextCoder-7B is a specialized, open-weights large language model (LLM) developed by Microsoft Foundry View on Hugging Face | ||
|
notux-8x7b-v1-AWQ
|
Notux-8x7b-v1-AWQ is a high-performance, 4-bit quantized version of the Notux-8x7b-v1 model. It combines a state-of-the-art Mixture-of-Experts (MoE) architecture with Activation-aware Weight Quantization (AWQ) for efficient deployment on NVIDIA GPUs. View on Hugging Face | ||
|
NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
|
Optimized with FP8 for fast, efficient reasoning and chat workloads. View on Hugging Face | ||
|
NVIDIA-Nemotron-3-Nano-30B-A3B-FP8-temp
|
test View on Hugging Face | ||
|
NVIDIA-Nemotron-Nano-12B-v2-VL-FP8
|
NVIDIA-Nemotron-Nano-12B-v2-VL-FP8 is a cutting-edge multimodal model released by NVIDIA in late 2025. It is specifically engineered for high-throughput, low-latency applications like document intelligence and long-form video understanding. View on Hugging Face | ||
|
NVIDIA-Nemotron-Nano-9B-v2
|
NVIDIA-Nemotron-Nano-9B-v2 is a 9-billion-parameter hybrid language model designed for high-efficiency reasoning and agentic workflows. View on Hugging Face | ||
|
Olmo-3.1-32B-Think-AWQ-4bit
|
OLMo-3.1-32B-Think-AWQ-4bit is a high-efficiency, reasoning-optimized version of the OLMo 3.1 family View on Hugging Face | ||
|
OpenEuroLLM-Czech-vLLM-GGUF
|
a Czech-language model optimized for local inference using the GGUF format View on Hugging Face | ||
|
OpenThinker3-7B
|
OpenThinker3-7B is a state-of-the-art open-source reasoning model View on Hugging Face | ||
|
OpenThinker-Agent-v1
|
OpenThinker-Agent-v1 is a state-of-the-art, 8-billion parameter open-source model specifically engineered for terminal automation and software engineering tasks. View on Hugging Face | ||
|
Phi-3.5-vision-instruct
|
Phi-3.5-vision-instruct is a lightweight, multimodal small language model (SLM) released by Microsoft. View on Hugging Face | ||
|
Phi-4-mini-reasoning
|
Phi-4-mini-reasoning is a compact, open-weight reasoning model from Microsoft, designed to bring high-level logical and mathematical "thinking" to small-scale hardware. View on Hugging Face | ||
|
Qianfan-OCR
|
Qianfan-OCR is a 4B-parameter end-to-end document intelligence model developed by the Baidu Qianfan Team. It unifies document parsing, layout analysis, and document understanding within a single vision-language architecture. View on Hugging Face | ||
|
Qwen2.5-14B-Instruct
|
A 14B instruction-tuned model for strong reasoning, coding, and chat tasks. View on Hugging Face | ||
|
Qwen2.5-32B-Instruct-AWQ
|
High-quality instruction-tuned model quantized with AWQ for efficient, lower-memory inference. View on Hugging Face | ||
|
Qwen2.5-7B-Instruct
|
Qwen2.5-7B-Instruct is part of Alibaba Cloud’s latest generation of large language models, released as an evolution of the Qwen2 series. View on Hugging Face | ||
|
Qwen2.5-7B-Instruct-Test
|
a test model for special image, not for public usage. View on Hugging Face | ||
|
Qwen2.5-Coder-0.5B
|
A lightweight coding model for fast, efficient code generation and debugging View on Hugging Face | ||
|
Qwen2.5-Coder-1.5B-Instruct
|
A lightweight coding model for fast, low-cost code generation and editing. View on Hugging Face | ||
|
Qwen2.5-VL-32B-Instruct-AWQ
|
The Qwen2.5-VL-32B-Instruct-AWQ is a high-performance, vision-language model optimized for efficient inference. It represents a significant step up in complexity and reasoning from the 7B models, sitting in the "heavyweight" class that typically requires multi-GPU setups or advanced quantization to run at interactive speeds. View on Hugging Face | ||
|
Qwen2.5-VL-7B-Instruct
|
Qwen2.5-VL-7B-Instruct is the latest iteration of Alibaba Cloud’s vision-language models, released in early 2025. View on Hugging Face | ||
|
Qwen2-VL-7B-Instruct
|
The Qwen2-VL-7B-Instruct is a cornerstone model in the second generation of Alibaba's Vision-Language (VL) series. It was released as a major upgrade to the original Qwen-VL View on Hugging Face | ||
|
Qwen3-14B
|
A 14B general-purpose model designed for strong reasoning, coding, and chat performance. View on Hugging Face | ||
|
Qwen3-32B
|
A general-purpose model designed for strong reasoning, coding, and conversational performance. View on Hugging Face | ||
|
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
|
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled is a specialized reasoning model released in March 2026. It is built on Alibaba's Qwen3.5-27B architecture and fine-tuned using high-density Chain-of-Thought (CoT) distillation from Anthropic’s Claude 4.6 Opus. View on Hugging Face | ||
|
Qwen3.5-27B-FP8
|
Optimized with FP8 quantization for efficient, high-performance inference View on Hugging Face | ||
|
Qwen3.5-35B-A3B
|
A model designed for strong reasoning, coding, and agent workloads with improved efficiency. View on Hugging Face | ||
|
Qwen3.5-35B-A3B-FP8
|
Optimized with FP8 for efficient, high-performance reasoning and chat workloads View on Hugging Face | ||
|
Qwen3.5-35B-A3B-GPTQ-Int4
|
The Qwen3.5-35B-A3B-GPTQ-Int4 is a specialized, highly optimized version of the Qwen3.5 model family. View on Hugging Face | ||
|
Qwen3.5-4B
|
Qwen3.5-4B is a compact, natively multimodal model released by Alibaba Cloud in February 2026. View on Hugging Face | ||
|
Qwen3.5-9B
|
A 9B general-purpose model designed for strong reasoning, coding, and chat performance View on Hugging Face | ||
|
Qwen3.5-9B-AWQ
|
a 9B Qwen3.5 model quantized with AWQ for efficient, low-memory inference. View on Hugging Face | ||
|
Qwen3.5-9B-NVFP4
|
A 9B Qwen3.5 model quantized to NVFP4 for ultra-efficient, low-memory inference. View on Hugging Face | ||
|
Qwen3.6-27B
|
Qwen3.6-27B View on Hugging Face | ||
|
Qwen3.6-27B-FP8
|
Qwen3.6-27B-FP8 View on Hugging Face | ||
|
Qwen3.6-35B-A3B
|
Qwen3.6-35B-A3B (released April 14, 2026) is the first open-weight model of the Qwen3.6 series. View on Hugging Face | ||
|
Qwen3.6-35B-A3B-AWQ
|
The QuantTrio/Qwen3.6-35B-A3B-AWQ is a high-performance, 4-bit quantized version of the Qwen3.6-35B-A3B model View on Hugging Face | ||
|
Qwen3.6-35B-A3B-FP8
|
Qwen3.6-35B-A3B-FP8 (officially released on April 16, 2026) is the first natively quantized FP8 variant of the Qwen3.6 series. View on Hugging Face | ||
|
Qwen3-ASR-1.7B
|
The Qwen3-ASR family includes Qwen3-ASR-1.7B and Qwen3-ASR-0.6B, which support language identification and ASR for 52 languages and dialects. Both leverage large-scale speech training data and the strong audio understanding capability of their foundation model, Qwen3-Omni. View on Hugging Face | ||
|
Qwen3-Coder-30B-A3B-Instruct-1M-GGUF
|
Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct. This streamlined model maintains impressive performance and efficiency, View on Hugging Face | ||
|
Qwen3-Coder-30B-A3B-Instruct-FP8
|
Qwen3-Coder-30B-A3B-Instruct-FP8 is a specialized, high-efficiency model released in late 2025/early 2026. It is designed for agentic coding—tasks where the AI acts as an autonomous developer, interacting with environments and tools View on Hugging Face | ||
|
Qwen3-Coder-Next-AWQ-4bit
|
This is a 4-bit AWQ quantized version of Qwen3-Coder-Next, an open-weight language model designed specifically for coding agents and local development. View on Hugging Face | ||
|
Qwen3-Coder-Next-FP8
|
Qwen3-Coder-Next-FP8, an open-weight language model designed specifically for coding agents and local development. View on Hugging Face | ||
|
Qwen3-TTS-12Hz-1.7B-Base
|
A lightweight text-to-speech model designed for efficient, high-quality speech synthesis View on Hugging Face | ||
|
Qwen3-TTS-12Hz-1.7B-CustomVoice
|
The Qwen3-TTS-12Hz-1.7B-CustomVoice represents a specialized, highly efficient iteration of Alibaba’s Qwen (Tongyi Qianwen) ecosystem, specifically tuned for Neural Text-to-Speech (TTS). At 1.7 billion parameters, it sits in the "Edge-AI" category—powerful enough to capture human-like prosody and emotion, but small enough to run with extremely low latency on local hardware or mobile devices. View on Hugging Face | ||
|
Qwen3-TTS-12Hz-1.7B-VoiceDesign
|
ightweight text-to-speech model designed for customizable voice generation View on Hugging Face | ||
|
Qwen3-VL-30B-A3B-Instruct
|
Qwen3-VL-30B-A3B-Instruct is a state-of-the-art multimodal Large Vision-Language Model (LVLM) released by the Qwen team (Alibaba Cloud) in late 2025. View on Hugging Face | ||
|
Qwen3-VL-32B-Instruct-FP8
|
Qwen3-VL-32B-Instruct-FP8 represents the pinnacle of mid-sized multimodal intelligence from Alibaba Cloud (Qwen Team), View on Hugging Face | ||
|
Qwen-Image-2512
|
Enhanced Huamn Realism Qwen-Image-2512 significantly reduces the “AI-generated” look and substantially enhances overall image realism, especially for human subjects. View on Hugging Face | ||
|
Qwen/Qwen3-32B-AWQ
|
a 32B Qwen3 model quantized with AWQ for efficient, high-performance inference View on Hugging Face | ||
|
Qwopus3.5-27B-v3
|
Jackrong/Qwopus3.5-27B-v3 is a highly specialized, reasoning-distilled version of the Qwen3.5-27B base model. View on Hugging Face | ||
|
rnj-1-instruct-AWQ-8bit
|
rnj-1-instruct-AWQ-8bit is the 8-bit quantized version of Rnj-1 Instruct, an elite 8-billion parameter agentic coding model released by Essential AI in late 2025. View on Hugging Face | ||
|
RolmOCR
|
RolmOCR is an open-source, high-performance document OCR model developed by Reducto AI as a lighter and faster alternative to Allen Institute for AI's olmOCR. View on Hugging Face | ||
|
Seed-OSS-36B-Instruct-AWQ
|
Seed-OSS-36B-Instruct-AWQ is a 4-bit quantized version of ByteDance’s Seed-OSS-36B, a mid-sized but extremely powerful open-source model released in August 2025. View on Hugging Face | ||
|
stable-diffusion-3.5-medium
|
Stable Diffusion 3.5 Medium (SD 3.5 Medium) is a state-of-the-art text-to-image model released by Stability AI View on Hugging Face | ||
|
Step3-VL-10B
|
Step3-VL-10B is an open-source multimodal large language model (MLLM) released in January 2026 by StepFun (Stepwise Star). View on Hugging Face | ||
|
Strand-Rust-Coder-14B-v1
|
The model fine-tunes Qwen2.5-Coder-14B for Rust-specific programming tasks using a 191K-example synthetic dataset built via multi-model generation and peer-reviewed validation. View on Hugging Face | ||
|
translategemma-27b-it-FP8-Dynamic
|
Multilingual translation model optimized with FP8 for fast, memory-efficient inference View on Hugging Face | ||
|
VibeVoice-ASR
|
A speech recognition model designed for accurate and efficient audio-to-text transcription View on Hugging Face | ||
|
Z-Image-Turbo
|
Z-Image-Turbo is a 6-billion parameter text-to-image model released by Alibaba's Tongyi Lab (the team behind Qwen) in late 2025. It was specifically engineered to challenge the dominance of larger models like FLUX.1 by prioritizing extreme inference speed and bilingual text rendering without sacrificing photorealism. View on Hugging Face |