InferX Beta Serverless GPU Inference Platform, Built for Agent-Native Workloads

Qwen2.5-14B-Instruct

A 14B instruction-tuned model for strong reasoning, coding, and chat tasks.
Qwen text text2text coding reasoning chat

Qwen/Qwen2.5-14B-Instruct is a 14B parameter instruction-tuned model from the Qwen2.5 family, offering strong performance across reasoning, coding, and conversational tasks, with efficient resource usage suitable for agents, assistants, and production deployments.

Log in to deploy: this public page shows the catalog model details, but deployment and customization stay behind login.
Log in to deploy

Metadata

Provider
Qwen
Modality
text
API type
text2text
Source
Created
2026-04-06 05:40:30 UTC
Updated
2026-04-13 03:13:42 UTC
Catalog version
5
Visibility
Published

Specifications

Parameters
14.00B
MoE
No
Max model length
32768
Image
vllm/vllm-openai:v0.16.0

Default Deploy Config

GPU count
1
vRAM
40000 MB
Summary
1xGPU 40000 MB

Recommended Use Cases

Model Spec