InferX Beta Serverless GPU Inference Platform, Built for Agent-Native Workloads

InnerVerse-GLM47Flash-v1

A fast, reasoning-focused model optimized for efficient inference and strong instruction following
Jrose620 text text2text coding reasoning

Jrose620/InnerVerse-GLM47Flash-v1 is a GLM-based model designed for fast inference and strong reasoning performance, offering solid capabilities in coding, analysis, and conversational tasks, making it suitable for agents, assistants, and production deployments where latency and efficiency matter.

Log in to deploy: this public page shows the catalog model details, but deployment and customization stay behind login.
Log in to deploy

Metadata

Provider
Jrose620
Modality
text
API type
text2text
Source
Created
2026-03-31 16:32:26 UTC
Updated
2026-03-31 16:46:49 UTC
Catalog version
1
Visibility
Published

Specifications

Parameters
MoE
No
Max model length
32768
Image
vllm/vllm-openai:glm5

Default Deploy Config

GPU count
1
vRAM
70000 MB
Summary
1xGPU 70000 MB

Recommended Use Cases

  • Coding assistant

Model Spec