Moonlight-16B-A3B
Moonlight-16B-A3B is a high-efficiency Mixture-of-Experts (MoE) language model released in February 2025 by Moonshot AI (the creators of Kimi). It was designed to push the "Pareto frontier"—delivering the reasoning power of much larger models while maintaining the inference speed and VRAM footprint of a small model
Metadata
Provider
moonshotai
Modality
text
API type
text2text
Source
huggingface /
moonshotai/Moonlight-16B-A3B
Created
2026-04-07 21:23:20 UTC
Updated
2026-04-13 03:13:35 UTC
Catalog version
2
Visibility
Published
Specifications
Parameters
—
MoE
No
Max model length
2000
Image
vllm/vllm-openai:v0.16.0
Default Deploy Config
GPU count
1
vRAM
50000 MB
Summary
1xGPU 50000 MB
Recommended Use Cases
—