InferX Catalog | Moonlight-16B-A3B

Moonlight-16B-A3B

Moonlight-16B-A3B is a high-efficiency Mixture-of-Experts (MoE) language model released in February 2025 by Moonshot AI (the creators of Kimi). It was designed to push the "Pareto frontier"—delivering the reasoning power of much larger models while maintaining the inference speed and VRAM footprint of a small model

moonshotai text text2text

Metadata

Provider

moonshotai

Modality

text

API type

text2text

Source

huggingface / moonshotai/Moonlight-16B-A3B

Created

2026-04-07 21:23:20 UTC

Updated

2026-04-13 03:13:35 UTC

Catalog version

Visibility

Published

Specifications

Parameters

—

MoE

Max model length

2000

Image

vllm/vllm-openai:v0.16.0

Default Deploy Config

GPU count

vRAM

50000 MB

Summary

1xGPU 50000 MB

Recommended Use Cases

—

Model Spec