Qwen3-Next-80B-A3B is launched: 3B-activated ultra-sparse MoE, a new benchmark for long-context throughput
Qwen3-Next-80B-A3B focuses on 80B total parameters, only 3B activation per token, adopts Hybrid architecture (Gated DeltaNet+Gated Attention), Ultra-s...