Qwen3-VL Released: Flagship 235B Model Open Source, Instruction/Thinking Versions Available

AI information • Admin • 9/24/2025 • 150 views

Tongyi Qianwen has launched the next-generation visual language model, the Qwen3-VL . The flagship Qwen3-VL-235B-A22B is available in two open-source versions: Instruct and Thinking . Official materials show that Instruct outperforms the Gemini 2.5 Pro on multiple visual benchmarks, while Thinking achieves leading results in multimodal reasoning tasks. The model supports "visual agents" that can interpret buttons, invoke tools, and complete real-world tasks on PC/mobile interfaces; it has performed exceptionally well in benchmarks such as OS World .

This upgrade emphasizes coverage of long context and complex scenarios: It supports over 256KB of context, expandable to 1MB , and can process approximately two hours of video and multi-page PDFs. It also offers OCR in 32 languages (with enhanced robustness against blurry, skewed, and rare characters), and provides more robust performance in 2D/3D spatial understanding, occlusion, and viewpoint reasoning. Regarding the open ecosystem, online conversation (Qwen Chat), API (Alibaba Cloud Model Studio), and Hugging Face/ModelScope weights and demos have all been released simultaneously.

Frequently Asked Questions

Q: Which variants are open sourced this time?

A: Qwen3-VL-235B-A22B Instruction and Thinking , also provides Caption/demonstration resources and reasoning examples.

Q: What can a visual agent do?

A: Read screen elements and hierarchies, understand buttons and forms, and use tool calls to complete tasks on real devices/applications.

Q: How large is the long context supported?

A: It is marked as 256K+ and can be expanded to 1M level, which is suitable for long video and long document scenarios.

Q: What is the coverage of multi-language capabilities?

A: It supports OCR in 32 languages, and its text capabilities are aligned with top general models for cross-language screen reading and comprehension.

Q: How to experience or access?

A: For Qwen Chat, choose qwen3-vl-plus . Alibaba Cloud Model Studio provides the API. Weights and demos are available in Hugging Face/ModelScope.

Qwen3-VL Released: Flagship 235B Model Open Source, Instruction/Thinking Versions Available

Related Articles

Qwen3-Coder Upgrade Release: Improved Terminal Bench Performance, Support for Qwen Code/Claude Code Integration

Qwen3-Max-Instruct/Thinking is now available: Coding and Agent capabilities are significantly enhanced

Kimi K3 officially launched: 2.8 trillion parameters betting on millions of contexts and open weight

Mistral Studio adds prompt version management: enterprise AI is now managing behavioral assets

Recommended Tools

Qwen3-VL Released: Flagship 235B Model Open Source, Instruction/Thinking Versions Available

Related Articles

Qwen3-Coder Upgrade Release: Improved Terminal Bench Performance, Support for Qwen Code/Claude Code Integration

Qwen3-Max-Instruct/Thinking is now available: Coding and Agent capabilities are significantly enhanced

Kimi K3 officially launched: 2.8 trillion parameters betting on millions of contexts and open weight

Mistral Studio adds prompt version management: enterprise AI is now managing behavioral assets

Recommended Tools

Submit AI Tool

Please confirm submission information