Back to AI information
Ali Tongyi releases Qwen3-VL-4B and 8B models, and the multi-modal version is officially online

Ali Tongyi releases Qwen3-VL-4B and 8B models, and the multi-modal version is officially online

AI information Admin 115 views

The Alibaba Cloud Tongyi Qianwen team announced the release of two new open-source versions of the Qwen3-VL model series—Qwen3-VL-4B and Qwen3-VL-8B—available on GitHub. According to the official introduction, these two models inherit the Qwen3 architecture and are optimized for multimodal tasks involving images and text. They can understand image, text, and table content, supporting generative answering and complex visual reasoning.

According to technical documentation, the Qwen3-VL-4B is designed for lightweight applications, balancing performance and deployment costs. The Qwen3-VL-8B offers higher accuracy and enhanced visual understanding capabilities, making it suitable for scientific research and enterprise-level tasks. Officials stated that community users are free to test model performance and provide feedback, and public sharing of both success and failure cases is encouraged. This release is seen as a significant expansion of Tongyi's open source multimodal capabilities.

Frequently Asked Questions

Q: What type of model is Qwen3-VL?

A: It is Tongyi Qianwen’s multimodal model that can process both image and text inputs.

Q: What new versions are included in this release?

A: Two new parameter-scale open source models, Qwen3-VL-4B and Qwen3-VL-8B, have been added.

Q: Where can I get these models?

A: The model code and weight files have been published in the Qwen official GitHub repository.

Q: What are the improvements compared to the previous version?

A: It mainly improves visual understanding, OCR accuracy, and cross-modal reasoning capabilities, and optimizes reasoning speed.

Q: Can it be deployed commercially or locally?

A: According to the official open source license agreement of Qwen, you can freely research and deploy it under the premise of complying with the terms.

Recommended Tools

More