Qwen releases the Qwen2.5-VL-32B multimodal model, which outperforms the 72B large model in terms of performance

PANews|Mar 25, 2025 03:33

According to the announcement from the Qwen team, the Qwen 2.5-VL-32B-Instruction model is officially open source, with a parameter scale of 32B, demonstrating excellent performance in tasks such as image understanding, mathematical reasoning, and text generation. This model is further optimized through reinforcement learning to respond more in line with human preferences, surpassing the previously released 72B model in multimodal evaluations such as MMMU and MathVista. Compared to the previous Qwen2.5-VL series models, the 32B model has the following improvements: responses are more in line with human subjective preferences; output styles have been adjusted to make responses more detailed, formatted, and more in line with human preferences. Mathematical reasoning ability: The accuracy of solving complex mathematical problems has significantly improved. Fine grained understanding and reasoning of images: exhibiting stronger accuracy and fine-grained analysis capabilities in tasks such as image parsing, content recognition, and visual logic deduction.