Friday, May 1, 2026

Store Owner Uses Raise as Bribe After Assaulting Employee

A convenience store owner in his 60s...

No More Fry-less Days: McDonald’s Korea Gets Back on Track

McDonald's Korea is set to resume selling...

South Korea Condemns North Korea-Russia Military Pact

Chang Ho Jin, head of the National Security Office, announced during a briefing at the presidential office in Yongsan that afternoon.

Watch Out ChatGPT—Alibaba’s New Qwen AI Talks, Listens, and Even Watches

EtcWatch Out ChatGPT—Alibaba’s New Qwen AI Talks, Listens, and Even Watches
Alibaba Cloud\'s end-to-end multimodal AI model Qwen2.5-Omni-7B / Photo courtesy of Alibaba
Alibaba Cloud’s end-to-end multimodal AI model Qwen2.5-Omni-7B / Photo courtesy of Alibaba

Alibaba Cloud, a subsidiary of Alibaba Group, unveiled its groundbreaking end-to-end multimodal artificial intelligence (AI) model, Qwen2.5-Omni-7B, on Monday.

This cutting-edge model can process diverse input data, including text, images, voice, and video, while delivering real-time text and voice responses.

Based on 7 billion parameters, Qwen2.5-Omni-7B is a lightweight model that implements cost-effective AI agents suitable for developing intelligent voice applications. Its versatility allows for applications across various domains, from providing real-time audio descriptions for the visually impaired to offering cooking guidance and enhancing customer service systems.

An Alibaba Cloud spokesperson stated that the model delivers high performance at a reduced cost based on innovative architecture. Key technologies include the “Thinker-Talker architecture,” which minimizes interference by separating text generation and speech synthesis, and “TMRoPE” (Time-aligned Multimodal RoPE), a location embedding technique that strengthens video and audio synchronization.

Leveraging an extensive pre-trained dataset, Qwen2.5-Omni-7B excels in various tasks, such as image-to-text, video-to-text, video-to-speech, and speech-to-text conversions. The model has notably achieved top-tier performance on the OmniBench benchmark, which assesses the integrated processing of visual, auditory, and textual information.

Alibaba Cloud has made the model open-source through popular platforms like Hugging Face and GitHub. It can also be accessed via ModelScope, Alibaba Cloud’s dedicated open-source community.

Alibaba Cloud has released over 200 Generative AI models open-source in recent years.

Following the initial launch of Qwen2.5 in September 2024, Alibaba Cloud expanded its AI portfolio with Qwen2.5-Max in January 2025. The company has also introduced specialized models like Qwen2.5-VL and Qwen2.5-1M for enhanced visual understanding and processing extended inputs.

Check Out Our Content

Check Out Other Tags:

Most Popular Articles