Monday, March 10, 2025

Is Your Back Pain Just Aging? Discover the Truth About Spinal Stenosis

Spinal stenosis, common in older adults, narrows spinal spaces, compressing nerves and causing leg pain and walking difficulties.

North Korea Highlights Sports Achievements in Latest Film

To encourage locals to take part in...

U.S. vs. China: The Battle for AI Agent Supremacy Heats Up with ByteDance’s UI-TARS

WorldU.S. vs. China: The Battle for AI Agent Supremacy Heats Up with ByteDance’s UI-TARS
Bytedance UI-TARS introduction video capture
Bytedance UI-TARS introduction video capture

Competition between the United States and China for supremacy in generative artificial intelligence (AI) technology is expanding beyond the AI chatbot market into the AI agent market. 

AI agents are intelligent systems capable of autonomous problem-solving beyond the abilities of conventional chatbots. 

U.S. companies like OpenAI (Microsoft), Google, and Anthropic also have taken the lead in this field. However, ByteDance, TikTok’s parent company, is quickly gaining ground with its new AI model, setting the stage for an intense showdown. 

Bytedance UI-TARS introduction video capture
Bytedance UI-TARS introduction video capture

Industry sources reported on Tuesday that ByteDance unveiled its AI agent, UI-TARS, on January 23. This innovative system can solve problems autonomously by interpreting and inferring graphical user interfaces (GUIs).

Unlike other models, UI-TARS is said to function in both web browsers and mobile app environments.

Bytedance UI-TARS introduction video capture
Bytedance UI-TARS introduction video capture

ByteDance claimed that UI-TARS outperformed competitors like GPT-4o and Claude 3.5 Sonnet in VisualWebBench, a web-based visual AI model assessment. 

The company also claims that the “Daubao 1.5 Pro” version, released on the same day, is more cost-effective than Chat GPT-4o in coding, inference, and Chinese processing. Daubao is a popular Chinese chatbot with 60 million monthly active users (MAU).

OpenAI Web Browser AI Agent Operator
OpenAI Web Browser AI Agent Operator

OpenAI introduced its web browser-based AI agent, “Operator,” on January 23, just a day after ByteDance’s announcement. 

This rapid response is interpreted as UI-TARS influencing its timeline, just as it distributed o3-mini free of charge and launched the deep reasoning AI model DeepResearch immediately following the DeepSeek breakthrough.

The operator employs a modified version of GPT-4o’s vision recognition capability, CUA (Computer-Using Agent), to identify and sequentially execute commands in a web browser. 

This enables the AI to interact by interpreting images displayed on a computer screen.

YouTube@OpenAI
YouTube@OpenAI

Both companies claim their AI agents can autonomously search, recommend, and book travel itineraries. These systems can handle requests for flight bookings, hotel reservations, and Uber calls through voice or text commands. If problems arise, they attempt to resolve them independently before seeking user intervention. 

While traditional large language model (LLM) based AI chatbots primarily produce text outputs, these advanced AI agents are designed to perform actions based on user requests, potentially becoming essential partners in daily life and work. 

An industry insider noted that tech companies invest heavily in AI technology to secure leadership and market dominance. The insider pointed out that the DeepSeek breakthrough will likely intensify the U.S.-China rivalry in the AI agent market.

Check Out Our Other Content

Check Out Other Tags:

Most Popular Articles