Friday, December 5, 2025

North Korea Trash Talks With the 5th Round of Trash Balloons

Despite North Korea's fifth round of trash...

THE ONE TRILLION FANTASY: Lee’s Gov’t Throws $1 Billion At A Unification Dream Kim Jong Un Doesn’t Want

The Lee Jae Myung administration boosts the inter-Korean cooperation fund by 200 billion KRW to enhance dialogue and economic cooperation.

George Washington Becomes the First President of the United States on February 4, 1789

George Washington, the first U.S. president, led the American Revolution and set key precedents for governance and presidential terms.

THE AI DEBACLE: Did Trump’s ‘Elon Musk Mistake’ Hand China The Keys To Tech Superpower Status?

TechTHE AI DEBACLE: Did Trump's 'Elon Musk Mistake' Hand China The Keys To Tech Superpower Status?
 Courtesy of News1
 Courtesy of News1

The Chinese AI startup Moonshot AI recently unveiled its open-source inference LLM model, Kimi-K2-Thinking, intensifying the AI performance race in a way reminiscent of this year’s earlier DeepSick breakthrough.

Industry sources reported on Thursday that Moonshot AI claimed its Kimi K2 Thinking, released on the 6th of this month, outperformed leading models like OpenAI’s GPT-5 and Anthropic’s Claude Sonnet 4.5 across several benchmarks.

At launch, Kimi K2 Thinking scored an impressive 44.9 points on the Human-Level Evaluation (HLE) test, which assesses how closely AI matches human expert performance in specific fields. This score surpassed OpenAI’s GPT-5 (41.7 points), Claude Sonnet 4.5 (32 points), and DeepSick V3.2 (20.3 points).

In the BrowseComp web search capability test, Kimi K2 Thinking achieved a score of 60.2, outperforming GPT-5 (54.9), Claude Sonnet 4.5 (24.1), and V3.2 (40.1). However, it fell short of GPT-5 and Sonnet 4.5 in the SWE benchmark for coding abilities.

CNBC highlighted the model’s cost efficiency, estimating its training cost at just 4.6 million USD, though Moonshot AI denied that this figure was official.

Courtesy of News1
Courtesy of News1

Industry experts see this as another confirmation of the rapid advancement of Chinese AI models, following DeepSick and Alibaba Group’s Qwen Series.

OpenAI responded swiftly, unveiling GPT-5.1, an upgraded version of GPT-5, just a week after Kimi K2 Thinking’s release. The company reported significant performance improvements in math and coding benchmarks.

Additionally, OpenAI introduced a group chat feature for collaborative ChatGPT interactions, piloting it in countries including South Korea, Japan, Taiwan, and New Zealand.

Courtesy of News1
Courtesy of News1

Elon Musk’s xAI entered the fray by releasing Grok 4.1. on Monday.

Grok 4.1 comprises two models: Grok 4.1 (codename: tensor) for immediate responses, and Grok 4.1 Thinking (quasarflux) for deep thinking. Both models briefly topped several benchmarks, outperforming competitors like OpenAI, Anthropic, and Google.

Google quickly countered by launching its next-generation AI model, Gemini3, touting world-class performance. This latest version comes about eight months after Gemini 2.5’s debut.

Gemini3 scored 37.4 points on the Humanity’s Last Exam (HLE) benchmark, surpassing GPT-5.1 and Claude Sonnet 4.5.

Google plans to release Gemini 3 DeepSync, a research-focused extended version, soon.

U.S. analysts suggest that Gemini 3’s launch quickly overshadowed Grok 4.1’s spotlight.

Meanwhile, the performance gap between U.S. and Chinese AI models is rapidly narrowing. Stanford University’s Human-Centered AI Institute (HAI) reports that the gap between top models from China and the U.S. shrank from 103 points in January 2024 to just 23 points in February. The MMLU benchmark gap is expected to plummet from 20 percentage points in 2023 to just 0.3 percentage points by the end of 2024.

In a groundbreaking revelation, DeepSick disclosed on Tuesday in a peer-reviewed Nature paper that the training cost for R1 was only 294,000 USD. This represents only 0.3% of OpenAI’s reported 100 million+ USD in training costs for its foundational model in 2023.

Check Out Our Content

Check Out Other Tags:

Most Popular Articles