Alibaba RynnBrain: Open-Source Physical AI Beats Google and NVIDIA

Introduction

The race to build "physical AI" — artificial intelligence that can operate in and reason about the real world — just got a major new entrant. On February 10, 2026, Alibaba's DAMO Academy unveiled RynnBrain, an open-source embodied AI model designed to give robots the cognitive horsepower they need to navigate, perceive, and act in complex physical environments.

The announcement carries real weight. RynnBrain set new records across 16 open-source embodied AI benchmarks and claims to outperform significantly larger models from Google (Gemini Robotics ER 1.5) and NVIDIA (Cosmos Reason 2). Perhaps most importantly, Alibaba is releasing the full model weights and tooling as open source — a move that could dramatically accelerate robotics research globally.

Here's a deep dive into what RynnBrain is, how it works, and why it matters for the future of robotics.

What Is Physical AI?

Before unpacking RynnBrain itself, it's worth framing the category it competes in.

"Physical AI" (also called "embodied AI") refers to AI systems designed to interact directly with the physical world — not just process text or images on a screen. This covers humanoid robots, autonomous vehicles, robotic arms in factories, and service robots navigating real environments.

The core challenge is substantially harder than language AI. A large language model (LLM) processes tokens. A physical AI system must:

Perceive a dynamic 3D environment in real time
Remember where objects were and how they moved
Reason about spatial relationships and constraints
Plan multi-step actions with precise timing
Execute those plans with low latency and high reliability

Existing robotics AI models often fail at memory and spatial reasoning — they forget where an object was placed moments ago or misinterpret cluttered scenes. This is the exact gap RynnBrain targets.

RynnBrain's Core Architecture

Spatiotemporal Memory

The headline innovation in RynnBrain is its spatiotemporal memory system. Traditional embodied AI treats each moment in isolation. RynnBrain maintains a running model of the environment that tracks:

Where objects were (and where they've moved to)
How objects are likely to move next (trajectory prediction)
What the robot itself has done (action history)

This allows robots to handle tasks that unfold over time — like following instructions that reference objects placed earlier ("put the red apple next to the bowl you moved before"). Without spatiotemporal memory, most current systems would fail at even basic versions of these tasks.

Global Retrospection

RynnBrain introduces a mechanism called global retrospection — the ability for the robot to review its own action history before committing to a next step. Think of it like a chess player reviewing the last few moves before deciding on the next.

This reduces cascading errors in long-horizon tasks. If a robot makes a mistake in step 3 of a 10-step sequence, global retrospection gives it a mechanism to notice the error before compounding it through steps 4–10.

Physical-Space Reasoning

Most modern AI models reason primarily in the domain of language or 2D image features. RynnBrain adds a physical-space reasoning layer that combines:

Text-based logical inference
Spatial coordinate reasoning (3D positions, orientations, distances)
Scene geometry understanding

This hybrid approach lets robots reason in ways that map more naturally to physical reality — "the cup is 15cm to the left of the plate and 5cm above the table surface" rather than just "there's a cup near a plate."

The RynnScale Architecture

Under the hood, RynnBrain is trained on Alibaba's Qwen3-VL visual-language model — one of the leading vision-language foundation models — and optimized using a custom training architecture called RynnScale.

According to DAMO Academy, RynnScale doubled training efficiency without increasing compute costs. This kind of efficiency gain matters enormously at scale: training large robotics models is extraordinarily expensive, and halving those costs opens the technology to more research labs and companies.

The Numbers: 30B Parameters, 3B Active

One of the most technically interesting aspects of RynnBrain is its parameter architecture.

The headline model is a 30-billion-parameter mixture-of-experts (MoE) system. MoE architecture means the model is divided into specialized "expert" sub-networks. During inference, only a subset of these experts activates for any given input.

In RynnBrain's case: despite having 30 billion total parameters, only 3 billion parameters activate during any single inference pass. This is a 10:1 ratio — meaning the model runs at the speed and cost of a 3B model while drawing on the knowledge of a 30B system.

Alibaba claims this efficiency advantage lets RynnBrain outperform standard (non-MoE) 72-billion-parameter models. For robotics deployment, this is critical: robots have power budgets, thermal limits, and strict latency requirements that make running large dense models impractical.

Model	Parameters (Total)	Parameters (Active)	Benchmarks Set
RynnBrain	30B (MoE)	3B	16 records
Google Gemini Robotics ER 1.5	—	—	Previous record holder
NVIDIA Cosmos Reason 2	—	—	Previous record holder

Benchmark Performance: 16 Records Broken

RynnBrain was evaluated against 16 open-source embodied AI benchmarks covering:

Environmental perception — object detection and scene understanding
Spatial reasoning — 3D relationship comprehension and navigation
Task execution — multi-step instruction following and manipulation

DAMO Academy reports RynnBrain surpassed both Google's Gemini Robotics ER 1.5 and NVIDIA's Cosmos Reason 2 across these evaluations.

Alongside the model release, Alibaba introduced RynnBrain-Bench — a new evaluation framework specifically designed for fine-grained spatiotemporal tasks. This addresses a longstanding gap in embodied AI benchmarking: most existing tests focus on static image recognition rather than the dynamic, time-sensitive tasks real robots face.

If adopted by the research community, RynnBrain-Bench could become a standard evaluation tool — giving Alibaba additional influence over how physical AI progress is measured.

Open Source: Seven Models Released

Perhaps the most significant aspect of the RynnBrain launch is its open-source commitment. DAMO Academy released seven fully open-source models:

Base models — pre-trained foundations for robotics tasks
Fine-tuned variants — models optimized for specific commercial applications

All weights are publicly available on GitHub at github.com/alibaba-damo-academy/RynnBrain.

This is a meaningful differentiator. Competitors like Google and NVIDIA have kept their most capable robotics AI models proprietary or limited to research partnerships. Alibaba's full open release lowers the barrier to entry for:

Academic research labs building on top of RynnBrain
Robotics startups integrating physical AI without licensing costs
Hardware manufacturers optimizing for specific deployment scenarios
Open-source communities building tools and fine-tuning datasets

The move mirrors Alibaba's strategy with its Qwen language model series — using open-source releases to drive adoption and ecosystem development, then monetizing through cloud services and enterprise support.

China's Physical AI Race

RynnBrain doesn't exist in isolation. It's part of a broader national push by China to establish dominance in physical AI.

Beijing has explicitly designated robotics and embodied AI as strategic priorities, with significant government investment flowing into research programs and domestic manufacturers. Chinese companies — from Unitree to Fourier Intelligence — are rapidly closing the gap with American counterparts on humanoid robot hardware.

On the AI model side, RynnBrain is Alibaba's entry into a field where NVIDIA (with Cosmos) and Google (with Gemini Robotics) have been making aggressive moves. Baidu, ByteDance, and Tencent are also rumored to be developing competing physical AI models.

The result is a genuine race — not just between companies, but between countries — to define the intelligence layer of the coming robotics economy.

Real-World Applications

So what can RynnBrain actually do today? The demonstration video from DAMO Academy shows a robot arm recognizing fruit and placing it into a basket. Humble on the surface, but the underlying capabilities generalize broadly:

Manufacturing: Robots that can handle unstructured assembly tasks — not just bolt tightening on a fixed line, but adapting to varied component orientations and positions.

Logistics: Warehouse robots that track object states across a shift, remember where items were placed during restocking, and handle exceptions without human intervention.

Service Robotics: Household or hospitality robots that maintain a mental model of a room and follow multi-step natural language instructions involving objects placed at different times.

Surgical/Medical Robotics: Precise manipulation tasks where spatial reasoning and action history are critical for safety.

The factory and logistics sectors represent the most immediate deployment opportunities, where task structures are constrained enough that current spatiotemporal memory capabilities can add genuine value.

Strengths, Limitations, and What to Watch

What's genuinely impressive:

The MoE efficiency advantage is real and deployment-relevant
Open-sourcing seven models is a bold, community-accelerating move
The spatiotemporal memory approach targets a genuine weakness in existing systems
Setting benchmark records across 16 evaluations, if verified independently, is a strong signal

What to be cautious about:

Benchmark vs. real-world performance: Lab benchmarks and real robot deployment are different things. RynnBrain's 16 records are on open-source tests; independent real-world validation is still needed.
Manipulation dexterity: Spatial reasoning and memory are necessary but not sufficient. The hardware dexterity required for many tasks (fine motor manipulation, force sensing) is a separate challenge that AI models alone don't solve.
RynnBrain-Bench adoption: DAMO Academy introducing their own benchmark that their model tops is a well-worn move in ML. The community will need to verify that RynnBrain-Bench is genuinely standardizing useful evaluations rather than being tuned to favor RynnBrain.

What to watch:

Independent replication of benchmark results
Community adoption and fine-tuning efforts on GitHub
Whether RynnBrain gets integrated into commercial robot platforms
NVIDIA and Google's response (both have significant resources to iterate quickly)

Conclusion

Alibaba's RynnBrain is a serious entrant in the physical AI race. The combination of spatiotemporal memory, global retrospection, and a highly efficient MoE architecture addresses real limitations in current embodied AI systems. The open-source release of seven models is a genuine gift to the robotics research community.

Whether RynnBrain's benchmark dominance translates to real-world robot deployment superiority remains to be seen — that's a longer test that labs and manufacturers will run over the coming months. But as a technical statement and a strategic move in the global physical AI competition, it's impossible to ignore.

The robots are getting smarter. And increasingly, that intelligence is becoming open to everyone.

Sources: Interesting Engineering, CNBC, Techzine, GitHub

Alibaba RynnBrain: The Open-Source Physical AI Model Beating Google and NVIDIA