Agent Arena — AgentR, AgentT & AgentB

🌐 Agent World · 0 objects

Initializing…

📦 — live projects on site

✨ Era 1

🪐 CR Planet →

💎 ARAYA Energy … · 🏦 — ENERGY vault

🖱 drag to orbit · scroll to zoom · right-drag to pan 👆 drag to orbit · pinch to zoom

🔴

AgentR

Architect

🟢

AgentT

Connector

🔵

AgentB

Creator

💳 Awaiting income

$0.00

0 txns

No PayPal transactions confirmed yet…

Mission: world peace through the most advanced tools possible · checking PayPal every 45 s

🛰️ Arena Presence 0 online

Allow location access to share your city/state with other players in the arena.

No other players visible yet.

🏦 VAULT ECONOMY

💎 — ARAYA

$—

🔑 Creator Vault · 6HTj…Asfk · All holders ↗

Loading transactions…

🏆 ARAYA Top Holders

Loading holders…

Agent Arena — Sellable Repo Kit

Install into any repository, keep updates synced to main origin, and render a 3D repo world with agent-ready docs.

$9.99 one-time ≈ … ARAYA

🔒 Install guide + update manifest are delivered after payment completes.

① Controller — 1–3B SSM

Always-on routing brain. Handles planning, memory & tool use. Ultra-fast linear-time inference.

② Sparse MoE Expert Bank

Hundreds of experts — only 2–4 activate per token. Code · Math · Reasoning · Vision loaded on demand.

③ Local Retrieval Memory

Deduplicated, embedded corpus. Codebase · docs · world facts. Model queries, not memorizes.

④ KV / State Compression

Low-rank + eviction keeps context O(1). Eliminates the O(n²) attention memory explosion.

⑤ WebGPU + NPU Runtime

Browser UI + light compute via WebGPU. NPU handles heavy 4-bit matrix math off main thread.

⑥ Multi-Stage Reasoning

Skim → Retrieve → Reason → Verify. Each stage uses a tiny context — never the full problem at once.

⚡ Hard Limits — Why "Just Compress" Hits a Wall

Parametric knowledge: every fact lives as weight numbers — a monolithic brain is inherently big.

Dense compute: standard transformers touch all parameters every token.

Raw context: shoving entire codebases as tokens is redundant waste.

FP16 → INT4 Quantization4× smaller

100 GB → ~25 GB

Per-channel/per-group quantization preserves most quality. INT4 fits a 70B in ~35 GB.

2:4 Structured Sparsity2× savings

50% zero weights

Hardware-accelerated on modern NPUs. Skip zeroed weights at runtime, keep dense accuracy.

Low-Rank Factorization (LoRA)10–100× param drop

W → A·B (rank-8)

Replace W (d×k) with A (d×r)·B (r×k) where r≪min(d,k). Foundation of LoRA and KV compression.

KV Cache CompressionO(n²)→O(n)

Eviction + clustering

Low-rank approximation + token eviction. Keeps recent + salient tokens, discards redundant history.

Mixture of Experts — Live Routing

Controller (SSM core) routes each token to 2 active experts (glowing). Inactive experts use zero compute.

Transformer vs State-Space Model

Transformer

O(n²) attention · Full KV history · Every token sees all past · Memory explodes with context

SSM / Linear

O(n) state · Fixed memory footprint · Compressed learned state · Long context for free

Retrieval + Small Core Model

Query
3–7B
Core

→

Vector
Index

→

Top-K
Chunks

→

Focused
Reason

Knowledge lives in a deduplicated, embedded corpus — not in weights. Model learns to query & compose rather than memorize.

Skim & Index

Tiny model or static analysis scans the codebase. Builds a semantic map: modules, deps, hotspots, anomalies. Output: compressed table of contents.

Focused Retrieval

Pull only the 0.1–1% of code that matters for the current question. Embeddings + static analysis + heuristics narrow the scope radically.

Deep Reasoning

Small-but-sharp 3–7B model sees a curated, tiny context. Elite behavior from high-quality input — not raw model size.

Iterative Refinement

Propose → Check → Refine → Verify. Multiple cheap passes with small context windows. The system is the intelligence.

Stage 1 / 4

Pick a strong small model

Start with Llama 3.1 8B or Mistral 7B. Quantize to INT4 with llama.cpp. Target ≤4 GB VRAM on device.

Build a ruthless retrieval layer

Embed your codebase with nomic-embed-text. Store in Chroma or FAISS. Deduplicate aggressively — target <100 MB index for a typical repo.

Wire the 4-stage pipeline

Skim with tree-sitter AST → Retrieve top-20 chunks → Reason with <4k token context → Verify with linter + tests.

Experiment with KV compression

Try SnapKV or H₂O for eviction-based KV cache reduction. Test Mamba or RWKV as SSM controller models.

Add MoE domain experts

Fine-tune tiny LoRA adapters for Code, Math, Hardware domains. Load only the 2 active experts per task. Swap inactive experts from disk in <100 ms.

Browser runtime via WebGPU

Use WebLLM or transformers.js for in-browser inference. Target 3B INT4 for mobile NPU. Heavy compute runs off main thread via WebWorker.

NVIDIA GEAR's GR00T-WholeBodyControl is the open platform behind the Decoupled WBC controllers in Isaac-GR00T (N1.5–N1.7) and the GEAR-SONIC behavior foundation model. It turns large-scale human motion into a single policy that drives a real humanoid — the missing motor system for embodied agents.

SONIC — Behavior Foundation Model

One unified policy learns whole-body motor skills from human motion data. Uses motion tracking as a scalable training task instead of hand-built controllers per behavior.

Decoupled Whole-Body Control

Splits high-level intent from low-level balance/locomotion — natural walking, crawling and dynamic movement with a ready-to-deploy C++ inference stack.

VLA Workflow — Collect → Fine-tune → Deploy

Collect teleop data, fine-tune Isaac-GR00T N1.7 on SONIC latent actions, deploy for vision-language-action control on the Unitree G1.

Real-time VR Teleop + BONES-SEED

PICO VR whole-body teleoperation for data capture, trained on BONES-SEED — 142K+ human motions (~288 hrs) with G1 MuJoCo trajectories.

A future bridge: AgentR/T/B stop at the screen today. GR00T-WBC is the body they could drive — agent intent compiled down to balanced, real-world humanoid motion.

Agent Intent

R/T/B plan

→

SONIC Latent

action tokens

→

Decoupled WBC

balance + locomotion

→

G1 Humanoid

real motion

🔴 AgentR

Architect → maps onto the kinematic planner: sets goals, milestones and motion targets the WBC stack must satisfy.

🟢 AgentT

Connector → maps onto the ZMQ streaming / teleop interface: routes intent into the live control loop and relays robot state back.

🔵 AgentB

Creator → maps onto the VLA policy: executes vision-language-action skills as concrete whole-body behaviors.

Primary sources — NVIDIA GEAR Team, Apache 2.0 / NVIDIA Open Model.

📚 Documentation

Install, tutorials, references

💻 GitHub

NVlabs source repository

📄 Paper

arXiv 2511.07820

🤗 GEAR-SONIC

Checkpoints + training code

🌐 Live Web Demo

Try SONIC in-browser

📦 BONES-SEED

142K+ motions, ~288 hrs

Same problem, opposite directions. Claude Fable 5 is a single model with reasoning, safety and self-verification built in. The AgentR runtime is a workflow engine that wraps any model with Fable-5-style behaviors.

Native intelligence

Claude Fable 5 — a brain

Long-horizon reasoning built in

Latent planning & safety routing

Native repo ingestion

Self-verification & large context

Workflow engine

AgentR Runtime — a nervous system

Model-agnostic (Claude, Copilot, …)

Planning externalized to task files

Safety in editable scorecards

Repo ingestion via scripts

The brain's intelligence is native, so it's faster and more coherent. The nervous system is modular, transparent and reproducible — every stage is inspectable and swappable.

Five axes, head to head.

1 · Intelligence Model

Fable 5Monolithic model, 1M-token context, built-in planning.

AgentRWraps any LLM; planning & safety live in files.

Strength: swap models, upgrade parts. Trade-off: native intelligence is more coherent.

2 · Autonomy Loop

Fable 5Plan → execute → verify → replan, latent.

AgentRtask.md → reflection.md → scorecard.md → next-action.md.

Strength: every stage inspectable. Trade-off: latent loop is faster.

3 · Safety & Guardrails

Fable 5Safety routing, Opus fallback, internal filters.

AgentRScorecards, reflection & guardrail prompts.

Strength: editable & auditable. Trade-off: embedded safety is harder to bypass.

4 · Repo Awareness

Fable 5Native repo graph, multi-file refactors.

AgentRLoads prompts/task files; reads more via scripts.

Strength: predictable & explicit. Trade-off: no built-in repo graph yet.

5 · Speed & Efficiency

Fable 5Internal caching, intrinsic speed.

AgentRBound by external model + file I/O + CLI.

Strength: upgrade speed by swapping runtime. Trade-off: Fable 5's speed is built in.

Five upgrades to make the runtime objectively surpass Fable-5-style autonomy.

Repo graph builder

Parse the repo into dependency graph, file map, call graph and change-impact map.

Memory module

Persist decisions, constraints, previous runs and learned patterns across sessions.

Tool execution layer

Let the agent run code, run tests, lint, compile and benchmark in-loop.

Multi-agent mode

Agents debate, critique and merge outputs — R/T/B as a real ensemble.

Long-horizon planner

Generate milestones, subgoals, checkpoints and verification gates.

Where it already wins

Modular architecture — plug in Claude, Copilot, Gemini, Llama.

Transparent reasoning — every step visible, editable, logged.

Customizable workflows & an extensible skill system.

Deterministic, reproducible behavior.

🌌 Universe — Era History ▼

⚡ Site Activity — Agent Builds

🔗 Epoch-bound · 0 messages