Summary of "NVIDIA GTC Taipei 2026 Keynote | Live"
Technological concepts & product/strategy highlights
1) “Useful AI” and agentic AI as the new computing wave
- Jensen frames AI’s progression as: generative AI → useful AI → agentic AI.
- Key claim: agentic systems are arriving and can do productive work (not just generate text).
- Agents are described as an architecture that includes:
- LLM(s) for “thinking”
- a harness for orchestration (routing context → plan → tool calls → execution)
- tools (e.g., browsers, databases, compilers, data engines)
- memory (working memory / “KV caching” and long-term retrieval)
- a runtime/security harness that makes the whole system operate safely.
2) Tokens as the economic driver
- “Tokens” are presented as profitable units that drive demand:
- If AI systems generate more output, companies want to generate more tokens
- This increases compute demand, which (per Jensen) is reflected in ecosystem activity in Taiwan.
- NVIDIA’s repeated metric focus: performance per watt = revenue per watt (compute as revenue).
3) New “computing pattern”: from apps to agents
- Traditional model: launch app → click/type → get output.
- New model: express intent, and the AI generates code or uses tools to produce the output.
- Breakthrough emphasized: tool use and tool orchestration, enabling agents to function like software coworkers.
4) NVIDIA CUDA-X libraries as “tools for agents”
- NVIDIA positions its library collection as an agent “toolbelt,” calling them CUDA-X.
- Examples named:
- cuLitho (computational lithography)
- cuOpt (decision optimization)
- cuDSS (direct sparse solvers)
- AI-Q (deep research across documents)
- Aerial (AI RAN)
- PhysicsNeMo (differentiable physics)
- Parabricks (genomics)
- “Skills”/documentation concept: libraries ship with usage “skills” so agents can learn how to use them effectively.
5) Disaggregated + distributed agent runtime (rack/data center orchestration)
- Agents are described as an ultimate disaggregated/distributed computing model, where:
- LLM context/logic runs on GPUs
- tool execution can involve CPUs
- security harness runs on a DPU (BlueField)
- orchestration runs on CPUs
- Memory complexity is highlighted as a major challenge:
- KV caching, compression, retrieval of structured vs unstructured data
- “ontology/relationships” for retrieval
Major product announcements & infrastructure features
6) Vera Rubin: the next-generation agentic supercomputer (full production)
- Vera Rubin is positioned as:
- not just a GPU system
- a multi-rack, pod-scale agent processing system
- Claims include:
- “Vera Rubin in full production”
- supply chain expansion and faster assembly:
- assembling a rack reduced from ~2 hours to 5 minutes (relative to Grace Blackwell assembly time per the narration)
- Architectural components described across a rack/pod:
- Vera NVL72 GPU rack/tray for high-throughput token generation
- Vera CPU rack (liquid-cooled) for orchestrating harnesses/tools and critical-path work
- Vera BlueField-4 for storage processing and security/context memory
- ConnectX-9 / SuperNICs and DPUs
- liquid-cooled bus bars and modular tray design for scalability/resiliency
- a specialized Ethernet switch concept (co-packaged optics mentioned)
7) Vera CPU: CPUs re-designed for agent latency/interaction
- Central message: older CPUs were built for “humans”; agents need nanosecond-level responsiveness.
- Vera is described with four design requirements:
- very high single-thread performance / IPC (instructions per clock)
- high bandwidth per core
- very high total bandwidth (including a “fabric” connecting cores)
- energy efficiency (so CPU capacity scales without stealing token-generator power)
- Specific implementation details mentioned:
- Olympus core architecture for modern data center workloads (Python/tool/sandbox execution)
- LPDDR5X with multi-error correction while maintaining bandwidth
- scalable coherence fabric (monolithic mesh unifying cores)
- “LPDDR5X vs x86” comparison claims:
- ~40% lower peak memory latency vs x86 (per narration)
- agent sandbox performance improvements (stated as 1.8× vs x86 in the narration)
- Real-world workload speedups cited:
- SQL running 3× faster
- real-time stream processing ~6× faster (example: NYSE telemetry)
8) DSX AI Factories: blueprint + simulation + operations stack
The video includes a long product narrative around NVIDIA’s DSX:
-
DSX Sim (Omniverse blueprint / digital twin):
- partners design/validate a Vera Rubin AI factory layout
- simulate power/cooling, network design, integration testing, and changes in a digital environment
-
DSX OS:
- provisions, operates, monitors, remediates infrastructure
- supports multi-tenant, resilient, AI-ready capacity
-
Power/cooling/throughput features highlighted:
- AI factories may overprovision power by up to 40% today; DSX MaxLPS aims to reduce waste.
- DSX MaxLPS lets operators safely deploy more GPUs under same power budget
- hot liquid cooling around 45°C (claimed water/energy efficiency)
- dynamic power allocation to shift power rack-to-rack (“recover stranded watts”)
- in-rack power smoothing to reduce spikes
- DSX Flex:
- cooperate with the grid using real-time grid signals
-
Economic framing:
- “Compute is revenue” and “lowest-cost tokens”
- DSX aims at faster time to first token / inference / training and higher tokens per watt
-
Hardware/software emphasis:
- DSX is portrayed as an end-to-end co-designed stack for maximum profitability and reliability.
9) NVIDIA Agent Toolkit for Enterprise AI (Open models + harness + runtime)
-
Jensen lists “4 things” for enterprise agents:
- Models
- Harness (orchestration)
- Tools/skills (e.g., CUDA-X libraries with agent-consumable skills)
- Runtime (secure “operating system” for agents)
-
Key runtime/security product:
- OpenShell:
- sandboxing agents under enterprise security policies
- privacy and identity protection emphasized
- positioned as open source and being adopted (Red Hat, Canonical, Microsoft mentioned)
- OpenShell:
-
Named components/harnesses:
- Claude Code and Codex as agent tools
- harness examples: OpenShell (runtime) and agent harnesses like OpenClaw, Hermes (mentioned as additional harnesses)
10) Chip design example: Cadence + NVIDIA agents + Nemotron
-
A “design verification agent” is described as speeding up the RTL verification loop:
- Codex orchestrates
- Nemotron + secure environment using NVIDIA OpenShell
- uses subagents for RTL generation, testbench creation, regression, debug
- uses Cadence tools (e.g., Xcelium, JasperGold) in loops
-
Performance claim:
- verification cycles over 40× faster
- “weeks to hours” (explicitly repeated)
11) Nemotron 3 Ultra (open model) for agent workflows
- Announcement of Nemotron 3 Ultra:
- 5× faster and ~30% cheaper
- hybrid architecture: SSM (State Space Models) + Mixture of Experts
- claims: open model system includes training data and scripts (“open models at their best”)
- Mentions ongoing work toward Nemotron 4.
“Reviews/guides/tutorials” style content
- This keynote is not presented as a tutorial/review format (no typical product review scorecards), but it does include an “implementation blueprint” view:
- DSX is effectively an operational guide: simulate → validate in digital twin → provision/operate with DSX OS.
- The Agent Toolkit is a conceptual guide: models + harness + tools/skills + runtime (OpenShell).
Additional platform directions mentioned (AI beyond the data center)
RTX Spark: reinvention of the PC for agents
- RTX Spark: reinvention of the PC for agents (RTX + unified memory + Grace CPU + NVLink)
- Includes partner ecosystem (e.g., Microsoft/MediaTek)
- Agent-friendly updates mentioned:
- Adobe Photoshop/Premiere
- MCP server support
Physical AI
- Cosmos 3 (physical-world omni-model): world model/simulator/action-conditioned future prediction
- Alpamayo 2 Super (reasoning autonomous vehicle model): tied to DRIVE Hyperion + Halos OS
- Isaac GR00T humanoid robotics platform:
- humanoid robot reference for research
- simulation/teleoperation/data generation stack
Main speakers / sources
- Jensen Huang (NVIDIA founder & CEO) — primary speaker
- Janine — stage host/introducer (spoken line: “Welcome to the stage…”)
- Additional source elements: VIDEO NARRATION segments (DSX, Vera Rubin, Vera CPU details, Agent Toolkit, Cadence verification, Cosmos/Alpamayo/Isaac GR00T announcements)
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.
Preparing reprocess...