Summary of "NVIDIA GTC Taipei 2026 Keynote | Live"

Technological concepts & product/strategy highlights

1) “Useful AI” and agentic AI as the new computing wave

Jensen frames AI’s progression as: generative AI → useful AI → agentic AI.
Key claim: agentic systems are arriving and can do productive work (not just generate text).
Agents are described as an architecture that includes:
- LLM(s) for “thinking”
- a harness for orchestration (routing context → plan → tool calls → execution)
- tools (e.g., browsers, databases, compilers, data engines)
- memory (working memory / “KV caching” and long-term retrieval)
- a runtime/security harness that makes the whole system operate safely.

2) Tokens as the economic driver

“Tokens” are presented as profitable units that drive demand:
- If AI systems generate more output, companies want to generate more tokens
- This increases compute demand, which (per Jensen) is reflected in ecosystem activity in Taiwan.
NVIDIA’s repeated metric focus: performance per watt = revenue per watt (compute as revenue).

3) New “computing pattern”: from apps to agents

Traditional model: launch app → click/type → get output.
New model: express intent, and the AI generates code or uses tools to produce the output.
Breakthrough emphasized: tool use and tool orchestration, enabling agents to function like software coworkers.

4) NVIDIA CUDA-X libraries as “tools for agents”

NVIDIA positions its library collection as an agent “toolbelt,” calling them CUDA-X.
Examples named:
- cuLitho (computational lithography)
- cuOpt (decision optimization)
- cuDSS (direct sparse solvers)
- AI-Q (deep research across documents)
- Aerial (AI RAN)
- PhysicsNeMo (differentiable physics)
- Parabricks (genomics)
“Skills”/documentation concept: libraries ship with usage “skills” so agents can learn how to use them effectively.

5) Disaggregated + distributed agent runtime (rack/data center orchestration)

Agents are described as an ultimate disaggregated/distributed computing model, where:
- LLM context/logic runs on GPUs
- tool execution can involve CPUs
- security harness runs on a DPU (BlueField)
- orchestration runs on CPUs
Memory complexity is highlighted as a major challenge:
- KV caching, compression, retrieval of structured vs unstructured data
- “ontology/relationships” for retrieval

Major product announcements & infrastructure features

6) Vera Rubin: the next-generation agentic supercomputer (full production)

Vera Rubin is positioned as:
- not just a GPU system
- a multi-rack, pod-scale agent processing system
Claims include:
- “Vera Rubin in full production”
- supply chain expansion and faster assembly:
  - assembling a rack reduced from ~2 hours to 5 minutes (relative to Grace Blackwell assembly time per the narration)
Architectural components described across a rack/pod:
- Vera NVL72 GPU rack/tray for high-throughput token generation
- Vera CPU rack (liquid-cooled) for orchestrating harnesses/tools and critical-path work
- Vera BlueField-4 for storage processing and security/context memory
- ConnectX-9 / SuperNICs and DPUs
- liquid-cooled bus bars and modular tray design for scalability/resiliency
- a specialized Ethernet switch concept (co-packaged optics mentioned)

7) Vera CPU: CPUs re-designed for agent latency/interaction

Central message: older CPUs were built for “humans”; agents need nanosecond-level responsiveness.
Vera is described with four design requirements:
1. very high single-thread performance / IPC (instructions per clock)
2. high bandwidth per core
3. very high total bandwidth (including a “fabric” connecting cores)
4. energy efficiency (so CPU capacity scales without stealing token-generator power)
Specific implementation details mentioned:
- Olympus core architecture for modern data center workloads (Python/tool/sandbox execution)
- LPDDR5X with multi-error correction while maintaining bandwidth
- scalable coherence fabric (monolithic mesh unifying cores)
- “LPDDR5X vs x86” comparison claims:
  - ~40% lower peak memory latency vs x86 (per narration)
- agent sandbox performance improvements (stated as 1.8× vs x86 in the narration)
Real-world workload speedups cited:
- SQL running 3× faster
- real-time stream processing ~6× faster (example: NYSE telemetry)

8) DSX AI Factories: blueprint + simulation + operations stack

The video includes a long product narrative around NVIDIA’s DSX:

DSX Sim (Omniverse blueprint / digital twin):
- partners design/validate a Vera Rubin AI factory layout
- simulate power/cooling, network design, integration testing, and changes in a digital environment
DSX OS:
- provisions, operates, monitors, remediates infrastructure
- supports multi-tenant, resilient, AI-ready capacity
Power/cooling/throughput features highlighted:
- AI factories may overprovision power by up to 40% today; DSX MaxLPS aims to reduce waste.
- DSX MaxLPS lets operators safely deploy more GPUs under same power budget
- hot liquid cooling around 45°C (claimed water/energy efficiency)
- dynamic power allocation to shift power rack-to-rack (“recover stranded watts”)
- in-rack power smoothing to reduce spikes
- DSX Flex:
  - cooperate with the grid using real-time grid signals
Economic framing:
- “Compute is revenue” and “lowest-cost tokens”
- DSX aims at faster time to first token / inference / training and higher tokens per watt
Hardware/software emphasis:
- DSX is portrayed as an end-to-end co-designed stack for maximum profitability and reliability.

9) NVIDIA Agent Toolkit for Enterprise AI (Open models + harness + runtime)

Jensen lists “4 things” for enterprise agents:
1. Models
2. Harness (orchestration)
3. Tools/skills (e.g., CUDA-X libraries with agent-consumable skills)
4. Runtime (secure “operating system” for agents)
Key runtime/security product:
- OpenShell:
  - sandboxing agents under enterprise security policies
  - privacy and identity protection emphasized
  - positioned as open source and being adopted (Red Hat, Canonical, Microsoft mentioned)
Named components/harnesses:
- Claude Code and Codex as agent tools
- harness examples: OpenShell (runtime) and agent harnesses like OpenClaw, Hermes (mentioned as additional harnesses)

10) Chip design example: Cadence + NVIDIA agents + Nemotron

A “design verification agent” is described as speeding up the RTL verification loop:
- Codex orchestrates
- Nemotron + secure environment using NVIDIA OpenShell
- uses subagents for RTL generation, testbench creation, regression, debug
- uses Cadence tools (e.g., Xcelium, JasperGold) in loops
Performance claim:
- verification cycles over 40× faster
- “weeks to hours” (explicitly repeated)

11) Nemotron 3 Ultra (open model) for agent workflows

Announcement of Nemotron 3 Ultra:
- 5× faster and ~30% cheaper
- hybrid architecture: SSM (State Space Models) + Mixture of Experts
- claims: open model system includes training data and scripts (“open models at their best”)
Mentions ongoing work toward Nemotron 4.

“Reviews/guides/tutorials” style content

This keynote is not presented as a tutorial/review format (no typical product review scorecards), but it does include an “implementation blueprint” view:
- DSX is effectively an operational guide: simulate → validate in digital twin → provision/operate with DSX OS.
- The Agent Toolkit is a conceptual guide: models + harness + tools/skills + runtime (OpenShell).

Additional platform directions mentioned (AI beyond the data center)

RTX Spark: reinvention of the PC for agents

RTX Spark: reinvention of the PC for agents (RTX + unified memory + Grace CPU + NVLink)
Includes partner ecosystem (e.g., Microsoft/MediaTek)
Agent-friendly updates mentioned:
- Adobe Photoshop/Premiere
- MCP server support

Physical AI

Cosmos 3 (physical-world omni-model): world model/simulator/action-conditioned future prediction
Alpamayo 2 Super (reasoning autonomous vehicle model): tied to DRIVE Hyperion + Halos OS
Isaac GR00T humanoid robotics platform:
- humanoid robot reference for research
- simulation/teleoperation/data generation stack

Main speakers / sources

Jensen Huang (NVIDIA founder & CEO) — primary speaker
Janine — stage host/introducer (spoken line: “Welcome to the stage…”)
Additional source elements: VIDEO NARRATION segments (DSX, Vera Rubin, Vera CPU details, Agent Toolkit, Cadence verification, Cosmos/Alpamayo/Isaac GR00T announcements)

Share this summary

Is the summary off?

If you think the summary is inaccurate, you can reprocess it with the latest model.

Summarize another video

Summary of "NVIDIA GTC Taipei 2026 Keynote | Live"

Technological concepts & product/strategy highlights

1) “Useful AI” and agentic AI as the new computing wave

2) Tokens as the economic driver

3) New “computing pattern”: from apps to agents

4) NVIDIA CUDA-X libraries as “tools for agents”

5) Disaggregated + distributed agent runtime (rack/data center orchestration)

Major product announcements & infrastructure features

6) Vera Rubin: the next-generation agentic supercomputer (full production)

7) Vera CPU: CPUs re-designed for agent latency/interaction

8) DSX AI Factories: blueprint + simulation + operations stack

9) NVIDIA Agent Toolkit for Enterprise AI (Open models + harness + runtime)

10) Chip design example: Cadence + NVIDIA agents + Nemotron

11) Nemotron 3 Ultra (open model) for agent workflows

“Reviews/guides/tutorials” style content

Additional platform directions mentioned (AI beyond the data center)

RTX Spark: reinvention of the PC for agents

Physical AI

Main speakers / sources

Category

Share this summary

Is the summary off?

Video

Summary of "NVIDIA GTC Taipei 2026 Keynote | Live"

Technological concepts & product/strategy highlights

1) “Useful AI” and agentic AI as the new computing wave

2) Tokens as the economic driver

3) New “computing pattern”: from apps to agents

4) NVIDIA CUDA-X libraries as “tools for agents”

5) Disaggregated + distributed agent runtime (rack/data center orchestration)

Major product announcements & infrastructure features

6) Vera Rubin: the next-generation agentic supercomputer (full production)

7) Vera CPU: CPUs re-designed for agent latency/interaction

8) DSX AI Factories: blueprint + simulation + operations stack

9) NVIDIA Agent Toolkit for Enterprise AI (Open models + harness + runtime)

10) Chip design example: Cadence + NVIDIA agents + Nemotron

11) Nemotron 3 Ultra (open model) for agent workflows

“Reviews/guides/tutorials” style content

Additional platform directions mentioned (AI beyond the data center)

RTX Spark: reinvention of the PC for agents

Physical AI

Main speakers / sources

Category ?

Share this summary

Is the summary off?

Video

Category