NVIDIA has announced its next-generation AI chip, Rubin CPX, built to handle complex tasks like long-format video generation and million-token software coding.
TLDR:
- Rubin CPX is a new AI-focused GPU designed to process high-volume, long-context tasks like generative video and code creation.
- It is built on NVIDIA’s next-gen Rubin architecture, the successor to the current Blackwell tech.
- Rubin CPX is part of the Vera Rubin NVL144 CPX platform, offering up to 8 exaflops of AI compute and 100TB of fast memory.
- Companies can earn up to $5 billion in token revenue for every $100 million invested in Rubin CPX systems, according to NVIDIA.
What Happened?
NVIDIA has announced its latest AI hardware innovation, the Rubin CPX GPU, expected to launch by the end of 2026. The chip is designed for large-scale inference and generation tasks, especially those requiring high data volumes like generative video and advanced code creation. It’s built to process up to 1 million tokens in a single run, a major leap from what current GPUs can handle.
This marks NVIDIA’s next big step in advancing AI hardware performance and monetization potential, especially as demand grows for more intelligent and scalable AI systems.

Rubin CPX is Built for Long-Context AI
With the rise of complex AI applications, traditional GPUs are struggling to meet the demands of million-token inference tasks. An hour of generative video, for example, can take up to 1 million tokens, far beyond what standard GPUs were built to handle. That’s where Rubin CPX comes in.
The new chip will power high-volume workloads by integrating all processing stages like video decoding, encoding, and inference into a single GPU. This makes Rubin CPX a complete solution for AI developers building next-gen tools for video editing, cinematic content creation, and intelligent code assistance.
Highlights of the Rubin CPX Platform:
- Up to 30 petaflops of compute with NVFP4 precision
- 128GB GDDR7 memory for fast, cost-efficient processing
- 3x faster attention performance compared to NVIDIA GB300 NVL72
- Part of the Vera Rubin NVL144 CPX system, which delivers 8 exaflops of AI compute and 100TB of memory
- Integrated with 1.7 petabytes per second memory bandwidth in a single rack
Enterprise and Developer Benefits
NVIDIA estimates that every $100 million invested into Rubin CPX systems could yield $5 billion in token revenue, showcasing just how lucrative long-context processing can be. The Rubin CPX platform is also backward-compatible, allowing reuse of existing Vera Rubin NVL144 systems through a dedicated compute tray.
Top AI companies are already exploring Rubin CPX:
- Cursor plans to use it for fast, collaborative code generation inside developer tools.
- Runway aims to push creative workflows with agent-driven video tools that scale with Rubin CPX performance.
- Magic will apply it to AI agents that understand entire codebases and historical interactions, promising near-autonomous software engineering.
Software Support and Ecosystem
Rubin CPX will support NVIDIA’s full software stack, including:
- NVIDIA Dynamo for scaled AI inference
- Nemotron models for multimodal reasoning
- NVIDIA AI Enterprise with NIM microservices and frameworks
- Integration with CUDA-X libraries and a vast developer ecosystem of over 6 million members
This extensive support ensures Rubin CPX is not just a hardware upgrade, but a platform-level shift that businesses and developers can rely on for scalable, production-grade AI deployment.
What TechKV Thinks?
Honestly, Rubin CPX sounds like a monster chip. I’m genuinely excited about what it could unlock. Processing a million-token context was something that felt out of reach not long ago, but NVIDIA is making it feel standard. The way they’re integrating video processing, memory, and inference in one unit? That’s game-changing.
What really stands out is the potential for developers and creators. Whether you’re building cinematic AI content or massive software projects, Rubin CPX gives you the speed and scale to go wild. And the fact that companies could see $5 billion in returns on a $100 million investment? That’s no small claim. NVIDIA is not just setting a benchmark here. They’re redefining what AI infrastructure should look like.