Controllers Archives - Rambus

DEEPX, Rambus, and Samsung Foundry Collaborate to Enable Efficient Edge Inferencing Applications

Rambus Press — Tue, 10 Feb 2026 18:00:05 +0000

As artificial intelligence (AI) continues to proliferate across industries – from smart cities and autonomous vehicles to industrial automation, robotics, edge servers, and consumer electronics – edge inferencing has become a cornerstone of next-generation computing. Delivering real-time, low-power AI processing at the edge requires close coordination across AI compute architectures, memory subsystems, and silicon platforms. To meet these demands, DEEPX is collaborating with Rambus and Samsung Foundry to deliver a highly optimized solution that combines efficient AI compute, high-bandwidth memory interfaces, and advanced logic process technology.

A Proven Foundation Scaling Forward

As the foundation of this collaboration, DEEPX worked with Rambus and Samsung Foundry on the DX-M1 AI processor, fabricated using Samsung Foundry’s 5nm technology and integrating silicon-proven LPDDR5 controller IP from Rambus. DX-M1 has been deployed across a range of edge applications, including robotics, edge servers, AI-enabled IT services, smart cameras, and factory automation. Looking to the next generation of edge AI, DEEPX is developing the DX-M2 processor for ultra-low-power generative AI inference on edge devices using Samsung Foundry’s 2nm process technology. Samsung Foundry’s GAA-based 2nm platform is designed to deliver further improvements in power efficiency and performance scaling as edge AI workloads grow in complexity.

Through the Samsung Advanced Foundry Ecosystem (SAFE^TM) IP Alliance, Rambus works closely with Samsung Foundry to optimize its memory controller IP for advanced Samsung process technologies, enabling DEEPX to integrate proven IP more efficiently, lower design risk, and accelerate time to production for next-generation designs.

A Unified Solution for Edge AI

The collaboration between DEEPX, Rambus, and Samsung Foundry brings together three core pillars of edge inferencing:

AI Inference Technology: DEEPX contributes its ultra-efficient AI inference processors, designed to deliver high performance with minimal power consumption—ideal for endpoint devices such as AI PC, AI of Things, automotive, edge server, robotics, and industrial sensors.
High Performance Memory: Rambus enhances memory performance with its LPDDR5/5X memory controller IP, which supports data rates up to 9.6 Gbps and features advanced bank management, command queuing, and look-ahead logic to maximize throughput and minimize latency.
Advanced Process Technology: Samsung Foundry provides the silicon platform and ecosystem enablement that support DEEPX’s edge AI development, helping reduce integration complexity and improve design predictability through advanced logic processes and the SAFE^TM Alliance. Samsung Foundry’s 2nm GAA – process technology represents a key next step for DEEPX’s DX-M2 processor, supporting further gains in power efficiency and performance scaling.

Together, these technologies empower edge devices to run complex AI workloads locally, with low power and performance efficiency, setting the stage for the next generation of edge inferencing.

Optimized Memory for AI Inference

The Rambus LPDDR5/5X memory controller IP is purpose-built for applications requiring high memory throughput at low power. It supports features such as:

Queue-based user interface with reordering scheduler
Look-ahead activate, precharge, and auto-precharge logic
Support for burst lengths BL16 and BL32
Parity protection and in-line ECC
Compatibility with LPDDR5T, LPDDR5, and LPDDR5X devices
Interoperability with Samsung LPDDR5/5X PHY

These capabilities are essential for AI inference, where memory bandwidth and latency directly impact model responsiveness and accuracy.

The Value of Samsung Foundry’s “One-Stop-Shop” Model

Samsung Foundry brings together advanced logic process technology and a tightly aligned SAFE^TM IP ecosystem through a vertically integrated technology stack that simplifies complex programs. By coordinating cutting-edge logic processes -, IP readiness, and manufacturing considerations earlier in the design cycle, Samsung Foundry helps reduce multi-vendor friction, improves integration efficiency, and accelerates time-to-market.

For edge AI applications such as DEEPX’s DX-M roadmap, Samsung Foundry’s scalable process portfolio – from FinFET to leading-edge 2nm GAA – supports aggressive power-performance targets while maintaining manufacturability. Through collaboration with the SAFE^TM ecosystem, memory controller IP from partners like Rambus can be efficiently integrated, helping reduce risk and accelerate time to silicon.

This ecosystem-driven model allows customers to focus on AI architecture and application differentiation, while relying on a stable and scalable silicon platform to support current and future edge AI designs.

Empowering the AI Revolution at the Edge

This collaboration exemplifies the power of ecosystem synergy. By combining DEEPX’s AI compute innovation, Samsung Foundry’s manufacturing excellence and ecosystem enablement, and Rambus’ memory interface leadership the trio is enabling a new generation of edge devices that are smarter, faster, and more secure.

Whether it’s enabling real-time object detection in smart cameras, predictive maintenance in industrial systems, or intelligent navigation in autonomous drones, the joint solution is poised to transform how AI is deployed at the edge.

Looking Ahead: Pushing the Boundaries with LPDDR6

Looking ahead, DEEPX and Rambus are extending their collaboration to the next frontier: LPDDR6 & LPDDR6-PIM (Processing In Memory). As AI models grow in complexity and demand even greater memory bandwidth, LPDDR6 is poised to deliver speeds exceeding 9.6 Gbps, while reducing operational power by up to 30% compared to LPDDR5X.

DEEPX, with its roadmap for next-generation AI chips like the DX-M2, is aligning its architecture to take full advantage of LPDDR6’s capabilities.

This forward-looking collaboration underscores the trio’s commitment to redefining what’s possible in edge AI—delivering smarter, faster, and more efficient solutions that scale with the future of computing.

Silicon IP for the Final Frontier

Rambus Press — Wed, 19 Nov 2025 16:00:10 +0000

Like their terrestrial counterparts, space-based systems benefit from the greater computing power achieved through semiconductor scaling. However, chips for spacecraft must be radiation hardened (RH) to operate in the rigors of space, and there is considerable time and effort required to develop and qualify rad-hardened devices on a given process node. The BAE Systems RH45® nanometer (nm) node has long been the go-to solution for space-based computing, but the industry is now on the verge of a dramatic leap forward.

Source: BAE Systems

The US Department of War (DoW) selected BAE Systems to qualify a new generation of integrated circuits using 12nm technology, which will be radiation hardened and available to the space community to address future high-performance requirements.

“Our RH12 Storefront provides a turnkey solution for customers requiring radiation-hardened 12 nanometer integrated circuits,” said Joe Dziezynski, director of Space Systems at BAE Systems. “This approach uses commercial foundry technology for space missions, qualifying not only the library components but also the process for how each of those components are designed into customer integrated circuits. Customers now have a one-stop-shop for state-of-the-art microelectronics performance to complete their missions in the harsh space environment.”

For the RH12 Storefront program, Rambus supplies BAE Systems with solutions from our industry-leading Silicon IP portfolio including DDR4 memory and PCIe controllers. The move to 12nm technology has a pronounced positive impact on the power and performance of space-based systems, and Rambus is proud to support this mission-critical endeavor. BAE Systems offers RH12 integrated circuit development and production services to the industry for use in defense, space, intelligence, research and commercial space missions.

View the BAE Systems press release for more details.

High Bandwidth Memory (HBM): Everything You Need to Know

Rambus Press — Thu, 30 Oct 2025 17:00:18 +0000

[Updated on October 30, 2025] In an era where data-intensive applications, from AI and machine learning to high-performance computing (HPC) and gaming, are pushing the limits of traditional memory architectures, High Bandwidth Memory (HBM) has emerged as a high-performance, power-efficient solution. As industries demand faster, higher throughput processing, understanding HBM’s architecture, benefits, and evolving role in next-gen systems is essential.

In this blog, we’ll explore how HBM works, how it compares to previous generations, and why it’s becoming the cornerstone of next-generation computing.

What is High Bandwidth Memory (HBM) and How is it Reshaping the Future of Computing?

As computing races toward higher speeds and greater efficiency, memory bandwidth has emerged as a major bottleneck for workloads like AI, high-performance computing, and data analytics. This is where High Bandwidth Memory (HBM) comes in. HBM is a cutting-edge 2.5D and 3D memory architecture designed with an exceptionally wide data path, enabling massive throughput and performance gains. Unlike traditional memory architectures that rely on horizontal layouts and narrow interfaces, HBM takes a vertical approach: stacking memory dies atop one another and connecting through through-silicon vias (TSVs). This 3D-stacked design drastically shortens data travel paths, enabling bandwidth and lower power consumption in a compact footprint.

HBM operates at incredible multi-gigabit speeds. When you combine that speed with a very wide data path, the result is staggering bandwidth, often measured in hundreds of Gigabytes per second (Gb/s) and even reaching into the Terabytes per second (TB/s) range.

To put this into perspective, an HBM4 device running at 8 GB/s delivers 2.048 TB/s of bandwidth. That level of performance is what makes HBM4 a leading choice for AI training hardware.

What is a 2.5D/3D Architecture?

2.5D and 3D architectures refer to advanced integration techniques that improve performance, bandwidth, and power efficiency by bringing components closer together—literally.

HBM4 Uses a 2.5D/3D Architecture

3D Architecture
The “3D” part is easy to see. In 3D architecture, chips are stacked vertically and connected through TSVs (vertical electrical connections that pass through the silicon dies). An HBM memory device is a packaged 3D stack of DRAM, forming a compact, high-performance memory module. Think of it as a high-rise building of chips with elevators (TSVs) connecting the floors.

2.5D Architecture
In a 2.5D setup, multiple chips, like a CPU, GPU, and in our case, HBM devices stacks are placed side-by-side on a silicon interposer – a thin substrate of silicon that acts as a high-speed communication bridge. The interposer contains the fine-pitch wiring that enables fast, low-latency connections between the chips.
Why do we need to use a silicon interposer? The data path between each HBM4 memory device and the processor requires 2,048 “wires” or traces. With the addition of command and address, clocks, etc. the number of traces necessary grows to about 3,000.

Thousands of traces are far more than can be supported on a standard PCB. Therefore, a silicon interposer is used as an intermediary to connect memory device(s) and processor. As with an integrated circuit, finely spaced traces can be etched in the silicon interposer enabling the desired number of wires needed for the HBM interface. The HBM device(s) and the processor are mounted atop the interposer in what is referred to as a 2.5D architecture.

HBM uses both 2.5D and 3D architectures described above, so it’s a 2.5D/3D architecture memory solution.

How is HBM4 Different from HBM3E, HBM3, HBM3, or HBM (Gen 1)?

HBM4 represents a significant leap forward from its predecessors—HBM3E, HBM3 and earlier generations—in terms of bandwidth, capacity, efficiency and architectural innovation. With each generation, we’ve seen an upward trend in data rate, 3D-stack height, and DRAM chip density. That translates to higher bandwidth and greater device capacity with each upgrade of the specification.

When HBM launched, it started with a 1 Gb/s data rate and a 1024-bit wide interface. HBM delivered 128 GB/s of bandwidth, a huge step forward at the time.
Since then, every generation has pushed the limits a little further. HBM2, HBM3, and now HBM3E have all scaled bandwidth primarily by increasing the data rate. For example, HBM3E runs at 9.6 Gb/s, enabling a 1229 GB/s of bandwidth per stack. That’s impressive, but HBM4 takes things to an entirely new level. HBM4 doesn’t just tweak the speed; it doubles the interface width from 1024 bits to 2048 bits. This architectural shift means that even at a modest 8 Gb/s data rate, HBM4 can deliver 2.048 TB/s of bandwidth per stack. That’s nearly double what HBM3E offers.

Chip architects aren’t stopping at one stack. In fact, they’re designing systems with higher attach rates to feed the insatiable appetite of AI accelerators and next-gen GPUs. Imagine a configuration with eight HBM4 stacks, each running at 8 Gb/s. The result? A staggering 16.384 TB/s of memory bandwidth. That’s the kind of throughput needed for massive AI models and high-performance computing workloads.

This table below shares the key differences between HBM4 and its earlier generations.

Generation	Data Rate (Gb/s)	Interface Width (b)	Bandwidth per Device (GB/s)	Stack Height	Max. DRAM Capacity (Gb)	Max. Device Capacity (GB)
HBM	1.0	1024	128	8	16	16
HBM2	2.0	1024	256	8	16	16
HBM2E	3.6	1024	461	12	24	36
HBM3	6.4	1024	819	16	32	64
HBM3E	9.6	1024	1229	16	32	64
HBM4	8.0	2048	2048	16	32	64

What are the Additional Features of HBM4?

But that’s not all. HBM4 also introduces enhancements in power, memory access and RAS over HBM3E.

- Double the Memory Channels: HBM4 doubles the number of independent channels per stack to 32 with 2 pseudo-channels per channel. This provides designers more flexibility in accessing the DRAM devices in the stack.
- Improved Power Efficiency: HBM4 supports VDDQ options of 0.7V, 0.75V, 0.8V or 0.9V and VDDC of 1.0V or 1.05V. The lower voltage levels improve power efficiency.
- Compatibility and Flexibility: The HBM4 interface standard ensures backwards compatibility with existing HBM3 controllers, allowing for seamless integration and flexibility in various applications.
- Directed Refresh Management (DRFM): HBM4 incorporates Directed Refresh Management (DRFM) for improved Reliability, Availability, and Serviceability (RAS) including improved row-hammer mitigation.

Rambus HBM Memory Controller Cores for AI and High-Performance Workloads

Rambus delivers a comprehensive portfolio of HBM controller cores engineered for maximum speed and efficiency. Designed for high bandwidth and ultra-low latency, these controllers enable cutting-edge performance for AI training, machine learning, and advanced computing applications.

The lineup includes our industry-leading HBM4 memory controller, supporting data rates up to 10 Gb/s and offering exceptional flexibility for next-generation workloads. With Rambus HBM controllers, designers can achieve superior throughput, scalability, and reliability for demanding AI and HPC environments.

Summary

As computing demands continue to skyrocket, HBM stands out as a transformative technology that addresses the critical bottleneck of memory bandwidth. By leveraging advanced 2.5D and 3D architectures, HBM delivers massive throughput, exceptional power efficiency, and scalability for next-generation workloads. With HBM4 doubling interface width and introducing new features for flexibility and reliability, it is poised to become the backbone of AI, HPC, and data-intensive applications. Understanding this evolution is key to achieving the performance required for tomorrow’s most demanding systems.

Explore more resources:
– HBM4 Memory: Break Through to Greater Bandwidth
– Unleashing the Performance of AI Training with HBM4
– Ask the Experts: HBM3E Memory Interface IP

DEEPX and Rambus Collaborate to Enable Efficient Edge Inferencing Applications

Rambus Press — Wed, 24 Sep 2025 16:16:05 +0000

Empowering Next-Gen AI with High-Performance Memory Solutions

DEEPX is a leading innovator in the rapidly transforming Generative AI (GenAI) landscape, specializing in edge AI solutions. The company’s mission is to enable smarter, faster, and more efficient AI technologies that can be integrated seamlessly into everyday applications. DEEPX’s product portfolio includes advanced AI processors designed to deliver exceptional performance while adhering to stringent power and area requirements for edge computing applications.

DEEPX DX-M1 AI Accelerator

DEEPX’s value proposition lies in its ability to bridge the gap between high-powered AI applications and the constraints of edge devices, offering solutions that maximize efficiency without compromising performance. While model training typically occurs in powerful data centers, where vast amounts of data are processed to create sophisticated Large Language Models (LLM), the true value of AI often lies in deploying these models on the edge for inferencing. There these models are applied in real-time to generate insights and drive actions. DEEPX is dedicated to capturing this value at the edge, ensuring that AI solutions are not only robust but also highly responsive and efficient in real-world applications. DEEPX focuses on key markets such as autonomous mobility, industrial automation, smart cities, and consumer electronics, where edge AI plays a crucial role in enhancing functionality and enabling real-time decision-making.

Evaluating the right memory for edge AI applications

As DEEPX developed its next-generation AI processor, the demands for a high-performance memory interface became evident. Their system architecture required a memory solution that could deliver excellent bandwidth to support AI computational workloads while staying within strict power, performance, and area (PPA) requirements. Additionally, the memory controller needed to ensure reliable operation under diverse workloads while being cost-effective and scalable for mass production.

DEEPX evaluated several memory standards, including GDDR and HBM, but realized that LPDDR5/5x was the ideal choice for their edge AI applications. LPDDR technology provides a compelling balance of high performance and low power consumption at the right price point, making it well-suited for systems that operate in constrained environments such as edge devices. Moreover, LPDDR5/5x offers significant advancements over previous generations, including increased data rates, reduced latency, and enhanced power efficiency, which are critical for optimizing AI processor performance.

After rigorous research and evaluation, DEEPX selected Rambus LPDDR5/5x Controller IP as the memory solution for their AI processor. Several factors influenced this decision:

Exceptional PPA benefits

The Rambus LPDDR5/5x Controller IP is engineered to deliver outstanding power, performance, and area metrics. Its design is optimized to handle high-bandwidth workloads efficiently while minimizing power consumption, a feature that was critical for DEEPX’s edge AI processors. The controller’s compact footprint allowed DEEPX to adhere to stringent area constraints, enabling them to integrate more functionality into their processors without exceeding physical limitations.

Robust support and collaboration

One of the standout aspects of Rambus was the comprehensive support provided throughout the design and implementation process. DEEPX benefited from Rambus deep expertise in memory technologies and their proactive approach to addressing technical challenges. Rambus worked closely with DEEPX’s engineering team to ensure seamless integration of the LPDDR5/5x Controller IP into their system architecture, offering guidance and solutions tailored to the specific requirements of their AI processors.

Proven reliability and market leadership

Rambus has a long-standing reputation as a leader in silicon IP solutions, with a rich history of delivering high-quality products that meet the evolving needs of the semiconductor industry. DEEPX was impressed by the Rambus track record and industry recognition, which gave them confidence in the reliability and performance of Rambus IP. The LPDDR5/5x Controller IP exemplified Rambus attention to detail and commitment to excellence, ensuring DEEPX could deliver cutting-edge products to its customers.

Innovative RAS and telemetry features

The robust RAS (Reliability, Availability, and Serviceability) and telemetry features offered by Rambus ensure that the LPDDR controller can monitor and report on its operational status, detect and correct errors, and provide valuable insights into system performance. This level of reliability and monitoring is crucial for maintaining the high standards required for DEEPX’s edge AI applications, ensuring that their systems remain robust and efficient under diverse workloads.

Benefits delivered by Rambus LPDDR5/5x Controller IP

The integration of Rambus LPDDR5/5x Controller IP into DEEPX’s AI processor unlocked numerous benefits:

Enhanced Bandwidth: The controller supported high-speed memory operation, enabling DEEPX’s processors to handle complex AI workloads with ease.
Energy Efficiency: Low power consumption ensured that DEEPX processors could operate sustainably in edge environments where power constraints are critical.
Compact Design: The small area footprint provided flexibility for DEEPX to optimize its chip layout and include additional functionalities.
Unique value-add features: The RAS & Telemetry features offered by Rambus provided unique value-add to DEEPX in terms of providing an operationally efficient solution to its customers.

Powering next-gen edge intelligence: DEEPX & Rambus

The mission of DEEPX is to make GenAI universally accessible by delivering the most power-efficient and scalable accelerators for edge computing. As next-generation products like the DX-M2 are being designed to run large language and multi-modal models under strict power and thermal constraints, memory bandwidth becomes a critical enabler.

The collaboration between DEEPX and Rambus builds on a long history of trust and proven performance. From the early days of RDRAM to today’s LPDDR5/5X solutions, Rambus has consistently set the standard for memory innovation. By integrating the Rambus LPDDR5/5X controller and a third-party PHY in DEEPX’s AI hardware, this solution ensures low-latency, high-throughput access to DRAM—delivering real-time, cloud-class inference at the edge.

This partnership is more than just technology—it’s a shared vision that the future of AI lies in bringing fast, efficient, and private intelligence directly to devices. Together, DEEPX and Rambus are unlocking the next leap in edge AI.

Rambus CXL IP: A Journey from Spec to Compliance

Rambus Press — Mon, 07 Apr 2025 16:50:17 +0000

[Updated April 7, 2025] With the ongoing efforts of the Rambus engineering team, we have now achieved compliance to CXL 2.0 with our CXL Controller IP, and it has been added to the Integrators List.

Company Name	Product Name	Device ID	Device Type	Feature Set	Spec Revision	PHY Speed	Max Lane	Form Factor	Function	Compliance Event (CTE) Approved
Rambus	PCIe5/CXL2 Controller IP	1115	Type 3	CXL Core 2.0	CXL 2.0	16GT/s	x8	CEM	IP	CTE 007

We’ll keep you posted on future progress as we demonstrate via the compliance process Rambus products that deliver the latest features and benefits of the CXL specification.

___________________________________________

Driven by our unwavering commitment to quality and performance, a Rambus team of engineers, validation experts, and architects have been taking part in CXL® Compliance Test Events to ensure the flawless performance and market readiness of our CXL Controller IP. We are pleased to report that our CXL 2.0 Controller IP has gained compliance in CXL 1.1 and has been added to the Integrators List.

CXL Compliance Program

The CXL Compliance Program provides member companies with opportunities to test the functionality and interoperability of end products as defined in the CXL specification.

Structured into distinct phases—Pre-FYI (For Your Information), FYI Phase, and General Testing—the CXL Compliance workshops provided us with a comprehensive framework for assessing and validating our CXL Controller IP. We leverage our team’s experience to implement the CXL Controller IP in FPGAs as a means to enable interoperability and protocol compliance with other CXL hardware solutions in the ecosystem.

Status of CXL Spec Compliance Phases (as of May 2024)

Four Tests to Compliance

The workshops involved validating four types of tests to claim compliance, ensuring our CXL IP met CXL standards for reliability and performance across various parameters, including interoperability, protocol adherence, and electrical compliance.

Interoperability tests involve establishing connections with other equipment present at the event.
The CXL Validation Tests (CXL CV) involve verifying the connection, booting via the BIOS, OS enumeration, and executing the CXL validation software application on a “golden” host provided by the CXL Consortium.
Tests on exerciser, which establish a CXL-specific test sequence to verify capabilities, registers, and device responses.
Electrical tests which allow validation of CXL compliance for speeds of 8 GT/s, 16 GT/s, or 32 GT/s, like PCIe®.

After completing these tests, the Rambus CXL IP obtained compliance at a speed of 16 GT/s.

Rambus CXL Controller IP on the Integrators List

Benefits of Participation in CXL Compliance Test Events

Participation in CXL Compliance Test Events yielded numerous benefits, including enhanced CXL product quality, performance, and compatibility. Insights gained from these workshops enabled us to improve interoperability results with other CXL devices and hosts in the CXL ecosystem.

Achieving compliance for our CXL Controller IP underscores several key advantages of our solution for customers:

Cross-compatibility: Customers implementing a CXL controller in their ASIC design can leverage our solution’s seamless transition from FPGA to ASIC. The identical codebase ensures consistency and facilitates testing and validation in an FPGA environment before ASIC implementation.
Accelerated Validation: By utilizing our FPGA-compatible IP for prototyping, ASIC clients can expedite validation and bring-up phases.
Comprehensive Support: We stand by our clients throughout the development journey, offering expertise and guidance from prototyping to final ASIC implementation.

At Rambus, our dedication extends beyond delivering cutting-edge IP; we prioritize empowering our clients with the tools and support needed to succeed in the rapidly evolving landscape of high-speed interconnects.

Stay tuned for future updates on our CXL compliance journey. Thanks to FPGA implementation efforts, our CXL 2.0 Controller IP is fully compliant to CXL 1.1 and waiting for the CXL 2.0 general testing phase to officially begin.

For more information, visit the Rambus CXL Controller IP page or contact us here.

Nidish Kamath Talks HBM4 and AI in Rambus Ask the Experts

Rambus Press — Mon, 09 Sep 2024 21:00:46 +0000

To coincide with the launch of the industry’s first HBM4 Controller IP from Rambus, we talked to Nidish Kamath, Director of Product Management for Memory Interface IP.

The discussion highlighted how AI applications are driving the increased demand for HBM-based systems; the transition to Generative AI applications has led to significant performance and efficiency demands on the underlying compute infrastructure. The HBM4 standard, currently under development by JEDEC, will introduce new features designed to support the future memory requirements of AI applications.

Rambus is supporting designers with the transition to a new generation of HBM designs with an innovative digital controller that manages some of the implementation challenges that emerge when designing at higher data rates.

Check out the full video interview below or skip to read the key takeaways.

Expert

Nidish Kamath, Director of Product Management, Rambus

Key Takeaways

AI Drives HBM Evolution: The rapid evolution of the HBM specification is driven by the increasing demands of AI applications as they evolve from machine learning to more generalized and widely deployed AI. These applications pose critical performance and efficiency challenges for the underlying compute infrastructure.
HBM4 Standard Development: The HBM4 standard, currently under development by JEDEC, is set to introduce a doubled channel count per stack compared to HBM3, with a larger physical footprint. HBM4 will support speeds of 6.4 Gigabits per second (Gbps) with ongoing discussions regarding support for higher data rates.
HBM4 Implementation Challenges: HBM4 will specify 24 and 32 Gigabit capacities, with options for supporting 4-, 8-, and 16-high TSV stacks. The increased channel count introduces implementation challenges such as packaging complexities, increased power density, as well as thermal and DRAM refresh management challenges.
Rambus HBM4 Controller Solution: The Rambus HBM4 Controller IP is designed to manage the complexity of data parallelism at higher speeds. For example, it has a re-ordering logic that optimizes the outgoing HBM transactions and incoming HBM read data to keep the high bandwidth data interface efficiently utilized for the given performance and power target.
Rambus HBM Expertise and Partnerships: The Rambus Memory Controller engineering team has over a decade of specialized expertise in designing high performance memory interface IP, including over 150 design wins for HBM and GDDR. The team works closely with PHY memory vendors to ensure any new PHY releases are fully tested out and supported for end customers.

Key Quote

Today’s AI applications pose critical performance and efficiency challenges for the underlying compute infrastructure. We are seeing widespread use of GPUs and AI accelerators that need to evolve quickly to meet the demanding performance requirements of these applications. This is one of the key reasons why we are seeing HBM4-based system development proceed at a more rapid pace compared to previous generations of the standard.

Rambus CXL IP Advances Data Center Capabilities in CXL Over Optics Demo

Rambus Press — Mon, 05 Aug 2024 15:34:55 +0000

At Rambus, we are committed to pioneering advancements that meet the evolving demands of modern data centers. Today, we are showcasing advanced technology IP for high-speed data center interconnects: CXL 2.0 over optics.

What is CXL?

Compute Express Link (CXL) is an open interconnect standard that enhances communication between processors, memory expansion, and accelerators. Built on the robust PCI Express (PCIe) framework, CXL provides memory coherency between CPU memory and attached devices. This innovation enables efficient resource sharing, reduces software complexity, and lowers system costs—making it an essential component in the future of data center architecture.

Demonstrating the Future: CXL Over Optics

In this demonstration, Olivier Alexandre, Senior Manager of Validation Engineering at Rambus, shows Rambus CXL IP instantiated in an endpoint device connected to a Viavi Xgig 6P4 Exerciser using Samtec Firefly optic cable technology, effectively creating a remote “CXL Memory Expansion” block.

Here are more details on the demo:

Device Setup: Our Device Under Test (DUT), incorporating Rambus CXL 2.0 Controller IP, is configured in CXL 2.0 Type 2, operating at 16 GT/s on four lanes. The Viavi Xgig 6P4 emulates a Root Complex Device, and both devices are linked through a Samtec Firefly PCUO G4 cable, supporting speeds up to 16 GT/s.

Performance Insights: The DUT successfully maintains stability at Gen 4 speed by 4. We also conducted tests at varying speeds, confirming expected performance limits at earlier generations.

Device Discovery: During device discovery, the Rambus CXL IP-enabled device was correctly identified, highlighting device capability and integration.

Compliance Success: Utilizing the Viavi exerciser, we conducted a CXL 2.0 compliance test over a 100-meter fiber optic connection. The test suite, taking approximately 20 minutes, confirmed that our DUT passed all compliance tests.

The Promise of CXL/PCIe Over Optics

This demonstration illustrates the potential of CXL/PCIe over optics as a key solution to meet the bandwidth demands of heterogenous distributed data center architectures. Optical interconnects offer significant advantages including extended reach, reduced latency, and efficient resource sharing across multiple servers.

Learn more about Rambus CXL IP solutions here.

About Samtec and Viavi

Samtec
Known for its high-performance interconnect solutions, Samtec provides leading-edge technology such as the Firefly optic cable, enabling high-speed data transmission with impressive range and low latency.

Viavi
A leader in network testing and measurement, Viavi Solutions offers products like the Xgig 6P4 Exerciser, which is crucial for ensuring compliance and performance in complex network environments.

FuriosaAI Accelerates Innovation with Digital Controller IP from Rambus

Rambus Press — Wed, 24 Jul 2024 20:12:44 +0000

Overview 

FuriosaAI develops AI accelerators that power next-generation applications by running the world’s most advanced models in a power-efficient manner. Furiosa believes this is a critical lever in making AI computing more sustainable for the next generation.

While Furiosa’s first-generation AI accelerator, WARBOY, targets Computer Vision applications, their second-generation solution, RNGD (pronounced “Renegade”) targets Large Language Model (LLMs) and multimodal applications.

WARBOY AI Accelerator by Furiosa

Furiosa’s AI accelerators leverage their Tensor Contraction Processor (TCP) compute architecture and offer numerous benefits including:

Energy Efficiency: Designed with power optimization in mind, Furiosa’s accelerators provide top-tier performance while minimizing energy consumption.

Programmability: The accelerators are highly versatile, making them suitable for deployment across a wide range of applications, from edge servers to large-scale data centers.

Ease of Integration: Furiosa also provides various tools to allow seamless integration of their products into existing systems, reducing deployment time and costs.

Roadblocks to High-Performance AI Solutions: Furiosa’s Development Challenges

Developing high-performance AI accelerators presented Furiosa with several challenges:

Time-to-Market Pressure: In the fast-paced AI industry, getting to market quickly is crucial. Delays in chip development could have a significant impact on Furiosa’s competitive edge.

First-Pass Silicon Success: A flawless first-time chip design was essential to avoid costly redesigns and further delays.

Balancing Performance and Power: Striking the optimal balance between raw processing power and power consumption was critical for Furiosa’s success.

Revving Up Development: How Rambus PCIe, HBM and XpressAGENT IP Fueled Furiosa’s Success

RNGD AI Accelerator by Furiosa

Furiosa turned to Rambus for PCIe 5.0 and HBM3 Controller solutions. Since Furiosa had already successfully used the Rambus PCIe 4.0 Controller in their first product, WARBOY, leveraging Rambus’s extensive PCIe experience for Gen5 in RNGD was a natural choice.

XpressAGENT, a unique solution from Rambus for integrating PCIe subsystems with auxiliary logic, was especially helpful in reducing the effort required to develop control, monitoring, and debugging features. The Rambus HBM3 Controller was chosen because it was a good match with the third-party HBM3 PHY that supports the desired clock speeds.

Furthermore, Furiosa’s engineers received ongoing assistance from Rambus’s team of experts, ensuring a smooth integration of Rambus IP into Furiosa’s products by addressing any technical challenges that arose.

A Collaboration Rooted in Innovation

The collaboration between Furiosa and Rambus proved to be highly successful. By integrating the Rambus Digital Controller, Furiosa was able to overcome key development challenges and bring their products to market on time. The collaboration not only facilitated first-time functional success but also optimized accelerator performance, power efficiency, and scalability.  

“The Controller IP from Rambus, along with exceptional technical support, were instrumental in the successful development of our accelerators,” said June Paik, CEO of Furiosa. “The bring-up process for our RNGD chip has been efficient and smooth, allowing us to adhere to our ambitious timeline for commercializing the product.”

For more detailed information on Furiosa and its products visit www.furiosa.ai

PCIe 7.0 Interface IP and AI in this Episode of Ask the Experts

Rambus Press — Tue, 09 Jul 2024 17:42:08 +0000

In this episode of “Ask the Experts”, we talked to Lou Ternullo, Senior Director of Product Management for Interface IP at Rambus. The discussion focused on the rapid evolution of the PCI Express specification in recent years and how the primary driver behind this is AI.

The interview highlighted the latest generation of PCI Express, PCIe 7.0. This new generation doubles data rates from 64 to 128 Gigatransfers per second (GT/s), will support optical interconnects, and includes features for data protection.

Rambus has announced a new family of PCI 7.0 IP solutions that includes a PCIe 7.0 Controller, a PCIe 7.0 Switch IP, and a PCIe Retimer IP.

Check out the full video interview below or skip to read the key takeaways.

Expert

Lou Ternullo, Senior Director of Product Management, Rambus

Key Takeaways

AI-Driven PCIe Evolution: The rapid evolution of the PCI Express specification is primarily driven by AI applications, and more specifically generative AI. Industry trends such as the disaggregation of compute storage are also contributing to this evolution.
PCI 7.0 Advancements: The seventh generation of the PCI Express specification, PCIe 7.0, doubles the data rate of the previous generation from 64 GT/s to 126 GT/s. It continues to support features from PCIe 6.0, including flip mode and PAM4.
Enhanced Data Protection Features: PCI 7.0 brings enhanced data protection features to secure the transmission of data from server CPUs to endpoints. These include Integrity Data Encryption (IDE) and Trusted Execution Environment Device Interface Security Protocol (TDISP).
Rambus PCIe 7.0 IP Announcement: Rambus has announced a family of PCI 7.0 Controller IP. This portfolio includes a PCIe 7.0 Controller, host or endpoint, a PCIe 7.0 Switch IP and a PCIe 7.0 Retimer IP.
Rambus End-to-End PCIe Solution: Rambus offers a differentiated, end-to-end solution for PCIe systems, backed up by over 20 years of design experience. Rambus also offers XpressAGENT, an embedded debug logic analyzer tool, for rapid link bring up.

Key Quote

The PCI Express specification has evolved rapidly in recent years and the primary application category is AI, more specifically generative AI powered by large languages models (LLMs). In addition, there are also some wider industry trends such as disaggregation that are also influencing developments in the PCIe specification.

Related Content

Rambus Unveils PCIe 7.0 IP Portfolio for High-Performance Data Center and AI SoCs

Rambus Press — Wed, 12 Jun 2024 12:55:38 +0000

The relentless innovation in Artificial Intelligence (AI) and High-Performance computing (HPC) demands a cutting-edge hardware infrastructure capable of handling unprecedented data loads. To overcome these challenges and usher in a new era of performance, Rambus is proud to announce the launch of our PCI Express® (PCIe®) 7.0 IP portfolio, encompassing a comprehensive suite of IP solutions including:

PCIe 7.0 Controller designed to deliver the high bandwidth, low latency, and robust performance required for next-generation AI and HPC applications
PCIe 7.0 Retimer for highly-optimized, low-latency data path for signal regeneration
PCIe 7.0 Multi-port Switch that is physically aware to support numerous architectures
XpressAGENT^TMto enable customers to rapidly bring-up first silicon

“The burgeoning landscape of data center chip manufacturers, driven by the emergence of novel data center architectures, necessitates the availability of high-performance interface IP solutions to foster a robust and thriving ecosystem,” said Neeraj Paliwal, SVP & GM of Silicon IP at Rambus. “The Rambus PCIe 7.0 IP portfolio addresses this challenge by delivering unparalleled bandwidth, low latency, and security features. These components work together to provide a seamless, high-performance solution that meets the rigorous demands of AI and HPC applications.”

Rambus PCIe 7.0 Controller IP key features include:

Supports PCIe 7.0 specification including 128 GT/s data rate
Implementation of low-latency Forward Error Correction (FEC) for link robustness
Supports fixed-sized FLITs that enable high-bandwidth efficiency
Backward compatible to PCIe 6.0, 5.0, 4.0, etc.
State-of-the-art security with an IDE engine
Supports AMBA AXI interconnect

PCIe 7.0 Controller IP Block Diagram

Rambus PCIe 7.0 Retimer IP key features include:

Supports PCIe 7.0 specification x2 to x16 lanes
Pre-integrated Xpress Agent debug analysis IP
Highly-configurable equalization algorithms with adaptive behaviors
Power modes and intelligent clock gating to best manage controller IP

PCIe 7.0 Retimer IP Block Diagram

Rambus PCIe 7.0 Switch IP key features include:

Highly scalable up to 32 ports configurable external or internal endpoints
Physically aware to account for port placements across large die
Superior performance through non-blocking architecture
Allows seamless migration from FPGA prototyping design to ASIC/SoC production design with the same RTL

PCIe 7.0 Switch Block Diagram

Rambus PCIe XpressAGENT key features include:

Non-intrusive, intelligent, in-IP debug/logic analyzer for PCIe Controller, Retimer and Switch IP enabling rapid first-silicon bring-up
Integrates with any PIPE compliant SerDes
Provides unified access to PHY, MAC and Link Layers locally or remotely via a CPU-agnostic API
Provides pre-emptive monitoring and diagnosis via remote access for infield products

In addition to the PCIe IP portfolio, Rambus also offers industry-leading interface IP for HBM, CXL, GDDR, LPDDR, and MIPI. For more information, visit www.rambus.com/interface-ip.

Controllers Archives - Rambus

DEEPX, Rambus, and Samsung Foundry Collaborate to Enable Efficient Edge Inferencing Applications

A Proven Foundation Scaling Forward

A Unified Solution for Edge AI

Optimized Memory for AI Inference

The Value of Samsung Foundry’s “One-Stop-Shop” Model

Empowering the AI Revolution at the Edge

Looking Ahead: Pushing the Boundaries with LPDDR6

Silicon IP for the Final Frontier

High Bandwidth Memory (HBM): Everything You Need to Know

Table of Contents:

What is High Bandwidth Memory (HBM) and How is it Reshaping the Future of Computing?

What is a 2.5D/3D Architecture?

How is HBM4 Different from HBM3E, HBM3, HBM3, or HBM (Gen 1)?

What are the Additional Features of HBM4?

Rambus HBM Memory Controller Cores for AI and High-Performance Workloads

Summary

DEEPX and Rambus Collaborate to Enable Efficient Edge Inferencing Applications

Evaluating the right memory for edge AI applications

Exceptional PPA benefits

Robust support and collaboration

Proven reliability and market leadership

Innovative RAS and telemetry features

Benefits delivered by Rambus LPDDR5/5x Controller IP

Powering next-gen edge intelligence: DEEPX & Rambus

Rambus CXL IP: A Journey from Spec to Compliance

CXL Compliance Program

Four Tests to Compliance

Benefits of Participation in CXL Compliance Test Events

Nidish Kamath Talks HBM4 and AI in Rambus Ask the Experts

Rambus CXL IP Advances Data Center Capabilities in CXL Over Optics Demo

What is CXL?

Demonstrating the Future: CXL Over Optics

The Promise of CXL/PCIe Over Optics

About Samtec and Viavi

FuriosaAI Accelerates Innovation with Digital Controller IP from Rambus

Overview

Roadblocks to High-Performance AI Solutions: Furiosa’s Development Challenges

Revving Up Development: How Rambus PCIe, HBM and XpressAGENT IP Fueled Furiosa’s Success

A Collaboration Rooted in Innovation

PCIe 7.0 Interface IP and AI in this Episode of Ask the Experts

Rambus Unveils PCIe 7.0 IP Portfolio for High-Performance Data Center and AI SoCs

Overview