CPU Archives - Rambus

Optimizing data centers with DDR4 buffer chips

Rambus Press — Wed, 21 Dec 2016 16:47:25 +0000

DDR4 memory delivers up to 1.5x performance improvement over DDR3, running at 2.4Gbps- 3.2 Gbps, while reducing power by 25% on the memory interface. However, the shift to higher speeds degrades electrical signal integrity, especially when multiple modules are added to a system. Consequently, achieving higher capacities at more advanced speeds has become quite challenging.

To overcome these limitations, specialized clocks and dedicated DDR4 memory buffer chips have been integrated onto DIMMs. Put simply, buffer chips allow server designers to maintain high-speeds with DDR4, while enabling the higher capacity demanded by Big Data applications. More specifically, multiple loads (DRAM devices) tend to reduce the maximum speed which the bus can reach.

Buffer chips help improve system signal integrity by reconditioning the signal coming from the CPU and forwarding it to DRAM, thereby enabling higher operating data rates. In addition, buffer chips facilitate optimized RAS (row address strobe), with the silicon verifying correctness of commands and data.

As expected, DDR4 buffer chips offer several distinct advantages over the previous generation (DDR3), including faster speeds, higher usable bandwidth, larger device density and more banks (16). In addition, DDR4 defines CID for up to 8 high stack addressing, with smaller DDR4 row sizes (x4 devices) lowering power sipping and improving performance for multi-threaded applications.

Moreover, DDR4 buffer chips offer lower voltage consumption at 1.2v, uses (POD) VDDQ termination, reduces I/O current draw and eliminates the need for a voltage pump with VPP. In terms of specific RAS improvements, DDR4 buffer chips support register parity checks and command blocks, optional CRC usage for high data rates, boundary scans (connectivity test mode), MRS readout via MPR and native ECC support for DDR4 SoDIMM.

Clearly, increased memory capacity and performance are critical for data centers tasked with solving today’s complex problems with the storage and analysis of huge data sets. This is precisely why our DDR4 Data Buffer (DB) — iDDR4DB2-GS02 — enables DDR4 Load Reduced Dual Inline Memory Modules (LRDIMMs) to deliver high-bandwidth performance (when combined with our DDR4 RCD) with twice the capacity of DDR4 Registered DIMMs (RDIMMs).

Designed to meet the demanding requirements for real-time, memory-intensive applications, the DB delivers enhanced performance and margin at 2400 Mbps with built-in support for future data rates up to 2666 Mbps. This facilitates the highest speeds and robust operation when multiple LRDIMMs populate the memory channel for the highest system capacities.

The iDDR4DB2-GS02 dual 4-bit bidirectional data register with differential strobes is designed for 1.2 V VDD operation. The device has a dual 4-bit host bus interface connected to a memory controller and a dual 4-bit DRAM interface that is connected to two x4 DRAMs. It also has an input-only control bus interface that is connected to a DDR4 Register. This interface consists of a 4-bit control bus, two dedicated control signals, a voltage reference input and a differential clock input.

All DQS inputs are pseudo-differential with an internal voltage reference, while all DQ outputs are VDD terminated drivers optimized to drive single or dual terminated traces in DDR4 LRDIMM applications. The differential DQS strobes are used to sample the DQ inputs and are regenerated in the DDR4 DB for driving out the DQ outputs on the opposite side of the device. The iDDR4DB2-GS02 also supports dedicated pins for ZQ calibration and for parity error alerts.

Interested in learning more? You can check out our server DIMM chipsets product page here.

Saving power with HBM

Rambus Press — Mon, 28 Nov 2016 17:14:44 +0000

Ed Sperling of Semiconductor Engineering notes that power has always been a “global concern” in the design process because it affects every part of a chip. Nevertheless, partitioning for power rather than functionality or performance has not, historically, been seriously considered, although the status quo is beginning to change.

For example, says Sperling, the increasing use of system partitioning into multiple chips connected by high-speed buses rather than putting everything on a single chip offers some interesting possibilities for managing power.

Read first our primer on:
HBM2E Implementation & Selection – The Ultimate Guide »

According to Kelvin Low, senior director of foundry marketing at Samsung, system architects are now looking at power management in a different way rather than simply relying on silicon technology.

“You can partition a system to achieve system-level performance scaling,” Low told SemiEngineering. “So if you use a 2.5D approach with HBM2 (second-generation High-Bandwidth Memory), the system-level performance increases. It becomes a partition problem, but the distributed processing approach is an important enabler.”

As Sperling points out, this approach has a bearing on power as well, because it takes less power to drive signals through an interposer than through increasingly narrow wires on a single die at advanced nodes. As a result, there are significant power savings in addition to performance increases.

Frank Ferro, a senior director of product management for memory and interface IP at Rambus, expressed similar sentiments.

“One of the advantages of HBM2 is that it is that you can move it closer to the processing, and you have 2 gig (gigatransfers/second per pin) rates,” Ferro told the publication. “The power of HBM2 is lower, too, and you can re-use quite a bit of technology. But it does require a new PHY design.”

As Ferro explained in a Semiconductor Engineering article earlier this year, HBM bolsters local available memory by placing low-latency DRAM closer to the CPU. In addition, HBM DRAM increases memory bandwidth by providing a very wide interface to the SoC of 1024 bits. This means the maximum speed for HBM2 is 2Gbits/s for a total bandwidth of 256Gbytes/s. Although the bit rate is similar to DDR3 at 2.1Gbps, the 8, 128-bit channels provide HBM with about 15X more bandwidth.

Perhaps not surprisingly, mass-market deployment of HBM will present the industry with a number of challenges. This is because 2.5D-packaging technology, along with a silicon interposer, increases manufacturing complexities and cost. In addition, HBM routes thousands of signals (data + control + power/ground) via the interposer to the SoC (for each HBM memory used). Clearly, maximal yields will be critical to making HBM cost effective, especially since there are a number of expensive components being mounted to the interposer, including the SoC and multiple HBM die stacks.

Nevertheless, even with the above-mentioned challenges, the advantage of having – for example – four HBM memory stacks, each with 256Gbytes/sec in close proximity to the CPU, provides both a significant increase in memory density (up to 8Bb per HBM) and bandwidth when compared with existing architectures.

Interested in learning more? The full text of “Partitioning for Power” by Ed Sperling is available on Semiconductor Engineering here.

The evolution of embedded FPGAs

Rambus Press — Tue, 08 Nov 2016 14:37:20 +0000

Brian Bailey of Semiconductor Engineering observes that systems on chip have been manufactured with numerous processing variants ranging from general-purpose CPUs to DSPs, GPUs and custom processors which are highly optimized for certain tasks.

“When none of these options provide the necessary performance or consumes too much power, custom hardware takes over. But there is one type of processing element that has rarely been used in a major SoC— the FPGA,” he explained. “Solutions implemented in FPGAs are often faster than any of the instruction-set processors. In most cases they complete a computation with lower total energy consumption.”

However, as Bailey points out, the overall power consumption of embedded FPGAs (as opposed to discrete) is higher, while performance is slower than custom hardware. In addition, field programmable gate arrays typically occupy significantly more silicon area than ASICs.

“In the past, several companies have attempted to pioneer the embedded FPGA space, but none have been successful,” he continued. “To understand why eFPGAs may succeed this time around requires an understanding of both the changes happening across the industry at large and within specific markets.”

Indeed, numerous markets have traditionally relied on a waning Moore’s Law to enable increasing levels of integration as well as lower power, although product cycles at the top of the market are now predictably slowing.

“Networking and communications chips have long design cycles and are typically fabricated in advanced process nodes with $2 million to $5M mask costs,” Geoffrey Tate, CEO of Flex Logix, told Semiconductor Engineering. “The problem with this is that standards such as protocols and packets are changing rapidly. It used to be that these chips would be redesigned every couple of years to keep up, which is an increasingly expensive proposition. In addition, data centers are pushing to make chips programmable so they can be upgraded in-system automatically, thereby improving the economics of data centers and enabling them to do their own customization and optimization for a competitive edge.”

Steven Woo, VP of Systems and Solutions at Rambus, expressed similar sentiments.

“Rising design and mask costs at smaller process geometries, coupled with increasing chip complexity, verification effort and embedded software development, make the economics of chip design difficult, especially for smaller markets,” he explained. “FPGA technology offers the potential to help address this by allowing multiple markets and applications to be served with a single chip.”

As Woo notes, there is a tradeoff between the flexibility afforded by FPGAs and the increased area overhead such versatility incurs.

“The key is whether or not critical metrics such as application performance, power and TCO justify the overhead of increased flexibility. The industry is still in the early days of understanding how to use FPGAs in environments like data centers, so the adoption of FPGAs in this market will depend greatly on how much applications can benefit from them. As Microsoft has demonstrated, there are already compelling reasons to adopt them for modern workloads,” he added.

Interested in learning more? The full text of “Embedded FPGAs Going Mainstream?” is available on Semiconductor Engineering here. You can also check out our article archive on FPGAs here.

NVIDIA licenses DPA countermeasures

Rambus Press — Mon, 24 Oct 2016 14:57:34 +0000

NVIDIA has licensed Rambus’ Differential Power Analysis (DPA) countermeasures to protect its visual computing products against side-channel attacks.

As Dr. Martin Scott, GM of Rambus’ Security Division notes, NVIDIA products help drive performance for some of the world’s most demanding users including gamers, designers and scientists.

“The products, services and software developed by NVIDIA power a range of amazing experiences for artificial intelligence, autonomous cars, virtual reality and professional visualization,” he stated.

According to Scott, innovative companies are continuously assessing the security risks associated with increased levels of connectivity.

“By integrating DPA countermeasures into their products, NVIDIA is at the forefront of ensuring the integrity of their solutions in a variety of advanced application areas,” he added.

As we’ve previously discussed on Rambus Press, DPA is a type of side-channel attack that involves monitoring variations in the electrical power consumption or EM emissions from a target device. These measurements can then be used to derive cryptographic keys and other sensitive information from chips. This is precisely why Rambus’ Cryptography Research division has developed a comprehensive portfolio of application-specific hardware core and software library solutions that can be used to build DPA resistant products.

In addition to DPA countermeasures, Rambus has designed a DPA Workstation (DPAWS) platform for its customers and partners. Essentially, DPAWS analyzes hardware and software cryptographic implementations for vulnerabilities to power and electromagnetic side-channel attacks. Specifically, DPAWS enables users to quickly assess any vulnerability that an FPGA, ASIC, CPU or microcontroller may have to side-channel analysis.

DPAWS also includes an integrated suite of hardware and data visualization software to aid in the identification and understanding of vulnerabilities in cryptographic chips.

This includes:

A project library manager that delivers an integrated view of multiple data sets, scripts and analyses.
A powerful trace display with an intuitive interface for easy analysis.
Integrated scripting modules for MatLab and Python.

Interested in learning more about DPA countermeasures? You can check out our product page here and our article archive on the subject here.

Microsoft catapults FPGAs to new heights

Rambus Press — Thu, 13 Oct 2016 16:43:18 +0000

Karl Freund of Moore Insights and Strategy recently penned an article for Forbes about Microsoft’s extensive deployment of FPGA’s in the data center and beyond.

As Freund notes, Microsoft currently uses field programmable gate arrays to accelerate its Bing search engine (Project Catapult) along with its Azure Cloud, which has at least one FPGA in each server – delivering over one “exa op” (one billion operations per second) of total throughput across data centers in 15 countries.

“The real key to Microsoft’s heart is not just performance or power consumption. Microsoft points to the flexibility that FPGAs afford due to their inherent programmability,” writes Freund. “The ‘P’ in FPGA means programmable and therein may lay their most important value to Microsoft and in the data center in general. Once programmed, the FPGA hardware itself can be changed (reprogrammed) in the field (hence the “F”) to enable it to evolve with changes in the company’s business, science and underlying logic.”

According to Freund, Microsoft’s plans for FPGAs extend far and wide.

“Beyond Deep Learning acceleration, Microsoft is using FPGAs to accelerate networking and the complex software required to implement software-defined networks,” he adds.

Commenting on the above, Steven Woo, VP of Systems and Solutions at Rambus, notes that a number of major industry players have turned to FPGAs to accelerate a wide range of data-intensive tasks which historically have been distributed across multiple racks of servers.

“Aggregating numerous individual servers into a pool of processing units is a ‘one size fits all’ approach typically characterized by a relatively fixed amount of compute, memory, storage and I/O resources in each server,” he explains. “However, in practice, this paradigm suffers from an acute under-utilization of resources. This is because specific tasks may require a different amount of each resource during execution.”

Moreover, says Woo, legacy server architectures can also contribute to low CPU utilization rates, high latencies to access data, reduced power efficiency and increased TCO. In contrast, versatile FPGAs allow companies to evolve a more modular, flexible and effective approach for data centers, acceleration, HPC and beyond. For example, Baidu engineers are using field programmable gate arrays to accelerate SQL queries, while DeePhi is looking towards reconfigurable devices such as FPGAs for deep learning.

“Together with other silicon, such as GPU and CPUs, FPGAs will play an increasingly important role in helping to evolve computing platforms by enabling flexible acceleration and near data processing,” Woo concludes. “At Rambus, we look forward to collaborating with our industry partners and customers on cutting-edge memory technologies and solutions for future servers and data centers.”

Indeed, it should be noted that Rambus recently signed a license agreement with Xilinx that covers Rambus’ patented memory controller, SerDes and security technologies. In addition, the two companies agreed to evaluate potential collaboration on the use of Rambus’ CryptoManager platform, with Rambus also exploring the use of Xilinx FPGAs in its Smart Data Acceleration (SDA) research program.

As we’ve previously discussed on Rambus Press, the SDA research program focuses on architectures designed to offload computing closer to very large data sets at multiple points in the memory and storage hierarchy. Potential use case scenarios include big data analytics, real-time risk analytics, ad serving, neural imaging, transcoding and genome mapping. Comprising software, firmware, FPGAs and significant amounts of DRAM, the SDA platform operates as an effective test bed for exploring new methods of optimizing and accelerating analytics in extremely large data sets. As such, the SDA’s versatile combination of hardware, software, firmware, drivers and bit files can be precisely tweaked to facilitate architectural exploration of specific applications.

Put simply, the SDA – powered by an FPGA paired with 24 DIMMS – offers high memory densities linked to a flexible computing resource. Currently, the SDA platform functionality is targeted at accelerating and offloading tasks such as those found in Big Data analytics and in-memory database applications. The Smart Data Acceleration platform can also be made available over a network where it would serve as a shared resource and offload agent in a more disaggregated scenario.

Interested in learning more? You can check out our SDA research program article archive here.

Maximizing Von Neumann architecture

Rambus Press — Wed, 05 Oct 2016 16:27:48 +0000

In 1945, mathematician and physicist John von Neumann described a design architecture for an electronic digital computer in the First Draft of a Report on the EDVAC. Also known as the Princeton architecture, the design included a processing unit with an arithmetic logic unit and processor registers; a control unit containing an instruction register and program counter; memory to store both data and instructions; external mass storage; as well as an input and output mechanism.

Although modern systems have indeed benefited from decades of Moore’s Law and Dennard Scaling, the basic computer architecture has remained fundamentally unchanged since the days of Von Neumann. While a plethora of alternative architectures have been proposed over the years, none has managed to gain the sustained traction of the von Neumann architecture. But as a 2016 Bernstein research report observes, the sustained industry reliance on this architecture has led to the development of multiple bottlenecks.

“The first major limitation of the Von Neumann architecture is the ‘Von Neumann Bottleneck’; the speed of the architecture is limited to the speed at which the CPU can retrieve instructions and data from memory,” Bernstein analysts Pierre Farragu, Stacy Rasgon, Mark Li, Mark Newman and Matthew Morrison explained. “The throughput of a computer system is limited due to the relative ability of processors compared to top rates of data transfer. Therefore, the processor is idle for a certain amount of time while the memory is accessed.”

According to the analysts, the Von Neumann bottleneck has only worsened over time, as the disparity between processor speed and memory access throughput speed widens.

“Whilst a number of solutions have been proposed and implemented in modern day computers (including cache memory and branch predictor algorithms), these solutions have not been able to solve the root of the problem – the actual underlying design architecture,” the analysts stated. “Secondly, the step-by-step serial nature of a Von Neumann processor means that analyzing a very large, complex data set requires a large amount of processing power which is both timely and very expensive. Therefore, when it comes to certain applications, traditional processor architecture can simply not be utilized in a fast and cost efficient manner.”

Steven Woo, VP of Systems and Solutions at Rambus, expressed similar sentiments during a recent interview with Rambus Press.

“Bottlenecks have arisen in traditional architectures that are driving the industry to re-think how systems should be designed moving forward,” he explained.

“Several techniques for bringing architectures back into balance are being pursued by the industry, including Near Data Processing to minimize data movement and energy consumption, and hardware acceleration to improve performance and power efficiency.”

More specifically, says Woo, acceleration can now be implemented across a wide range of silicon, including field-programmable gate arrays (FPGAs). As Microsoft’s Project Catapult illustrates, FPGAs are helping to play a critical role in evolving future computing platforms. To be sure, FPGAs are already used in Microsoft’s Bing search engine and will soon power new search engines based on deep neural networks.

“Project Catapult signals a change in how global systems will operate in the future. From Amazon in the US to Baidu in China, all the Internet giants are supplementing their standard server chips—central processing units, or CPUs—with alternative silicon that can keep pace with the rapid changes in AI,” Wired’s Cade Metz recently reported.

“FPGAs also drive Azure, the company’s cloud computing service. And in the coming years, almost every new Microsoft server will include an FPGA. [In addition], Office 365 is moving toward using FPGAs for encryption and compression as well as machine learning—for all of its 23.1 million users. Eventually, these chips will power all Microsoft services.”

According to Woo, the industry has realized that it can no longer rely on Moore’s Law alone to optimize CPU performance and power efficiency.

“As the Bernstein analysts note, traditional architectures are not the best choice for some applications because they don’t address key bottlenecks that exist in these workloads,” he added. “Traditional processors coupled with FPGAs, and technologies to minimize data movement, offer new approaches to improving performance and power efficiency in modern systems. We believe FPGAs will continue to play an important role in helping to evolve computing platforms by enabling flexible acceleration and near data processing.”

Managing memory more efficiently with Milk

Rambus Press — Thu, 22 Sep 2016 16:57:27 +0000

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have introduced a programming language extension known as Milk that allows app developers to manage memory more efficiently in programs with scattered data points in large data sets.

To be sure, fetching data from memory banks is currently a major performance bottleneck, with cores grabbing entire blocks of data at a time based on the principle of locality. This approach results in slow program execution for many modern workloads, especially those with frequent random, indirect memory accesses such as graph analytics, key-value stores and machine learning.

According to MIT Press, programs written using these new language extensions were four times as fast as those coded without them – although the researchers believe additional progress with Milk will yield even more significant gains.

As Saman Amarasinghe, an MIT professor of Electrical Engineering and Computer Science, explains, big data sets pose problems for existing memory management techniques because they are sparse. Put simply, the scale of the solution does not necessarily increase proportionally with the scale of the problem.

“In social settings, we used to look at smaller problems,” Amarasinghe told the publication. “If you look at the people in this [CSAIL] building, we’re all connected. But if you look at the planet scale, I don’t scale my number of friends. The planet has billions of people, but I still have only hundreds of friends. Suddenly you have a very sparse problem.”

Vladimir Kiriansky, a PhD student in electrical engineering and computer science and first author on the paper introducing Milk, expressed similar sentiments.

“It’s as if, every time you want a spoonful of cereal, you open the fridge, open the milk carton, pour a spoonful of milk, close the carton and put it back in the fridge,” he said.

With Milk, chip cores refrain from grabbing entire blocks of data a time. Instead, Milk adds a data item’s address to a list of locally stored addresses. Ultimately, the cores ‘pool’ their respective lists, allowing group addresses in close proximity to be redistributed. This allows each core to request only required data items that that can be retrieved efficiently.

Commenting on the introduction of Milk, Steven Woo, VP of Systems and Solutions at Rambus, told us that modern applications such as in-memory databases, analytics and machine learning are increasingly being bottlenecked by accesses to the memory system.

“Limitations in delivered bandwidth, latency and capacity are causing CPUs to be heavily underutilized, driving the need for change in how applications interact with hardware,” he explained. “Making more efficient use of memory resources and minimizing data movement throughout the memory hierarchy, are critical issues that must be addressed in order to improve the performance and power efficiency of emerging workloads. Approaches like Milk demonstrate the potential benefits to end users of solving these key issues.”

Protecting avionic systems from side-channel attack

Rambus Press — Wed, 17 Aug 2016 16:30:28 +0000

Asaf Ashkenazi, a senior director of product marketing at Rambus’ security division, recently sat down with Neil Tyler of NewElectronics to discuss the potential threat side-channel attacks pose to avionic systems.

As Tyler points out, encryption is typically used to protect aerospace platforms. Although it is difficult to break the cryptographic algorithm itself, devices can readily reveal information during routine operations from such factors as power consumption, heat dissipation, time of computation and electromagnetic leaks.

“This type of [data] is referred to as side-channel information. The attacker can use this to determine the keys and break the cryptosystem. It’s breaking the system by going through the back door, [with attacks such as differential power analysis, or DPA],” he told the publication. “[In a broader sense], the threat of DPA attacks is on the rise and [aerospace] companies will need security solutions to safeguard high-value data. [This is why] Boeing recently signed a license agreement with Rambus for the inclusion of advanced DPA countermeasures in its products.”

According to Ashkenazi, electronic circuits are inherently leaky, as they produce a variety of emissions as by products that make it possible for an attacker to deduce how the circuit works and what data it is processing.

“All of these types of [side-channel] attacks can be recorded and reveal a surprising amount of information, especially if these attacks are combined,” he explained. “[Nor] do hackers need expensive equipment to do this. Pay a visit to the Dark Net and you can download the necessary software to carry out these attacks.”

As noted above, a wide range of DPA countermeasures are available to protect against various types of side-channel attacks, including special shielding, powering line condition and filtering, as well as blinding, which randomly adds a delay to any cryptographic computation.

“We have developed a technology that ensures signals emitted from any cryptographic operation are unreadable; any information generated will not make sense,” he added. “Essentially, we are hiding the data and, while the standard algorithm stays the same, the way in which it is implemented is changed.”

As we’ve previously discussed on Rambus Press, concerns about DPA attacks originated in the smart card market, although such attacks have since spread into other segments, including aerospace and defense. Fortunately, government and military systems can be protected from cyber adversaries with a hardware-centric security approach, which helps prevent the threat of reverse engineering and exploitation.

To evaluate vulnerability and resistance to side-channel attacks, Rambus has also developed a DPA Workstation (DPAWS) platform for its customers and partners. Essentially, DPAWS analyzes hardware and software cryptographic implementations for vulnerabilities to power and electromagnetic side-channel attacks. Specifically, DPAWS enables users to quickly assess any vulnerability that an FPGA, ASIC, CPU or microcontroller may have to side-channel analysis.

In addition, DPAWS includes an integrated suite of hardware and data visualization software to aid in the identification and understanding of vulnerabilities in cryptographic chips.

Interested in learning more? The full text of “Side Channel Attacks” by Neil Tyler is available on NewElectronics here (PDF). You can also check out our DPA Countermeasures product page here and our DPA Workstation product page here.

Computer Business Review highlights side-channel threat

Rambus Press — Tue, 09 Aug 2016 16:30:39 +0000

Alexander Sword of Computer Business Review says cyber security is often thought of as a software issue that can be solved with a software solution. However, this paradigm ignores hardware-based attacks, a type of cyber threat security providers are now taking quite seriously.

“There are still plenty of unsecured chips out there, vulnerable to several major types of hardware attack,” he explained. “These include side-channel attacks, which are techniques that allow attackers to monitor the analogue characteristics and interface connections and any electromagnetic radiation.”

According to Sword, differential power analysis (DPA) is a type of side-channel attack that measures the electrical power consumption or electromagnetic emissions from the device.

“From these measurements, attackers can derive cryptographic keys and private data,” he continued. “These keys allow attackers to easily gain unauthorized access to a device, decrypt or forge messages, steal identities, clone devices, create unauthorized signatures and perform additional unauthorized transactions.”

As Sword notes, Boeing recently licensed Rambus DPA Countermeasures to protect its aerospace and defense systems from security threats.

“Rambus is also working with smartphone manufacturers, [as the company’s] CryptoManager platform establishes a hardware-based root-of-trust, embedding a security core in the SoC itself,” he added. “Vendors can therefore securely provision unique keys for each chip during the silicon manufacturing and testing process.”

As we’ve previously discussed on Rambus Press, DPA countermeasures will allow Boeing to protect against security attacks that are used to reverse engineer or exploit critical technologies built into aircraft and other defense-related products. To be sure, the threat of DPA attacks is on the rise and defense companies require an extremely high level of hardware-based security to safeguard its customers’ high-value data.

Perhaps not surprisingly, concerns about DPA attacks originated in the smart card market, although such attacks have since spread into other segments, including aerospace and defense. Fortunately, government and military systems can be protected from cyber adversaries with a hardware-centric security approach, which helps prevent the threat of reverse engineering and exploitation.

In addition, DPAWS includes an integrated suite of hardware and data visualization software to aid in the identification and understanding of vulnerabilities in cryptographic chips.

Interested in learning more? You can check out our DPA Countermeasures product page here and our DPA Workstation product page here.

EE Times takes a closer look at Rambus’ 14nm R+ DDR4 PHY

Rambus Press — Mon, 01 Aug 2016 16:36:51 +0000

Gary Hilson of the EE Times has covered Rambus’ recent announcement about the development of its R+ DDR4 PHY on GLOBALFOUNDRIES 14nm LPP process. As the journalist notes, the silicon is the first production-ready 3200 Mbps DDR4 PHY available on GLOBALFOUNDRIES Inc.’s FX-14 ASIC platform using its power-performance optimized 14nm LPP process.

“The Rambus R+ DDR4 PHY intellectual property uses Rambus’ proprietary R+ architecture, based on the DDR industry standard,” Hilson explained. “The PHY is part of the Rambus’ suite of memory and SerDes interface offerings for networking and data center applications. Meeting the performance and capacity demands of those segments [is] a heavy focus for the company.”

Frank Ferro, a senior director of product marketing at Rambus, told the publication that the DFI 4.0-compatible R+ DDR4 PHY will enable customers to differentiate their offerings with improved performance while still maintaining full compatibility with industry standard DDR4 and DDR3/3L/3U interfaces.

“This gets them ahead of the curve in terms of memory performance,” Ferro said.

Indeed, the R+ DDR4 PHY delivers data rates from 800 to 3200 Mbps in multiple memory sub-system options, including die down, DIMM and 3DS. It also supports 16 to 72-bit interfaces, along with single and multi-rank configurations. The overall goal, says Ferro, is to provide system designers with flexibility for both high performance and low power, which is where the GLOBALFOUNDRIES 14nm process comes in. Nevertheless, as Ferro emphasizes, while DDR4 provides a significant performance boost over DDR3, engineers are still finding it challenging to improve the interface between memory and the CPU.

“The CPUs can run faster, and they [have] multiple channels of local DRAM they are accessing, but the CPUs are only as good as the access to the memory,” he explained. “The interface is the key bottleneck in the system.”

Ferro told the EE Times that Rambus is using internal tools to analyze the physical connections between the CPU and the DIMMs. “That’s where the limits come in. I think there’s still a ways to go,” he added.

Another challenge, he notes, is balancing the trade-offs between density and bandwidth by looking at the physical loading onto the bus. Rambus, Ferro confirmed, is currently exploring technology to minimize the loading effect of DIMMs.

From a broader perspective, says Ferro, while the high performance computing (HPC) segment might be what comes to mind first, Rambus is looking to meet the needs of the Facebooks and Instagrams of the world as their data center requirements trickle down to the chip companies. To be sure, Rambus recently announced its intention to acquire Inphi’s memory interconnect business as well as Semtech’s Snowbush serial interface IP.