Intel Reveals Details Regarding Intel’s “Knights Landing” Xeon Phi Coprocessor

Intel has announcement the ‘Knights Landing’ Xeon Phi Coprocessor late last year, having released very few details about the lineup back then. As time passes, details are bound to be revealed and Intel is said to start shipping the series next year. This is why Intel apparently has decided to reveal some more details regarding the ‘Knights Landing’ Xeon Phi Coprocessor.

The announcement from last year points to the Knights Landing taking the jump from Intel’s enhanced Premium 1 P54C x86 cores and moving on to the more modern Silvermont x86 cores, significantly increasing the single threaded performance. Furthermore, the cores are said to incorporate AVX units, allowing AVX-512F operations and provide bulk Knight Landing’s compute power.

Intel is said to offer 72 cores in Knight Landing CPUs, with double-precision FP63 performance expected to reach 3 TFLOPS, having the CPUs boasting the 14nm technology. While this is somewhat old news, Intel revealed some more insights at the ISC 2014.

During the conference, Intel stated that the company is required to change the 512-bits and GDDR5 memory present in the current Knights Corner series. This is why Intel and Micron have apparently struck a deal to work on a more advanced memory variant of Hybrid Memory Cube (HMC) with increased bandwidth.

Also, Intel and Micron are said to be working on a Multi-Channel DRAM (MCDRAM) specially designed for Intel’s processors, having a custom interface best suited for Knights Landing. This is said to help scale its memory support up to 16 GB if RAM while offering up to 500 GB/s memory bandwidth, a 50% increased compared to Knights Corner’s GDDR5.

The second change made to Knights Landing is said to include replacing the True Scale Fabric with Omni Scale Fabric in order to offer better performance compared to the current fabric solution. Though Intel is currently keeping this information on a down-low, traditional Xeon processors are said to benefit from this fabric change in the future as well.

Lastly, compared to Intel’s Knights Corner series, the Knights landing will be available both in PCIe and socketed form factor, mainly thanks to the MCDRAM technology. This is said to allow the CPU to be installed alongside Xeon processors on specific motherboards. The company has also emphasised that the Knights Landing version will be able to communicate directly with other CPUs with the help of Quick Patch Interconnect, compared to current PCIe interface.

In addition to the latter, having the Knights Landing socketed would also allow it to benefit from the Xeon’s NUMA capabilities, being able to share memory and memory spaces with the Xeon CPUs. Also, Knights Landing is said to be binary compatible with Haswell CPUs, having the company considering writing programs once and running them across both types of processors.

Intel is expected to start shipping the Knights Landing Xeon Psi Coprocessor somewhere around Q2 2015, having the company already lining up its first Knights Landing supercomputer deals with National Energy Research Scientific Computing Center with around 9300 Knights Landing nodes.

Thank you Anandtech for providing us with this information
Image courtesy of Anandtech

New Hybrid Memory Cube Designed To Boost DRAM Bandwidth By 15x

Micron, Samsung and Hynix, 3 of the largest Flash NAND manufacturers, with a support from a 100 tech companies, have announced that the final specifications of a 3 dimensional DRAM called ‘Hybrid Memory Cube’. According to the claim made by the Hybrid memory cube consortium, it will increase performance for networking and for high performance computing requirements.

This technology uses stacks of multiple DRAM memory chips over the DRAM controller. This possible using Vertical Interconnect Access technology which is a method that passes an electric wire through the DRAM chips vertically. It also decided the load on DRAM since the chips are stacked on top of eachother over the controller, therefore the distance between them is significantly shorter without any need of circuit board traces.

There are 2 physical interfaces between the system’s processor and the memory cube: short reach and ultra-short teach. Short Reach is practically similar to what traditional memory stick maintains with the CPU with a distance of no more than 10 inches. The application of this method will help network application can provide a throughput of 15-28 Gbps per pin.

Ultra-short method is primarily for low energy consumption and a much lesser proximity design for high performance networking, testing and measuring requirements with a distance of 3 inch between the memory block and the CPU. This will provide a throughput of 15Gbps.

The initial Memory Cube will be with 2GB and 4GB variants and provides bi-directional bandwidth of upto 160GBps, more than what DDR3 and DDR4 spec allows.

Mike Black, the chief technology Strategist said,”We took the logic portion of the DRAM functionality out of it and dropped that into the logic chip that sits at the base of that 3D stack. That logic process allows us to take advantage of higher performance transistors … to not only interact up through the DRAM on top of it, but in a high-performance, efficient manner across a channel to a host processor. So that logic layer serves both as the host interface connection as well as the memory controller for the DRAM sitting on top of it”

Source: Computer World