Take a Look at the New Nvidia Pascal Architecture

With the reveal of the Tesla P100, Nvidia has taken the wraps off of their new Pascal architecture. Originally set to debut last year, delays with 16nm kept Pascal from being a reality, leading to Maxwell on 28nm. Now that Pascal is finally here, we are getting an architecture that combines the gaming abilities of Maxwell with much improved compute performance. The new Unified Memory and Compute Pre-Emption are the main highlights.

First off, Pascal changes the SM (Stream Multiprocessor) configuration yet again. Kepler featured 192 CUDA cores per SM, Maxwell had 128 and Pascal will now have 64. By reducing the number of CUDA cores per SM, it increases the fine grain control over compute tasks and ensure higher efficiency. Interestingly, 64 is also the same amount of cores GCN has in each CU, AMD’s equivalent to SM. The TMU to CUDA core ratio remains the same as Maxwell with 4 per SM instead of 8, in line with the drop in cores/SM.

For compute, the gains mostly come from increasing the number of FP64 or Dual Precision CUDA cores. DP is important for scientific and compute workloads though game rarely make use of them. Kepler started cutting out some FP64 units and Maxwell went even further, with virtually no FP64 even in the Tesla’s. This was one reason why Maxwell cards were so efficient and Nvidia only managed to hold onto their leadership in compute due to CUDA and their Single Precision performance.

With Pascal, the ratio of SP to DP units goes to 2:1, significantly higher than the 32:1 of Maxwell and 3:1 of Kepler. GP100 in particular has about 50% of its die space dedicated to FP32, about 25% to DP and the last 25% split between LD/ST and SFUs. This suggests that Pascal won’t be changing much in terms of gaming performance. The only gains will be from a slight increase in efficiency due to the smaller SMs and the die shrinking from 16nmFF+. GeForce variants of Pascal may have their FP64 units trimmed to cram in more FP32 resources but again, most of the gains will be due to increased density.

Lastly, Pascal brings forward unified memory to allow threads to better share information. This comes along with improved L2 cache sizes and the more than double register file sizes. P100, the first Pascal chip, also uses HBM2, with 16GB of VRAM over a 4096bit bus for a peak bandwidth of  720 GB/s. For CUDA compute tasks, a new Unified Memory model allows Pascal GPUs to utilize the entire system memory pool with global coherency. This is one way to tackle AMD’s advancement with HSA and GCN and Intel’s Xeon Phi’s.

Overall, Pascal looks to be an evolutionary update for Nvidia. Perhaps, Nvidia has reached the point that Intel has, making incremental progress. In other ways though, the reduction in SM size has great potential and provides a more flexible framework to build GPUs. Now all we are waiting for is for the chips to finally drop.

Nvidia Details Tesla P100 Pascal GPU Specifications

After revealing their next flagship Telsa earlier, Nvidia has let loose with a few more details and specifications. Based on the new Pascal architecture, the P100 will be utilizing TSMC’s latest 16nmFF+ process. As we know from the keynote, the chip will feature 15.3 billion transistors and the latest HBM2 memory. The P100 also features what Nvidia is calling the “5 miracles”.

First off, the P100 will run at an impressive 1328 MHz base clock and 1480 MHz boost. This is high for a professional Tesla card though well in line with GeForce clocks. The card won’t be using the full GP100 die with 60 SMs and 3840 CUDA cores, rather it will use a cut-down version with 56 SMs with 3584 cores. This mirrors Kepler’s launch where the cut-down Titan came before the Titan Black. In addition to the usual FP32 CUDA cores, there are also 1792 FP64 CUDA cores for Dual Precision Work. This gives a 2SP/1DP ratio, higher than anything from Kepler or Maxwell. The P100 also has 224 TMUs and massive amounts of cache and register files.

Next, we have the massive 610 mm² die on 16nmFF+. About 50% of that is FP32 CUDA cores, 25% is FP64 and rest on other parts. This means despite the massive die size, the P100 and GP100 derivatives won’t be great gamers, as games generally only use FP32 CUDA cores. There may be a GP100 variant though that swaps out the FP64 cores for FP32 ones. Even saddled with compute though, GP100 will still beat the Titan X by a good margin. TDP is a relatively tame 300W, as expected from the use of 16nm and 16GB of HBM2.

Finally, most marketing statements are hyperbole and the “5 miracles” are no exception. They are the Pascal Architecture, 16nm FinFET, CoWoS with HBM2, NVLink, and New AI Algorithms. Honestly, none of these are really that amazing on their own and have been expecting. Combining all of them in one go on such a massive chip though is pretty amazing though. While the P100 will be shipping soon, don’t expect many till Q1 2017.

NVIDIA Unveil Tesla P100 GPU

GTC 2016: As part of NVIDIA’s GPU Tech Conference, Jen-Hsun unveiled the latest product in the Tesla family with the P100. Branded as the most advanced hyperscale datacentre GPU, it features 150 billion transistors and is based on the latest Pascal architecture.

Built on a 16nm FinFET process and featuring HBM2, this product is the latest in a whole host of new technologies from NVIDIA and should be the start of what we’re going to see across other NVIDIA products.

With AI and deep learning now at the forefront of NVIDIA’s thinking, the Tesla P100 GPU has been created to assist with making AI and deep learning among other tasks as fast as physically possible.

Super-Charged P100D Tesla Plans Discovered in Firmware Update

Tesla enthusiast and self-proclaimed white hat hacker Jason Hughes has been doing some digging, taking apart the latest Tesla firmware update that was sent his own Tesla P85D, and it looks like he’s uncovered a secret!

The Model S currently comes in three variants, but it seems Elon Musk has a fourth model in the pipeline, and it’s going to be powerful! Jason claims to have spotted files pointing towards a 100kWh car, which is quite a bump from the current 70 and 90kWh models available. With the available models being named the 70D, 90D and the P90D, it would be safe to assume that a new 100kWh model would be called the P100D.

The information was discovered in Firmware Update 7.1 and while he could have release the information immediately, he decided to tease Elon a little bit first with the following tweet as well as some forum posts. Followed up later the next day with “You guys are great. Fun to get home and find people have cracked some SHA256 I posted. Nice work.”

I don’t have to tell you this whole thing needs a pinch or fist full of salt, but it’s certainly believable that some information to a future car would be held within the firmware. Tesla would likely use their cars firmware to test future models and we’ve seen similar leaks in the past for graphics cards and other bits of tech.

“There are quite a few things that are in the firmware that I’m not prepared to share publicly. Just like the P100D has been in there for months with my lips mostly sealed.” said Hughes.

However, Tesla rolled back an update, perhaps to remove the files or tighten up the encryption keys on the files unlocked, and potentially other parts of the system that used the same keys, who knows. Either way, Elon seems to be playing it cool right now, so we’ll have to wait and see.

 

Antec P100 Mid-Tower Chassis Review

Introduction


New Antec chassis have been a little bit of a rarity in the last couple of years, so I’m very happy today to see their P100 in my office and ready for testing. Antec was once my number one choice for chassis products when building my own systems, and their Three Hundred is still one of the best low-budget chassis products on the market. The P100 is aimed at the other end of the market, designed with high performance systems in mind and as such it’s a little more expensive at around £70 from most major retailers.

There are lots of great chassis on the market within this price range, so the P100 is going up against some tough competition from all major brands such as Cooler Master, Corsair and NZXT, who all have premium products of similar specifications and price. Of course each brand usually offers something unique, be that the design or a few added bonuses that set them apart. One of the main selling points for Antec has long been their build quality, often favouring thick cut steel panels that are likely to last many years.

The P100’s understated design is packed full of features, including sound dampening materials, plentiful dust filters, water cooling support, room for multiple expansion cards (eg; triple SLI graphics cards), cable management, USB 3.0 and loads of tool free storage bays, so it certainly sounds like a prime pick for a high-end gaming rig.

The packaging is nicely design, with a large image of the chassis on each side of the box, as well as a “quiet” label in the top left, detailing that this chassis has been lined with sound dampening materials.

Around the back is a more detailed run down of the features of the chassis, but of course we’ll be getting in for a closer look in a moment.

In the box you’ll find a bunch of cable ties to help with cable management, a collection of high quality screws to install all major components, a quick setup guide and two 3pin to molex fan adaptor cables.

Computex: Antec Unveil P100 ATX Chassis

Antec are out in force at Computex and were keen to show off their latest chassis design, power supplies and cooling solutions. Antec may have wowed the crowds with the Nineteen Hundred chassis but the new P100 isn’t without its own charms either.

“The P100 is an ATX case that epitomizes cool, quiet, and sophistication.” – Antec

It features Antec’s award winning Performance One series design and Quiet Computing technologies that minimize system noise. In addition to 7 expansion slots, the P100 also supports 3 tool-less 5.25” drives and 4 tool-less 3.5”/ 2.5” drives.

With a great price to feature ratio the P100 is the only economical case that delivers silence. A balance of price and feature is achieved by integrating Antec design with superior build quality. Engineered with Quiet Computing Technology, the P100 is truly in a class all its own.

Stay tuned to eTeknix for more Computex coverage in our Computex section.

Image(s) courtesy of Antec