NVIDIA Strikes Back, The Kepler Architecture And The New King Of Über Video Cards…

After many weeks of NVIDIA preparing the world for their brand new graphics processor, with the wheels of the marketing machine running at full speed NVIDIA’s new GPU the GK104 has been introduced to the world. The basis of the new GFX 680 the war of the Über video card has swung back in NVIDIA’s favour.

NVIDIA’s new design emphasizes power efficiency and performance where-as the previous generation chips went for performance at any cost. Strangely though the previous generation were beaten by AMD while this new more power efficient design ends being up to 3 times faster – per watt of power – than the previous generation chips and soundly beats AMD’s latest, the HD 7970.

In one foul swoop NVIDIA have stolen the crown back, reaching over and taking the crown right off the kings brow. While the GTX 680 video card slaps AMD back into second place NVIDIA isn’t done yet, they may have an ace up their sleeve. The GTX 680 is cheaper, faster and lower power than AMD’s best, the 7970, the King is dead. Long live the new King.

The Kepler Story

Kepler silicon.

While many of us have heard the name Kepler bandied about recently, the first publicly available product putting all of this design work to use is the hot off the presses GTX 680 built using the brand new GK104 GPU and released to the public this week.

Kepler represents a change in direction for NVIDIA, taking a new approach to designing their graphics chips takes kahuna’s, it’s risky business. The risk seems to have paid off for NVIDIA this time, producing a chip that is faster and more efficient, in both power use and chip size.

Internally NVIDIA’s design team also saw many changes in the development process, with all design teams working much closer together and having equal input into design decisions.

The design efficiency gains have reduced the power requirements of the card over previous generations. A 550W power supply is recommended if you intend to install the GTX 680 in your PC. The 680 has a peak power requirement of 195 Watts, with an idle usage of 19Watts. AMD’s new generation HD 7970 use a lot more, 342Watts, when running at full speed .

The card itself is about an inch shorter than its direct competitor the 7970, and the previous generation GTX 580, hopefully allowing it to fit into more cases.

The Chip Technicalities
Getting the numbers out of the way, internally the GK104 has four GPCs –processors – with a total of eight SMXs, 1536 CUDA cores, eight geometry units, four raster units, 128 texture units, and 32 ROP units. All of this adds up to an immense amount of number crunching power contained in the 3.54 billion transistors that are required to produce this pixel pusher.

The base clock is 1006MHz, but it can dynamically boost to 1058MHz and beyond for more performance. The card itself makes use of the new PCI Express 3.0 video card interface, doubling the transfer speeds between video card and system memory to 16GB/sec.

The 2GB of DDR5 runs at 1500MHz – 6.0GHz effective – and operates on a 256 bit memory bus. While the memory controller may seem to have been cut back over previous generations, and AMD’s current 384 bit bus for that matter, the new memory controller is far more efficient and manages to keep everything moving along just fine.

GTX 680 rear panel

Logic Strategy
One of the more interesting design decisions made by the development team early on was to move much of the control logic of their graphics chip into software.

All modern chips have large amounts of logic dedicated to scheduling the workload. This ensures the GPU’s resources are utilized efficiently. NVIDIA took the radical approach of shifting much of this scheduling work to the graphics driver, in effect getting your computers CPU to organise the GPU’s workload.

This is a radical but logical approach. Most modern CPU’s have plenty of spare power and it is the kind of work they do best, far more efficiently than a GPU. Not only did this reduce the chips complexity but it allows updates to the video card driver to improve the scheduling in the future. Something that’s not possible if the scheduling is built into the silicon which can’t be changed.

GPU Boost
Like modern CPU’s the GPU Boost technology is able to automatically overclock the GPU, taking heat and power usage into account. In turbo mode up to an extra 5% performance can be gained. GPU Boost does make it far more difficult for hardware enthusiasts to manually overclock the card themselves, but it pays dividends for everyone else.

TXAA, next generation Anti-Aliasing
TXAA and Anti-Alising in general are techniques for smoothing out the jagged edges in games. TXAA is a evolution of these processes, allowing for 8 times oversampling with the same grunt that was required to do 4 times oversampling in the past.

NVENC – NVIDIA’s Encoding Engine
NVIDIA’s new dedicated encoding engine brings video processing power to the GTX 680. Able to encode 1080p video at 4 to 8 times real-time NVENC is a V8 video encoding engine. Not only will it improve times to convert – transcode – your favourite content but it will also improve performance in video editing and video conferencing. With the added benefit of taking the workload off the CPU, keeping your PC running smooth as silk during these demanding operations. So far the only software to support this function is Cyberlinks MediaExpresso Beta.

Adaptive Vsync
Adaptive Vsync provides a better balance between maintaining frame rates while not allowing scene tearing. At its core the issues revolves around the fact that your monitor is a little OCD about time while the video card does its job on an as long as it takes basis. Your display want 60 frames a second sent to it at nice even intervals, while a video can take varying amounts of time to generate each frame. In games if you watch you FPS counter it will vary depending on the scene being played. Like the odd coupe the display and video card have to learn to get along.

Ordinary VSync is an attempt to compromise and keep things running smoothly. Ensuring that the next frame is output when your screen has finished putting up the current frame. The down side is if the video card takes longer than the display it will force a two frame delay, causing stuttering. Dropping you straight to 30fps instead of 60fps. Without vsync though, if your video card is only half finished building the frame it will be output half built, a phenomenon called texture tearing, not pretty. Adaptive Vsync attempts to maintain the best of both worlds while avoiding both issues.

Adaptive VSync keeps the sync turned on but if a frame is held up Vsync turns off momentarily and the GPU ensures the full frame is output asap. Hopefully maintaining the frame rate, avoiding stuttering and managing the texture tearing, win win.


NVIDIA GEFORCE ‘Kepler’ GTX 680 – Official Introduction

Chip package.

Performance
Kepler is quick, very quick. Unanimously all reviewers have agreed the GTX 680 is the fastest single chip video card on the market. Frame rates in all of the latest Direct X 11 games are excellent with upto 25% performance lead over AMD’s HD 7970. While NVIDIA is claiming a threefold performance increase over the previous generation GTX 580 this is on power / performance basis. The actual performance increase ignoring the power efficiency improvements is closer to twice the performance. Still no small improvement by any account.

NVIDIA’s Secret Ace Up Its Sleeve
Many aspects of the GK104 point to this chip being the mid-range chip of the family. The size and model number have both raised questions. The fact that Kepler is already being used by many mid-range laptops, as opposed to high end gaming laptops, also suggests NVIDIA has more in store for AMD.

NVIDIA’s previous flag ship video card, the GTX 580, uses the GF110 GPU which occupies 520 mm2, the new chip the GK104 occupies only 294mm2 of silicon, nearly have the size of the previous high end GPU. Also the model number GK104 is obviously a step behind the GF110.

Until NVIDIA release the entire range this is speculation but analysis does indeed suggest the GK104 will end up being a mid ranger. There is also the replacement for the GTX 590 to come, the Dual GPU video cards that sit at the very top of the Über video card heap are due a refresh on both sides. AMD’s new Dual GPU Über card the 7990 is said to be released shortly, March 2012, expect NVIDIA to have an answer for that assault ready in the wings.

While it is often customary to release the fastest chip first NVIDIA’s decision to start in the mid-range could also have been influenced by the difficulties it’s chip manufacturer TSMC is having with the new manufacturing process, the 28nm node. With a smaller chip design they are able to produce more working chips per wafer. AMD and NVIDIA are both suffering through low yields using TSMC’s new process, NVIDIA chose to get more chips by making them smaller, AMD did very little besides pass on the cost.

GTX 680 next to HD7970

Products already using Kepler
The GTX 680 is the first video card to use the new Kepler architecture but it is by no means the only product using Kepler. The mobile variant of Kepler is already appearing in many mid-range laptops and Ultrabooks. This is a fast video card for the desktop, it should be a screamer in laptops. This bodes very well for the power efficiency gains made by NVIDIA.

Conclusion
The fast paced evolution of technology can also lead to the even faster change in dominance. AMD has held the Über video card crown for a couple of generations, though it may have at times been a slender lead they were still the holders of the crown. In the blink of an eye that has now changed, NVIDIA looks to have the top spot back, while still having an ace up their sleeve.

The RRP for the GTX 680 is $499 USD, don’t expect discounts anytime soon, the card may actual go up in price initially as demand outstrips availability. At $50 less than the AMD HD 7970 the GTX 680 already represents reasonable value, for a performance card of this type.

With an even faster chip waiting in the wings along with the fact that the GTX 680 already beats AMD’s top of the line card by between 5% and 25%, AMD should be more than a little concerning. The battle to be the King of Über video cards is certainly heating up. Can AMD produce an answer to the GTX 680 or will they be forced to cut the price premium they have enjoyed ? Stay tuned things are only just starting to warm up.

References: The Tech Report
References: Hot Hardware
References:
TechSpot