Monday, 19 June 2017

The End of Quad-Core Dominance

Quad-core CPUs on desktops have been the dominant PC configuration for a long time. Long enough that my old early-2011 system is finally reaching the point where the motherboard is probably dying and the CPU cannot be overclocked any higher to fix poorly optimised shipping games. In fact, the crashes and beeps from the motherboard are quite insistent that that overclock is now beyond the system.


However, console-followers will have noted that octo-cores are now the hot thing. This isn't hyperthreading (hardware schedulers that can shuffle two threads onto the execution units inside a single core without evicting either) but eight genuine Jaguar cores running around 1.6GHz in both the main consoles. The caveat being that a Jaguar core has about half as many execution units (count Int and FP ALUs that can be scheduled vs Ryzen above) in which to do the actual maths the code requires and is clocked at about half the frequency of modern desktop processors. Even the decode and dispatch front-end can only chew through half as much to feed the core when compared to the Ryzen's design - everything is relatively balanced. Effectively, there are eight cores but only about as much work can be done (with the maximum throughput) as with two cores on a high-end desktop CPU. This requires game engines be optimised to work well with low single-threaded performance (apparently unless you're porting Forza Horizon 3 to Win10/UWP!) when tuned to each console (where there is far less overhead from the OS/other tasks running).

My old Sandy Bridge's cores actually sits somewhere in the middle of Jaguar and Ryzen in terms of execution units. That's one of the reasons why a new CPU may not clock any higher (especially at the limits of overlocking) than my processor but it can do significantly more work. Each core is bigger and can do more each cycle. But, eventually, four cores is simply not something you can just keep making wider without leaving resources underutilised. This is one reason why hyperthreading becomes a really good move, because juggling two threads on each core increases the chances of being able to dispatch work to each execution unit. The big rumour (basically all but confirmed) is that by this time next year even Intel will have moved to six cores in their upper-end mainstream processors. If you're buying new hardware today (which is where I am) then you must consider this push to increasingly threaded work, the benefit of thread scheduling for wide cores, and the expected future where four cores is something you find on laptops and lower end desktops.

The i7-7700K may offer the fastest single core, but it appears that Intel's new High-End DeskTop platform (with beta motherboard firmware) is offering many cores without holding back single-threaded performance. With enough money, you can now buy six, eight, or ten cores (up to 20 threads with hyperthreading) with that supreme Intel single-threaded performance. Competition will only increase when AMD's Threadripper (four partially disabled Ryzen dies on a single socket) appears in August. What do these HEDT platforms offer that the current Ryzen (octo-core with those cores we already described as wide) doesn't? Twice as much RAM bandwidth from extra memory controllers and more dedicated PCI-Express 3.0 lanes (rather than lanes bottlenecked off the motherboard controller) to connect graphics cards and other high-speed devices. That becomes more of a concern for a future-looking platform as M.2 SSDs already push to use 4-lanes of bandwidth each. The short load times on PC continue to look like they'll go down, even without new SSD memory types.


CPU launch Cores/Threads CBr15 ST CBr15 MT CPU+mobo
Threadripper 1950X August 2017 16/32 170 3000 $1,200
Threadripper 1920X August 2017 12/24 160 2400 $1,000
i9 7900X June 2017 10/20 195 2200 $1,200
Threadripper 1910X??? Late 2017? 10/20? 165? 1950? $850?
i7 6950X 2016 10/20 165 1850 $2,000
i7 7820X June 2017 8/16 195 1800 $800
Ryzen7 1800X 2017 8/16 160 1650 $575
Ryzen7 1700X 2017 8/16 155 1550 $500
i7 6900K 2016 8/16 155 1500 $1,200
i7 8700K? Sept 2017? 6/12 195? 1400? $500?
i7 7800X June 2017 6/12 185 1350 $600
Ryzen5 1600X 2017 6/12 160 1150 $450
i7 7700K 2016 4/8 190 950 $475
Ryzen5 1500X 2017 4/8 155 800 $350
i5 7600K 2016 4/4 170 650 $375

If we assume that RAM will cost what it costs (4x8GB sticks is not significantly different to the price of 2x16GB sticks, everything uses DDR4), the platform differences will come down to CPU costs and motherboard costs. The HEDT platforms are both going to lack value motherboard offerings and so inflates the platform cost beyond simply buying a premium CPU. But also that will provide more connectivity, making use of the extra PCI-Express lanes. The full picture will only emerge in August when Threadripper launches but we can already look at some initial data. I've done a few guesstimates for where we've yet to see initial results and AMD's HEDT is definitely the far more speculative section as we don't even have pricing, let alone beta performance numbers.

Edit: shortly after writing this the main reviews (taken after the weekend BIOS updates) landed so those speculated scores for Intel HEDT have been replaced with solid data - the estimates were basically on the money except the 7820X is actually slightly stronger in single-threaded tests than expected.

Edit 2: By late July, it had become clear that Intel was likely going to react with a new desktop i7 (with six cores) earlier than 2018 and that the models of Threadripper on offer at launch were not the full range speculated upon earlier (rather than being two Ryzen on a chip, they are the parts that failed EPYC server testing so have half the cores disabled and may not offer a low-end cheap variant (1910X)). The table has been updated again (with finalised 1920/1950X data confirmed in August to be as expected, no 1910X on the horizon).

Threadripper will all have 60 PCI-Express 3.0 lanes, giving effectively unlimited bandwidth for anything that will fit on a motherboard. The top of Intel's offerings are also not going to worry anyone who isn't buying several GPUs (44 lanes on the 7900X, 40 on the 6950X & 6900K). Where Intel start to differentiate their offerings is the 7820X & 7800X which only have 28 lanes, not even enough to fully saturate two 16x GPUs, although currently GPUs rarely actually use the full bandwidth offered. The Ryzen and quad-core Intel mainstream CPUs all have 16 lanes for the GPU connection but then mainly rely on their chipset to provide anything else. Ryzen does have four extra lanes that can be dedicated to a M.2 SSD as well as the chipset connection while the Intels generally shuffle far more lanes off the chipset than X370 motherboards - but you can't use them all at the same time as they'll just bottleneck. The issue is when motherboards mask lanes, for example where you have several 16x slots but using them will start to cut bandwidth or disable other connections like M.2 ports. It's not an immediate concern as everything should be able to drive a high end GPU and SSD for now, but expandability may be more limited than the selection of ports (several 16x slots, multiple M.2 ports) on the motherboard implies - the second M.2 port may well be a 2x PCI-Express 2.0 connection so quarter of the bandwidth (2.0 is half the speed of 3.0) of a full M.2 port.

We can certainly see where a future hexa-core mainstream i7 may offer an extremely good value next year with both single-threaded performance and enough cores to compete with the brand new 7800X, even if the RAM bandwidth will be reduced - potentially starving cores with workloads that are mainly about fetching data. It is clear that for threaded tasks the Ryzen 1700X already offers a similar price for even more performance thanks to eight cores and Threadripper should offer a lot more. However, if we look at single-threaded performance then the void becomes apparent and that is what leads to some issues. CineBench 15 isn't the perfect test but it's illustrative of the gap, one that Threadripper is unlikely to dent. The 7820X retains most of the value of the 10-core cousin that costs $400 more and offers performance in every use case for an expensive but attainable price (no worse that a premium laptop). Of course, all of this changes if Threadripper has some secret sauce to provide single-threaded results beyond that of Ryzen. In less than two months we should have all of the data. The 7820X offers twice the performance (in tests that can spread the work) for less than twice the price of Intel's mainstream i7 option and without sacrificing any single-threaded performance or overclocking ability. For those who don't require the maximum single-threaded performance (especially overclocked), the current Ryzens already offer a significantly more attractive package at a similar price to the quad-core Intels.

Last year's $1200 Intel HEDT offering is certainly looking like a very bad choice while the $2000 premium combination looks to be made completely redundant with Threadripper. Hopefully by speculating about where the mainstream goes next year, we can avoid bad choices if we need to buy a new system this year.