Advanced Micro Devices has released a presentation for investors that puts the launch of the new Steamroller micro-architecture in 2013. While the company did not unveil what chips will feature Steamroller high-performance x86 cores this year, it implied that the new generation of server-class Opteron chips will be based on Steamroller, the micro-architecture AMD puts a lot of hopes on.
A Curious Case of Steamroller Plans
The Sunnyvale, California-based company published a slide called “AMD Opteron Technology: Delivering multiple generations of greater functionality and improved performance” in its Q1 2013 investor presentation, which clearly puts release of Steamroller in 2013. In the previous version of AMD enterprise roadmap, the company clearly stated that its Opteron “Abu Dbabi” powered by Piledriver high-performance x86 cores will be its focus for 2013 and 2014. The next-generation of Opteron processors was planned to be introduced for the second half of 2014.
As it appears, AMD has either dramatically changed its Opteron plans and the new chip will actually be launched this year (which either suggests compatibility with current sockets or means that AMD intends to unveil a new platform for server microprocessors with support for PCI Express 3.0 and improved functionality), or just wants to show that it is on-track with the micro-architecture, which will power different chips.
Earlier this year AMD updated its roadmap for client-class personal computers. It reiterated plans to launch its code-named Kabini and it is a new-generation accelerated processing unit with Steamroller x86 general-purpose core as well as Radeon graphics engine based on GCN architecture.
AMD Steamroller: Expectations
AMD revealed a lot of details about the Steamroller at Hot Chips conference in August, 2012. Just like in case of the Bulldozer architecture, Steamroller x86 cores - which will power AMD's future high-performance Opteron and FX chips - will be located inside dual-core modules and therefore processors on its base should be similar by design with Orochi and Viperfish, with some minor exceptions that will not be truly important (new memory controller, different internal buses additional tweaks, etc) foe x86 performance. The main improvements will be independent instruction decoders for each core within a module, better schedulers, larger and smarter caches, more register resources and some other enhancements.
One of the reasons why dual-core Bulldozer modules [the same may be said about Piledriver] are not completely efficient is because they have only one instruction decoder for two ALUs and one FPU. With steamroller, AMD not only incorporated two decoders per module, but also increased instruction cache size (to lower i-cache misses by 30%), enhanced instruction pre-fetch (the number of mis-predicted branches is down by 20% compared to Bulldozer ) as well as improved max-width dispatches per thread by 25%. AMD believes that Steamroller will provide 30% improvement in ops per cycle.
AMD also advanced single-core execution by implementing 5%-10% more efficient scheduling, incorporated higher-capacity register files and performed some other tweaks. It should be noted that while integer pipes of Steamroller will not be too different from existing ones, the floating point pipe will be a bit redesigned. In general, AMD promises that both integer and floating point per-core performance of Steamroller will be higher than they are today with Bulldozer micro-architecture.
One of the interesting features of AMD Steamroller will be its ability to disable unused parts of L2 cache. Since not all apps are cache-bound, this may result in decreased power consumption and/or AMD's ability to boost clock-speeds of its microprocessors dynamically.
It is noteworthy that AMD decided to talk about its Steamroller micro-architecture that will be utilized inside microprocessors made using 28nm process technology approximately a year or more ahead of their roll-out