AMD's Barcelona Launch
| 09-13-2007AMD’s Barcelona Launch
After much anticipation and a difficult delivery, Barcelona, AMD’s next-generation microarchitecture has arrived – and not a moment too soon. For a while, Intel has been the performance leader across most market segments, with the exception of HPC and the larger MP servers. Particularly for servers, AMD was in the difficult situation of offering dual core processors against Intel’s quad-core parts. Since most server workloads are multi-threaded this translated to a substantial performance advantage for Intel. This enabled Intel to start a punishing price war, and regain substantial market share.
Barcelona hosts a wide variety of improvements, which have been discussed extensively in prior articles. The most notable change is that Barcelona is a fully integrated quad-core processor, which is aggressively optimized for AMD’s 65nm process. Other improvements include:
- 2MB shared L3 cache (32 way associative)
- Improved memory controller performance
- 32B instruction fetch
- Improved indirect branch prediction
- Side-band stack optimizer
- 128 bit FPUs
- 128 bit SSE loads
- Load re-ordering
- Improved prefetchers
- 1GB page table entries, and improved virtualized page tables
- Separate voltage planes and clock distribution for cores, and northbridge/L3 cache
Like its predecessor, Barcelona’s microarchitecture is extremely well-suited to HPC workloads, but weaker for general purpose applications that rely more on integer performance and less on bandwidth. The initial versions of Barcelona will clock at up to 2GHz, with improvement in the future as AMD tweaks their 65nm process and the design itself.
Performance
AMD’s initial benchmark results for Barcelona present a puzzling picture; some of the most important server benchmarks have been omitted entirely. There are no TPC-C, TPC-H, TPC-E, SAP, SPECjbb2005, SPECint2006 or SPECfp2006 results at all. Considering that Barcelona is a server processor, the lack of any serious commercial server benchmarks is surprising at best. However, AMD’s results for several prominent HPC benchmarks and applications are quite encouraging, as a 2GHz Barcelona handily outperforms 2.33GHz Xeons on SPEComp2001, Fluent and LS-DYNA, by margins of up to 60%.
AMD also seemed to have problems getting technical websites to conduct thorough reviews of Barcelona. While most websites are seeded review systems weeks prior to a launch, it appears that AMD only worked with two or three sites, and provided them with about 3 days of testing time. The best review so far is from Tech Report, where he concludes that Barcelona is a big step forward, but not sufficient to outperform Intel across a broad range of applications.
The bottom line is that today, Barcelona offers the best performance for HPC workloads that tend to stress system bandwidth, while falling short on more general server workloads such as web serving or transaction processing. Barcelona will have lower power consumption than rival Intel products, since they are using registered DDR2 memory, rather than FB-DIMMs.
ACP
AMD has correctly argued since 2003 that their TDP ratings are inherently more conservative than Intel’s, and hence a 130W Opteron consumes less power than a 130W Xeon during actual operation. This is because AMD’s TDP is defined as the power used when the processor is operating at the highest voltage and drawing the most current – which is equivalent to Intel’s maximum power rating. Both of these numbers measure the instantaneous power draw of a device, which cannot be sustained for any period of time. Fortunately, AMD made the effort to try and come up with more useful ratings for their processors.
AMD’s new Average CPU Power (ACP) rating is supposed to complement the TDP rating and give a more accurate picture of the processor’s thermal envelope. While ACP was not formally defined, it appears to be a geometric mean of the average power of several benchmarks from SPEC, TPC and other sources. Note that this is the power draw of the processor, as measured at the socket – rather than power ‘at the wall’, which is a system metric. Hopefully in the future, we will get a chance to explore ACP in greater detail and to what extent it will be useful to OEMs and end-users.
Market Implications
While AMD does not have a decisive performance lead, they do have design wins at every major OEM and a substantially better reputation than in 2003. Barcelona is drop-in compatible with AMD’s current systems (there are roughly 50 out there), so it leverages prior successes and will cost OEMs relatively little to introduce new systems.
While Barcelona’s performance may lag in some areas AMD has adjusted their pricing to reflect this reality. Realistically, OEMs will not pitch Barcelona systems where performance is the key metric, unless the end users are concerned about HPC workloads. However, for the vast majority of systems performance is just one of many features, and AMD’s competitive pricing ensures that OEMs will be quite happy to work with their products.
At the Barcelona launch, Hector Ruiz announced that AMD would ship a 2.5GHz part before the end of the year. Such a product will certainly be in the 130W bin, but nonetheless, this raises an interesting question. As Barcelona ships at higher frequencies, thus providing more performance, the price will consistently move up. At what point will AMD’s Barcelona be competitive enough to merit equal pricing to Intel’s offerings? While there are no hard and fast rules, it seems likely that a 2.4- 2.6 GHz Barcelona will have equal or better performance to the newer 2.93GHz Xeon MPs for four socket servers. For DP platforms, where AMD’s superior system architecture for the Opteron is less advantageous, they will probably need a 2.6-2.8GHz Barcelona to compete against Intel’s 3GHz Xeon DP products. Of course, AMD will also need to have relatively close TDP as well. There is relatively little guidance here from AMD, but hopefully, by the early part of next year, AMD will have offerings that are more competitive with Intel.

