here are some benchmark results of K10 beating Intel QuadCore
Fudzilla - K10 at 2.5 GHz beats QX6800 in multimedia
Fudzilla - K10 at 2.5 GHz beats QX6800 in multimedia
This summer, AMD will announce its first major architecture change since the introduction of the K8 architecture in 2003. This new architecture, dubbed K10, will first make an appearance in the server space, with the introduction of the Barcelona-family processors.
K10 features a native quad-core design that incorporates shared-L3 cache, HyperTransport-3 support and backwards functionality with AM2 motherboards. However, the original K10 desktop and server processors will debut on the 65nm architecture -- a process AMD only started mastering in December 2006 with the launch of the Brisbane desktop CPU family.
In the second half of 2008, AMD will begin to migrate its K10 architecture to the 45nm node. AMD explicitly to mentions that its 45nm process technology utilizes silicon-on-insulator (SOI). Intel's 45nm process node, slated for introduction later this year, uses conventional CMOS process technology.
The halo AMD 45nm chip, Deneb FX, shares the same functionality as its 65nm counterpart, Agena. Both families incorporate native quad-core designs and shared-L3 cache support. Deneb FX goes one step further, adding support for DDR3 on the integrated memory controller.
However, the bulk of AMD's 45nm quad-core offerings will come with the Deneb (non-FX) family. AMD suggests Deneb will be the first processor on the new AM3 socket. Previous AMD documentation indicated that AM2 and AM3 would be forward/backward compatible -- yet AMD engineers claim the AM3 alluded to in 2006 is not the same AM3 referenced in the 2008 launch schedule.
"At the time AM3 was the likely candidate to become AM2+," claimed one field application engineer familiar with AMD's socket migration. "[AMD] wanted to keep the socket name associated with DDR2 memory and backwards compatibility, but AM3 emphasizes DDR3 support."
After Deneb, and closer to 2009, AMD's guidance states that 45nm Propus and Regor will replace the 65nm Kuma and Rana mid-range products. Propus is very similar to Deneb: 45nm, shared L3 cache, AM3 package. However, Propus will only feature two cores. Regor is identical to Propus, but will not include shared-L3 cache support.
AMD's low-end single core Athlon 64 and Sempron appear consolidated with the introduction of the Sargas family. Sargas is an optical shrink of the 65nm Spica core, with the addition of DDR3 support and AM3 packaging. AMD's ultra-low end Sparta-family, slated for introduction this year to replace the Manila-family Semprons, has no successor.
AMD product managers are keeping details of their 45nm technology close. However, this past January AMD and IBM jointly announced plans for high-k, metal gate transistors on future 45nm and 32nm processors.
This past February, AMD senior vice president of technology development Douglas Grose claimed the company is still anticipating whether or not it will use high-k metal gate technology in later 45nm revisions or if the company will wait until 32nm.
Marty Seyer, AMD senior vice president, recently disclosed AMD's 45nm server offering slated for release in 2008. Seyer stated that Shanghai, the 45nm successor to Barcelona, would feature additional cache and other performance enhancements.
AMD plans to release its K10-derived Stars-family desktop processors later this year. The new Stars-family processors take advantage of AMD’s Socket AM2+, an updated Socket AM2 platform that adds support for the faster Hyper Transport 3.0 bus. AMD’s latest roadmap divulges information on its upcoming Hyper Transport 3.0 compatible chipset family, arriving in Q3’2007.
The new AMD discrete graphics chipset family includes four new chipsets ranging from the entry-level RX740 to the flagship RD790. At the top of the discrete graphics lineup is the RD790, which replaces the current AMD 580X. The RD790 serves double duty in AMD’s chipset lineup, powering AMD’s Quad FX Socket 1207+ and Socket AM2 platforms.
AMD’s flagship packs plenty of PCI Express flexibility with up to four physical PCIe x16 slots. The four slots can electrically operate with four 8-lane slots, one 16-lane and three 8-lane, or two 16-lane slots. There are six additional PCIe lanes for additional expansion. The RD790 is fully PCIe 2.0 compatible. AMD plans to target RD790 towards the $150 plus market.
Taking place of the AMD 480X is the upcoming RD780. The new RD780 supports two physical PCIe x16 slots in dual eight-lane configurations. The two PCIe x16 slots are fully PCIe 2.0 compatible. Slotted below the RD780 is the RX780, which does away with CrossFire multi-GPU support. The RX780 supports a single PCIe 2.0 x16 slot. Both chipsets support AMD’s Hyper Transport 3.0 bus. RD780 will target the $70-100 price points while the RX780 takes on the $50-70 price points.
AMD also intends to offer more value conscious consumers the RX740. This chipset features support for AMD’s Socket AM2+, however, it only supports Hyper Transport 1.0. The RX740 does not support PCIe 2.0 either.RX740 will take on the same $50-70 price points as the RX780.
AMD RD780, RX780 and RX740 can also share the same motherboard design, simplifying the design process. The four new chipsets pair up with AMD’s existing SB600 south bridge, as the SB700 won’t be ready until Q4’2007.
The new chipsets will also feature a Windows-based tweaking utility – AMD System Utility. The AMD System Utility allows users to tweak memory settings, automatically overclock the processor, test system stability and benchmark the processor and memory.
A while ago we told you about the intended launch frequencies, basically 1.9-2.5GHz, but that was before B0 parts came back.
We understand that people in Austin were 'dancing in the aisles'. When asked if that was because of B0, we were told that people are very happy, very very happy, but he had 'never heard of such a thing'.
Another source claimed the memory controller, long a bottleneck in K8 scaling, came in way better than expected.
So what do you end up with? A massive gain in frequency. How massive? Almost 500MHz. Instead of the much touted launch parts, look for five SKUs at launch, AM2 quads at 2.6GHz, 2.7GHz and 2.9GHz, a dual at 2.7GHz and a quad on socket F at 2.8GHz.
The AMD 10h processor family - also known as K10 - has 20 key features. We will put them all on paper for you. Some of them are known, but some of them are not. Let's list them.
First of all, the K10 has an integrated DDR2 memory controller with memory prefatcher and the K10 core has 64kb of L1 instruction cache + 64 KB data cache. It also has on chip L2 and L3 cache and this varies depending on the core. The Barcelona / Agena quad core K10 for example has 4x512 KB L2 and 2 MB of L3 cache. K10 supports 32 Byte instruction fetch, instruction precode and branch prediction during cache line files, decoupled decode / execution code, 3-way AMD64 instruction decoding, sideband stack optimiser, dynamic scheduling and speculative execution.
The new core also features 3-way integer execution and address generation, 3-way 128 bit wide floating point executions, Enhanced 3Dnow! Marchitecture, MMX, SSE, SSE2, SSE3 & SSE4A Single instruction multiple data (SIMD) instruction extensions.
Further the CPU can cope with advanced bit manipulation instructions, super forwarding, prefetch into L1 data cache, deep out of order integer & floating point execution, 8 additional XMM registers (SSE, SSE2, SSE3 and SSE4A) & 8 additional GPRs in 64 Bit mode.
Last but not the least is Enhanced HyperTransport marchitecture. If this is too much for you don’t worry, it is too much for most of us, but we like that the K10 supports SSE4A so it might have a fighting chance in encoding.
The pinout of Socket-AM2 and Socket-AM2+ is identical, and likewise Socket-1207 and 1207+, and thus the same Agena or Barcelona will work in both sockets, which is how AMD is able to guarantee full backwards compatibility with current AM2 and 1207 motherboards. If you do buy a new motherboard that uses either Socket-AM2+ or 1207+, you will get some additional functionality.
...
...
While your current motherboards will work with AMD's forthcoming CPUs, you'll get better performance out of upcoming Socket-AM2+ and Socket-1207+ platforms. AMD does plan on supporting both AM2 and 1207 into 2009, so you can expect a continued upgrade path for your AMD platforms well after Agena/Barcelona.
Barcelona revision B0 ... in the multimedia test in Sandra it is about 50 % faster than a Core 2 Quad QX6700 ... In the integer part of the test it scores 423,382 instructions per seconds compared to 289,382 of the Core 2 Quad QX 6700 (46%)... In the FPU part of the test the Barcelona B0 scores 305,680 it/s while QX 6700 scores 156,012 it/s. Barcelona is twice as fast in this particular test and it still works at a 260 MHz slower clock speed.
AMD's original plans to introduce Quad-Core Opteron (Barcelona) CPUs are unlikely to be on schedule with a concrete launch-time still unknown, according to sources at Taiwan server makers.
The sources noted that AMD has informed them that the introduction of Barcelona will be delayed until August or September, instead of the originally planned June. However, the sources also noted that this schedule is still subject to change. The Barcelona CPU samples they have currently are not the final versions and bugs are still being discovered, they added.
Some Taiwan-based server vendors commented that their confidence over AMD will be affected by the delay. Roadmaps for new product launches were mapped out by the vendors in 2006, and the delay of Barcelona will interrupt these schedule. Comments gathered from vendors show that the majority believe a critical impact will be seen amid Barcelona's delay.
Besides that the postponement may discourage confidence among server makers, AMD also has to face the potential threat from rival Intel whose 45nm-made Harpertown and Wolfdale-DP CPUs are still on schedule to be both launched as planned, according to sources.
For reference, the Intel Xeon 5355 is clocked at 2.66 GHz and is priced around $1,600. The Intel Xeon 5160 has a core frequency of 3.0 GHz and runs approximately $850. AMD's 3.0 GHz Opteron 2222 SE runs just under $1,000 at retail.
AMD guidance puts the SPECint_rate performance of two quad-core 2.6 GHz Barcelona approximately 23% higher than the quad-core 2.66 GHz Xeon 5355; a score of approximately 104 versus 84.8. SPECfp_rate performance puts the Barcelona performance almost 58% higher than that of the Intel Xeon 5355; 92 versus 58.8.
Barcelona HPC improvements include a wider instruction set, L3 cache, new SIMD support and better branch prediction. Kanter also claims, "Improvements that Barcelona is making are not necessarily as targeted for single threaded performance." Specifically, Kanter discounts SSE improvements as a major performance head turner, but for some applications it certainly is a huge single threaded help. For example, it's not going to make web browsers or word processors faster; but it would certainly help single threaded performance for games and numerical stuff.
AnandTech founder Anand Lal Shimpi disagrees on Kanter's dismissal of new SSE instructions on Barcelona. "Many of the major changes to Barcelona were driven by one significant change: what AMD is calling SSE128," he states. Shimpi tells DailyTech, "The culmination of the SSE128 improvements is very similar to some of the changes made in the Yonah to Merom transition."
However, what the SPECint_rate and SPECfp_rate benchmarks don't show is the ability to handle process-to-process throughput rates. Kanter highlights this to DailyTech, stating, "For stuff like web serving, application serving, I think Barcelona will kinda be a mixed bag, won't be a real home run." He clarifies this by emphasizing many of the K10 changes have possible drawbacks, including the split power-plane.
"The split power-plane, while saving power, has tendencies to make moving data between them a little awkward." Kanter continues, "It's a subtle thing, but in the end it will all depend on latency."
On the other hand, changes to the architecture are actually specifically geared at improving socket-to-socket performance. Four socket systems will now utilize one 16-bit HyperTransport link to each socket on the system -- eight-socket systems will utilize one 8-bit HyperTransport link to each socket. But, as Kanter stated earlier, this is largely an HPC change and will not affect desktop and dual-socket performance.