Barcelona
-
- Tenth Dan Procrastinator
- Posts: 4891
- Joined: Fri Jul 18, 2003 3:09 am
- Location: San Jose, CA
Do you know what sort of cache coherency, if any, they're using? It's either something related to the data cache or they messed up their simulation of the chip to chip interconnect somehow.The reason for this is getting the server oriented Barcelona to work reliably in 2P or above configurations is a leading factor in the reduced core speeds expected at launch.
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
Caveat: I know diddly about AMD uarch. However, I still more know than you.
It very much sounds like an HT speedpath, because the chipset link and the remote sockets links are not the same. But jeez, it's a different clock. Lowering your core clock simply won't improve your HT frequency unless it's a heat problem. That doesn't make any sense. Perhaps they have some logic operating on a core clock that handles transactions...
The last level cache (L3, I suppose) on Barcelona is neither inclusive nor exclusive. It is a large victim cache holding data evicted from the core caches. I would assume it uses MESI, but if I've been shown the info I have forgotten it.
It very much sounds like an HT speedpath, because the chipset link and the remote sockets links are not the same. But jeez, it's a different clock. Lowering your core clock simply won't improve your HT frequency unless it's a heat problem. That doesn't make any sense. Perhaps they have some logic operating on a core clock that handles transactions...
The last level cache (L3, I suppose) on Barcelona is neither inclusive nor exclusive. It is a large victim cache holding data evicted from the core caches. I would assume it uses MESI, but if I've been shown the info I have forgotten it.
Disclaimer: The postings on this site are my own and don't necessarily represent Intel's positions, strategies, or opinions.
-
- Tenth Dan Procrastinator
- Posts: 4891
- Joined: Fri Jul 18, 2003 3:09 am
- Location: San Jose, CA
Maybe it is a HT speed path and they're doing something funky like counting cycles for an invalidate signal to reach the other caches and declaring exclusive locally after that much time so they don't have to wait for an ACK. Slowing down the core gives HT more time. Anyways, at the top level, it still sounds like a coherency problem. The root cause will be pretty much impossible for us to diagnose here.Dwindlehop wrote:Caveat: I know diddly about AMD uarch. However, I still more know than you.
It very much sounds like an HT speedpath, because the chipset link and the remote sockets links are not the same. But jeez, it's a different clock. Lowering your core clock simply won't improve your HT frequency unless it's a heat problem. That doesn't make any sense. Perhaps they have some logic operating on a core clock that handles transactions...
The last level cache (L3, I suppose) on Barcelona is neither inclusive nor exclusive. It is a large victim cache holding data evicted from the core caches. I would assume it uses MESI, but if I've been shown the info I have forgotten it.
Aren't victim caches exclusive typically? As you said, you know better than me, but "holding data evicted from the core caches" sounds exclusive. Also, I would think that a large victim cache would be pretty damn slow since it would typically have high associativity, especially as the ratio of L3 to L1/2 increases... What are the sizes of the L1/2 and L3 caches?
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
http://theinquirer.net/default.aspx?article=40348
http://www.theinquirer.net/default.aspx?article=40347
Edit: I was on bad crack. Expect Phenoms for Christmas 2007
http://www.theinquirer.net/default.aspx?article=40347
Edit: I was on bad crack. Expect Phenoms for Christmas 2007
Last edited by Jonathan on Wed Jul 11, 2007 1:18 am, edited 1 time in total.
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
http://online.wsj.com/article/SB1183100 ... YWORDS=AMD
Shipping parts in August, systems available in September. Launch frequency is 2.0 GHz.
Shipping parts in August, systems available in September. Launch frequency is 2.0 GHz.
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
It's 2MB L3 (not 2MB per core). This is why the number was weird.Dwindlehop wrote:64k il1/dl1
512k ul2
2mb per core l3, but even as I type it it seems high. maybe it's 1mb per core, but I'm pretty sure it's 2.
http://www.anandtech.com/cpuchipsets/sh ... i=2939&p=9
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
I am 100% right.Dwindlehop wrote:A victim cache needn't dump the line to a core on a hit, but an exclusive cache must. I could be wrong, but I recall their l3 being neither inclusive nor exclusive.
So new fetches fill into the L2 (not L3). They will eventually be evicted into the L3. If the line is hit from L3, sometimes it is invalidated and sometimes it isn't. Therefore, the L3 is neither exclusive nor inclusive.The new L3 cache, acts as a victim for the L2 cache. So when the small L2 cache fills up, evicted data is sent to the larger L3 cache where it is kept until space is needed. The algorithms that govern the L3 cache's operation are designed to accommodate data that is likely to be needed by multiple cores. If the CPU fetches a bit of code, a copy is left in the L3 cache since the code is likely to be shared among the four cores. Pure data load requests however go through a separate process. The cache controller looks at history and if the data has been shared before, a copy will be left in the L3 cache; otherwise it will be invalidated.
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
Seems like the L3 can't have the line in E if a core does an RFO, so I doubt they bother with core valid bits. It's sorta like a third caching agent. If the line is in this L3, you don't need to snoop the cores. If it's not, you do. You couldn't really get any benefit from core valid bits since the MESI state sorts you out.
Now, in an inclusive cache, core valid bits would help because the cache could filter unnecessary snoops to the cores. The inclusivity would guarantee correctness.
Now, in an inclusive cache, core valid bits would help because the cache could filter unnecessary snoops to the cores. The inclusivity would guarantee correctness.
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
http://www.dailytech.com/AMD+to+Can+Sin ... le7960.htm
New prices cuts and EOL announcements. If you have an existing AMD platform you want to upgrade, now is probably the time to pounce.
New prices cuts and EOL announcements. If you have an existing AMD platform you want to upgrade, now is probably the time to pounce.
-
- Grand Pooh-Bah
- Posts: 6722
- Joined: Tue Sep 19, 2006 8:45 pm
- Location: Portland, OR
- Contact:
Barcelona Opterons are out of NDA this week. AMD is already sampling 2.5 GHz Opterons, so I guess they worked out the kinks, just not in time to launch.
http://www.anandtech.com/cpuchipsets/sh ... i=3092&p=5
This is a review pitting the Barcelona versus the old Opteron on the same motherboard on desktop benchmarks. Barcelona gets about 15% perf per clock over the old Opteron on most apps. The games are some of the better outliers. Expectation is that this trend will hold true for Phenom launch.
In other news, Penryn Xeons launch in November. I don't think I've seen the date for desktop Penryns.
http://www.anandtech.com/cpuchipsets/sh ... i=3092&p=5
This is a review pitting the Barcelona versus the old Opteron on the same motherboard on desktop benchmarks. Barcelona gets about 15% perf per clock over the old Opteron on most apps. The games are some of the better outliers. Expectation is that this trend will hold true for Phenom launch.
In other news, Penryn Xeons launch in November. I don't think I've seen the date for desktop Penryns.
Disclaimer: The postings on this site are my own and don't necessarily represent Intel's positions, strategies, or opinions.