Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Captain Foo
May 11, 2004

we vibin'
we slidin'
we breathin'
we dyin'

Agile Vector posted:

ur mum's cooter lol

Adbot
ADBOT LOVES YOU

BONGHITZ
Jan 1, 1970

omg

Ator
Oct 1, 2005

amd should make a javascript coprocessor

Bored Online
May 25, 2009

We don't need Rome telling us what to do.

Endless Mike posted:

he saw what intel could have done if only weyland-yutani hadn't gone for the lowest bidder

intel does what amdon't

Notorious b.s.d.
Jan 25, 2003

by Reene

seeing as naples is four loving chips in one package, it would be pretty shameful if they didn't beat an e5 on speeds and feeds

(not so sure about watts and dollars!)

Notorious b.s.d.
Jan 25, 2003

by Reene
ryzen looks surprisingly ok

for naples all this poo poo hangs on their "infinity fabric," because it's four chips in one god drat can. if the interconnect sucks, naples sucks. if the interconnect rules, then... naples might be kinda sorta ok, maybe.

amd has been pretty tight-lipped about the interconnect so, uh, it don't look real good

Bloody
Mar 3, 2013

Ator posted:

amd should make a javascript coprocessor

every once in a while when I'm very drunk I think about this idea

Asymmetric POSTer
Aug 17, 2005

Bloody posted:

every once in a while when I'm very drunk I think about this idea

you're very drunk more than every once in a while bloody <:shobon:>

Cybernetic Vermin
Apr 18, 2005

its fun to hate on amd and all, but in general this new stuff looks like it'll be competitive enough with intel that they can play around on price, intel has huge margins, they don't need to beat them on any straight performance metric to make a bit of money

Perplx
Jun 26, 2004


Best viewed on Orgasma Plasma
Lipstick Apathy

Notorious b.s.d. posted:

ryzen looks surprisingly ok

for naples all this poo poo hangs on their "infinity fabric," because it's four chips in one god drat can. if the interconnect sucks, naples sucks. if the interconnect rules, then... naples might be kinda sorta ok, maybe.

amd has been pretty tight-lipped about the interconnect so, uh, it don't look real good

the core to core latency isn't that great




https://www.pcper.com/reviews/Processors/AMD-Ryzen-and-Windows-10-Scheduler-No-Silver-Bullet

Fabricated
Apr 9, 2007

Living the Dream
Ryzen is merely not quite as good as Intel stuff which is a huge improvement from utterly comical disaster like bulldozer

Notorious b.s.d.
Jan 25, 2003

by Reene

this is latency between cores on a single chip. it is not relevant to the question. what's important for naples is the performance of the interconnect between chips.

naples is literally 4x ryzen-type chips glued together. i don't mean that conceptually, or metaphorically. i mean four discrete, separated pieces of silicon glued into a single package with little wires between them.

if the interconnect between chips is good, then naples will be good. if the interconnect is bad, then naples will be a turd.

"infinity fabric" is the new interconnect. like hypertransport, except, AMD. if anyone sees something about "infinity fabric" pls post it here

Breakfast All Day
Oct 21, 2004

Notorious b.s.d. posted:

if anyone sees something about "infinity fabric" pls post it here

Bloody
Mar 3, 2013

mishaq posted:

you're very drunk more than every once in a while bloody <:shobon:>

yeah :v:

MrBadidea
Apr 1, 2009

Notorious b.s.d. posted:

"infinity fabric" is the new interconnect. like hypertransport, except, AMD. if anyone sees something about "infinity fabric" pls post it here

its used in the current 8C chips- they're functionally 2 * 4C CCXs connected over that interconnect. thats what the graphs were showing, moving a thread between cores in the same CCX vs between the two CCXs thus hitting the fabric (although the graphs were trying to figure out which logical cores were the hyperthread ~threadripper~ cores, it just happens to show the actual physical CCX setup too)

what i'm interested in is how the 2 socket systems are gonna handle big gpgpu clusters; each socket has 128 pcie lanes, but in 2 socket setups, 64 are used to provide the fabric interconnect so there's still 128 lanes leaving the sockets for the rest of the system, half from each socket

Notorious b.s.d.
Jan 25, 2003

by Reene

MrBadidea posted:

what i'm interested in is how the 2 socket systems are gonna handle big gpgpu clusters; each socket has 128 pcie lanes, but in 2 socket setups, 64 are used to provide the fabric interconnect so there's still 128 lanes leaving the sockets for the rest of the system, half from each socket

this could get kinda ugly

if it's 16 pci-e lanes per chip, does that mean it's literally one gpu per chip, and moving data in/out of gpu memory requires going across the interconnect for every memory access?

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Notorious b.s.d. posted:

this is latency between cores on a single chip. it is not relevant to the question. what's important for naples is the performance of the interconnect between chips.

naples is literally 4x ryzen-type chips glued together. i don't mean that conceptually, or metaphorically. i mean four discrete, separated pieces of silicon glued into a single package with little wires between them.

if the interconnect between chips is good, then naples will be good. if the interconnect is bad, then naples will be a turd.

"infinity fabric" is the new interconnect. like hypertransport, except, AMD. if anyone sees something about "infinity fabric" pls post it here

the standard R7 chip (1700, 1700X, 1800X) is just a pair of dies glued together though, so Nipples will actually be 8 pieces of silicon glued together

it's the same interconnect in both cases, the performance of the interconnect between a pair of dies still tells us a little about how it might perform between 8 dies - although the 5960X is not really the relevant comparison there since you'd want to look at the big quad-die E5s/E7s where the interconnect is being stressed a little harder

JawnV6
Jul 4, 2004

So hot ...
whats the topology

suck my woke dick
Oct 10, 2012

:siren:I CANNOT EJACULATE WITHOUT SEEING NATIVE AMERICANS BRUTALISED!:siren:

Put this cum-loving slave on ignore immediately!

Paul MaudDib posted:

the standard R7 chip (1700, 1700X, 1800X) is just a pair of dies glued together though, so Nipples will actually be 8 pieces of silicon glued together

it's the same interconnect in both cases, the performance of the interconnect between a pair of dies still tells us a little about how it might perform between 8 dies - although the 5960X is not really the relevant comparison there since you'd want to look at the big quad-die E5s/E7s where the interconnect is being stressed a little harder

make more dies.

Tokamak
Dec 22, 2004

akadajet posted:

why is walter crying?

because a decade ago he would have been a busty lady in skimpy clothing, or a sick rear end metallic robot-demon thing.

Silver Alicorn
Mar 30, 2008

𝓪 𝓻𝓮𝓭 𝓹𝓪𝓷𝓭𝓪 𝓲𝓼 𝓪 𝓬𝓾𝓻𝓲𝓸𝓾𝓼 𝓼𝓸𝓻𝓽 𝓸𝓯 𝓬𝓻𝓮𝓪𝓽𝓾𝓻𝓮

blowfish posted:

make more dies.

the bigger the better. bring back card-edge cpus

Mr. Apollo
Nov 8, 2000

akadajet posted:

why is walter crying?
he realized that he's powered by an amd chip

Notorious b.s.d.
Jan 25, 2003

by Reene

Paul MaudDib posted:

the standard R7 chip (1700, 1700X, 1800X) is just a pair of dies glued together though, so Nipples will actually be 8 pieces of silicon glued together

the ryzen is still a single die, even if it is two "core complexes" connected by fabric on that single die

i'm not sure we can infer very much from the fabric performance in the degenerate case. we don't know how many links exist on the ryzen CCXs vs a naples CCX, or what the specific topology will be with eight CCXs on four dies

Paul MaudDib posted:

it's the same interconnect in both cases, the performance of the interconnect between a pair of dies still tells us a little about how it might perform between 8 dies - although the 5960X is not really the relevant comparison there since you'd want to look at the big quad-die E5s/E7s where the interconnect is being stressed a little harder

as far as i know all intel x86 chips are 1 die in 1 package. even the monster 22 core E5s are on a single gigantic chip.

the main difference between e5 and e7 is how many qpi links you've got, which determines the possible topologies to link sockets together

Farmer Crack-Ass
Jan 2, 2001

this is me posting irl

Notorious b.s.d. posted:

the main difference between e5 and e7 is how many qpi links you've got,

hifi
Jul 25, 2012

i think anything with crystal well is 2 dies on the pcb

Tokamak
Dec 22, 2004

akadajet posted:

why is walter crying?

core temperature is reaching dangerous levels

atomicthumbs
Dec 26, 2010


We're in the business of extending man's senses.

BangersInMyKnickers
Nov 3, 2004

I have a thing for courageous dongles


High end xeons have the same problem where the 12+ core packages are basically two 6 or 8 core packages glued together with a high speed crossbar. Software needs to be numa aware and hardware needs to present each set of cores as their own node to know not to jump the crossbar if possible or put latency insensitive things across it. It's probably ok for desktop workloads at the moment though a 4 and 4 core design is pathetic by todays standards and is going to cause a lot of headaches for programmers to optimize for.

Notorious b.s.d. posted:

as far as i know all intel x86 chips are 1 die in 1 package. even the monster 22 core E5s are on a single gigantic chip.

the main difference between e5 and e7 is how many qpi links you've got, which determines the possible topologies to link sockets together

Nope, they call it cluster-on-die. Intel broke out the glue gun too.

suck my woke dick
Oct 10, 2012

:siren:I CANNOT EJACULATE WITHOUT SEEING NATIVE AMERICANS BRUTALISED!:siren:

Put this cum-loving slave on ignore immediately!

is that the german delidder guy?

i mean yeah if i had infinite money i would buy a shoebox of ryzens and then go full threadripper on them because i can :cadfan:

A Pinball Wizard
Mar 23, 2005

I know every trick, no freak's gonna beat my hands

College Slice
idgi why did he break the socket

suck my woke dick
Oct 10, 2012

:siren:I CANNOT EJACULATE WITHOUT SEEING NATIVE AMERICANS BRUTALISED!:siren:

Put this cum-loving slave on ignore immediately!

A Pinball Wizard posted:

idgi why did he break the socket

*mounts processor in vice*
*applies hammer and chisel*

Notorious b.s.d.
Jan 25, 2003

by Reene

BangersInMyKnickers posted:

High end xeons have the same problem where the 12+ core packages are basically two 6 or 8 core packages glued together with a high speed crossbar.

yes the really big xeon chips (haswell "HCC") have a funny logical layout, but i'm pretty sure they're still physically a single huge die.

here is an hcc die from xeon e5 2500 v3. it's very easy to see, and count, the L2 caches for the 18 cores. (if this is actually two chips glued together, i sure don't see the seam.)



BangersInMyKnickers posted:

Nope, they call it cluster-on-die. Intel broke out the glue gun too.

"cluster-on-die" is a bios flag that changes l3 cache handling

you can either split the big chips into two numa zones, with lower latency and crappier L3 cache performance, or you can leave them in a single pool. higher latency, better caching.

Notorious b.s.d. fucked around with this message at 15:19 on Mar 17, 2017

BangersInMyKnickers
Nov 3, 2004

I have a thing for courageous dongles


It's a bit more complicated than that. The caches might be unified but there are four memory controllers and each core is only able to directly address two at a time. The crossbar provides the interconnect from the two halves of the processor with the two sets of memory controllers. Hitting the crossbar incurs a latency and potential bandwidth bottleneck so CoD defines the numa domains so the memory manager can attempt to avoid that when possible for everything except extremely large VMs or large/parallel workloads. I don't believe it splits L3.

Fabricated
Apr 9, 2007

Living the Dream
AMD...lol!

http://www.pcmag.com/news/352538/ryzen-7-chips-are-locking-up-pcs-amd-knows-why

quote:

AMD threw Intel a curve ball in February when the chip company announced its Ryzen CPUs would launch in early March. They are fast and significantly cheaper than Intel's equivalent Core processors. It even led to some price cuts by Intel.

But with Ryzen chips now making their way into desktop PCs, AMD experienced its first major problem. All variants of the Ryzen 7 desktop processors are locking up PCs. The issue is related to FMA3 code, which are a set of streaming SIMD Extensions (SSE) that can greatly enhance the performance of floating point operations carried out by the chips. FMA3 isn't new. AMD added support for the instruction set back in 2012.

skimothy milkerson
Nov 19, 2006


MAH BITS!

infernal machines
Oct 11, 2012

we monitor many frequencies. we listen always. came a voice, out of the babel of tongues, speaking to us. it played us a mighty dub.
amd: we'll fix it in POST

Asymmetric POSTer
Aug 17, 2005

infernal machines posted:

amd: we'll fix it in POST

Fabricated
Apr 9, 2007

Living the Dream

infernal machines posted:

amd: we'll fix it in POST

atomicthumbs
Dec 26, 2010


We're in the business of extending man's senses.

BangersInMyKnickers posted:

It's a bit more complicated than that. The caches might be unified but there are four memory controllers and each core is only able to directly address two at a time. The crossbar provides the interconnect from the two halves of the processor with the two sets of memory controllers. Hitting the crossbar incurs a latency and potential bandwidth bottleneck so CoD defines the numa domains so the memory manager can attempt to avoid that when possible for everything except extremely large VMs or large/parallel workloads. I don't believe it splits L3.

https://www.youtube.com/watch?v=KmtzQCSh6xk

Adbot
ADBOT LOVES YOU

Mr.Radar
Nov 5, 2005

You guys aren't going to believe this, but that guy is our games teacher.

infernal machines posted:

amd: we'll fix it in POST

  • Locked thread