Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Rastor
Jun 2, 2001

Rest In Peace, Andy Grove.

Adbot
ADBOT LOVES YOU

HMS Boromir
Jul 16, 2011

by Lowtax
Someone posted this in the AMD thread first, which is kind of funny, but no more tick-tock. Intel is feeling the pressure of only having a few die shrinks to go before physics stops them and are officially announcing their intent to use each manufacturing process for much longer.

Potato Salad
Oct 23, 2014

nobody cares


Now it is official? What was Broadwell, then?

HMS Boromir
Jul 16, 2011

by Lowtax
It was already official that 14nm and 10nm would be getting 3 architectures each, so I guess it's less that tick-tock is over and more that tick-tock-refresh is also over and they're going to stick to one process for even longer than that.

The fact that they're including 14nm in the statement makes it sound like Cannonlake is getting further delayed, too.

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.

Potato Salad posted:

Now it is official? What was Broadwell, then?

Broadwell was a "tick", a 14nm die shrink of 22nm Haswell. It's Kaby Lake that's the new "Optimization" step.

KingEup
Nov 18, 2004
I am a REAL ADDICT
(to threadshitting)


Please ask me for my google inspired wisdom on shit I know nothing about. Actually, you don't even have to ask.

EdEddnEddy posted:

The thing about that Razer Core though is the price ($499!?!) and the fact that you are limited to PCI-E 3.0 @ 4X. Which if each PCI-E Standard is double the last, that means it is as fast as PCI-E 1.0 X16 which was good years ago, but throwing anything better than a 970 in there seems like it would bottleneck the hell out of it.

I don't see how bottlenecking is a problem. We have adaptive sync these days and as long as your minimum FPS is not dropping below say 40 fps you aren't going to notice.

Looking forward to the new NUC. Here's hoping it will run DOTA2 on high settings at 1080p.

Edit: some Razer Core benchmarks here:

KingEup fucked around with this message at 01:30 on Mar 26, 2016

Josh Lyman
May 24, 2009


I need Kaby Lake to get here already so that I can get a new laptop.

Gucci Loafers
May 20, 2006

Ask yourself, do you really want to talk to pair of really nice gaudy shoes?


Josh Lyman posted:

I need Kaby Lake to get here already so that I can get a new laptop.

I feel like I won't be upgrading Sandy Bridge until Cannonlake.

Josh Lyman
May 24, 2009


Tab8715 posted:

I feel like I won't be upgrading Sandy Bridge until Cannonlake.
Yeah as far as my desktop goes, I don't see any reason to upgrade my 3570K until it dies.

DrDork
Dec 29, 2003
commanding officer of the Army of Dorkness

Tab8715 posted:

I feel like I won't be upgrading Sandy Bridge until Cannonlake.
I only upgraded to a 5820k because my P67 motherboard literally fried a trace and there was no way I was paying $150+ for a replacement when for $100 more I could get the glory of a true 6-core CPU.

That said, it's not like it's any faster in normal use as far as I can tell.

BIG HEADLINE
Jun 13, 2006

"Stand back, Ottawan ruffian, or face my lumens!"

Josh Lyman posted:

Yeah as far as my desktop goes, I don't see any reason to upgrade my 3570K until it dies.

I consider the 2500K one of the most worthwhile purchases I've ever made. I do miss having PCIe M.2 capability, but Skylake-E might be where I finally 'trade up.'

il serpente cosmico
May 15, 2003

Best five bucks I've ever spend.

BIG HEADLINE posted:

I consider the 2500K one of the most worthwhile purchases I've ever made. I do miss having PCIe M.2 capability, but Skylake-E might be where I finally 'trade up.'

Yeah, my 2500K is still going strong at 4.5ghz after nearly five years. It's a little nuts how quickly CPU advances slowed down after Conroe.

mobby_6kl
Aug 9, 2009

by Fluffdaddy
I'm really kicking myself for not getting a 2500K back in the day. It would've been a significant improvement over my C2Q, but at the time even this CPU was more than enough for anything so I thought "I'll upgrade the next gen, after another significant IPC improvement". And, well, here I am, still running a 9 year-old CPU :stare:

Generic Monk
Oct 31, 2011

Josh Lyman posted:

I need Kaby Lake to get here already so that I can get a new laptop.

why would you be waiting for something that was literally designed as a stopgap so they technically have something to release in the year after skylake. it's not exactly going to be a revelation

mobby_6kl posted:

I'm really kicking myself for not getting a 2500K back in the day. It would've been a significant improvement over my C2Q, but at the time even this CPU was more than enough for anything so I thought "I'll upgrade the next gen, after another significant IPC improvement". And, well, here I am, still running a 9 year-old CPU :stare:

tbf you're still leaving a pretty large amount of performance on the table by sticking with the c2q, plus a load of the newer features added to the chipset like USB3

Generic Monk fucked around with this message at 12:19 on Mar 26, 2016

Instant Sunrise
Apr 12, 2007


The manger babies don't have feelings. You said it yourself.
Yeah I've got an i7 2600k @ 4.2 GHz and I'm really glad I got in on that when I did. Especially because the Micro Center by my work had them really cheap.

No Gravitas
Jun 12, 2013

by FactsAreUseless
People are still buying Sandy Bridge Xeons because they are so cheap. I just got 16 cores (http://www.natex.us/product-p/intel-e5-2670.htm) worth for 130$. Total, not each. The power supply to feed this was more than the cores. Second hand CPUs, sure, but in addition to my other machine I suddenly have a cluster at home and my research will run at least 3x as fast as before. :science:

Mr Shiny Pants
Nov 12, 2012

No Gravitas posted:

People are still buying Sandy Bridge Xeons because they are so cheap. I just got 16 cores (http://www.natex.us/product-p/intel-e5-2670.htm) worth for 130$. Total, not each. The power supply to feed this was more than the cores. Second hand CPUs, sure, but in addition to my other machine I suddenly have a cluster at home and my research will run at least 3x as fast as before. :science:

The only thing holding me back from buying these is the power usage, we have a couple of 12 cores that are redundant but running them 24/7 has me wondering about the power bill.

Much cheaper to run a newer E3-1245V3.

Mr Shiny Pants fucked around with this message at 22:03 on Mar 26, 2016

Gucci Loafers
May 20, 2006

Ask yourself, do you really want to talk to pair of really nice gaudy shoes?


Generic Monk posted:

why would you be waiting for something that was literally designed as a stopgap so they technically have something to release in the year after skylake. it's not exactly going to be a revelation.

With USB 3.1 integrated into Kaby Lake's chipset and I think we'll see the end of proprietary docking stations.

EdEddnEddy
Apr 5, 2012



KingEup posted:

I don't see how bottlenecking is a problem. We have adaptive sync these days and as long as your minimum FPS is not dropping below say 40 fps you aren't going to notice.

Looking forward to the new NUC. Here's hoping it will run DOTA2 on high settings at 1080p.

Edit: some Razer Core benchmarks here:

While the Core offers performance way above what you would be seeing with a "M" graphics chip currently, the entire laptop performance isn't going to feed the card what it can potentially take, either with CPU limits, or eventually the 4X limit.

Overall its great for gaming, but it's not enough for VR.

Guru3D's R9 Nano Review



And I agree that CPU performance jumps just really haven't happened since the C2Q days to the first few i7 generations. Hell my 3930K is able to hold its own against a 5930K with only a modest OC and with a larger overclock, can complete against OC'ed ones still.

The biggest gains have been on the chipset side with native 3.0(.1), DDR4, M.2, etc.

Gucci Loafers
May 20, 2006

Ask yourself, do you really want to talk to pair of really nice gaudy shoes?


Mobile performance has grown enormously along with integrated graphics.

Methylethylaldehyde
Oct 23, 2004

BAKA BAKA

Tab8715 posted:

Mobile performance has grown enormously along with integrated graphics.

Pretty much, laptops went from 4 hours, maybe, in ultra-dim mode reading static text in 2005, where the best game you could play was DOOM, to 'I can play battlefield on a Iris laptop' and still get 6-10 hours of reading time on it in 2015. If they moved all the video poo poo to L1/L2/L3 cache, they could probably do some pretty cool things with the desktop chips, but would 95% of the people buying them give a poo poo? Nope, because a C2Q and a brand new Skylake both open office apps and play Internet Hearts exactly the same.

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!
What's this about?

http://www.myce.com/news/skylake-cpus-have-inverse-hyper-threading-to-boost-single-thread-performance-77011/

The Heise magazine in Germany claims to have noticed considerable performance boosts with Skylake in single threading performance in the SPECint benchmark that they speculate could only come from the processor pooling resources from multiple cores onto the single thread, calling it inverse hyperthreading. Is there something like that in Skylake, or are they just being idiots with their methodology?

suck my woke dick
Oct 10, 2012

:siren:I CANNOT EJACULATE WITHOUT SEEING NATIVE AMERICANS BRUTALISED!:siren:

Put this cum-loving slave on ignore immediately!

No Gravitas posted:

People are still buying Sandy Bridge Xeons because they are so cheap. I just got 16 cores (http://www.natex.us/product-p/intel-e5-2670.htm) worth for 130$. Total, not each. The power supply to feed this was more than the cores. Second hand CPUs, sure, but in addition to my other machine I suddenly have a cluster at home and my research will run at least 3x as fast as before. :science:

Holy poo poo, that thing has the same multithreaded performance as a 4790K. Since I could typically run any CPU intensive tasks (video converting, video analysis, other science stuff) in parallel on multiple PCs I might actually go build some cheap-rear end number crunching PCs from these next years.

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

Combat Pretzel posted:

pooling resources from multiple cores onto the single thread

Generally applied, that sounds like quite the breakthrough.

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
I haven't seen any references to single thread optimizations that resemble anything like that kind of performance characteristic, but from a methodology perspective if you flat out disable cores in your EFI / BIOS down to only permitting a single thread to run on a single physical core and disable hyperthreading on both the Haswell and Skylake CPUs as well as measure the CPU clock speed auto-scaling, you could see if that performance speed-up holds or is correlated with the clock speed moreso. If that can't happen, it might be worth trying to schedule CPU intensive tasks onto all other physical (pinning them) and praying that they aren't cross-scheduled out under the hood by the CPU anyway like how registers get remapped via Tomasulo and variants by the hardware.

I can imagine games would benefit substantially from this kind of hardware optimization at least.

suck my woke dick
Oct 10, 2012

:siren:I CANNOT EJACULATE WITHOUT SEEING NATIVE AMERICANS BRUTALISED!:siren:

Put this cum-loving slave on ignore immediately!

Subjunctive posted:

Generally applied, that sounds like quite the breakthrough.

It would mean Intel is effectively building 12GHz processors. Well, minus whatever inefficiencies apply to pooling cores.

Not sure if that is actually a real thing yet.

Gorau
Apr 28, 2008

blowfish posted:

It would mean Intel is effectively building 12GHz processors. Well, minus whatever inefficiencies apply to pooling cores.

Not sure if that is actually a real thing yet.

I can pray that it is though right?

HMS Boromir
Jul 16, 2011

by Lowtax
What, do people regularly just find new features on Intel chips? Is there at any given time just a list of Known Features and the rest are left to be found by intrepid treasure hunters? Surely if this was anything Intel would've advertised it.

Eletriarnation
Apr 6, 2005

People don't appreciate the substance of things...
objects in space.


Oven Wrangler

Gorau posted:

I can pray that it is though right?

If Intel had a way to make n threads work faster by adding more than n cores for any given value, then it would be the biggest news in CPUs in 10 years at least. Not being able to increase single-threaded performance further was why Intel abandoned the P4 NetBurst architecture and started increasing core count.

The article implies that Skylake may still be pooling some sort of shareable resource like cache more effectively for use on single-threaded application and on a really cache-sensitive test I can see that making a big difference, but I am skeptical that they've found the One Weird Trick of making one thread run like lightning on four cores.

Furthermore, this article is from August so anything incredible should have popped up on a lot more sources by now.

Eletriarnation fucked around with this message at 17:36 on Mar 27, 2016

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!
Figured as much. I haven't cared about Heise a long time, but it seems like they went down the shitter at some point, too.

At least the cache sharing thing sounds at least somewhat plausible. If one core has been parked, let the others use the parked L2 in full.

Eletriarnation posted:

Furthermore, this article is from August so anything incredible should have popped up on a lot more sources by now.
Ah my mistake. Didn't notice, only saw it referred today elsewhere.

No Gravitas
Jun 12, 2013

by FactsAreUseless
Agner Fog would have noticed it.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Combat Pretzel posted:

The Heise magazine in Germany claims to have noticed considerable performance boosts with Skylake in single threading performance in the SPECint benchmark that they speculate could only come from the processor pooling resources from multiple cores onto the single thread, calling it inverse hyperthreading. Is there something like that in Skylake, or are they just being idiots with their methodology?

The latter. "Inverse hyperthreading" is a bullshit fantasy first pushed by AMD fanboys as the tech which would supposedly save AMD from irrelevance.

Do not expect it to ever be an actual thing, it's close to impossible to make it work without actually harming performance. The resources you'd be borrowing from other cores are execution units. In order to usefully borrow an execution unit from Core B for Core A (making A a wider superscalar processor, and B a narrower one), B's execution unit would need to physically be teleported into Core A so it could be adjacent to A's register file, L0/L1 caches, register bypass networks, and so forth. Since it's far away, you'd need extra cycles to ship data there and back, which is very likely to kill any benefit.

But wait there's more downside! You'd also likely end up paying power and/or cycle time (=clock freq) penalties due to all the extra logic required to multiplex execution units between cores.

Another factor to think about is that integer execution units are small potatoes these days. If you wanted more of them available to a single core, you'd just add them locally. They're a tiny fraction of the die area used by all the amazingly complicated circuitry which tracks in-flight instruction state.

The final nail in the coffin is that even if any of this could actually work it would only benefit a very narrow set of programs, in a marginal way. For a long time Intel effectively had a 4-wide processor (6 ports to dispatch instructions to, but the decoder could dispatch 4 uops per clock IIRC). Despite being 4-wide the average number of instructions executed per clock (IPC) almost never cracked 2.0 for any real world code, and more commonly hovered somewhere in the range 0.8 to 1.5. There's too many serializing dependencies in typical integer code to use lots of parallel execution units in an effective way.

For nontypical integer code where there are lots of non-dependent operations, well, usually that kind of code is a great candidate for using SSE or AVX, and adding more threads if you want to scale out beyond 1 core.

The only idea conceptually similar to "inverse hyperthreading" which I've ever seen that was in any way practical to implement was one of the variants on scouting. In the traditional version of scouting, when a processor core is stalled waiting on memory fetch, it checkpoints its register file and begins fake-executing future code past the stall. In this mode it does not try to be accurate, just to run over future memory reads and get them started ahead of time. The variant of this idea similar to inverse-HT is using an entire processor core as a scout for another core.

However, "practical to implement" doesn't actually mean "win". Scouting hasn't seen much success, probably because it isn't very energy efficient. The one attempt (that I know of) to design a commercial, competitive CPU using some form of scouting, Sun's Rock, got cancelled before it went to market.

PerrineClostermann
Dec 15, 2012

by FactsAreUseless
My German friend says Heise retracted their report within 24 hours, too. Also linked me this.

http://forums.anandtech.com/showpost.php?s=7ad0b9ad3aa7d72041d020d9708e7f40&p=37641760&postcount=21

The NPC
Nov 21, 2010


Pretty in-depth article about the Xeon Phi. Most of the info is from the Open Compute Summit a few weeks ago.

http://www.nextplatform.com/2015/03/25/more-knights-landing-xeon-phi-secrets-unveiled/

Anime Schoolgirl
Nov 28, 2002

*a year ago

champagne posting
Apr 5, 2006

YOU ARE A BRAIN
IN A BUNKER

A while back a generous comp sci goon effortposted about Xeon Phi and how it was hot garbage. Does said goon have a comment?

MaxxBot
Oct 6, 2003

you could have clapped

you should have clapped!!

Boiled Water posted:

A while back a generous comp sci goon effortposted about Xeon Phi and how it was hot garbage. Does said goon have a comment?

I don't know what their complaint was but the old Knight's Corner Phi and the new Knight's Landing Phi are a lot different. The old Phi's cores didn't have the greatest performance, were not x86, and as a result the Phi was only available as an AIC and could not be used as a main CPU. The new Phi has much faster cores that are x86 and is available as a main CPU, which all together should make it a lot faster and easier to program than the old Phi was.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

MaxxBot posted:

I don't know what their complaint was but the old Knight's Corner Phi and the new Knight's Landing Phi are a lot different. The old Phi's cores didn't have the greatest performance, were not x86, and as a result the Phi was only available as an AIC and could not be used as a main CPU. The new Phi has much faster cores that are x86 and is available as a main CPU, which all together should make it a lot faster and easier to program than the old Phi was.

Basically this - the problem was that Knight's Corner had lovely single-threaded performance, had a limited amount of fast memory accessible, and you still needed to custom-write software to use an external coprocessor card. They've supposedly tripled the single-threaded performance, the fact that it works as a general-purpose processor means it can have a bunch of memory attached, and now it basically just looks like a super-parallel CPU to software so it should be possible to run vanilla x86 code on it.

JawnV6
Jul 4, 2004

So hot ...

BobHoward posted:

The latter. "Inverse hyperthreading" is a bullshit fantasy first pushed by AMD fanboys as the tech which would supposedly save AMD from irrelevance.
Yeah, it's in that same uncanny valley as core hopping. Hard to do, causes perf regressions, limited upside even if it's done perfectly.

BobHoward posted:

The final nail in the coffin is that even if any of this could actually work it would only benefit a very narrow set of programs, in a marginal way. For a long time Intel effectively had a 4-wide processor (6 ports to dispatch instructions to, but the decoder could dispatch 4 uops per clock IIRC). Despite being 4-wide the average number of instructions executed per clock (IPC) almost never cracked 2.0 for any real world code, and more commonly hovered somewhere in the range 0.8 to 1.5. There's too many serializing dependencies in typical integer code to use lots of parallel execution units in an effective way.
Back on Penryn the decoder was 4-1-1-1, meaning 4 or 5 macroinstructions could emit up to 7 uops. Very dependent on instruction mix, you need to have a semi-complex one in the first slot, two simple ones in the middle, then a fused test/jmp pair in the last. Then you've got the SB/trace cache further confusing things. I can't find anything on the structure of more recent architectures, mostly because graphics decoders are polluting the namespace. There was talk of 4-2-2 and other blasphemies.

BobHoward posted:

However, "practical to implement" doesn't actually mean "win". Scouting hasn't seen much success, probably because it isn't very energy efficient. The one attempt (that I know of) to design a commercial, competitive CPU using some form of scouting, Sun's Rock, got cancelled before it went to market.
It's weird that Rock's intense speculation is unusable for power efficiency while npm shuffles hundreds of gigabytes daily. Like what system would be build if that fat pipe to a thick server was available to the architecture? A few bytes worth of branch hints delivered alongside a code snippet could do wonders. Feedback piping up to a JIT engine in the cloud, able to track thousands of runtime instances.

Boiled Water posted:

A while back a generous comp sci goon effortposted about Xeon Phi and how it was hot garbage. Does said goon have a comment?
You're thinking of No Gravitas:

No Gravitas posted:

Now to start having fun with it...

Adbot
ADBOT LOVES YOU

No Gravitas
Jun 12, 2013

by FactsAreUseless

Boiled Water posted:

A while back a generous comp sci goon effortposted about Xeon Phi and how it was hot garbage. Does said goon have a comment?

The Xeon Phi I have is still hot garbage. You know what AMD is doing with lots of weak cores? Think ten times the cores, each core one twentieth of the performance. Really cool gadget to own and for the price I got it, worth it for the cool factor. Otherwise, worthless. A nice blue bookstop for me right now.

For this to work semi-decently your workload must love both very wide SIMD and cores, lots of cores. You need a special motherboard for it to work, which is super expensive too. Oh, and to use those vector units you need the intel compiler. Oh, did I forget to mention it won't run regular software either, needing at least a recompile. No copying your SIMD binaries here.

The next version fixes some of those issues, but... I'm not sold on it. If given a new model for free, maybe I'd try it. It is supposed to run regular software without a need to recompile.

Given the cash, just buy a few of the natex deal I linked above, it will at least run everything you throw at it.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply