Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

https://twitter.com/jioriku/status/1344871845762314241?s=21

lol

Adbot
ADBOT LOVES YOU

Lambert
Apr 15, 2018

by Fluffdaddy
Fallen Rib
RIP to my old HP Pocket PC with an Intel Xscale ARM chip. Guess selling that division wasn't that smart in hindsight.

priznat
Jul 7, 2009

Let's get drunk and kiss each other all night.
Are foldable screens going to be the new 3d screen/curved tv fad that consumers never really have any interest in despite industry trying super hard to invent a new product class? Probably not really OT for the intel thread but that was my 2nd reaction to that tweet after “lol” as well.

DrDork
Dec 29, 2003
commanding officer of the Army of Dorkness

priznat posted:

Are foldable screens going to be the new 3d screen/curved tv fad that consumers never really have any interest in despite industry trying super hard to invent a new product class? Probably not really OT for the intel thread but that was my 2nd reaction to that tweet after “lol” as well.

Not anytime soon, but maybe in 5 years when they don't add $1000 to the price of a phone and also kinda suck.

If Intel wants to do ARM in mobile platforms, they're gonna have to work pretty hard to catch up to Qualcomm. I guess they could try to buy Exynos from Samsung--that'd sufficiently meet Intel's metric for dumping money into a bonfire, I'd think.

movax
Aug 30, 2008


I'm too lazy to look up who's actually on Intel's board, but clearly they don't know any better than "gimme some of those arms the kids are talking about" sad if a true rumor.

gradenko_2000
Oct 5, 2010

HELL SERPENT
Lipstick Apathy
have to admit that FIRESTORM is a hell of a lot more compelling name than i5-1135QWERTY

BIG HEADLINE
Jun 13, 2006

"Stand back, Ottawan ruffian, or face my lumens!"
"Focus groups tell us people like all the numbers and letters." :downs:

I will admit that AMD does a really bad job of letting people know how many cores their mobile chips have at a glance at the model number. I mean, seriously - just name the 8/16 parts the Ryzen 9 4908, or Ryzen 7 4708. Put a "V+" on the parts with virtualized cores.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE
Apple and Ampere have really kicked off an arms race in high-performance ARM lol. Intel is apparently chasing it, NVIDIA is trying to loving buy ARM to chase it. Everyone wants to do ARM now.

I'm mindboggled that ARM is the surprise winner of 2020, like who knew that 2020 was the year of poo poo getting real with ARM. I know everyone's had their own in-house chips for a while customized to the widget frobulator server's exact load characteristics, for AWS and other generic cloud usage, etc but ARM on the laptop and desktop and server competing with x86 per-thread performance characteristics is completely nuts. Maybe the start of an era not entirely dominated by x86 lol.

(which is why I think NVIDIA won't torch the ARM ecosystem - they would be insane to throw away this momentum and try to do it all themselves.)

Is there some ruling that limited the enforcement of Intel's x86 license stuff? I seem to remember NVIDIA getting read the riot act when they wanted to emulate x86 with Denver, what has changed? now everybody seems to be going for the "rosetta" idea and I haven't heard of Intel suing everyone into the ground.

M1 is real good for what it is though, and Apple can totally do some sweet desktop-class processors if they want (if they have the fab capacity).

I suppose that's why you see Intel going to a big.LITTLE with Alder Lake? ARM is usually big.LITTLE so we're about to see a whole lot of work on making that work well (at an OS level).

Packages with all performance cores (which I'm 100% sure AMD will have, as they're a byproduct from Epyc, but who knows with Intel) will probably be more desirable for enthusiasts of course, but maybe Intel is making a play for the mass market. "Here's your office box, it now pulls 3w idle" type stuff. But apart from office boxes... Alder Lake with 8+8 seems like a meh comparison vs 12 full fat Zen4 cores (assuming TSMC hell ever ends). And does Intel even have a play for HEDT coming any time soon? Intel seems to be curving away from the enthusiast market unfortunately.

AMD could probably feasibly implement these as separate "CCXs" with their own separate cache/etc. Mobile dies would probably have to be monolithic for efficiency, but it probably would be no sweat for them to add efficiency cores architecturally. As long as there's OS level support for handling it well of course.

Apparently Apple designed their core so that it had the same memory/cache coherency "syntax" as x86 instructions rather than the usual looser ARM standards that require you to spam memory barriers everywhere when emulating x86, and that's supposedly a big plus for rosetta performance. I wonder if we will see other "laptop/desktop class" ARM cores following that lead.

Paul MaudDib fucked around with this message at 07:12 on Jan 2, 2021

WhyteRyce
Dec 30, 2001

I hope these regular board meetings are bringing in actual technical people for input and not just keeping it amongst the board and Bob

Bob himself should keep himself very distanced from these conversations because all he knows how to do is refine known working existing processes and business plans

SCheeseman
Apr 23, 2003

priznat posted:

Are foldable screens going to be the new 3d screen/curved tv fad that consumers never really have any interest in despite industry trying super hard to invent a new product class? Probably not really OT for the intel thread but that was my 2nd reaction to that tweet after “lol” as well.

The concept is a good idea: small phone for talking and casual/simple media viewing (scrolling through news/twitter/facebook, controlling audio playback and short form video) that folds out into a tablet for large format media (books, comics, photography, film and TV).

All that's available now is either a phablet that folds out into a small tablet with a crease down the middle or a phone that folds in half. What's needed is foldable glass but that seems like future tech to me.

BlankSystemDaemon
Mar 13, 2009




Paul MaudDib posted:

Apple and Ampere have really kicked off an arms race in high-performance ARM lol. Intel is apparently chasing it, NVIDIA is trying to loving buy ARM to chase it. Everyone wants to do ARM now.
Minor detail, but the Ampere core is Neoverse, which is just straight up ARM IP - so even if NVIDIA doesn't buy ARM, they can still license it, just like anyone else can

Lambert
Apr 15, 2018

by Fluffdaddy
Fallen Rib

Paul MaudDib posted:

Is there some ruling that limited the enforcement of Intel's x86 license stuff? I seem to remember NVIDIA getting read the riot act when they wanted to emulate x86 with Denver, what has changed? now everybody seems to be going for the "rosetta" idea and I haven't heard of Intel suing everyone into the ground.

The patents for full x86-64 (including SSE2) ran out this year, which makes something like Apple's Rosetta possible without worrying about patents. This is also why they don't support AVX, that is still patented.

Maybe part of the reason Apple waited this long to introduce ARM Macs, considering this move has been rumored for many years.

repiv
Aug 13, 2009

Lambert posted:

The patents for full x86-64 (including SSE2) ran out this year, which makes something like Apple's Rosetta possible without worrying about patents. This is also why they don't support AVX, that is still patented.

Maybe part of the reason Apple waited this long to introduce ARM Macs, considering this move has been rumored for many years.

Rosetta supports up to SSE4, which being 6 years newer is presumably still patented

I think not supporting AVX has more to do with M1 only having 128 bit SIMD in hardware - they could emulate it at half rate but there wouldn't be much point

gradenko_2000
Oct 5, 2010

HELL SERPENT
Lipstick Apathy
is there a use-case for AVX-512 in gaming, even on a theoretical level?

or does game programming not work like that

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.

gradenko_2000 posted:

is there a use-case for AVX-512 in gaming, even on a theoretical level?

or does game programming not work like that

Yes, definitely. Potential uses for AVX-512 are all over the place, wherever you're performing the same operation on a large amount of data at once, super common in image processing, cryptography, there's certainly room for usage in gaming.

One of the biggest problems that AVX-512 has is that most applications where it's useful, a GPU tends to perform even better. There's higher latency when doing it on a GPU because you have to copy things into VRAM, send whatever commands to the graphics driver for execution, and then copy results back. Apple (and the gaming consoles) unified memory means that they're able to do GPU computation at lower latency than more typical desktop computers, meaning that they might be able to get away with never having AVX-512 or similar without giving up much performance.

I'm not an expert at this, but I'm of the opinion that broadly, AVX-512 is a mistake. The place where it should be most relevant, servers without GPUs, it's being effectively replaced with GPUs in servers anyway, or dedicated purpose hardware like Google's Tensor Processing Units. AMD and Nvidia both have solutions now to do direct memory access to VRAM, so you can get data in and out of VRAM without copying through main memory: https://developer.nvidia.com/blog/gpudirect-storage/. Intel has also been really precious about which CPUs they put both AVX-512 modules into. For example, the $1k/CPU Xeon Silver 4216 ships with half the AVX-512 performance of its bigger, much more expensive siblings.

ColTim
Oct 29, 2011
While AVX-512 doesn't exist for client-level computing (besides the ultralight notebooks), there are uses for SIMD in gaming. The tricky part is that it sorta needs to be built in from the beginning, as it boils down to how data is arranged/stored in memory.

repiv
Aug 13, 2009

Twerk from Home posted:

I'm not an expert at this, but I'm of the opinion that broadly, AVX-512 is a mistake. The place where it should be most relevant, servers without GPUs, it's being effectively replaced with GPUs in servers anyway, or dedicated purpose hardware like Google's Tensor Processing Units.

AVX-512 was originally developed to be a GPU ISA for Larrabee, you have to wonder if Intel would have gone down that road if Larrabee never existed

Maybe Intel would still be on iterations of 256-bit AVX but with MoRe CoReS instead, like AMD is currently doing

Lambert
Apr 15, 2018

by Fluffdaddy
Fallen Rib

repiv posted:

Rosetta supports up to SSE4, which being 6 years newer is presumably still patented

I think not supporting AVX has more to do with M1 only having 128 bit SIMD in hardware - they could emulate it at half rate but there wouldn't be much point

No clue whether patents still apply to SSE4, but where did you see information about SSE4 support? If they have it and it is still covered under patents, they'd have to have a license. But I can't find anything that states the exact level of SSE support in Rosetta.

repiv
Aug 13, 2009

Lambert posted:

No clue whether patents still apply to SSE4, but where did you see information about SSE4 support? If they have it and it is still covered under patents, they'd have to have a license. But I can't find anything that states the exact level of SSE support in Rosetta.

https://www.anandtech.com/show/16252/mac-mini-apple-m1-tested/6

quote:

As long as a given application has a x86-64 code-path with at most SSE4.2 instructions, Rosetta2 and the new macOS Big Sur will take care of everything in the background

Lambert
Apr 15, 2018

by Fluffdaddy
Fallen Rib
Seems to be the only source of that, but thanks! Apparently, Windows 10 on ARM does have support up to SSE4.2 as well. I wonder whether there's some list of SSE patents around.

repiv
Aug 13, 2009

If someone has an M1 Mac to hand it would be interesting to see which features MacCPUID thinks it has

https://software.intel.com/content/www/us/en/develop/download/download-maccpuid.html

BlankSystemDaemon
Mar 13, 2009




repiv posted:

Larrabee never existed
Larabee is a weird thing, because it sort of did exist, and completely legitimately exists in the form of Knights Landing.
Also, it ran FreeBSD.

Tom's back at Intel again, too - makes me wonder if he's got anything to do with Xee or is working on something else.

SwissArmyDruid
Feb 14, 2014

by sebmojo

Lambert posted:

The patents for full x86-64 (including SSE2) ran out this year,

https://i.imgur.com/rAFP13z.gifv

trilobite terror
Oct 20, 2007
BUT MY LIVELIHOOD DEPENDS ON THE FORUMS!

repiv posted:

If someone has an M1 Mac to hand it would be interesting to see which features MacCPUID thinks it has

https://software.intel.com/content/www/us/en/develop/download/download-maccpuid.html

Ask in the Mac hardware thread

serebralassazin
Feb 20, 2004
I wish I had something clever to say.
I have 16gb M1 mini.

repiv posted:

If someone has an M1 Mac to hand it would be interesting to see which features MacCPUID thinks it has

https://software.intel.com/content/www/us/en/develop/download/download-maccpuid.html

Virtual Apple @ 2.50ghz
Architecture - Westmere - 1st Generation Intel Core
Family - 6 (06h)
Model - 44 (2ch)
Stepping - (00h)
TSC frequency - ,999,999 (hz)
GPU Model(s) - Unknown

AES
CLFSH
CMOV
CX16
CX8
DE
DTES64
EM64T
FPU
FXSR
LAHFSAHF
MMX
MONITOR
PAT
PCLMULQDQ
POPCNT
PSE36
RDTSCP
SEP
SS
SSE
SSE2
SSE3
SSE4_1
SSE4_2
SSSE3
SYSCALLRET
TSC
XD

Cygni
Nov 12, 2005

raring to post

just like the Kaby Lake X parts for X299, Rocket Lake is gonna lead to some strange motherboard and manual instructions

priznat
Jul 7, 2009

Let's get drunk and kiss each other all night.

Cygni posted:

just like the Kaby Lake X parts for X299, Rocket Lake is gonna lead to some strange motherboard and manual instructions



I'm sure the motherboard manuals will make it extremely clear what is up (lol)

Lambert
Apr 15, 2018

by Fluffdaddy
Fallen Rib

Muchas gracias


Seriously. Feels like the Athlon 64 wasn't that long ago.

movax
Aug 30, 2008

priznat posted:

I'm sure the motherboard manuals will make it extremely clear what is up (lol)

I’ve always liked Supermicro because they put the block diagram at the front so you can see exactly where all the lanes and other things go and can piece together the available configuration yourself if you know what the chipset does.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

gradenko_2000 posted:

is there a use-case for AVX-512 in gaming, even on a theoretical level?

or does game programming not work like that

Yes, it’s not just “AVX2 but wider”, there are a whole bunch of new instruction types.

For example bit-masked instructions where you can make an operation only apply to certain lanes in the vector. So take any operation you might already be using AVX for in a game, well, what if you only want an operation to apply to 4 of those lanes for a step, but then go on to keep further processing with the vector still loaded. Plus things like population count instructions which support that, finally support for scatter operations, etc.

Neural inferencing instructions are another one that could potentially see use in something like game AI, where you don’t really have enough workload to merit sending it off to a GPU but you would want a quick low latency output, whether that's running a decision net for each unit in a game or running a deepmind-like AI or something. That's not how it's done now of course, but everyone has been talking about using neural networks for AIs for a while, so it's certainly a theoretical application.

It’s a huge improvement at an instruction set level and actually would be worth implementing even if you only ran it at half-rate (i.e. 512-bit instructions would take 2 cycles like AMD did with AVX2 on Zen1/Zen+).

Paul MaudDib fucked around with this message at 01:30 on Jan 3, 2021

repiv
Aug 13, 2009

Paul MaudDib posted:

It’s a huge improvement at an instruction set level and actually would be worth implementing even if you only ran it at half-rate (i.e. 512-bit instructions would take 2 cycles like AMD did with AVX2 on Zen1/Zen+).

If AVX-512 gets used in games I suspect they'll use the 256-bit instruction variants anyway to avoid dealing with Intel's aggressive downclocking on 512-bit ops

Intel did thankfully have the foresight to include versions of every AVX-512 instruction that operate on 128-bit or 256-bit registers instead of the full width

shrike82
Jun 11, 2005

zero use for AVX-512 in gaming given GPUs even for inferencing

repiv
Aug 13, 2009

SIMD is plenty useful for gaming, anything related to gameplay logic needs to be processed immediately, not punted onto an asynchronous queue that will get back to you 1 or more frames in the future

shrike82
Jun 11, 2005

i'm referring specifically to AVX-512. also, does it still throttle performance on mixed workloads?

Arivia
Mar 17, 2011

Paul MaudDib posted:

Apple and Ampere have really kicked off an arms race in high-performance ARM lol. Intel is apparently chasing it, NVIDIA is trying to loving buy ARM to chase it. Everyone wants to do ARM now.

I'm mindboggled that ARM is the surprise winner of 2020, like who knew that 2020 was the year of poo poo getting real with ARM. I know everyone's had their own in-house chips for a while customized to the widget frobulator server's exact load characteristics, for AWS and other generic cloud usage, etc but ARM on the laptop and desktop and server competing with x86 per-thread performance characteristics is completely nuts. Maybe the start of an era not entirely dominated by x86 lol.

2021 is the year of linux on the arm desktop

repiv
Aug 13, 2009

shrike82 posted:

i'm referring specifically to AVX-512. also, does it still throttle performance on mixed workloads?

The full fat 512-bit operations are still dubious for mixed workloads, but you can use 256-bit operations to sidestep that

As discussed above AVX-512 is more than AVX-but-wider, there's enough new stuff in there that it's worth having even if you stick to working on 256 bits at a time

(obviously game devs aren't going to use AVX-512 widely until the consoles get it, this is mostly hypothetical. at least the new consoles finally have full rate AVX2 so that can finally be used in anger now)

repiv fucked around with this message at 03:46 on Jan 3, 2021

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

shrike82 posted:

i'm referring specifically to AVX-512. also, does it still throttle performance on mixed workloads?

not on Ice Lake at least.

Ice Lake loses 100 MHz max boost when running 512-bit operations on a single core, in other circumstances there is no change in boost. So basically it can't run quite the highest possible (single-threaded) boosts, it has to back down by 100 MHz on single-threaded loads, but if you are running all-core loads then there's no offset anymore.

https://travisdowns.github.io/blog/2020/08/19/icl-avx512-freq.html

(obviously a big chunk of that is 10nm vs 14nm, but Rocket Lake supposedly only has 2x256b fusable AVX units, it doesn't have the extra 512b unit that Skylake-X/Skylake-SP did, so even in the absence of further improvements it might be somewhat better than Skylake-X. And Skylake-X/Xeon-W itself downclocks less than Skylake-SP, which is where there is really heavy downclocking and where Cloudflare or whoever it was was doing their testing.)

Paul MaudDib fucked around with this message at 04:06 on Jan 3, 2021

repiv
Aug 13, 2009

huh i didn't know ice lake improved matters that much

looks promising if they can scale that up to bigger chips

shrike82
Jun 11, 2005

Paul MaudDib posted:

not on Ice Lake at least.

Ice Lake loses 100 MHz max boost when running 512-bit operations on a single core, in other circumstances there is no change in boost.

https://travisdowns.github.io/blog/2020/08/19/icl-avx512-freq.html


quote:

Licence-based downclocking is only one source of downclocking. It is also possible to hit power, thermal or current limits. Some configurations may only be able to run wide SIMD instructions on all cores for a short period of time before exceeding running power limits.

:shrug:

Adbot
ADBOT LOVES YOU

movax
Aug 30, 2008


Uhhhhh, "license-based downclocking", is that what I think it is?

Also whatever Zen2 supports is what AAA / "big" games are going to assume for the next...5 years, I guess?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply