Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
PC LOAD LETTER
May 23, 2005
WTF?!

Potato Salad posted:

30%... is almost as far back as Sandy Bridge?
Probably a tad slower at somethings if the hit is really that huge. Most things don't seem to be getting effected that bad though + the software to address it seems to be getting worked on super fast so its probably more of a blow for Intel's prestige than anything else.

If it was an across the board 30% hit to performance + the fixes didn't come out for a long time then I could see it hurting their server sales significantly.

AFAIK the biggest issue AMD has is in the server market is getting more Epycs out the door since demand is high so its not like they can really capitalize on that potential weakness anyways. Given the way Epycs are made they should be able to get production ramped relatively quickly so maybe in another quarter or 2 their supply issues will clear up and we'll really begin to see them get some market share in the server space. Taking "just" 10% of the market from Intel would be big for them, getting back to 20% market share like they had back in the Opteron heyday would be huge.

Adbot
ADBOT LOVES YOU

NewFatMike
Jun 11, 2015

The performance hit is likely to be a bell curve with the majority just getting a 5% hit and a handful of workloads getting the 30%. This will be a fun game of productivity roulette after Patch Tuesday for everyone, though.

repiv
Aug 13, 2009

It sounds like I/O heavy workloads will take the worst of it, since they're transitioning between usermode and kernelmode a lot.

GRINDCORE MEGGIDO
Feb 28, 1985


It's going to be extremely interesting to see how the Ryzen refresh looks like Vs borked Intel.

NewFatMike
Jun 11, 2015

The DX9 comparability problem still blows my mind for APU purposes. Those older titles are prime laptop material. Eugh. The Vulkan wrapper is probably going to be too heavy duty for Vega Mobile.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

NewFatMike posted:

The DX9 comparability problem still blows my mind for APU purposes. Those older titles are prime laptop material. Eugh. The Vulkan wrapper is probably going to be too heavy duty for Vega Mobile.

They backed off it today.

NewFatMike
Jun 11, 2015

Paul MaudDib posted:

They backed off it today.

:toot:

The whole situation is dumb, though, and does not keep AMD from looking less like a circus. Hopefully the open sourced Vulkan stack will help them...uh...co-opt some better submissions from the community.

NewFatMike fucked around with this message at 05:40 on Jan 3, 2018

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

NewFatMike posted:

:toot:

The whole situation is dumb, though, and does not keep AMD from looking less like a circus. Hopefully the open sourced Vulkan stack will help them...uh...co-opt some better submissions from the community.

It's loving hilarious that "what if we ported the open-source drivers to Windows and let the community do our jobs for us" is actually a serious option in this situation.

Paul MaudDib fucked around with this message at 05:53 on Jan 3, 2018

Volguus
Mar 3, 2009

Paul MaudDib posted:

It's loving hilarious that "what if we ported the open-source drivers to Windows and let the community do our jobs for us" is actually a serious option in this situation.

The miracle would be if there would be people taking that on (with the required expertise and willingness).

NewFatMike
Jun 11, 2015

Before some signing fuckery a while back, community drivers were pretty commonplace. Valve actually contribute a lot to AMD drivers on the Linux side. Gotta pay those investors back first, then hire some driver talent I guess.

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!
If the average hit turns out to be more than 5%, it'll get quite a little ironic that Intel's ongoing performance advantage came from cutting corners in regards to security.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Combat Pretzel posted:

If the average hit turns out to be more than 5%, it'll get quite a little ironic that Intel's ongoing performance advantage came from cutting corners in regards to security.

It's pretty likely that Intel will have a stepping out that mitigates the general case pretty quickly. The impact is on silicon that exists in the wild, not Intel's uarch in general, and even with (some of) the extant silicon it may eventually be possible to mitigate it in microcode. It may be as simple as disabling speculative execution for certain instructions, or checking the privilege of the calling thread. Super big opening for AMD to sell their hardware here, though.

It's a shitload of silicon in general though, that's bad enough.

Arguably this is code that should have existed anyway, though, given the existence of timing attacks and the known problem of rowhammer, and that may be why the AMD patch of "nuh uh, doesn't apply to us" hasn't been upstreamed so far. It may still be... but this is a good defensive code change regardless. gently caress it, guess we don't share page tables between kernel and userland anymore.

Paul MaudDib fucked around with this message at 06:32 on Jan 3, 2018

fishmech
Jul 16, 2006

by VideoGames
Salad Prong

Combat Pretzel posted:

If the average hit turns out to be more than 5%, it'll get quite a little ironic that Intel's ongoing performance advantage came from cutting corners in regards to security.

Er, but they don't seem to have? It appears to affect all x86-64 supporting Intel CPUs, which would mean it goes all the way back to the first 64 bit Pentium 4s and for the whole line from the first Core 2 Duo chips.

SwissArmyDruid
Feb 14, 2014

by sebmojo

NewFatMike posted:

Before some signing fuckery a while back, community drivers were pretty commonplace. Valve actually contribute a lot to AMD drivers on the Linux side. Gotta pay those investors back first, then hire some driver talent I guess.

No, more investment in shroud R&D.

Risky Bisquick
Jan 18, 2008

PLEASE LET ME WRITE YOUR VICTIM IMPACT STATEMENT SO I CAN FURTHER DEMONSTRATE THE CALAMITY THAT IS OUR JUSTICE SYSTEM.



Buglord

Paul MaudDib posted:

It's pretty likely that Intel will have a stepping out that mitigates the general case pretty quickly. The impact is on silicon that exists in the wild, not Intel's uarch in general, and even with (some of) the extant silicon it may eventually be possible to mitigate it in microcode. It may be as simple as disabling speculative execution for certain instructions, or checking the privilege of the calling thread. Super big opening for AMD to sell their hardware here, though.

Arguably this is code that should have existed anyway, though, given the existence of timing attacks and the known problem of rowhammer, and that may be why the AMD patch of "nuh uh, doesn't apply to us" hasn't been upstreamed so far. It may still be... but this is a good defensive code change regardless. gently caress it, guess we don't share page tables between privileged and unprivileged code anymore.

Your wishy washy opinion on the severity of this vs the ryzen segfault is just amazing. This issue is every xeon sku since basically forever.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Risky Bisquick posted:

Your wishy washy opinion on the severity of this vs the ryzen segfault is just amazing. This issue is every xeon sku since basically forever.

Is there a software patch with ~5% average performance impact on Ryzen that I don't know about?

Errata happen, available mitigations matter. And we don't even have a documented scope of the exploit here - not even a kernel-team consensus about whether it affects AMD or not. It's serious, but the exact trigger conditions are still embargo'd.

As I said in the Intel thread, though, there are definitely cases like databases that Intel had previously smoked Epyc in that are going to poo poo the bed (or at least be competitive) on existing silicon with this patch. There is going to be a pretty serious incentive for players on existing silicon to turtle up their databases and SANs as separate dedicated appliances that you know will only be executing trusted code from you and no others (and can thus run nopti and ignore the performance impact). It's a huge blow to the whole hyperconverged concept - and/or AMD gains a big advantage in this area.

This is on existing code, too, though, and vendors may be able to drop that performance impact a little bit by optimizing for syscalls even harder now that they have become slower.

Paul MaudDib fucked around with this message at 06:57 on Jan 3, 2018

Lord Windy
Mar 26, 2010

NewFatMike posted:

Before some signing fuckery a while back, community drivers were pretty commonplace. Valve actually contribute a lot to AMD drivers on the Linux side. Gotta pay those investors back first, then hire some driver talent I guess.

AMD GPU drivers wouldn't be very easy to contribute to would they? I recall most of the documentation is hidden behind walls like Nvidia's and Intel's (bar a couple of open sourced documents that would theoretically make it possible) so you'd mostly be stuck trying to grok the open source driver.

I wish it wasn't stuck behind walls, it would be great to see what people could come up with in driver space. I'm sure they'd mostly be ultra optimized etherium miners but oh well!

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!

fishmech posted:

Er, but they don't seem to have? It appears to affect all x86-64 supporting Intel CPUs, which would mean it goes all the way back to the first 64 bit Pentium 4s and for the whole line from the first Core 2 Duo chips.
The point was, mitigation of this longstanding snafu may more or less remove the performance Intel currently has over Zen in terms of IPC. Remains to be seen if a hardware or microcode fix makes it back up. There's going to be loss either way, if there can be less speculative execution regardless of how it gets fixed.

feedmegin
Jul 30, 2008

Lord Windy posted:

AMD GPU drivers wouldn't be very easy to contribute to would they? I recall most of the documentation is hidden behind walls like Nvidia's and Intel's (bar a couple of open sourced documents that would theoretically make it possible) so you'd mostly be stuck trying to grok the open source driver.

Intel has open source documentation for all their GPU stuff, they're great about that. It's just their GPUs aren't that good compared to AMD/NVidia.

ufarn
May 30, 2009
Would Intel change the numbering on their processors to something like 8401 if they change the stepping to address this?

Risky Bisquick
Jan 18, 2008

PLEASE LET ME WRITE YOUR VICTIM IMPACT STATEMENT SO I CAN FURTHER DEMONSTRATE THE CALAMITY THAT IS OUR JUSTICE SYSTEM.



Buglord

Paul MaudDib posted:

Is there a software patch with ~5% average performance impact on Ryzen that I don't know about?

Errata happen, available mitigations matter. And we don't even have a documented scope of the exploit here - not even a kernel-team consensus about whether it affects AMD or not. It's serious, but the exact trigger conditions are still embargo'd.

As I said in the Intel thread, though, there are definitely cases like databases that Intel had previously smoked Epyc in that are going to poo poo the bed (or at least be competitive) on existing silicon with this patch. There is going to be a pretty serious incentive for players on existing silicon to turtle up their databases and SANs as separate dedicated appliances that you know will only be executing trusted code from you and no others (and can thus run nopti and ignore the performance impact). It's a huge blow to the whole hyperconverged concept - and/or AMD gains a big advantage in this area.

This is on existing code, too, though, and vendors may be able to drop that performance impact a little bit by optimizing for syscalls even harder now that they have become slower.



Segfaulting probably affected in the tens of thousands in absolute numbers. Cpu errata and/or poor QA on the AMD side, perhaps on purpose to sell the dies to enthusiast consumers. Intels problem is going to be felt by everyone in the hundreds of millions at the very best case and if the penalty is more severe than 5% they are gonna it to get roasted. There is not going to be a cpu recall and this could potentially have serious brand damage if the heavily affected workloads can’t be mitigated to the low teens in terms of performance penalties after they have more time to refactor code following friday.

If this bug is what it takes to get AMD back to the perf crown with zen+ I’m going to laugh

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



Risky Bisquick posted:

If this bug is what it takes to get AMD back to the perf crown with zen+ I’m going to laugh

Can't wait for gamers to try to patch it out and either get owned by exploits or gently caress up their systems constantly.

NewFatMike posted:

hire some driver talent I guess.

2018: the year of the actually good Radeon driver

ufarn
May 30, 2009

Munkeymon posted:

Can't wait for gamers to try to patch it out and either get owned by exploits or gently caress up their systems constantly.
The NeoGAF/ResetERA types are already vowing never to update their Windows 10 again. :doh:

repiv
Aug 13, 2009

I don't envy the people who write GPU drivers :v:

https://twitter.com/FioraAeterna/status/948464769039765504

https://twitter.com/FioraAeterna/status/948473516029968384

Stanley Pain
Jun 16, 2001

by Fluffdaddy

Paul MaudDib posted:

Is there a software patch with ~5% average performance impact on Ryzen that I don't know about?

Errata happen, available mitigations matter. And we don't even have a documented scope of the exploit here - not even a kernel-team consensus about whether it affects AMD or not. It's serious, but the exact trigger conditions are still embargo'd.

As I said in the Intel thread, though, there are definitely cases like databases that Intel had previously smoked Epyc in that are going to poo poo the bed (or at least be competitive) on existing silicon with this patch. There is going to be a pretty serious incentive for players on existing silicon to turtle up their databases and SANs as separate dedicated appliances that you know will only be executing trusted code from you and no others (and can thus run nopti and ignore the performance impact). It's a huge blow to the whole hyperconverged concept - and/or AMD gains a big advantage in this area.

This is on existing code, too, though, and vendors may be able to drop that performance impact a little bit by optimizing for syscalls even harder now that they have become slower.



I think he was calling you out more on your choice of words than anything else. You were quite severe with your language about the AMD segfault issue and seem to be sweeping things under the rug, so to say, about Intel's very big, huge, massive problem affecting basically all their CPUs. In general you do come across as being a bit more Intel fanboyish than I think you realize :shobon:

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!
Bleh, apparently if the CPU supports PCID, the performance loss can be reduced to next to nothing. Haswell and newer does support it.

Rastor
Jun 2, 2001

Combat Pretzel posted:

Bleh, apparently if the CPU supports PCID, the performance loss can be reduced to next to nothing. Haswell and newer does support it.

These database benchmarks were posted in the Intel thread (WHY HAVEN'T WE MERGED THE THREADS YET)

https://www.postgresql.org/message-id/20180102222354.qikjmf7dvnjgbkxe@alap3.anarazel.de

code:
readonly pgbench (tpch-like), 16 clients, i7-6820HQ CPU (skylake):

pti=off:
tps = 236629.778328

pti=on:
tps = 220791.228297 (~0.93x)

pti=on, nopcid:
tps = 198959.801459 (~0.84x)


To get closer to the worst case, I've also measured:

pgbench SELECT 1, 16 clients, i7-6820HQ CPU (skylake):

pti=off:
tps = 420490.162391

pti=on:
tps = 350746.065039 (~0.83x)

pti=on, nopcid:
tps = 324269.903152 (~0.77x)
So with PCID, a 7% hit

Without PCID, a 16% hit

Now that's a database benchmark, which is one of the worst affected workloads, but it doesn't sound like PCID reduced to next to nothing

Generic Monk
Oct 31, 2011

Stanley Pain posted:

I think he was calling you out more on your choice of words than anything else. You were quite severe with your language about the AMD segfault issue and seem to be sweeping things under the rug, so to say, about Intel's very big, huge, massive problem affecting basically all their CPUs. In general you do come across as being a bit more Intel fanboyish than I think you realize :shobon:

i mean the only potential downside to treating this issue as the serious thing it is, is if you own any intel stock. which patently seems to not have been affected by the issue anyway. forcing them to own their mistakes, and at the very least getting them to abide by the nominal rules of the free market hellhole in which we currently reside, should be a moral imperative

speaking as an apple user for a minute it is quite amusing to see apple's new $5000+ computer presumably get merked by this thing as well

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!

Rastor posted:

Now that's a database benchmark, which is one of the worst affected workloads, but it doesn't sound like PCID reduced to next to nothing
Hmmm, crap. I guess it'd still be interesting to see a varied workload. poo poo like x264 encoding doesn't seem to be affected at all, gaming ostensibly also not, which surprises me, because communication with the graphics driver crosses the kernel boundary (I guess not often enough with command batching).

repiv
Aug 13, 2009

Combat Pretzel posted:

(I guess not often enough with command batching).

Yeah, graphics drivers mostly run in usermode these days (at least on Windows) then batch the work into fat command lists before throwing them over the wall into the kernel.

There's probably not that many kernel calls per-frame in the grand scheme of things.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Stanley Pain posted:

I think he was calling you out more on your choice of words than anything else. You were quite severe with your language about the AMD segfault issue and seem to be sweeping things under the rug, so to say, about Intel's very big, huge, massive problem affecting basically all their CPUs. In general you do come across as being a bit more Intel fanboyish than I think you realize :shobon:

The mitigations available for an erratum and their impacts matter. :shrug:

The numbers being thrown around are pathological test-cases and don't seem to be borne out in general instances. IO-heavy workloads are going to take a moderate dent here, but even then real-world workloads do not consist solely of running 'du' and loopback SELECT 1; operations 24/7 so this is still overstating the problem in general.

This will have less than a 1% impact on most applications and the real-world worst case will be probably a 5-10% impact on databases and compilation workloads. That's embarrassing and AMD will be nipping even closer on their heels in a few tasks, but it's far from the apocalyptic "50% performance loss!111!" numbers that are getting breathlessly thrown around.




https://www.computerbase.de/2018-01/intel-cpu-pti-sicherheitsluecke/




https://www.hardwareluxx.de/index.php/news/hardware/prozessoren/45319-intel-kaempft-mit-schwerer-sicherheitsluecke-im-prozessor-design.html

So yeah, to be frank this one's a nothingburger as far as end users are concerned. That's not spin, that's a realistic assessment of the situation.

Paul MaudDib fucked around with this message at 17:56 on Jan 3, 2018

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!
And there goes my shallow justification for upgrading.

--edit: What about this Denuvo DRM poo poo (or whatever it is called) which has games call god knows how many times a second into their kernel driver?
--edit2: Nevermind, I should learn to read.

Combat Pretzel fucked around with this message at 17:42 on Jan 3, 2018

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Combat Pretzel posted:

And there goes my shallow justification for upgrading.

--edit: What about this Denuvo DRM poo poo (or whatever it is called) which has games call god knows how many times a second into their kernel driver?

That's AC:O, second timg down, 3% performance impact at 1080p lowest preset.

repiv
Aug 13, 2009

Combat Pretzel posted:

--edit: What about this Denuvo DRM poo poo (or whatever it is called) which has games call god knows how many times a second into their kernel driver?

Denuvo doesn't use a driver AFAIK, or even an external process. I'm pretty sure it's self contained in the protected executable.

e: It's probably worth mentioning that organizations doing seriously high speed I/O are already bypassing the kernel for performance and shouldn't be affected by this workaround at all.

repiv fucked around with this message at 18:11 on Jan 3, 2018

PerrineClostermann
Dec 15, 2012

by FactsAreUseless
Europe friend is migrating to Epyc :shrug:

Rastor
Jun 2, 2001

Paul MaudDib posted:

The mitigations available for an erratum and their impacts matter. :shrug:

The numbers being thrown around are pathological test-cases and don't seem to be borne out in general instances. IO-heavy workloads are going to take a moderate dent here, but even then real-world workloads do not consist solely of running 'du' and loopback SELECT 1; operations 24/7 so this is still overstating the problem in general.

This will have less than a 1% impact on most applications and the real-world worst case will be probably a 5-10% impact on databases and compilation workloads. That's embarrassing and AMD will be nipping even closer on their heels in a few tasks, but it's far from the apocalyptic "50% performance loss!111!" numbers that are getting breathlessly thrown around.

That may go some ways to explain why the Linux kernel developers set the pti=on to be the default for all x86 processors in their patch rather than having an exception for AMD.

So it may become 'make your database faster on AMD processors with ONE WEIRD TRICK'

Bloody Antlers
Mar 27, 2010

by Jeffrey of YOSPOS

Paul MaudDib posted:



So yeah, to be frank this one's a nothingburger as far as end users are concerned. That's not spin, that's a realistic assessment of the situation.

Paul, this nothingburger is embargoed for a pretty good reason; if a script to exploit this security flaw were to make its way around the internet before a security fix is in place, every single loving person with sensitive data in a data center would be affected. And since this exploit has been known for months prior to the rushed patches, we have no idea how much damage has already been done by those that found it first. THAT is a realistic assessment of the situation.

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

Gpu driver bugs don't cause privilege escalation vulnerabilities across vms, yet

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

Rastor posted:

These database benchmarks were posted in the Intel thread (WHY HAVEN'T WE MERGED THE THREADS YET)

https://www.postgresql.org/message-id/20180102222354.qikjmf7dvnjgbkxe@alap3.anarazel.de

code:
readonly pgbench (tpch-like), 16 clients, i7-6820HQ CPU (skylake):

pti=off:
tps = 236629.778328

pti=on:
tps = 220791.228297 (~0.93x)

pti=on, nopcid:
tps = 198959.801459 (~0.84x)


To get closer to the worst case, I've also measured:

pgbench SELECT 1, 16 clients, i7-6820HQ CPU (skylake):

pti=off:
tps = 420490.162391

pti=on:
tps = 350746.065039 (~0.83x)

pti=on, nopcid:
tps = 324269.903152 (~0.77x)
So with PCID, a 7% hit

Without PCID, a 16% hit

Now that's a database benchmark, which is one of the worst affected workloads, but it doesn't sound like PCID reduced to next to nothing

That's nuts. Thats a colossal hit to throughputs for anyone running a database on a vm, of which there are many

Any syscall heavy workload is gonna get hosed. However silver lining is that kernel offload or xdp style solutions will become required.

Adbot
ADBOT LOVES YOU

Cygni
Nov 12, 2005

raring to post

so the current Linux fix has the performance hit regardless of CPU manufacturer, do we know if thats the case with the NT kernel fix yet?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply