Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Yeep
Nov 8, 2004
Problem description: Windows is crashing with, but not limited to the following blue screen errors:
CRITICAL_PROCESS_DIED
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED acpi.sys
KERNEL_SECURITY_CHECK_FAILURE
MEMORY_MANAGEMENT
KERNEL_EXCEPTION_NOT_HANDLED
KERNEL_MODE_HEAP_CORRUPTION
Also some about filesystem, network and bluetooth drivers that I didn't write down.

Crashes happen during boot, after login, when idle, during the Windows recovery process and once or twice when re-installing from a USB stick.

Attempted fixes: I've reinstalled Windows on a brand new hard drive (upgrading to nvme m.2 from an old sata ssd). That seemed to fix things for a couple of days but I had to take everything apart short of the CPU to get at the m.2 slot so the RAM and most of the power cables got reseated, and now I'm back where I was before with bluescreen crashes all the time.

I'm able to run the Intel CPU stress test, and all the OCCT tests for an hour without triggering a crash. memtest86 ran for 10 hours without errors with both sticks of RAM, then got stuck at 4GB on the first test with only one stick, then passed with the other, then passed with the first one again, then passed with each stick in both slots, then with both sticks in again. Running it again a couple of days later I got ~500 errors in test 3 with both sticks but that wasn't repeatable.

I've blasted everything with compressed air (with the power off).

It feels to me like my RAM is failing, but I can't get memtest to reliably show me that. Would reseating the memory temporarily hide bad RAM sticks? I've got some DDR 3600 I can swap in for testing but it's much smaller capacity and slightly I'm worried something in my PC is actively killing components. Plus going on the last couple of days of troubleshooting it might be a while before things go wrong again even if the RAM is good.

Recent changes: My Radeon VII died slowly about 6 weeks ago so I replaced it with a 1660 super I scavenged from another PC. I had some crashes then that were solved by removing all the old display drivers and things had been stable for about a month. I had to physically move the PC to another room for a couple of hours and when I brought it back the current bluescreens started. The 1660 has now gone back to the other PC and I'm running on the Intel iGPU.

Operating system: Windows 10 Pro 64 bit

System specs:
i7 8700k
ASRock Z370 Gaming-ITX/ac
2x16GB G.Skill DDR4 3200
Sapphire Radeon VII (confirmed deceased in another machine)
Gigabyte 1660 super Mini ITX OC (currently back in it's donor machine)
Silverstone SST-SX600-G 600W PSU
Corsair Force MP510 960GB m.2 SSD
Samsung 850 Evo 500GB (currently unplugged)

Location: UK

I have Googled and read the FAQ: Yes

Adbot
ADBOT LOVES YOU

down1nit
Jan 10, 2004

outlive your enemies
CPU is my first guess. Disable features on it in your bios. Speedstep, c states, boosting related....

I've seen cpus turn stable when you don't gently caress with the voltages or frequencies

C States has been the best performer. Stops the CPU from dropping voltages real fast like.

down1nit
Jan 10, 2004

outlive your enemies
You're not boosting up and dropping down on a stress test. You're max everything for forever.

Also that cpu has the memory controller in it.

down1nit
Jan 10, 2004

outlive your enemies
Holy poo poo asrock, you can disable individual components within the processor, that's terrifying and awesome. Good job lads.

Yeep
Nov 8, 2004

down1nit posted:

CPU is my first guess. Disable features on it in your bios. Speedstep, c states, boosting related....

I've seen cpus turn stable when you don't gently caress with the voltages or frequencies

C States has been the best performer. Stops the CPU from dropping voltages real fast like.

Thanks! Disabling C states seems to have made things more stable. I was previously able to force a bluescreen within 30 seconds of Windows startup but I'm now 30 minutes in without a crash. Is this a long term solution or should I be shopping for a new CPU?

down1nit
Jan 10, 2004

outlive your enemies
I'd RMA it but you can probably find a stable setting with all the poo poo asrock packed into that board. I now realize rma period is probably passed

Edit: it's just a power saving thing really. So keeping it disabled is fine as long as your temperatures are good

down1nit fucked around with this message at 16:53 on Sep 1, 2021

Yeep
Nov 8, 2004
I'm pretty sure the warranty expired last week.

I had to take the heatsink off to find the serial number on the CPU and after re-mounting it (plus cleaning off and reapplying thermal paste) everything is now much worse. I can't even boot the Windows recovery USB and sometimes I don't get as far as the BIOS post screen.

I've put in a support request with Intel anyway in case they're ok with the fact it actually failed last week (in warranty) even if it's taken me until now to diagnose, but I think I'm probably hosed.

edit: :mad:

Yeep fucked around with this message at 10:52 on Sep 2, 2021

Yeep
Nov 8, 2004

Yeep posted:

I've put in a support request with Intel anyway in case they're ok with the fact it actually failed last week (in warranty) even if it's taken me until now to diagnose, but I think I'm probably hosed.

No real update on this other than to say gently caress you Intel for making me buy a 15x macro lens for my phone before you'll consider an RMA.
https://www.intel.com/content/www/us/en/support/articles/000021613/processors/intel-core-processors.html

down1nit
Jan 10, 2004

outlive your enemies
Oh god. Yeah they ask for that. It's just dots on a green background. How the gently caress is anyone going to read that.

It's obviously a deterrent for returns right? Like, it has to be. They are fully capable of printing that on the heat spreader as well as the PCB.

Adbot
ADBOT LOVES YOU

down1nit
Jan 10, 2004

outlive your enemies
Ohh wait it's cheap and easy to de-lid a cpu. Harder to move the silicon to another PCB...? Still, better letters please, Intel?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply