Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
neogeo0823
Jul 4, 2007

NO THAT'S NOT ME!!

Problem description: The issue started happening infrequently a couple months ago, but has become incredibly common over the past month, to the point where I can't play any game without this happening. At first, it was only while playing Deep Rock Galactic, but two weeks ago, it began happening with Minecraft, and now it's happening with Death's Door as well. What happens is, I'll load up a game, and at some point in the session, typically within 30 minutes, the game will crash, Discord specifically will crash, no other anything that I have open will crash and my computer will display a singularly colored screen for as long as 30 seconds before it swaps to the desktop. It's always a different color, but it's always some kind of dark, neutral tone. Either an earth-tone, or just straight-black. With DRG, the crash handler says that a D3D device was not connected. With other games, there is no crash handler, or the handler closes too quickly for me to see what happened. I checked the Windows event viewer, and it tells me "Display driver nvlddmkm stopped responding and has successfully recovered."

Attempted fixes: At first, I thought it was my GPU. I downloaded MSI Afternurner, and I noticed that the GPU temps were high, around 90C, right when the crashes happened. So on the advice of Google, I underclocked the GPU to 75%, turned down the graphics settings on my games, and fully wiped and re-downloaded the latest graphics drivers in safe mode. After a number of crashes over a period of days, I noticed the CPU temperature was also getting high(91C), so I ordered new thermal paste, which I just replaced about an hour ago. After rebooting the computer and checking that the temperatures were stable, I loaded up Death's Door, and the game crashed almost immediately with the exact same symptoms. This time, the GPU hadn't gotten a chance to get hot. Nor had the CPU, for that matter. GPU temperature is rock solid at 28C, CPU temp had varied between 39C and 68C.

Recent changes: No changes before the problem occurred. The only changes I have made to anything were post-issue.

--

Operating system: Windows 10 64bit

System specs: Lifted straight from my PC Parts Picker list from like 2021:

CPU: AMD Ryzen 5 3600 3.6 GHz 6-Core Processor
Motherboard: MSI B450M BAZOOKA MAX WIFI Micro ATX AM4 Motherboard
Memory: Crucial Ballistix 16 GB (2 x 8 GB) DDR4-3600 CL16 Memory
Storage: Team MP33 512 GB M.2-2280 NVME Solid State Drive
Video Card: EVGA GeForce GTX 980 Ti 6 GB HYBRID Video Card
Case: Cougar MX330-G ATX Mid Tower Case
Power Supply: EVGA G3 650 W 80+ Gold Certified Fully Modular ATX Power Supply
Case Fan: CRYORIG QF120 Silent 44 CFM 120 mm Fan

Location: USA

I have Googled and read the FAQ: Yes

At this point, I'm completely stumped.

Adbot
ADBOT LOVES YOU

neogeo0823
Jul 4, 2007

NO THAT'S NOT ME!!

On a goon's suggestion, I just ran the Windows 10 memory diagnostic, and that went fine and came up with no errors. Someone also said to check my power supply, but if that was bad, wouldn't it be effecting everything, all the time? I consistently get the same stuff reported as causing the crash.

Zogo
Jul 29, 2003

Make sure W10 is fully updated and then run DDU https://www.wagnardsoft.com/forums/viewtopic.php?t=4518 and then install the latest GPU drivers again.

neogeo0823 posted:

On a goon's suggestion, I just ran the Windows 10 memory diagnostic, and that went fine and came up with no errors. Someone also said to check my power supply, but if that was bad, wouldn't it be effecting everything, all the time? I consistently get the same stuff reported as causing the crash.

A failing PSU can cause strange issues like this but it's more likely to be a GPU issue.

neogeo0823
Jul 4, 2007

NO THAT'S NOT ME!!

I tried that before posting the thread, and it didn't help the issue. Just to make sure I did it correctly though, what I had done was download DDU, disable my wifi on my desktop, restart in safe mode, run DDU and blow away any and all graphics drivers on the system, restart normally, and install the latest graphics driver that was recommended to me by Nvidia Geforce Experience before re-enabling the wifi. Is that the correct way to do this? Should I do it again?

Zogo
Jul 29, 2003

You could also try some of the steps here if you haven't yet already:
https://www.partitionwizard.com/clone-disk/display-driver-stopped-responding-and-has-recovered.html

neogeo0823 posted:

Is that the correct way to do this? Should I do it again?

Yeah, if you're on the latest GPU drivers right now then you don't have to run DDU again.

neogeo0823
Jul 4, 2007

NO THAT'S NOT ME!!

Zogo posted:

You could also try some of the steps here if you haven't yet already:
https://www.partitionwizard.com/clone-disk/display-driver-stopped-responding-and-has-recovered.html

Yeah, if you're on the latest GPU drivers right now then you don't have to run DDU again.

According to NGE, I'm currently running Geforce Game Ready Driver version 536.67, released on 7/18/2023. I'll go through that list and give all of it a try.

EDIT: Looking at that list, here's my thoughts on those possible fixes:

1.) I can definitely close some things, like open folders and Firefox, but I can't really close out Discord when playing DRG, as we use Discord's voice chat for in-game talking. Does Discord have some sort of minimized mode, or something, that would allow me to talk but otherwise free up any other resources that aren't going towards that?

2.) I'll try adjusting visual effects and report back if it helps.

3.) I have already reinstalled my graphics drivers.

4.) I've followed the instructions to increase the GPU processing time, and I'll report back if that helps.

5.) I'm already on the latest version of the drivers

6.) I thought the issue was heat related, and unless whatever the computer uses to determine the GPU temperature is faulty, then that's not entirely the issue, as I've detailed previously.

7.) I don't overclock anything, so it's not that

8.) I'm not going to reinstall Windows unless it's like an absolute last-ditch effort. Like, this is assuming I get an entirely new graphics card, and I'm still having this problem.

9.) this is the second-to-last resort.

I'm about to go to bed, so I'll give the things I changed a try tomorrow and see if they help at all. Hopefully something does help.

neogeo0823 fucked around with this message at 03:35 on Jul 30, 2023

Zogo
Jul 29, 2003

If you close Discord temporarily and play one of those games and it crashes it'll narrow down if it was a Discord issue or not.

neogeo0823
Jul 4, 2007

NO THAT'S NOT ME!!

Ok, so I just started the computer, started Steam and Afterburner, and then jumped straight into Death's Door. Surprisingly, I was able to play for maybe 35-45 minutes with just those three programs running. It still crashed, though. Looking at the temperatures, the GPU was rock solid at 35C, and the CPU maxed out at 77C, but mostly averaged around 60C. Steam never crashed, Afterburner never crashed.

In the event viewer, I see the following events:

This is the first event in the sequence. The next two from that highlighted one up are identical to the first, but the line "Resetting TDR Occurred on GPUID:2600" is changed to say "Reset", and "Restarting", respectively.

Up next is the warning:


And then the game crashing:


Finally, the topmost warning is, whatever this is, which I have no idea if it's even related to anything.


If it helps, I can also post screenshots of the details tabs of the error info. Hopefully this is helpful.

Zogo
Jul 29, 2003

I haven't kept up on GPUs in many years but I know some of them can have a BIOS update that might help with this kind of thing. Maybe someone else can confirm this for your case.

At this point I'd try another GPU but you might not have any spares around.

Sunblood
Mar 12, 2006

I'm a freakin' blur here!
I saw those exact same event log entries when my EVGA 2070 was dying. This is very likely to be a GPU failure, possibly caused by prolonged exposure to high temps you were seeing before. Even if it's not hitting those temps anymore, the damage is done. Definitely try to source another GPU to test with before you spend hours trying to fix it in software.

When I submitted an RMA (still under warranty; ymmv) EVGA offered advance shipping so I got the replacement card before sending the broken one back. This let me do some comparison testing with both old and new cards, which was nice. I had to pay a deposit for the full price of the card but it was refunded when they received the one I sent back.

neogeo0823
Jul 4, 2007

NO THAT'S NOT ME!!

Sunblood posted:

I saw those exact same event log entries when my EVGA 2070 was dying. This is very likely to be a GPU failure, possibly caused by prolonged exposure to high temps you were seeing before. Even if it's not hitting those temps anymore, the damage is done. Definitely try to source another GPU to test with before you spend hours trying to fix it in software.

When I submitted an RMA (still under warranty; ymmv) EVGA offered advance shipping so I got the replacement card before sending the broken one back. This let me do some comparison testing with both old and new cards, which was nice. I had to pay a deposit for the full price of the card but it was refunded when they received the one I sent back.

Yeah, it seems the card is dying. Sadness, but inevitable. For full disclosure, I bought this GPU second hand from a fellow goon, who sold it to me cheap at the height of the pandemic, after my last rig died and I couldn't find a card that cost less than the cost of the rest of the rig put together. If only this could've waited till like, I dunno, next year or something? I'm about to have to start repaying my student loans, my rent just went up $130/month, my car literally today nearly died because it had a surprise oil leak I didn't know about, and my lovely broken brain has decided that it's had enough of dealing with life while unmedicated. Oh, and my workplace decided that this month there will be no overtime in any shape or form.

I'd say it's been a hell of a month, but this is like the crescendo of the past year, by now.

Adbot
ADBOT LOVES YOU

down1nit
Jan 10, 2004

outlive your enemies
See if that goon has the receipts, you can still get a free replacement (if it's in warranty)

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply