Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!
I built this pc in early March of this year. Hadn't had any issues until what looks like in May (going by Event Viewer) my screen would freeze, kb/m became unresponsive, and the sound continued to play while playing games. Happened at least once in Overwatch where I specifically remember Zarya's beam sound playing on loop with no distortion, and at least twice during Final Fantasy 14 where the bgm would become distorted/stuttery. All cases required that I used the power button to shut the machine off and restart. The freezes also didn't seem to occur during high-intensity situations in-game. I can hear my fans when the temp starts getting up into the 70s or beyond, and that wasn't happening during the freezes. Since then, I stopped playing OW (both because of this issue and for unrelated reasons), continued playing FF14, and was also able to play Elden Ring with no issues. I considered making a thread here then, but Google searches returned (what I thought to be) generic answers without specific solutions and since I wasn't having the issue anymore I chose to ignore it. I admit that I am a colossal dipshit.

Fast forward to now, Remnant 2 came out, I bought it to play with friends, and this past Monday I get the same screen freeze issue. New detail is that I was in Discord with a friend at the time which hadn't happened previously, so I could hear him talking but I couldn't respond since I use push-to-talk and kb wasn't working. This happened after about an hour of playing I think. I restarted, tried again, and the issue happened again within 5 minutes so at that point I stopped playing. Later that day, I tried playing FF14 and got the freeze almost immediately after logging in; tried twice, same thing both times. I later tried playing Heroes of the Storm and that was fine for a few hours, until at one point the screen went black for a few seconds. When it came back on, I was still in the game but apparently I had to reconnect, and there didn't seem to be any graphical issues other than the skill icons were hosed up when on cooldown. Before going to bed I browsed the net for a bit and once over the period of about an hour the screen went black again. I tried watching Always Sunny on Plex before going to bed and got about 20min into the episode when the screen froze and the sound continued, which to this point had not happened outside of playing a game. I let it go for a bit and after maybe 10~15 seconds the colors on the screen changed, like the image was corrupted, and about another 5 seconds after that the sound started distorting, then the sound cut and the screen went black. Turned off the pc and figured I'd deal with it tomorrow.

Tuesday I woke up, turned on the pc, and my monitor told me there's no signal. I removed the graphics card and plugged the monitor into the mb Display Port which worked, but was also at a (relatively) gross low resolution. At least it was functional. I didn't try playing any games, was still feeling upset and not wanting to dig into the problem so I spent the rest of the day doing other stuff and just used the pc to check my email, let me friends know what was going on and watch some youtubes. The thing I thought was weird was that I usually put the pc to sleep when I'm done for the day, but now I was only given the option of Restart or Shut Down which I chalked up to some kind of restriction from using the mb port. I opted to shut down and went to bed. This morning when I turned the pc on, it booted up as normal which I could tell because my mouse lit up and the speakers played the Windows login jingle, but my monitor again told me there was no signal. So I grabbed my old-rear end pc which this new one replaced, connected that to my monitor with HDMI, and hey, that works. Checked my email/daily browsing stuff, things are fine. I decided to to swap my old gpu (a Geforce GTX 750 Ti) into the new pc and see what happens, and initially it had the same low-res as using the mb display port, but then I guess it updated itself a couple minutes later so now I'm at the resolution I was previously at with the new pc gpu, and am currently posting with this arrangement.

I have not recently made any changes on this pc. Around the first time the issues cropped up, I tried downloading and installing a chipset for the mb I believe, but the files had some kind of conflict with the Windows Updater. I don't really remember exactly how that turned out, just that it wasn't a simple "install this and it'll do its thing."

I initially thought the problem was with my gpu, but a friend made an off-hand comment about "widespread mobo issues" so then I thought it could be the mb isn't communicating with the gpu right or something. But now having swapped the gpu out for my older one and that working so far, I'm leaning back towards the gpu being the problem. Also confused about why the mb's display port would work fine one day and not the next after a proper shut down/boot. I did do a cursory Google about the screen freeze issue, but everything that's happened in the last few days has led me to thinking I'm better off asking for direct help because I'm not sure where to go from here. I realize that that's a bunch of poo poo to read through and I apologize to and appreciate anyone who takes the time to do so.

One of the suggestions I'd seen from google results was to check Event Viewer. Filtering for Critical and Error just gives me results that say "the system was rebooted/lost power without cleanly shutting down first," which, yes, I did that because it was the only recourse at the time. However, around the time of the black screen during Always Sunny, I got a series of nvlddmkm errors, but only then. They have not appeared around the other freezes.

System Specs

Windows 11 Home 22H2

AMD Ryzen 7 7700X 4.5 GHz 8-Core Processor
Thermalright Peerless Assassin 120 SE 66.17 CFM CPU Cooler
Gigabyte B650 AORUS ELITE AX ATX AM5 Motherboard
Corsair Vengeance 32 GB (2 x 16 GB) DDR5-5600 CL36 Memory
Western Digital Black SN770 2 TB M.2-2280 PCIe 4.0 X4 NVME Solid State Drive (x2)
MSI VENTUS 3X GeForce RTX 4070 Ti 12 GB Video Card
Lian Li LANCOOL 216 ATX Mid Tower Case
Dell S2721DGF 27.0" 2560 x 1440 165 Hz Monitor ASUS Tuf VG27AQ Monitor

I am in the USA.

Like I said I don't know what to do next. I'm thinking about reslotting the 4070 back into the new pc and seeing if anything comes of that. I don't know why it would but I also don't know why the display port on my mb would give no signal after giving a signal the previous day. I also considered trying to slot the 4070 into my old pc and testing that but there's just not enough room for it. Also the older mb was budget back in 2015 so I imagine it probably wouldn't play nice with a newer gpu. Any help at all would be greatly appreciated.

Vanrushal fucked around with this message at 03:39 on Aug 4, 2023

Adbot
ADBOT LOVES YOU

Zogo
Jul 29, 2003

Some things to try:

-Make sure W11 is fully updated.

-Make sure you're on the latest motherboard BIOS.

Run DDU https://www.wagnardsoft.com/forums/viewtopic.php?t=4518 and then install the latest GPU drivers again.


If none of that helps I'd try running https://www.hdsentinel.com/hard_disk_sentinel_trial.php to check your drive health.

If the drive is okay then run https://www.memtest.org/ overnight at some point to check RAM health.

Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!
Windows is up to date including optional updates.

Taking your suggestions in the order they were made and I've already hit a snag on updating the BIOS. I DLed the latest version from here. I followed the instructions in this video, but when I got to the point where it asked "Are you sure to update BIOS?" (at 2:07), when I clicked Yes it gave me the message "BIOS ID check error." I noticed that the files I DLed included a pdf so I checked that out and it had an alternate set of instructions as seen here. Followed these (my flash drive was FS3 if it matters, and I had to attempt it twice due to not noticing the first time that I had to include the colon when identifying the drive), and after I typed in "Flash.nsh", the program I guess accepted it and there was a very quick 0-to-100% thing but nothing else happened. No reboot or anything. When I went back into the BIOS settings, the version still read the default F3, as opposed to the F8A included in the DL. This is my first time attempting to update a BIOS ever, and I could use some pointers on what to try next.

As for DDU, should I get the installer or the portable version, or does it really matter?

Thanks for replying!

Just realized my monitor is actually an Asus VG27AQ. Idk if it matters, but I want to be thorough. Editing op to reflect this.

Zogo
Jul 29, 2003

Vanrushal posted:

I DLed the latest version from here.

That doesn't look like the right motherboard as it says AORUS MASTER rather than AORUS ELITE.

I think you want this one. But the question is which rev. you have: rev. 1.0/1.1 or rev. 1.2

https://www.gigabyte.com/Motherboard/B650-AORUS-ELITE-AX-rev-10-11/support#support-dl-bios
https://www.gigabyte.com/Motherboard/B650-AORUS-ELITE-AX-rev-12/support#support-dl-bios

You can find out from the motherboard box or the manual/documents might specify that information.

edit: You probably have 1.0 or 1.1

Vanrushal posted:

As for DDU, should I get the installer or the portable version, or does it really matter?

It doesn't matter that much but the installer would normally be used if you're putting it on your own computer. Portable would go on a flash drive if you were taking it to multiple computers and didn't want to leave extra stuff on the machines.

Zogo fucked around with this message at 04:47 on Aug 4, 2023

Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!

Zogo posted:

That doesn't look like the right motherboard as it says AORUS MASTER rather than AORUS ELITE.

I think you want this one. But the question is which rev. you have: rev. 1.0/1.1 or rev. 1.2

https://www.gigabyte.com/Motherboard/B650-AORUS-ELITE-AX-rev-10-11/support#support-dl-bios
https://www.gigabyte.com/Motherboard/B650-AORUS-ELITE-AX-rev-12/support#support-dl-bios

You can find out from the motherboard box or the manual/documents might specify that information.

edit: You probably have 1.0 or 1.1

It doesn't matter that much but the installer would normally be used if you're putting it on your own computer. Portable would go on a flash drive if you were taking it to multiple computers and didn't want to leave extra stuff on the machines.

So what happened was the lady in that video said to look for Master BIOS which I assumed was just like, what it was called generically and mot a specific line of mb, and I didn't notice the difference when I got to the DL page. Sorry about that. The good news is that I do indeed have a 1.0 and I was able to update it! Thanks!

I just ran DDU in safe mode, so now I just gotta reupdate the GPU drivers. I'm gonna try to re-slot the 4070 tomorrow, but a couple of questions until then.

1 - I'm good to reinstall Nvidia Experience or w/e it's called, the suite that manages driver updates for me rather than needing to go manually grab the latest one myself?

2 - If the 4070 still won't display, would getting drivers for the 750 Ti be an issue? Do they both work off the same drivers? There hasn't been a problem with using the 750 yet as far as I can tell, but I am clearly not good at this. :sweatdrop:

Zogo
Jul 29, 2003

1. Yeah, that should be okay.

2. They'd use different drivers but that Nvidia program might handle it. If you run into issues then just run DDU again.

Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!
I re-slotted the 4070 and still didn't get a signal from it, although I only thought to try DP and not HDMI. When I initially put the 750 back in, there was no signal from either DP or HDMI, but I powered down and restarted again and HDMI is fortunately sending a signal. Looks to me like the issue was the gpu, but still no idea what exactly the issue was. Replacing it so soon sucks but it is what it is, I'm more interested in what if anything I can do to prevent this from happening in the future. That said I'm not rushing to replace it right away, gonna try some games that the 750 can handle and see if the issue persists.

Ran the free version of DDU, and it says both my drives have 100% health. Ran Short self-tests on both and they came back with no errors. Will run Memtest tonight.

Zogo
Jul 29, 2003

You could also try different DP and HDMI cables if you have spares.

Vanrushal posted:

Looks to me like the issue was the gpu, but still no idea what exactly the issue was. Replacing it so soon sucks but it is what it is, I'm more interested in what if anything I can do to prevent this from happening in the future.

GPU failure usually happens due to a heat issue or a failing PSU causing damage to hardware.

I noticed in the OP that you didn't mention what PSU you're using. If it's an older one then that could potentially be an issue.

Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!
Dunno why I left the psu out of my list, sorry about that. I got a MSI MPG A850G PCIE5 850 W 80+ new when I built it.

Vanrushal fucked around with this message at 20:19 on Aug 5, 2023

Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!
Memtest results came back with 0 errors over 9 passes, although I let it go for a few more tests after waking up, still with 0 errors. Safe to assume at this point that it was indeed the gpu? I'm still curious as to why the display signal from the mb was fine one day and hosed the next after a shut down. I think I'm going to try and find a local pc repair place and see if they'll take a look at it and the gpu, but at least I now have some tools and a better idea of how to gauge my pc's health. Idk if there's anything else you want to suggest, but thanks a bunch for your help Zogo!

Also I was able to play FF14 (at 30 fps and the lowest settings. I forgot what things were like before the 4070 :() and HotS (also at lower setting but not as gross) for several hours each last night with no issues, so that's something. As far as replacing the gpu, I bought it directly from MSi, but neither the emails associated with the purchase nor my account info mention anything about a warranty that I can find. Would it be worth the time to try and contact customer support and see what they have to say about this?

Zogo
Jul 29, 2003

Vanrushal posted:

Safe to assume at this point that it was indeed the gpu?

It's the best bet at the moment.

Vanrushal posted:

I'm still curious as to why the display signal from the mb was fine one day and hosed the next after a shut down. I think I'm going to try and find a local pc repair place and see if they'll take a look at it and the gpu, but at least I now have some tools and a better idea of how to gauge my pc's health.

Unreliable cable is a possibility.

Or sometimes having a video card connected to the motherboard will prevent onboard video from working (not sure if you always had the GPUs out at the time).

Vanrushal posted:

Would it be worth the time to try and contact customer support and see what they have to say about this?

Yeah, since you've only had it for a few months.

Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!
I ended up RMAing the presumed faulty 4070 and received a replacement a couple days into September, got it installed and updated drivers, things were going fine with regular uninterrupted multiple-hour gaming sessions. Yesterday I played FF14 for about an hour, had alt+tabbed to firefox while the game was running, and after several minutes I noticed that I could move my mouse but nothing responded to clicking shortly before the pointer stopped moving along with no keyboard inputs. Looked like I was having similar if not the same issue again. I decided not to try any games for the remainder of the night, but twice while watching youtube the screen would freeze while the sound continued uninterrupted, but would later recover several seconds later. The first time this happened, I booted to safe mode, ran DDU on my nvidia drivers, rebooted back into normal mode and reinstalled the latest driver. The second time it happened I used DDU in safe mode again but this time I installed the previous gpu driver (there had been an update about a week ago which I had installed then), and I had no further issues for the rest of the evening.

Today things went fine, no issues with youtube, but a few minutes into FF14 and I get the frozen screen, no kb/m inputs. The only difference between today and what was happening in April and August was that the music continued normally, it wasn't slowed down and garbled as it had been previously. I took a look at event viewer yesterday and today, and as before the only Critical logs were telling me that the pc lost power without a proper shut down. In Error there are multiple nvlddmkm logs that read

quote:

The description for Event ID 0 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\Video9
Restarting TDR occurred on GPUID:100

The message resource is present but the message was not found in the message table

with the "Restarting" text interchanged with Reset, Resetting, UCodeReset, or "Error occurred on GPUID:100" depending on which event I look at. There was also a different category of nvlddmkm logs that read

quote:

The description for Event ID 14 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\Video9
00000000 00000000 00000000 00000000 00620030 00300000 10000000 00000000

The message resource is present but the message was not found in the message table

The main difference being (as far as I can tell) Event ID 0 and Event ID 14. I googled this which led me to some nvidia forums posts (some from 8 years ago) and the consensus seemed to be to roll back the driver which is what I ended up doing. All of those nvlddmkm events were logged from yesterday.

Today's Errors are logged from DeviceSetupManager and they read

quote:

Metadata staging failed, result=0x80004005 for container '{ABD69646-4FC8-52D9-A589-269589DA63F0}'

What I gather from googling is that this has something to do with usb devices which might be related to why the kb/m are unresponsive when my pc freezes but shouldn't be what's actually causing the freeze? Plus there's a ton of them in the Error log and there haven't been any freezes related to those occurrences. They are the only Errors that were logged at the time when the freeze today happened however.

The last thing that caught my eye was in Warning from NVIDIA Open GL Driver, and Event ID 2 which states

quote:

Ran out of memory

This log too has occurred multiple times previously and without any freezes so it doesn't seem like the culprit but seems somewhat worrying.

A friend suggested that I run a memory test which I passed on last night as I'd run it recently at Zogo's suggestion and it came back good, but I'm considering doing it again tonight just to make sure. Windows is updated as are my other drivers with the exception of the gpu which I rolled back last night. I didn't end up taking the pc to a repair place since the replacement gpu was working, but I think I'm going to need someone else to actually look at the thing at this point. If there's anything else I can do with the info provided, I'm open to and appreciative of suggestions. Thanks!

Zogo
Jul 29, 2003

You could try using only one stick of RAM at a time and seeing if the error goes away.

Those errors could point to many different issues so you might want to take it somewhere if you're tired of the troubleshooting.

down1nit
Jan 10, 2004

outlive your enemies
Yeah, it's really pointing to video card again, but it could be, somehow, either system or VRAM still. Could be memory controller in the CPU too. BIOS update plus adding microvolts to ram voltage maybe? Turn off XMP/Overclocking on RAM run at stock DDR speeds? Fun fact most if not all RAM is natively overclocked, but you can choose to run it at BASE DDR SPEEDS (https://www.crucial.com/support/memory-speeds-compatability 2400 would be the base speed for DDR4 for instance) to absolutely ensure that your ram is stable at that speed *at least.*

I'd love to work on this if you're near SF bay area. This does sound like a mid to pro level job.

Vanrushal
Apr 2, 2005

I thought my Spitter was a Jockey!
Just got my pc back today from a repair place. Initially they said they weren't successful replicating the freezing issue, but since Monday they said they were getting it pretty consistently and ruled it as the gpu being bad. Again. At this point I guess my options are to try and RMA it again, or try to get a refund. I'm leaning towards the latter. The guy who worked on my machine suggested I try a 3080 as a comparable, cheaper, and tried and tested alternative to the 4070, so I think I'm going to look into one of those. On the bright side they reported no faults with the rest of my setup (outside my cpu cooler staying heated for 30~60s when it should be cooling down immediately and having them replace it), which was a relief to hear as building the thing initially was very frustrating and I thought it'd turn out that I'd hosed something up. For whatever my uninformed word is worth, the 4070 ti is not great and I would not recommend.

down1nit posted:

Yeah, it's really pointing to video card again, but it could be, somehow, either system or VRAM still. Could be memory controller in the CPU too. BIOS update plus adding microvolts to ram voltage maybe? Turn off XMP/Overclocking on RAM run at stock DDR speeds? Fun fact most if not all RAM is natively overclocked, but you can choose to run it at BASE DDR SPEEDS (https://www.crucial.com/support/memory-speeds-compatability 2400 would be the base speed for DDR4 for instance) to absolutely ensure that your ram is stable at that speed *at least.*

I'd love to work on this if you're near SF bay area. This does sound like a mid to pro level job.

Unfortunately I am both in Nevada and I'd already taken it into the shop when you'd posted this, but thank you for the suggestions and the offer. I have no interest myself in messing with BIOS stuff, I just want the thing to work without having to tweak it, but I'll keep that overclock info in mind in case I ever need to deal with that. :)

Adbot
ADBOT LOVES YOU

down1nit
Jan 10, 2004

outlive your enemies
You went to a good shop sounds like. That's a good point about the 3080 and impressive they called it out. Love they caught a slow ramp up on your cpu temps; That can and will help cpu temps under boost and could have potentially fixed the problem. Video cards die like this all the time though.

Sorry you had to experience the dumb side of pc gaming. Manufacturers warranties are usually good enough; Luck for most people works out after one RMA. There is always going to be that one rare, super unlucky person though.

Please note it's not over but underclocking I was talking about. You may have glazed over that bit? I dunno.
Nope out of the bios stuff if you like, but it could have been the reason it doesn't "just work" lol. It's not but it could been.

Underclocking makes things more stable and should hardly be discounted as a troubleshooting method

down1nit fucked around with this message at 09:00 on Nov 3, 2023

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply