Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
gradenko_2000
Oct 5, 2010

HELL SERPENT
Lipstick Apathy
5000-series APU reviews are coming out.

GN did the 5600G and my takeaway from the video is that:

* the pure CPU part is decidedly slower than the 5600X and the i5-11400F (and keeping in mind GN keeps the i5-11400F within strict Intel guidance). It's not even that much faster than the Ryzen 5 3600.

* the integrated GPU is quite strong (relatively), with the APU able to exceed GT 1030 performance in some cases

* the MSRP is 260 USD, which feels like it's priced within a niche

- if you already have a GPU to work with, then getting a straight CPU is better, whether a 3600 or an intel i5, for cheaper, as in the 160 to 200 USD range
- if you do not yet have a GPU to work with, then you can get this and still play some games... but only if you can't get even a bottom-of-the-barrel discrete GPU for the difference

Adbot
ADBOT LOVES YOU

Cygni
Nov 12, 2005

raring to post

I was kinda disappointed in the GN review honestly. Basically no comparison to the past APUs, or discussion about the encoding/htpc/sff side of life. And honestly not that many comparisons in general on the iGPU either. I get that most of the attention right now is going to be on "can this replace a dGPU in the short term" cause of the shortages, but I would have liked to see just... more all around.

I am passively in the market for something to replace an aging Ivy Bridge 3770S in my HTPC/home server, mostly because the platform is so old. Performance is great, but no bootable M.2 support means i gotta burn a SATA plug for a boot drive and I'm trying to maximize redundant bulk storage so I'm greedy for more. I think I'll probably keep waiting. Bulk storage is so expensive right now anyway cause of Chia dogshit that it is probably not the time.

NewFatMike
Jun 11, 2015

I guess Serve the Home are the only guys who really consistently touch on encoding and such, huh?

avoid doorways
Jun 6, 2010

'twas brillig
Gun Saliva

Lord Stimperor posted:

It's an SSD. The SMART status is okay. I've run a short smartctl test and that passed as well. I'll try if I can run the long one while my windows USB stick finishes being prepared.


e: smartctl says that test will complete in 85 minutes so I'm probably not completing it tonight lol

Is your SSD firmware up to date?

I have a Samsung Evo 840 that worked fine on Windows 8.1 but was causing bluescreens every ~10 minutes on Windows 10 until I updated the firmware.

Seamonster
Apr 30, 2007

IMMER SIEGREICH

gradenko_2000 posted:

5000-series APU reviews are coming out.


DDR5 and Navi iGPUs can't get here fast enough!

SwissArmyDruid
Feb 14, 2014

by sebmojo

Seamonster posted:

DDR5 and Navi iGPUs can't get here fast enough!

What I been saying.

Spacedad
Sep 11, 2001

We go play orbital catch around the curvature of the earth, son.
Those new 5000-series apus seem drat good for what they are, based on early tests.

https://www.youtube.com/watch?v=KycNI1FxIPc

I kind of would love to see if I could build a viable super-compact art-and-light gaming pc around a 5600g or 5700g, without a gpu.

Lord Stimperor
Jun 13, 2018

I'm a lovable meme.

Phew, I'm close to throwing the towel on this one.


Lord Stimperor posted:

Computer from hell


Lord Stimperor posted:

Computer from hell


Lord Stimperor posted:

Computer from hell


CaptainSarcastic posted:

Is it a SATA drive? I'd consider swapping cables and/or ports on it.

Assuming your Windows install media is okay, it would be interesting to see if a Linux install on the same drive acted as badly. If Ubuntu is running happily off a USB drive then it would be worthwhile to see if also ran happily if installed on the same target disk you've been using for Windows.

Licarn posted:

Is your SSD firmware up to date?

I have a Samsung Evo 840 that worked fine on Windows 8.1 but was causing bluescreens every ~10 minutes on Windows 10 until I updated the firmware.

On the SSD and SATA: Did not manage to upgrade the firmware (yet). Windows didn't keep working long enough to download the firmware and flash it, and the stick I made in ubuntu wouldn't boot properly.

However, I was able to a) swap the SATA ports, b) wipe the entire disk and create new partition tables etc. and c) try out another SSD. Neither of these helped, unfortunately. So for now I'm putting the SSD(s) at the back of the suspect list.



Negative_Kittens posted:

I had a similar issue with my system, a Ryzen 5600x, where my ram xmp profile caused RAM failure errors, and had to stabilize the SOC voltage to 1.1 and the DIMM voltage to 1.38 to get the errors to disappear. To be fair, I didn't check the QVL before buying the RAM because the RAM was on sale and I wasn't gonna pass it up (2x16gb Corsair Vengeance LPX for $150).

(...) a setting in the BIOS called "Power Supply Idle Control", and the default/auto setting sets this option to "Low Current Idle", but the other option, "Typical Current Idle"

Paul MaudDib posted:

try disabling power c-states or even seeing if there's a way to disable whatever the AMD version of speedstep is so that it just runs max clocks all the time.


Okay these are several suggestions.

1. I was able to find the power supply control setting, but adjusting it to typical current idle didn't help
2. I was able to disable c-state, and that, too didn't help
3. 'SOC voltage to 1.1 and DIMM voltage to 1.38' -- couldn't exactly figure out which these are, got too tired, but I'll give this one a try.


I thought that these measures would increase stability. But that may just be wishful thinking, seeing that it started crashing again as usual again after a few minutes. In fact I think over time the crashing is getting more intense; there are instances now where Windows stays up for less than a minute before it bluescreens out.


Mentally I'm getting ready to just replace compnents until the problem is fixed. But I don't know whether I should start with the CPU, motherboard, or RAM, and I was actually expecting to get at least another 1 or 2 good years out of this system. Gives me a bit of a headache.


redeyes posted:

Try upping voltage a hair.

Can you elaborate which voltage, and how much a hair would be?








(installs Ubuntu on the wiped disk out of frustration)

Lord Stimperor fucked around with this message at 19:25 on Jun 18, 2021

hobbesmaster
Jan 28, 2008

Spacedad posted:

Those new 5000-series apus seem drat good for what they are, based on early tests.

https://www.youtube.com/watch?v=KycNI1FxIPc

I kind of would love to see if I could build a viable super-compact art-and-light gaming pc around a 5600g or 5700g, without a gpu.

Its roughly an r5 3600 + a gtx 1030 so if thats fine for your art and light gaming...

Xaris
Jul 25, 2006

Lucky there's a family guy
Lucky there's a man who positively can do
All the things that make us
Laugh and cry

Lord Stimperor posted:

Phew, I'm close to throwing the towel on this one.

I thought that these measures would increase stability. But that may just be wishful thinking, seeing that it started crashing again as usual again after a few minutes. In fact I think over time the crashing is getting more intense; there are instances now where Windows stays up for less than a minute before it bluescreens out.


Mentally I'm getting ready to just replace compnents until the problem is fixed. But I don't know whether I should start with the CPU, motherboard, or RAM, and I was actually expecting to get at least another 1 or 2 good years out of this system. Gives me a bit of a headache.

Woof that's a hard one. It won't boot a Ubuntu stick huh?

if you're okay with going a little gently caress jeff bezos, you could try replacing parts one/two at a time from Amazon and return them if it doesn't fix it. I don't know what I'd start with either though.

As far as voltages, yeah you can try it:
- VSOC (milliamps) = 1100 (1.1A). This is often under something like say "AMD OC Menu" but depends on manufacturer. It controls voltage to the IMC (Integrated Memory Controller).
- VDRAM(a|b) = 1.4v. This is often under a main page , for me it's just under "Tweaker" on a Gigabyte Mobo. Controls voltage to the ram
- VLoadline Calibration (LLC). Set this to "High / Level 3". There's sometimes LLC CPU and LLC SOC. For me this is under Tweaker -> Advanced CPU Settings or something like that. This controls voltage droop.

Another thing to also try is increasing voltage offset to the CPU using Curve Optimizer. This is under AMD OC Menu -> Precision Boost/Curve Optimizer tab. Basically you can set like a voltage offset for either allcores or per core. Could try +5 all core offset just as an initial test.

Harik
Sep 9, 2001

From the hard streets of Moscow
First dog to touch the stars


Plaster Town Cop
ouch, that's a mess. Best of luck.

I'm annoyed by a system that crashes once every other month. Often enough to be annoying, not often enough to be reliably tracked down. If anything you're in a better position in that you'll know immediately when you find it.


Unrelated, what's the reason that chiplet GPUs are so difficult to deal with? Pairing seperate chips (SLI,Xfire,2x on one board, etc) have extreme bandwidth limitations, so stitching work together takes effort. What's the hard problem with doing it on the same chip? They're already massively-duplicated independent compute cores running in parallel, and memory bandwidth already needs to scale based on that count to keep them fed. What's the hard unsolved problem with GPUs that means they can't do the ryzen core complex/infinityfabric thing with them?

karoshi
Nov 4, 2008

"Can somebody mspaint eyes on the steaming packages? TIA" yeah well fuck you too buddy, this is the best you're gonna get. Is this even "work-safe"? Let's find out!

Harik posted:

Unrelated, what's the reason that chiplet GPUs are so difficult to deal with? Pairing seperate chips (SLI,Xfire,2x on one board, etc) have extreme bandwidth limitations, so stitching work together takes effort. What's the hard problem with doing it on the same chip? They're already massively-duplicated independent compute cores running in parallel, and memory bandwidth already needs to scale based on that count to keep them fed. What's the hard unsolved problem with GPUs that means they can't do the ryzen core complex/infinityfabric thing with them?

I'm not an EE, so I'm just speculating:

I think it's the orders-of-magnitude higher bandwidth. Nominally, DRAM bandwidth is about an OOM higher, like 50GB/s -> 500GB/s, but the use patterns are different. CPUs are happy playing inside their little caches, while GPUs will just saturate any BW you give them until compute is the bottleneck. Reuse is less prominent; it's reading coherent buffer regions (textures/g-buffers) in and spitting out coherent memory regions (output buffers). So your IF needs to be an OOM bigger as your chiplets might want to access all other caches in other chiplets at 500GB/s and there's goes your power budget and your problems grow at O(N^2) with the number of chiplets.

For a recent GPU launch AMD commented on their problems routing the central 2048 bit-wide interconnect for all CU, and how it was a hotspot. I think if you try to do the same N-times over a substrate your gonna melt it, maybe?

HBM is 1024-bit wide, at lower clocks, and some GPUs run a few of them. Maybe it's the cache coherency traffic for 500GB/s that's a bit too much. Anyway, the next AMD compute-only thingy is dual-chiplet, so there was at least room for little a bus on the substrate.

Harik
Sep 9, 2001

From the hard streets of Moscow
First dog to touch the stars


Plaster Town Cop

karoshi posted:

I'm not an EE, so I'm just speculating:

I think it's the orders-of-magnitude higher bandwidth. Nominally, DRAM bandwidth is about an OOM higher, like 50GB/s -> 500GB/s, but the use patterns are different. CPUs are happy playing inside their little caches, while GPUs will just saturate any BW you give them until compute is the bottleneck. Reuse is less prominent; it's reading coherent buffer regions (textures/g-buffers) in and spitting out coherent memory regions (output buffers). So your IF needs to be an OOM bigger as your chiplets might want to access all other caches in other chiplets at 500GB/s and there's goes your power budget and your problems grow at O(N^2) with the number of chiplets.

For a recent GPU launch AMD commented on their problems routing the central 2048 bit-wide interconnect for all CU, and how it was a hotspot. I think if you try to do the same N-times over a substrate your gonna melt it, maybe?

HBM is 1024-bit wide, at lower clocks, and some GPUs run a few of them. Maybe it's the cache coherency traffic for 500GB/s that's a bit too much. Anyway, the next AMD compute-only thingy is dual-chiplet, so there was at least room for little a bus on the substrate.

Thanks.

So GPU compute units can trivially fit any kernel you throw at them in a modern-sized icache, but in terms of actually doing anything useful they're voraciously consuming bandwidth for textures/framebuffer. Sharing that bandwidth across any sort of interconnect that's not on the same monolithic die is much harder than the lower requirements for CPU.

Does that sound about right?

karoshi
Nov 4, 2008

"Can somebody mspaint eyes on the steaming packages? TIA" yeah well fuck you too buddy, this is the best you're gonna get. Is this even "work-safe"? Let's find out!

Harik posted:

Thanks.

So GPU compute units can trivially fit any kernel you throw at them in a modern-sized icache, but in terms of actually doing anything useful they're voraciously consuming bandwidth for textures/framebuffer. Sharing that bandwidth across any sort of interconnect that's not on the same monolithic die is much harder than the lower requirements for CPU.

Does that sound about right?

Yep, that was my point. But I'm a software guy. There was a discussion on this in Beyond3D like 5 years ago, but I don't remember the points made back then. There were some very fine posters on the subject of GPUs on that site.

Anyway imagine pushing 10x the interconnect of 3-chiplet Ryzen onto a substrate. Ryzen has a high standby power (~10 Watts, IIRC) and some people say it's the IF, was this proven?

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Harik posted:

Thanks.

So GPU compute units can trivially fit any kernel you throw at them in a modern-sized icache, but in terms of actually doing anything useful they're voraciously consuming bandwidth for textures/framebuffer. Sharing that bandwidth across any sort of interconnect that's not on the same monolithic die is much harder than the lower requirements for CPU.

Does that sound about right?

no - I will disagree with this.

It's true that GPUs are basically a bandwidth-oriented model designed around the engineering compromises that entails. Basically you do everything possible to re-use as much bandwidth as possible - you try to take advantage of locality of memory access (reads to the same area / contiguous areas can be broadcast to multiple units in a warp / maybe in a SM as a whole) and so on, and the architecture as a whole is designed as essentially a "super-SMT" (or more precisely a kind of barrel processor) where there are lots of warps running at a time so that any one warp can be "slept" and "swapped out" while its memory accesses complete, which gives you a way to "hide" the latency that is inherent to GDDR memory.

however, they absolutely will not "trivially fit any kernel you throw at them". Register pressure and L1 cache are very tight in a GPU architecture. Registers can be dynamically shared across different warps (so you can run more warps if you're not using a lot of registers) and will be spilled to L1 and eventually to VRAM if needed - however register pressure and cache pressure are basically always a limiting factor. And there are many things competing for L1 cache space - constants cache, texture cache, etc. "pinning" certain things into cache is a very effective acceleration technique for stuff that will be frequently accessed, because even with latency hiding, there is still latency. So generally you always want to be ruthless about minimizing register usage and L1 cache usage in general because there is a real performance cost to that. If you reduce register/cache usage, you can have more warps resident on the processor at a time (without being swapped out) which will provide overall higher performance.

I'm also not sure there's much of an "instruction cache" in the way you think of it on a CPU. Is there a cache where the SM holds the instructions for the kernel that it's currently running? Almost certainly. However, there is very little in the way of a "front end" in the CPU sense on the GPU itself. The program as a whole is immediately converted from an intermediate "bytecode" representation (it's called "PTX" on CUDA) to the actual native instructions for the actual processor itself, by the runtime, then dumped over to the GPU. There is very little pre-processing done on the SM itself, and I don't think there's anything like branch prediction or anything like that. This is very much a model where the runtime emits code that is custom crafted to the processor, the processor naively does whatever the instruction stream says, with some fixed and relatively predictable behavior. If the instruction stream is suboptimal, oh well. And so the programmer is expected to emit code that runs well on the runtime/hardware.

this is old but still is going to broadly reflect the overall design and optimization of GPU compute kernels: http://developer.download.nvidia.com/GTC/PDF/1083_Wang.pdf

general terminology: a SM (streaming multiprocessor) is the thing that actually processes a stream of instructions. It executes a bank of threads in lockstep, called a "warp" (so you can think of it as an analogy to AVX: SM is a core, a warp is a vector, and a thread is a lane in a vector). A group of SMs are aligned into a larger physical grouping that shares execution resources (cache, etc) called an SMX (I guess think of this like a CCX on a Ryzen processor - this is where cache lives and registers are allocated/etc, but the cores are independent within the CCX). A group of warps is arranged into a multidimensional (2d, 3d, etc - but could be higher) "space" of work called a "grid", usually you have a much larger grid than you actually have warps and you iterate the "grid" across the workspace. A "grid" is basically a problem configuration for a GPU compute program, the program is called a "kernel" and "kernel" is often used to refer to a specific invocation/instance of a gpu compute program.

(if any of that didn't make sense feel free to ask)

As far as MCMs - the question has always been architecture, what is being shared between chiplets. In general the GPU programming model is specifically designed to discourage sharing intermediate results between different parts of the kernel. It is possible within units of a warp (since they all execute in lockstep on a single processor - these amount to basically moving data around between the "lanes" of the "AVX unit"), however it is not possible to communicate data between multiple kernels or between SMXs without Undefined Behavior (to be clear it's possible and it's done, but it's Undefined and not necessarily guaranteed to ever terminate, you are Breaking The Rules and relying on Implementation-Specific Behavior). So in general programs are usually structured to push intermediate results back to global memory (which is legal - as long as something else on another SMX doesn't require it to be in memory before the kernel terminates), and to process data with multiple kernel invocations (wait for it to finish, then launch the next "pass"). So in general access between chiplets should often be fairly minimal, other than as required for global memory accesses. Basically if the memory location only lives on another chiplet, then you have to communicate data over there, potentially a lot of data. But if you structure the program so that almost all the data you want lives on your local chip (including by duplication of input data if needed) then that case shouldn't occur all that often. But in practice it may (especially in graphics).

The easiest theoretical architecture is likely to be if multiple GPU chiplets are all clients of one giant memory controller or a giant memory crossbar that lets any chiplet access any data at super high speed, so from the chip perspective it 's all flat memory and you just have "near SMXs" and "far SMXs" (neither of which you should be trying to communicate with - that's undefined). However that obviously means you have to design a controller which can switch at terabytes per second, which is hard/expensive. The other way is you have memory attached to each GPU chiplet, and when you want something that is not on your chiplet you have to ask for it, and you have to have a relatively high-speed interconnect to go get it. But, if you cleverly design your program so that you have data locality and most of the data you want lives on your memory controller and you're only going off-chip for a small amount of data, you are potentially only moving a much smaller amount of data between chips. But it's also less general and harder to write code for. And in particular graphics code appears to be particularly hard to write for that kind of architecture.

So far all the extant multi-GPU architectures appear to use the latter (memory attached to each chiplet, chiplets interconnected). NVIDIA has the DG1/DG2 servers which effectively are a crude multi-GPU implementation, they have a pair of NVSwitch chips each with about as many transistors as a hexacore CPU that can switch at 900 GB/s and if a GPU wants something from another GPU it gets it over that, so to a program it can all appear to be Just One Big GPU. AMD appears to be going for this with their first MCM implementation as well, it's two chiplets and they will be attached by an unknown Big Fast Interconnect. Probably infinity fabric, but a really wide/fast link? We don't really know specifics yet.

edit: and yes as mentioned moving all this data around takes a lot of power!

Paul MaudDib fucked around with this message at 01:55 on Jun 19, 2021

Harik
Sep 9, 2001

From the hard streets of Moscow
First dog to touch the stars


Plaster Town Cop

Paul MaudDib posted:

however, they absolutely will not "trivially fit any kernel you throw at them". Register pressure and L1 cache are very tight in a GPU architecture. Registers can be dynamically shared across different warps (so you can run more warps if you're not using a lot of registers) and will be spilled to L1 and eventually to VRAM if needed - however register pressure and cache pressure are basically always a limiting factor. And there are many things competing for L1 cache space - constants cache, texture cache, etc. "pinning" certain things into cache is a very effective acceleration technique for stuff that will be frequently accessed, because even with latency hiding, there is still latency. So generally you always want to be ruthless about minimizing register usage and L1 cache usage in general because there is a real performance cost to that. If you reduce register/cache usage, you can have more warps resident on the processor at a time (without being swapped out) which will provide overall higher performance.

I meant in a memory-fetch sense. The "code" for a kernel isn't waiting on memory bandwidth for the next op due to the relative scale of data size to code size. Whatever instructions for the 'warp' controller are in local SRAM until it's time to change kernels when one completes and the next kicks off.

I was aware it was super-wide AVX a shared controller but I didn't know how register constrained they were, that's pretty interesting. I've done some shader work but not enough to get deep in the weeds of optimization.

As for the rest thanks for the details, there's a bunch there that I was only vaguely aware of.

Lord Stimperor
Jun 13, 2018

I'm a lovable meme.

Xaris posted:

Woof that's a hard one. It won't boot a Ubuntu stick huh?

if you're okay with going a little gently caress jeff bezos, you could try replacing parts one/two at a time from Amazon and return them if it doesn't fix it. I don't know what I'd start with either though.

As far as voltages, yeah you can try it:
- VSOC (milliamps) = 1100 (1.1A). This is often under something like say "AMD OC Menu" but depends on manufacturer. It controls voltage to the IMC (Integrated Memory Controller).
- VDRAM(a|b) = 1.4v. This is often under a main page , for me it's just under "Tweaker" on a Gigabyte Mobo. Controls voltage to the ram
- VLoadline Calibration (LLC). Set this to "High / Level 3". There's sometimes LLC CPU and LLC SOC. For me this is under Tweaker -> Advanced CPU Settings or something like that. This controls voltage droop.

Another thing to also try is increasing voltage offset to the CPU using Curve Optimizer. This is under AMD OC Menu -> Precision Boost/Curve Optimizer tab. Basically you can set like a voltage offset for either allcores or per core. Could try +5 all core offset just as an initial test.

I am on Ubuntu now. Otherwise I would have gone completely mad :). Works mostly but not entirely stable on a stick, but crashes a couple of times per part-of-day on a drive. I'm inclined to not blame the SSDs here, given that the crashes occur on both of them and it's just improbable that they're hosed, simultaneously. I'm trying to get a semi-permanent Ubuntu on a stick now, so I can at least keep using the computer for non-gaming stuff.



I have trouble finding some of the power settings you recommend. Could it be that they're hidden or not changeable for non-overlockable CPUs (i.e. Ryzen 3600 instead of 3600x)?


By the way how do you all get this tremendous amount of hardware debugging experience? Are you all hardcore overclockers who live and breathe this stuff, does that knowledge overlap with an electrical engineering degree, or what is it? The advice I'm getting in this thread is far deeper and more consistent than what I would have received in random hardware forums (where it usually amounts to 'your 5000 watt power supply is probably underpowered'). That makes me feel really pampered and cared for.

Kibner
Oct 21, 2008

Acguy Supremacy
Ryzen has been relatively finicky from the start and people here have had to do what they can to make it stable, I figure.

Indiana_Krom
Jun 18, 2007
Net Slacker

Lord Stimperor posted:

By the way how do you all get this tremendous amount of hardware debugging experience? Are you all hardcore overclockers who live and breathe this stuff, does that knowledge overlap with an electrical engineering degree, or what is it? The advice I'm getting in this thread is far deeper and more consistent than what I would have received in random hardware forums (where it usually amounts to 'your 5000 watt power supply is probably underpowered'). That makes me feel really pampered and cared for.

Probably can't speak for everyone, but nerds gotta nerd out. If the rest of them are anything like me, I was building PCs from junk parts in the basement when most kids would be playing with the easy lego kits so lots of us literally grew up doing this stuff. Just ask how many of us remember the days of jumpers and dip switches and other oddball things like using a conductive pencil to mark a line between two points on a CPU package to unlock it for overclocking. Hardware troubleshooting is a skill some people just develop from experience like smashing stuff together from a pile of old junk PC parts when we were kids and seeing if we could get it to work (and play Doom).

Dr. Video Games 0031
Jul 17, 2004

I've become "good" at computers by simply being very bad at building them and thus having to constantly troubleshoot all of my dumbass mistakes. Nobody knows how to gently caress up a PC build quite like me. I'm rather proud of this.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Dr. Video Games 0031 posted:

I've become "good" at computers by simply being very bad at building them and thus having to constantly troubleshoot all of my dumbass mistakes. Nobody knows how to gently caress up a PC build quite like me. I'm rather proud of this.

FuturePastNow
May 19, 2014


I didn't have to do anything finicky to make my Ryzen system stable. I followed directions off some site to set up PBO and a custom Windows power plan but that wasn't a stability thing or hard

Zedsdeadbaby
Jun 14, 2008

You have been called out, in the ways of old.

Dr. Video Games 0031 posted:

I've become "good" at computers by simply being very bad at building them and thus having to constantly troubleshoot all of my dumbass mistakes. Nobody knows how to gently caress up a PC build quite like me. I'm rather proud of this.

'if you are not failing, you are failing'

The best way to learn is by doing, nobody ever learned anything by doing it right first time.

bobfather
Sep 20, 2001

I will analyze your nervous system for beer money

FuturePastNow posted:

I didn't have to do anything finicky to make my Ryzen system stable. I followed directions off some site to set up PBO and a custom Windows power plan but that wasn't a stability thing or hard

Same. My 1600 was fine with moderately overclocked RAM and this 5600X is fine, even with a massive undervolt using PBO and ridiculously overclocked RAM and infinity fabric.

Khorne
May 1, 2002

FuturePastNow posted:

I didn't have to do anything finicky to make my Ryzen system stable. I followed directions off some site to set up PBO and a custom Windows power plan but that wasn't a stability thing or hard
Zen2 I had no stability issues. My Zen3 cpu randomly reboots with anything below 1.125vsoc (1.0 is default) using identical components to the zen2 system. The closer I get to 1.125 the less frequent the reboots are. At stock 1.0 I'm lucky to be stable for 30-60 minutes. At 1.1 it might last a week. At 1.125 I've had no random reboots at all.

It's set & forget once you figure it out, but I'd bet most people do an RMA instead.

Khorne fucked around with this message at 17:51 on Jun 19, 2021

redeyes
Sep 14, 2002

by Fluffdaddy
Great info! Thank you.

Canned Sunshine
Nov 20, 2005

CAUTION: POST QUALITY UNDER CONSTRUCTION



Zedsdeadbaby posted:

'if you are not failing, you are failing'

The best way to learn is by doing, nobody ever learned anything by doing it right first time.

This is true for pretty much everything too. One of my greatest frustrations was that some of the young engineers coming out of college, that I would be assigned to mentor, are so afraid of failure and how it will make them look. And I try telling them that failure is one of the most beneficial lessons they'll have in their careers and greatest means to evolve in their field.

CaptainSarcastic
Jul 6, 2013



Lord Stimperor posted:

By the way how do you all get this tremendous amount of hardware debugging experience? Are you all hardcore overclockers who live and breathe this stuff, does that knowledge overlap with an electrical engineering degree, or what is it? The advice I'm getting in this thread is far deeper and more consistent than what I would have received in random hardware forums (where it usually amounts to 'your 5000 watt power supply is probably underpowered'). That makes me feel really pampered and cared for.

I used to do tech support and computer service, and have built my own machines for years, so it's a combination of those things for me. That, and reading forums like this to absorb stuff I might run into in the future.

gradenko_2000
Oct 5, 2010

HELL SERPENT
Lipstick Apathy
FWIW I never had any stability issues with my 2400G or my 3100, even after tinkering a bit with overclocking the memory and the core clocks on both (thanks Xaris!)

Number19
May 14, 2003

HOCKEY OWNS
FUCK YEAH


SourKraut posted:

This is true for pretty much everything too. One of my greatest frustrations was that some of the young engineers coming out of college, that I would be assigned to mentor, are so afraid of failure and how it will make them look. And I try telling them that failure is one of the most beneficial lessons they'll have in their careers and greatest means to evolve in their field.

I would rather work with people who make take chances and make mistakes and then learn how to fix them than people who stay in the box all the time and never really develop any problem solving skills

It’s just like I tell my kids: you gotta take risks sometimes, even if you could get hurt. That’s how you learn your limits and get better at doing things

CaptainSarcastic
Jul 6, 2013



Number19 posted:

I would rather work with people who make take chances and make mistakes and then learn how to fix them than people who stay in the box all the time and never really develop any problem solving skills

It’s just like I tell my kids: you gotta take risks sometimes, even if you could get hurt. That’s how you learn your limits and get better at doing things

The harshest lesson I learned in troubleshooting was to not skip steps. There's nothing like spending hours trying to figure out a software problem that was actually a hardware problem, or vice versa. Or doing a bunch of troubleshooting on a Mac showing virus-like network traffic only to discover that Limewire was choking out their slow-rear end connection.

SwissArmyDruid
Feb 14, 2014

by sebmojo

CaptainSarcastic posted:

The harshest lesson I learned in troubleshooting was to not skip steps. There's nothing like spending hours trying to figure out a software problem that was actually a hardware problem, or vice versa. Or doing a bunch of troubleshooting on a Mac showing virus-like network traffic only to discover that Limewire was choking out their slow-rear end connection.

I can't even begin to stress this. If one were to go back one post of mine, you would see that I finally nailed down that I had a bum stick of RAM.

If I did not have Phone Number nagging me that I should be running memtest on my sticks one at a time for proper elimination of root causes, I would not have nailed the one bum stick of memory, the pair tested absolutely fine in dual-channel, but because I knew they were right and I hadn't done my due diligence I went back and did it properly and voila, fired off an RMA to G.Skill that night, and now I'm waiting on a return package.

CaptainSarcastic
Jul 6, 2013



SwissArmyDruid posted:

I can't even begin to stress this. If one were to go back one post of mine, you would see that I finally nailed down that I had a bum stick of RAM.

If I did not have Phone Number nagging me that I should be running memtest on my sticks one at a time for proper elimination of root causes, I would not have nailed the one bum stick of memory, the pair tested absolutely fine in dual-channel, but because I knew they were right and I hadn't done my due diligence I went back and did it properly and voila, fired off an RMA to G.Skill that night, and now I'm waiting on a return package.

My most recent forehead-smacker was being mad about my Internet speed being lower than what I paid for, and doing all sorts of tests, removing a filter from the coax, checking my wireless router, and cursing my ISP in general. Then I finally thought to change the cable between modem and router, and lo and behold it turned out to be old ethernet that would only handle 100Mbps. New ethernet cable and I'm getting better speed than I pay for. :doh:

redeyes
Sep 14, 2002

by Fluffdaddy
I fought a 3400G for almost a month until I finally decided to up the cpu and soc voltages a hair. That worked.. I blame a bottom barrel ASROCK mobo.

Indiana_Krom
Jun 18, 2007
Net Slacker

CaptainSarcastic posted:

The harshest lesson I learned in troubleshooting was to not skip steps. There's nothing like spending hours trying to figure out a software problem that was actually a hardware problem, or vice versa. Or doing a bunch of troubleshooting on a Mac showing virus-like network traffic only to discover that Limewire was choking out their slow-rear end connection.
Yes, and always start with the simplest steps first.

I recall an incident a couple years ago where my brother spent the better part of a weekend trying to get this USB-C docking station working with his laptop and it kept on refusing to work in one way or another (you could get parts of it to work together, but never all of them at once). Finally he called me since I am more in to hardware troubleshooting and the first thing I said was: "Lets get the simple things out of the way first, did you try a different type-C cable?". It was the cable.

Once a local pastor brought his computer and printer which refused to print to my house, and I plugged the USB cable all the way into the printer for him.

A company in town had a computer that wouldn't boot, hadn't worked in a week and they were at their wits end, I examined the case closely and used a ball-point pen to get the reset switch unstuck.

Always check the basic solutions that are so simple it makes the people asking for help feel stupid, you would be amazed how many times it pays off, saves hours of troubleshooting and teaches people a valuable lesson that not every problem has to have a complicated cause.

Dr. Video Games 0031
Jul 17, 2004

CaptainSarcastic posted:

My most recent forehead-smacker was being mad about my Internet speed being lower than what I paid for, and doing all sorts of tests, removing a filter from the coax, checking my wireless router, and cursing my ISP in general. Then I finally thought to change the cable between modem and router, and lo and behold it turned out to be old ethernet that would only handle 100Mbps. New ethernet cable and I'm getting better speed than I pay for. :doh:

My latest massive blunder came from when I was trying to figure out a system stability issue earlier this year. My computer was just randomly restarting--no blue screens or error logs or anything. For the first month of use it was fine, actually, and the problem slowly started getting worse after that. I could play a demanding game for hours and it would be fine, but then it would restart when watching a youtube video. My computer is sort of in an annoying spot to deal with under my desk, so I try to pull it out and open it up as little as possible. Due to the sporadic nature of the restarts and lack of time available on my part, troubleshooting progress was slow. Then a week or two after the problem started, my computer wouldn't boot at all. I had almost no spare components to test with, so I ordered some things from Amazon with the intent to return them if they weren't the issue.

First up was the PSU, since I suspected this was power related. As I was unplugging things... one of the 6-pin connectors just pops out of the graphics card without having to squeeze the clip. The cable was loose. That was it. The plastic bits meant to line up the two connectors weren't properly aligned, so the second connector didn't go in all the way when I initially plugged it in. It just took some time to gradually come loose after that, which explains why it worked fine at first. I'm not sure I ever felt so dumb before. It was a good reminder to double check all my cable connections, I guess.

At least I was right about it being power related. :)

CaptainSarcastic
Jul 6, 2013



I had a moment of panic when I installed a better CPU cooler on my machine, then saw the fan didn't spin up at boot. Then I realized I had forgotten to plug it in.

Xaris
Jul 25, 2006

Lucky there's a family guy
Lucky there's a man who positively can do
All the things that make us
Laugh and cry

Khorne posted:

Zen2 I had no stability issues. My Zen3 cpu randomly reboots with anything below 1.125vsoc (1.0 is default) using identical components to the zen2 system. The closer I get to 1.125 the less frequent the reboots are. At stock 1.0 I'm lucky to be stable for 30-60 minutes. At 1.1 it might last a week. At 1.125 I've had no random reboots at all.

It's set & forget once you figure it out, but I'd bet most people do an RMA instead.

Yeah and to be clear, I think it's pretty obvious that Lord Stimperor has some bum hardware, no question it probably should be RMA-d if it's still under warranty. But, you can make somewhat bum or "failing" hardware work with increased voltages.

Likewise you can take advantage of good silicon lottery/hardware by undervolting it, but of course not everyone will have varying successful (or failure) doing it. Same applies to upping voltages. again ideally they should be working, but it sounds like the parts are getting old and may not be RMA-able anymore, and it's a question of where the problem actually lies that makes it hard as well. Voltage tinkering is at least an easy way to either help pin it down, or just fix it.

Lord Stimperor
Jun 13, 2018

I'm a lovable meme.

I think I'll draw a line under this the coming days. I'm pretty sure it's neither the memory nor the SSDs, as swapping/removing them does not change anything about the behaviour, and I've been able to partially memtest one of them for four passes (Internet says 8 passes minimum for confidence but that takes days, and physically swapping them and not seeing a difference points to other sources anyway).

First, I'll double-check power connections, given the recent discussions. Then I'll gently caress with the CPU voltage and/or stress test it. If that has a significant effect on stability, I'll decide it's the CPU. If that doesn't resolve it, I'll finish my memtesting on memtest86. If by then no clear culprit has emerged, I'm gonna order an MSI B450m mortar or pro-m2 MAX board. I can get these for 50-75 bucks and supposedly at least the mortar might be a tiny bit nicer than my bazooka, so that's not a terrible deal.

Adbot
ADBOT LOVES YOU

Lord Stimperor
Jun 13, 2018

I'm a lovable meme.

Final support question from me for now, I think: for picking the replacement motherboard, I'm hovering between the, and the MSI A520M Pro. They're both very affordable (75 and 50 EUR, respectively). Supposedly, the Mortar has better VRM and more versatility (which I don't need). By contrast, the A520M is a newer cheapset, but may have lovely VRM. Anyone have an opinion?




e: I've read a couple of more opinions and round-ups. From what I can gather, both chipsets are equally un-futureproof or futureproof, as AM4 is by now old anyway. VRM are supposed to be good on B450m Mortar, but unknown on the 520. The 520 supports less USB bandwidth, which might be a problem for certain VR setups (for Oculus, USB was often a bottleneck in the past). All that makes me tend to the Mortar, after all.

Lord Stimperor fucked around with this message at 19:26 on Jun 20, 2021

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply