GPU Megat[H]read - the cores of wrath grew heavy on the die that day

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > GPU Megat[H]read - the cores of wrath grew heavy on the die that day

«‹›3871 »

Rakeris: Jul 20, 2014

The Iron Rose posted:

Man, where the hell are y'all seeing 1080s for $450? Because at that price I'd pick one up in a heartbeat.

I bought one on eBay for 480 a few months ago.

# ? Jul 5, 2017 02:02

Adbot: ADBOT LOVES YOU

# ? May 31, 2024 10:46

GRINDCORE MEGGIDO: Feb 28, 1985

Could they mine faster with mine - optimised drivers?

# ? Jul 5, 2017 02:07

craig588: Nov 19, 2005; by Nyc_Tattoo

Yes, but Nvidia isn't interested in helping out compute on consumer cards. They don't want to compete with themselves in the Quadro and Tesla space with sub 200 dollar compute cards.

# ? Jul 5, 2017 02:10

repiv: Aug 13, 2009

SlayVus posted:

So if you bought a used miner card that had no IO on, what uses would it havr besides mining?

Poor man's 3D rendering workstation? GPU renderers get a practically linear speedup from extra cards.

# ? Jul 5, 2017 02:10

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

1gnoirents posted:

I dont quite understand the GDDR5X issue though I know this isnt the mining thread, it would appear under my "testing" (1070 and 1080 mining) that the 1080 always wins. I get it may not be optimal but it doesnt appear to be worse

GDDR5X is quad-pumped so each memory access returns 64 bytes of data, Ethereum uses 32-byte chunks of data so half of what you fetch is useless. Basically it cuts your bandwidth in half from the topline figure, and Ethereum is all about the bandwidth.

# ? Jul 5, 2017 02:53

Nairbo: Jan 2, 2005

Is there any reason not to buy a Bstock version of an EVGA card? I bought a 1070 a few weeks ago and I'd like to basically do the free upgrade while 1070s are still hot from mining.

According to some dudes on Reddit, the evga midweek sales will ship for free to Canada and UPS doesn't charge a broker fee on delivery for them. Assuming this is true, with conversion I'm looking at $607ish CAD for either a FTW or SC2. I'm thinking/hoping both would fit into my Thermaltake V21 cube case thing

# ? Jul 5, 2017 03:09

Malloc Voidstar: May 7, 2007; Fuck the cowboys. Unf. Fuck em hard.

SlayVus posted:

So if you bought a used miner card that had no IO on, what uses would it havr besides mining?

eBay

# ? Jul 5, 2017 03:34

DrDork: Dec 29, 2003; commanding officer of the Army of Dorkness

Godinster posted:

Is there any reason not to buy a Bstock version of an EVGA card? I bought a 1070 a few weeks ago and I'd like to basically do the free upgrade while 1070s are still hot from mining.

No, BStock EVGA cards are usually an excellent deal when you can get them.

# ? Jul 5, 2017 03:52

DrDork: Dec 29, 2003; commanding officer of the Army of Dorkness

1gnoirents posted:

I dont quite understand the GDDR5X issue though I know this isnt the mining thread, it would appear under my "testing" (1070 and 1080 mining) that the 1080 always wins. I get it may not be optimal but it doesnt appear to be worse

Yeah, as others have said, it's not so much that the 1080 is objectively worse at mining, as much as the price:performance ratio and the efficiency (hash/watt) was much worse due to only slightly better mining performance at notably higher cost and power use.

'course that was when 1070's were $400 and 1080's were $500. Now everything's topsy-turvey and a 1080 might actually be the more profitable choice in some cases.

# ? Jul 5, 2017 03:56

Watermelon Daiquiri: Jul 10, 2010; I TRIED TO BAIT THE TXPOL THREAD WITH THE WORLD'S WORST POSSIBLE TAKE AND ALL I GOT WAS THIS STUPID AVATAR.

Paul MaudDib posted:

GDDR5X is quad-pumped so each memory access returns 64 bytes of data, Ethereum uses 32-byte chunks of data so half of what you fetch is useless. Basically it cuts your bandwidth in half from the topline figure, and Ethereum is all about the bandwidth.

i wonder if you could run two instances, each using 50% of the card's processing. Since its all about bandwidth and each instance would get enough bandwidth shouldn't it work out?

# ? Jul 5, 2017 06:06

Fauxtool: Oct 21, 2008; by Jeffrey of YOSPOS

Watermelon Daiquiri posted:

i wonder if you could run two instances, each using 50% of the card's processing. Since its all about bandwidth and each instance would get enough bandwidth shouldn't it work out?

as soon as nothing is left but 1080s, mining software will start to be optimized for them

# ? Jul 5, 2017 06:21

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

Watermelon Daiquiri posted:

i wonder if you could run two instances, each using 50% of the card's processing. Since its all about bandwidth and each instance would get enough bandwidth shouldn't it work out?

My understanding is because it moves from double-pumped (GDDR5) to quad-pumped (5X) that 5X works in 64-byte "chunks", you can't subdivide further into (eg) 2x32 requests, that becomes 2x64 chunks.

With Ethereum which parts of the DAG you need to read is determined by something about the algorithm so changing the nonce also changes the parts of memory you need to read (threads want basically random parts of memory every nonce they check), so you get zero speedup from memory coalescing. It's specifically designed to be resistant to shared-memory acceleration like you're mentioning. The ways around this (eg read the entire DAG every cycle and broadcast it to all threads) are probably more wasteful than just throwing away half of your read.

If you accept the premise of Ethereum's memory-hardness, then the only way to scale up hashing is to throw absurd amounts of memory performance at it. Which is why everyone underclocks core and overclocks memory for Eth.

(If you were doing a large ASIC these acceleration strategies might be more viable (so you might be able to do a "single large ASIC"). GPUs cannot broadcast to "all threads", only within a group of 32 (NVIDIA) or 64 (AMD) threads, but if you could then you could just constantly cycle through memory broadcasting it (hundreds of times per second given the current ~2.5 GB DAG), and the sleeping threads slurp from the firehose as the data they want cycles through. If you have enough "sleeping threads" then on average throughput should be the same (or worse by a constant factor of DAG size) despite hilarious amounts of latency (this is the standard latency-hiding strategy for GPGPU). However, you would probably need to go from the currrent "tens of thousands of threads" support of GPUs to probably millions or tens of millions of sleeping threads (in CUDA those threads live in the equivalent of L1, with 64K registers per SMX engine, and you can also count the 64-112 KB of shared memory space here), and increase the number of threads on later revisions of the silicon as DAG size increased. Note that most of these threads are idle at any given time waiting for their memory requests to complete - you probably would need comparatively little actual execution hardware (especially with hardcoded logic units), as long as the threads can receive the broadcast while they're sleeping, and to repeat, this is already the standard latency-hiding approach for GPGPU. A naieve implementation would need a lot of L1 but since access patterns are so predictable (the scheduler loads the execution units - it knows what will be paged in) you might get some gains from L2 as well. Very little of this chip would be execution units probably - it's just a cache machine with a data playback loop attached. Maybe it would work, I dunno any of the die areas of the units in question, but that's how I'd attack it with the minimum deviation from the known-good GPGPU uarch.)

Frankly if you were going to design a system to attack Eth - there's no reason you couldn't just attach a massive amount of low-end GPUs to something like Threadripper (60, at 1x lane each), one of the biggest challenges with eth mining is keeping the GPUs from disconnecting randomly thanks to the lovely risers and just engineering around that problem would give you a massive edge. Having custom GPUs (ASICs) hardcoded to run the Eth algorithm would be gravy and would give a huge boost in efficiency. At the end of the day, having each ASIC have its own memory controller and VRAM system is not really that big an ask in design terms if you are the NSA and you have a big budget to do this. The eventual design would end up looking a lot like a GPU, the Eth algorithm's memory-hardness can force that much, but you can still design a significantly more reliable and efficient GPU that facilitates massive scaling rather than the ghetto crap that miners use - just like a supercomputer is much more efficient/reliable/scalable than a rack in your basement.

tldr: 5X and HBM have funky latency characteristics which make it undesirable for Ethereum's settings (in theory there's no reason Eth couldn't use 64-byte chunks or some other size instead), but in general, by design, the best approach is to throw tons of low-end commodity hardware at it.

Paul MaudDib fucked around with this message at 08:42 on Jul 5, 2017

# ? Jul 5, 2017 06:41

Surprise Giraffe: Apr 30, 2007; 1 Lunar Road
Moon crater
The Moon

Anyone know what the performance/power draw is like for water vs air cooling? I guess they arent proportionate at least

# ? Jul 5, 2017 08:57

Scarecow: May 20, 2008; 3200mhz RAM is literally the Devil. Literally.; Lipstick Apathy

Holy poo poo AMD what are you doing
http://wccftech.com/amd-rx-vega-benchmark-leaked-specs-confirmed-1630mhz-8gb-hbm2-484gbs-of-bandwidth/

# ? Jul 5, 2017 11:55

eames: May 9, 2009

hey those additional 30 Mhz over the FE really make a difference! :haw:

# ? Jul 5, 2017 12:19

Arzachel: May 12, 2012

Not like the FE ever stays at 1600 mhz for more then a couple seconds

# ? Jul 5, 2017 12:24

DrDork: Dec 29, 2003; commanding officer of the Army of Dorkness

Scarecow posted:

Holy poo poo AMD what are you doing
http://wccftech.com/amd-rx-vega-benchmark-leaked-specs-confirmed-1630mhz-8gb-hbm2-484gbs-of-bandwidth/

If going with only 8GB HBM2 instead of 16GB means they can price it closer to $400, it might actually be not entirely terrible. But that's a big if.

# ? Jul 5, 2017 12:49

Legin Noslen: Sep 9, 2004; Fortified with Rhiboflavin

Im really glad GPUs are retardedly expensive just because more idiots are trying to get 3 cents worth of cryptocurrency a day.

# ? Jul 5, 2017 14:01

Anarchist Mae: Nov 5, 2009; by Reene; Lipstick Apathy

DrDork posted:

If going with only 8GB HBM2 instead of 16GB means they can price it closer to $400, it might actually be not entirely terrible. But that's a big if.

8GB is what I expected it to have, what with them talking up the fact that they don't need as much ram due to the HBCC. I'm not really sure this is any more of a WTF than when the Vega FE was announced.

# ? Jul 5, 2017 14:23

Don Lapre: Mar 28, 2001; If you're having problems you're either holding the phone wrong or you have tiny girl hands.

Measly Twerp posted:

8GB is what I expected it to have, what with them talking up the fact that they don't need as much ram due to the HBCC. I'm not really sure this is any more of a WTF than when the Vega FE was announced.

Didn't Fury prove the 4gb of HBM is just as good as a million gb of gddr5 was bullshit?

# ? Jul 5, 2017 14:47

Seamonster: Apr 30, 2007; IMMER SIEGREICH

More importantly, which game actually uses more than 8gb VRAM @ 4k?

# ? Jul 5, 2017 14:56

Risky Bisquick: Jan 18, 2008; PLEASE LET ME WRITE YOUR VICTIM IMPACT STATEMENT SO I CAN FURTHER DEMONSTRATE THE CALAMITY THAT IS OUR JUSTICE SYSTEM.; Buglord

Just wait for 8k texture packs :pcgaming:

# ? Jul 5, 2017 14:57

Arivia: Mar 17, 2011

Risky Bisquick posted:

Just wait for 8k texture packs

Yeah, Skyrim with a bunch of bells and whistles added. I even watched a JayZTwoCents video where that was his exact explanation: "you don't need 8gb at 1080p unless you're a hardcore Skyrim modder"

# ? Jul 5, 2017 15:07

Seamonster: Apr 30, 2007; IMMER SIEGREICH

Deskable 8K monitors that can be used without scaling will simply never exist, biological limits of the human eye something something.

HDR 4k with high adaptive refresh is way better.

# ? Jul 5, 2017 15:08

DarkEnigma: Mar 31, 2001

Seamonster posted:

More importantly, which game actually uses more than 8gb VRAM @ 4k?

Off the top of my head, Resident Evil 7 used up to 11.5 gb at 4k with HDR. Quake Champions pushes 10-11 gbs. There may be a few others but I can't remember.

# ? Jul 5, 2017 15:08

1gnoirents: Jun 28, 2014; hello

Legin Noslen posted:

Im really glad GPUs are retardedly expensive just because more idiots are trying to get 3 cents worth of cryptocurrency a day.

Wait what gpu do i need to buy for 3 more cents a day

# ? Jul 5, 2017 15:49

Twerk from Home: Jan 17, 2009; This avatar brought to you by the 'save our dead gay forums' foundation.

DarkEnigma posted:

Off the top of my head, Resident Evil 7 used up to 11.5 gb at 4k with HDR. Quake Champions pushes 10-11 gbs. There may be a few others but I can't remember.

Quake Champions? Aren't they pitching that as an esports-type FPS where people are going to want extremely high framerates and to play it on a wide variety of hardware?

# ? Jul 5, 2017 15:57

xthetenth: Dec 30, 2012; Mario wasn't sure if this Jeb guy was a good influence on Yoshi.

Seamonster posted:

Deskable 8K monitors that can be used without scaling will simply never exist, biological limits of the human eye something something.

HDR 4k with high adaptive refresh is way better.

Yeah, but I desperately yearn for a 40" monitor as sharp as my surface's display, okay?

Twerk from Home posted:

Quake Champions? Aren't they pitching that as an esports-type FPS where people are going to want extremely high framerates and to play it on a wide variety of hardware?

May as well include the settings, I guess?

# ? Jul 5, 2017 16:00

Twerk from Home: Jan 17, 2009; This avatar brought to you by the 'save our dead gay forums' foundation.

xthetenth posted:

May as well include the settings, I guess?

What I was getting at is that most players will turn down settings until they get 150+ FPS, so my feeling about games like this is that like CS:GO, the devs might as well make max graphical details still extremely performant.

# ? Jul 5, 2017 16:02

xthetenth: Dec 30, 2012; Mario wasn't sure if this Jeb guy was a good influence on Yoshi.

Twerk from Home posted:

What I was getting at is that most players will turn down settings until they get 150+ FPS, so my feeling about games like this is that like CS:GO, the devs might as well make max graphical details still extremely performant.

My thought is that they may as well include them for press screenshots and in case they actually stay popular for a long time.

# ? Jul 5, 2017 16:08

repiv: Aug 13, 2009

Vega Crossfire is still slower than a single 1080ti in most cases :laugh:

GamersNexus also did a clock-for-clock comparison with the Fury X, and found Vega to have a slight IPC regression at least on current drivers. Vega does gain some ground in tessellation-heavy tests though, which suggests the primitive discarder is at least somewhat functional already.

https://www.youtube.com/watch?v=0l6Uc5kBrxY

repiv fucked around with this message at 16:54 on Jul 5, 2017

# ? Jul 5, 2017 16:11

Nfcknblvbl: Jul 15, 2002

xthetenth posted:

My thought is that they may as well include them for press screenshots and in case they actually stay popular for a long time.

I miss the days of "Ultra" settings being reserved for future hardware.

# ? Jul 5, 2017 16:39

Cygni: Nov 12, 2005; raring to post

Seamonster posted:

More importantly, which game actually uses more than 8gb VRAM @ 4k?

BF4 uses 8gigs at 4k. But honestly if you are gaming at 4k, you shouldn't be even considering RX Vega in the first place.

# ? Jul 5, 2017 16:43

AVeryLargeRadish: Aug 19, 2011; I LITERALLY DON'T KNOW HOW TO NOT BE A WEIRD SEXUAL CREEP ABOUT PREPUBESCENT ANIME GIRLS, READ ALL ABOUT IT HERE!!!

repiv posted:

Vega Crossfire is still slower than a single 1080ti in most cases

GamersNexus also did a clock-for-clock comparison with the Fury X, and found Vega to have a slight IPC regression at least on current drivers. Vega does gain some ground in tessellation-heavy tests though, which suggests the primitive discarder is at least somewhat functional already.

https://www.youtube.com/watch?v=0l6Uc5kBrxY

The more interesting point than the slight IPC regression is how badly Vega scales with clock rate, I ran numbers based on the FireStrike Extreme test results and with VegaFE stock running at 1440MHz like we saw in the testing by PCPer we see a performance gain of 25% over the FuryX but a clock speed difference of 38%. FuryX gets 7.4 points per MHz while stock VegaFE gets only 6.75 points, a ~10% difference in IPC at stock clocks, even if RX Vega ends up at the 1630MHz boost consistently we will not be seeing linear scaling with clock speed, IPC might be down 13-15% on a clock for clock basis at that speed.

# ? Jul 5, 2017 17:29

repiv: Aug 13, 2009

AVeryLargeRadish posted:

The more interesting point than the slight IPC regression is how badly Vega scales with clock rate, I ran numbers based on the FireStrike Extreme test results and with VegaFE stock running at 1440MHz like we saw in the testing by PCPer we see a performance gain of 25% over the FuryX but a clock speed difference of 38%. FuryX gets 7.4 points per MHz while stock VegaFE gets only 6.75 points, a ~10% difference in IPC at stock clocks, even if RX Vega ends up at the 1630MHz boost consistently we will not be seeing linear scaling with clock speed, IPC might be down 13-15% on a clock for clock basis at that speed.

That's more evidence for the micro-throttling theory then. If Vega is disabling shader cores to moderate its power consumption without affecting geometry throughput then you'd expect a drop in IPC as the target clock increases.

We really need to see how Vega behaves under watercooling.

# ? Jul 5, 2017 17:44

Subjunctive: Sep 12, 2006; ✨sparkle and shine✨

eames posted:

Sharedholders... they promised a card for Q2/17, they delivered a pointless but (kind of) working 1-2 days before that deadline to avoid lawsuits.

What would be the basis of the suit? Every company includes the "forward-looking statements" stanza in their guidance to shareholders. Hitting the target is important for investor confidence, but I seriously doubt there's liability.

# ? Jul 5, 2017 17:58

EmpyreanFlux: Mar 1, 2013; The AUDACITY! The IMPUDENCE! The unabated NERVE!

repiv posted:

That's more evidence for the micro-throttling theory then. If Vega is disabling shader cores to moderate its power consumption without affecting geometry throughput then you'd expect a drop in IPC as the target clock increases.

We really need to see how Vega behaves under watercooling.

This is going to be Fiji all over again, why bother with the high priced uncut model when it'll throttle enough to effectively be one of the cut down models?

Polaris is still the better uarch, lmao.

# ? Jul 5, 2017 18:08

Seamonster: Apr 30, 2007; IMMER SIEGREICH

You can shop it in your mind: that history channel aliens guy with the hair - "shareholders"

AMD stock up over a buck as I type.

# ? Jul 5, 2017 18:16

LorneReams: Jun 27, 2003; I'm bizarre

The Iron Rose posted:

Man, where the hell are y'all seeing 1080s for $450? Because at that price I'd pick one up in a heartbeat.

I got one for 480 at best buy (1080 FE). It was a sale of some sort. Was going to buy the 1070, but was like only $60 less, so said gently caress it.

# ? Jul 5, 2017 18:33

Adbot: ADBOT LOVES YOU

# ? May 31, 2024 10:46

SlayVus: Jul 10, 2009; Grimey Drawer

So if we underclock Vega for reduced power consumption, we should we engagement of all cores all the time. Which should in theory provide a more stable environment for gaming and other junk.

# ? Jul 5, 2017 18:37

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > GPU Megat[H]read - the cores of wrath grew heavy on the die that day

«‹›3871 »