GPU Megat[H]read - the cores of wrath grew heavy on the die that day

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > GPU Megat[H]read - the cores of wrath grew heavy on the die that day

«‹›3875 »

steckles: Jan 14, 2006

shrike82 posted:

yeah going to disagree with the walls of texts - LLMs have entrenched the dominance of Nvidia hardware if anything. there are a lot of toy repos out there that cut down LLMs to fit and be compatible with CPUs including Apple Silicon but it's not really useful beyond students and execs wanting something to run locally on their laptops. a) it's lossy and b) why try to penny pinch on on-prem or private cloud hardware - businesses are happy to throw money at racks of nvidia cards to demonstrate having an AI strategy

Eh, I don�t know. We�re still relatively early in the development of machine learning and it�s hard to say where things are going. Nvidia has the best support and most developed software ecosystem for sure, but ultimately most DL algorithms just need to do as many matrix multiplications as possible. A simpler architecture without all the GPU baggage designed solely to feed billions of MADDs could end up being the most cost effective approach as models continue to grow. Plenty of companies are experimenting with such designs.

I wouldn�t be surprised if we see a bunch more competing products, as Alphabet, Amazon, Meta, Microsoft, and others develop in house, single purpose hardware that is cheaper to rent if you�re already in their cloud ecosystem.

# ? Aug 10, 2023 06:58

Adbot: ADBOT LOVES YOU

# ? Jun 3, 2024 21:26

Yudo: May 15, 2003

shrike82 posted:

yeah going to disagree with the walls of texts - LLMs have entrenched the dominance of Nvidia hardware if anything. there are a lot of toy repos out there that cut down LLMs to fit and be compatible with CPUs including Apple Silicon but it's not really useful beyond students and execs wanting something to run locally on their laptops. a) it's lossy and b) why try to penny pinch on on-prem or private cloud hardware - businesses are happy to throw money at racks of nvidia cards to demonstrate having an AI strategy

...by people talking about using macs for llms, I thought it was pretty clear for local, personal usage. That is also where the action is since the Facebook leak and due to lora. Nvidia hardware is entrenched, but there is momentum going against CUDA to platform agnostic approaches, which is what we were talking about. And I very, very much doubt OpenAI, Facebook, Microsoft et al. are happy being beholden to nvidia. Nvidia's market dominance is bad for them. I also mention this above.

# ? Aug 10, 2023 07:37

Shipon: Nov 7, 2005

Yeah that's why they shouldn't use GDDR7, AI bros can get hosed

# ? Aug 10, 2023 08:21

Weird Pumpkin: Oct 7, 2007

I'm sure this is a dumb question, but why do gpus have embedded memory anyway?

Couldn't they just have like.. ram slots or whatever on them since they're getting so big? Or is it that the memory has to be so fast for a GPU that either it's not practical for a consumer to buy or it requires a special design to accommodate?

# ? Aug 10, 2023 12:34

gradenko_2000: Oct 5, 2010; HELL SERPENT; Lipstick Apathy

Weird Pumpkin posted:

I'm sure this is a dumb question, but why do gpus have embedded memory anyway?

Couldn't they just have like.. ram slots or whatever on them since they're getting so big? Or is it that the memory has to be so fast for a GPU that either it's not practical for a consumer to buy or it requires a special design to accommodate?

Yes. Making slotted VRAM would increase the trace lengths/distance to the core, which would be detrimental to signal integrity and make it harder to reach the speeds that VRAM is aiming for, and there wouldn't be that many advantages to it.

# ? Aug 10, 2023 12:38

Weird Pumpkin: Oct 7, 2007

gradenko_2000 posted:

Yes. Making slotted VRAM would increase the trace lengths/distance to the core, which would be detrimental to signal integrity and make it harder to reach the speeds that VRAM is aiming for, and there wouldn't be that many advantages to it.

Ah ok, thanks! I figured it would be something like that

Kind of a shame though because it would be nice to be able to just pack as much as you'd like, or upgrade it later as requirements get more ridiculous

# ? Aug 10, 2023 12:40

Truga: May 4, 2014; Lipstick Apathy

used to be able to jam ram chips into gpus back before 3d acceleration existed as a thing, so you could run a higher resolution desktop gui lol

e: pic not mine but i had one of these packed full of those chips, coupled with a voodoo1 in like 2001 lol

Truga fucked around with this message at 12:47 on Aug 10, 2023

# ? Aug 10, 2023 12:40

Dr. Video Games 0031: Jul 17, 2004

The earliest video cards actually did have RAM slots, but they very quickly moved away from that. Lots of random expansion cards had socketed memory or cache actually, but everyone decided it was best to just solder as much memory as you need for performance and guaranteed compatibility reasons. Hell, even CPU cache used to be slotted into your motherboard, but then they started putting it directly on the cpu die instead, the benefits of which should be obvious. That's a really good example actually because the reasons for using soldered memory are similar to the reasons for moving to on-die cache. Really, everyone's trying to move all forms of memory as close to the processing units as feasible. The highest-performance GPUs put HBM memory on the same package as the GPU, for instance. They'd probably put it all in one big chip together if such a thing were possible. GDDR exists in the middle ground where it's soldered onto the board near the package.

# ? Aug 10, 2023 12:47

Truga: May 4, 2014; Lipstick Apathy

Dr. Video Games 0031 posted:

Hell, even CPU cache used to be slotted into your motherboard

286 motherboards were fuckin wild

# ? Aug 10, 2023 12:54

njsykora: Jan 23, 2012; Robots confuse squirrels.

I forget if the MMX/K6 era had cache on the CPU but motherboards still had additional cache slots to augment it. I still think those are cool.

# ? Aug 10, 2023 13:02

Nfcknblvbl: Jul 15, 2002

Probably won't be long until CPU RAM gets soldered in too.

# ? Aug 10, 2023 14:08

power crystals: Jun 6, 2007; Who wants a belly rub??

The one true way is slot CPUs with cache slots on them. I will not be taking questions at this time.

# ? Aug 10, 2023 14:22

gradenko_2000: Oct 5, 2010; HELL SERPENT; Lipstick Apathy

Nfcknblvbl posted:

Probably won't be long until CPU RAM gets soldered in too.

that's already mostly the case in laptops, as Dr. Video Games 0031 alluded to

# ? Aug 10, 2023 14:24

repiv: Aug 13, 2009

and consoles, and macs

it's only a matter of time before desktops get it, perhaps with an intermediate step where motherboards still have RAM slots but you can opt for a CPU with embedded memory and leave the slots empty if you want to

# ? Aug 10, 2023 14:25

Nfcknblvbl: Jul 15, 2002

repiv posted:

and consoles, and macs

it's only a matter of time before desktops get it, perhaps with an intermediate step where motherboards still have RAM slots but you can opt for a CPU with embedded memory and leave the slots empty if you want to

Yeah, this is what I meant. Upgrading RAM is becoming less of a thing. A meme will truly die.

# ? Aug 10, 2023 14:27

Weird Pumpkin: Oct 7, 2007

That sounds like it's going to get expensive for the home market. Since then RAM will be a market segmentation tool in both GPUs and CPUs

:capitalism:

i spose

# ? Aug 10, 2023 14:29

repiv: Aug 13, 2009

they can take away our ram slots but they can never take away https://downloadmoreram.com

# ? Aug 10, 2023 14:31

ConanTheLibrarian: Aug 13, 2004; dis buch is late; Fallen Rib

repiv posted:

and consoles, and macs

it's only a matter of time before desktops get it, perhaps with an intermediate step where motherboards still have RAM slots but you can opt for a CPU with embedded memory and leave the slots empty if you want to

Stack everything. A compute die on top of a cache die on top of a PHY die on top of a HBM die. Let the 3D revolution begin!

# ? Aug 10, 2023 14:40

wemgo: Feb 15, 2007

ConanTheLibrarian posted:

Stack everything. A compute die on top of a cache die on top of a PHY die on top of a HBM die. Let the 3D revolution begin!

This is the way

# ? Aug 10, 2023 15:22

Nfcknblvbl: Jul 15, 2002

Good luck with cooling that stacked die!

Chips will have tiny cooling channels between each stack.

# ? Aug 10, 2023 15:53

Truga: May 4, 2014; Lipstick Apathy

Nfcknblvbl posted:

Probably won't be long until CPU RAM gets soldered in too.

no way ram getting soldered survives in the server market, and desktops are really just an extension of that

# ? Aug 10, 2023 16:08

Nfcknblvbl: Jul 15, 2002

Truga posted:

no way ram getting soldered survives in the server market, and desktops are really just an extension of that

Performance gains have to come from somewhere.

# ? Aug 10, 2023 16:21

Canned Sunshine: Nov 20, 2005; CAUTION: POST QUALITY UNDER CONSTRUCTION

And hell, it'll possibly further reduce replacement cycles, so they'll be making even more money off the customers!

# ? Aug 10, 2023 16:27

Dead Goon: Dec 13, 2002; No Obvious Flaws

Yo Dawg, I heard you like stacked dies, so I put a stacked die in your stacked die, so you can stack dies while you die.

# ? Aug 10, 2023 16:30

Twerk from Home: Jan 17, 2009; This avatar brought to you by the 'save our dead gay forums' foundation.

Truga posted:

no way ram getting soldered survives in the server market, and desktops are really just an extension of that

I don't see why soldered RAM in the server market couldn't work, especially with the way that there's building momentum for single-socket servers in some type of higher density configuration over the traditional 2P configuration.

# ? Aug 10, 2023 16:43

Arrath: Apr 14, 2011

Nfcknblvbl posted:

Performance gains have to come from somewhere.

Yeah, make software efficient again instead of throwing everything inside electron and brute forcing passable performance by throwing hardware at the problem.

# ? Aug 10, 2023 16:50

UHD: Nov 11, 2006

Arrath posted:

Yeah, make software efficient again instead of throwing everything inside electron and brute forcing passable performance by throwing hardware at the problem.

why not do both things

# ? Aug 10, 2023 16:54

wemgo: Feb 15, 2007

We�ll just use AI software optimization that creates an incoherent codebase that�s �good enough� with uncorrectable bugs at the extreme margins

Porbelm solued

# ? Aug 10, 2023 17:15

Yaoi Gagarin: Feb 20, 2014

just lmao if we end up with dumbed-down CPUs to avoid side channel attacks and make it up by going back to VLIW ISAs supported by AI-enhanced optimization in the compiler

"where did that bring you? back to me" - Itanium

# ? Aug 10, 2023 19:23

DoctorRobert: Jan 20, 2020

Nfcknblvbl posted:

Good luck with cooling that stacked die!

Chips will have tiny cooling channels between each stack.

Well:
https://www.froresystems.com/

# ? Aug 10, 2023 19:33

Nfcknblvbl: Jul 15, 2002

DoctorRobert posted:

Well:
https://www.froresystems.com/

Well dang, the future is now.

# ? Aug 10, 2023 20:12

Zero VGS: Aug 16, 2002; ASK ME ABOUT HOW HUMAN LIVES THAT MADE VIDEO GAME CONTROLLERS ARE WORTH MORE; Lipstick Apathy

Their best model removes 10w of heat... if they jam multiple into some Macbooks that'll be nice, but I don't think they can scale them to laptops with discrete GPUs or anything.

# ? Aug 10, 2023 20:27

Rinkles: Oct 24, 2010; What I'm getting at is...
Do you feel the same way?

Isn't the big asterisk that the cooling module itself will be power hungry, or am I thinking of something else?

E: Which I guess wouldn't be an issue on desktops

# ? Aug 10, 2023 20:55

SlowBloke: Aug 14, 2017

Rinkles posted:

Isn't the big asterisk that the cooling module itself will be power hungry, or am I thinking of something else?

E: Which I guess wouldn't be an issue on desktops

Their home page states that each module eats 1.75w to cool 5.75(mini) or 10.5 watts of heat. It's only going to cool ultraportables and nucs.

# ? Aug 10, 2023 21:08

Dr. Video Games 0031: Jul 17, 2004

1.75 watts to dissipate 10.5 watts is not terribly efficient when compared to a typical laptop fan, but as you say this is more about the form factor such a solution allows more than anything. In larger form factor systems, I don't expect it to offer many advantages. Maybe you can slap some on stuff like SSD controllers or motherboard chipsets so you can give them compact cooling solutions without annoying, whiny fans.

edit: The 10.5 watts includes the 1.75 from the airjet itself. The mini dissipates 4.25W of heat from a chip with 1W of power and the pro dissipates 8.75W of heat from a chip with 1.75W of power according to their spec sheets.

edit 2: This is mostly tangential but I was curious so I looked into this more. This thing's power efficiency is about 5:1 for watts dissipated to watts used. Going by the max power draw of laptop fans, I'd guess the typical blower-style laptop cooling solution is more like 10:1 to 20:1. And if we were to look at PC fans, the NF-A12x25 is 1.68W at max speed, and on the right heatsink it will be capable of dissipating 200W of heat or more, or about 119:1. The main draw of this device will be its form factor and low noise profile, but the low power efficiency does limit it somewhat in devices where battery life is important, I feel like.

Dr. Video Games 0031 fucked around with this message at 00:55 on Aug 11, 2023

# ? Aug 10, 2023 21:19

Cygni: Nov 12, 2005; raring to post

LTT had a puff piece (PUN INTENDED) on it a few months ago, it does seem pretty cool (PUN INTENDED):

https://www.youtube.com/watch?v=vdD0yMS40a0

# ? Aug 10, 2023 21:21

AirRaid: Dec 21, 2004; Nose Manual + Super Sonic Spin Attack

I dunno, the design seems to be full of holes. Their probably just blowing hot air.

# ? Aug 10, 2023 21:30

Dr. Video Games 0031: Jul 17, 2004

Dr. Video Games 0031 posted:

Maybe you can slap some on stuff like SSD controllers or motherboard chipsets so you can give them compact cooling solutions without annoying, whiny fans.

This was just an idle thought in my head, but what do we have here? https://www.tomshardware.com/news/phison-demos-ps5026-e26-max14um-gen5-ssd-with-a-14-gbs-read-speed

It makes sense. SSD controllers are getting out of hand with their cooling requirements, and the tiny fans they're putting on the heatsinks for Gen 5 drives are the worst. This seems like prime territory for AirJet cooling.

# ? Aug 10, 2023 22:09

Canned Sunshine: Nov 20, 2005; CAUTION: POST QUALITY UNDER CONSTRUCTION

Cygni posted:

LTT had a puff piece (PUN INTENDED) on it a few months ago, it does seem pretty cool (PUN INTENDED):

https://www.youtube.com/watch?v=vdD0yMS40a0

Wow, thanks for recapping Thermodynamics 101, Linus! :downs:

# ? Aug 10, 2023 22:18

Adbot: ADBOT LOVES YOU

# ? Jun 3, 2024 21:26

ijyt: Apr 10, 2012

I like this.

https://youtu.be/_1KjsqaWIU4

# ? Aug 10, 2023 23:19

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > GPU Megat[H]read - the cores of wrath grew heavy on the die that day

«‹›3875 »