gpgpgpu hot take thread: the "gp" stands for "garbage pile"

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > gpgpgpu hot take thread: the "gp" stands for "garbage pile"

«‹›15 »

crazysim: May 23, 2004; I AM SOOOOO GAY

this talk of apis and stuff makes me wonder if someone would ever make a wddmwrapper in the spirit of ndiswrapper. reactos doesn't even support wddm at the moment so maybe that's a bit too far out.

# ? Oct 22, 2017 16:58

Adbot: ADBOT LOVES YOU

# ? Jun 9, 2024 04:35

hackbunny: Jul 22, 2007; I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

crazysim posted:

reactos

*ears perk up*

# ? Oct 23, 2017 17:00

Farmer Crack-Ass: Jan 2, 2001; this is me posting irl

echinopsis posted:

i can remember space moose from like 2004 when I joined

end of an era

noooo, I think I started the space moose av back in 2008 or 2009 or so. in 2004 I was still av-less

# ? Oct 23, 2017 21:02

echinopsis: Apr 13, 2004; by Fluffdaddy

last decade wow

# ? Oct 24, 2017 01:01

Notorious b.s.d.: Jan 25, 2003; by Reene

Malcolm XML posted:

yeah nvidia is not gonna give away the detailed hardware specs needed for this unless AMD really magically starts competing in GPUs

i wasn't suggesting they would publish specs or beg open source authors to work on poo poo

just that nvidia would maintain an in-tree kernel driver instead of an out-of-tree

# ? Oct 24, 2017 01:04

Notorious b.s.d.: Jan 25, 2003; by Reene

given that nvidia has helped out the nouveau project with doc dumps, from time to time, suggests to me that they are not so much hostile to open source as focused on the bottom line

the nvidia linux driver makes money for nvidia in its current state, as a bunch of closed source blobs with a lovely little open source shim

if the future path of least resistance is a real open source kernel driver, pushing the blobs into user space, i would bet on nvidia following that path

# ? Oct 24, 2017 01:05

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

whenever a machine intelligence takes over the world its prob gonna be running on nvidia hardware

# ? Oct 24, 2017 02:43

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

do i win hottest take?

# ? Oct 24, 2017 02:43

josh04: Oct 19, 2008; "THE FLASH IS THE REASON
TO RACE TO THE THEATRES"
_{This title contains sponsored content.}

look i know they play it up a lot for marketing, but "machine learning" isn't really all that close to machine intelligence.

# ? Oct 24, 2017 02:50

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

apparently i wont effortpost again because you guys are more interested in linux

# ? Oct 24, 2017 03:43

Schadenboner: Aug 15, 2011; by Shine

Suspicious Dish posted:

apparently i wont effortpost again because you guys are more interested in linux

YOSPOS, bithc.

# ? Oct 24, 2017 03:43

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

josh04 posted:

look i know they play it up a lot for marketing, but "machine learning" isn't really all that close to machine intelligence.

yeah, common computers are only decades away from reaching human brain scales, prob nothing to worry about
they can only currently beat the top human's in the world at specific complex tasks, like playing go, or some RTS game.
as far as generality, they are doing ok at learning to interpret raw pixel input from games
the problem happens when some computer scientist decides to train a nn to be a computer scientist/neural network design expert. teach it concepts of logic, proofs, algorithms, and eventually how its own brain works. *then* we're hosed

# ? Oct 24, 2017 04:39

Shame Boy: Mar 2, 2010

Suspicious Dish posted:

apparently i wont effortpost again because you guys are more interested in linux

i appreciate your effortposts and hope you continue, friend :shobon:

# ? Oct 24, 2017 04:47

Shame Boy: Mar 2, 2010

peepsalot posted:

the problem happens when some computer scientist decides to train a nn to be a computer scientist/neural network design expert.

google's literally already doing that and they saw increases in performance of *gasp* a whole 20%!!

turns out there's a ton of saddle points and local minima that neural networks love to get stuck in and the whole "exponentially self-improving AI" thing is entirely fiction

# ? Oct 24, 2017 04:48

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

ate all the Oreos posted:

google's literally already doing that and they saw increases in performance of *gasp* a whole 20%!!

turns out there's a ton of saddle points and local minima that neural networks love to get stuck in and the whole "exponentially self-improving AI" thing is entirely fiction

algorithmic improvements and discovery of better heuristics are always a possibility and we certainly haven't exhausted that search space

# ? Oct 24, 2017 04:54

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

basically if our brains evolved from a few billion of years of natural selection and dumb luck,
how many clock cycles (at a few billion per second currently) will it be before AI can beat us at the game of life?

# ? Oct 24, 2017 05:12

30 TO 50 FERAL HOG: Mar 2, 2005

computers are deterministic. there is no truly random evolution

# ? Oct 24, 2017 05:15

Cocoa Crispies: Jul 20, 2001; Vehicular Manslaughter!; Pillbug

peepsalot posted:

basically if our brains evolved from a few billion of years of natural selection and dumb luck,
how many clock cycles (at a few billion per second currently) will it be before AI can beat us at the game of life?

humans will probably get out-competed and driven to extinction by corporations before computers become a threat

# ? Oct 24, 2017 05:33

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

Cocoa Crispies posted:

humans will probably get out-competed and driven to extinction by corporations before computers become a threat

corporations run by robots? :thunk:

# ? Oct 24, 2017 05:35

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

a google ai literally shoving targeted content down your throat until you suffocate

# ? Oct 24, 2017 05:38

eschaton: Mar 7, 2007; Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

peepsalot posted:

whenever a machine intelligence takes over the world its prob gonna be running on nvidia hardware

last I heard Nvidia hardware wasn�t all that great for running Lisp, and the only real contender for anything resembling �machine intelligence� is written in Lisp

# ? Oct 24, 2017 05:39

eschaton: Mar 7, 2007; Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

Suspicious Dish posted:

apparently i wont effortpost again because you guys are more interested in linux

boo

Suspicious Dish make the effortpost

# ? Oct 24, 2017 05:40

eschaton: Mar 7, 2007; Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

peepsalot posted:

as far as generality, they are doing ok at learning to interpret raw pixel input from games
the problem happens when some computer scientist decides to train a nn to be a computer scientist/neural network design expert. teach it concepts of logic, proofs, algorithms, and eventually how its own brain works. *then* we're hosed

Eurisko/Cyc hasn�t hosed us yet despite understanding how its own brain works and being able to self-modify

# ? Oct 24, 2017 05:44

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

eschaton posted:

Eurisko/Cyc hasn�t hosed us yet despite understanding how its own brain works and being able to self-modify

have they tried giving it a http connection and a server farm of computing power?

# ? Oct 24, 2017 05:50

peepsalot: Apr 24, 2007; ��PEEP THIS...
��BITCH!

NEED MORE MILK posted:

computers are deterministic

otoh this does exist https://en.wikipedia.org/wiki/Hardware_random_number_generator

e: also deterministic doesn't preclude chaotic

Suspicious Dish sorry i poo poo up the thread with my doomsaying, your smart posts are good too thanks and I read them

peepsalot fucked around with this message at 07:02 on Oct 24, 2017

# ? Oct 24, 2017 06:16

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

nobody seemed to care about quad occupancy so i won't post anymore about that... in the meantime have a really good article from 2013 about warp branch behavior and divergence

https://tangentvector.wordpress.com/2013/04/12/a-digression-on-divergence/

# ? Oct 24, 2017 07:41

BobHoward: Feb 13, 2012; The only thing white people deserve is a bullet to their empty skull

Suspicious Dish posted:

nobody seemed to care about quad occupancy

i did i just didn't have anything to say

also i think i owe you an effortpost but haven't been able to :effort:

here in a while

# ? Oct 24, 2017 07:59

BobHoward: Feb 13, 2012; The only thing white people deserve is a bullet to their empty skull

NEED MORE MILK posted:

computers are deterministic. there is no truly random evolution

<---- is reminding you that if you have any deece intel cpu made since ~2012 (i think that's the right year) you have a true random number generator built into the cpu which shits out high quality (*) entropy at rates on the order of hundreds of megabytes per second

<---- has designed a true random number generator for a former employer (not intel) and can tell you that it is challenging to do well but quite doable

_{* unless you believe the nerds who don't trust RdRand because they think the got to intel, personally i doubt that happened but have no way to prove it}

# ? Oct 24, 2017 08:12

OzyMandrill: Aug 12, 2013; Look upon my words
and despair

ate all the Oreos posted:

i'm seriously asking, like is "ordered information" some kind of potential energy since it's not as low an energy state as random / higher entropy states could be?

physicist in the house! (tho it was many years ago and i've been touching computers ever since)

yes
in order to not be random, you need to expend energy to change the bits. 1 bit is 1 unit of 'Shannon Entropy', which has been proved to be the same as the entropy we are familiar with in thermodynamics, but it's like the quantum-scale unit. the theory goes something like this...
for a given algorithm, you expect a certain output (in x bits). the more precise you need the answer, the less combinations of output bits are allowed, the lower the entropy, and the more work it takes to generate (why approximations run faster). there will be a theoretical lower limit of energy required, which varies depending on the algorithm.

take the simple case of generating the square root of a number. a double precision root op is expensive, 32bit is less expensive, and on a cpu we can use some bit twiddling to get a close approximation with fewer instructions. the fastest method of all (depending on memory transaction costs) would be a look up table, but this requires spending memory (fixed entropy) to save runtime energy - the memory is literally a store of pre-generated entropy that can be used. it has to be filled by doing the algorithm on all inputs (spent energy). a typical optimisation would be to only store 1% of the inputs. if we use it as-is, there will be some lower bits that are 'wrong'. this can be improved with interpolation - but that obviously is just changing the balance of 90% less stored entropy in exchange for the extra operations (energy spent) to interpolate between two entries each time the algorithm is used. the trick for optimisation is balancing the costs of work done (cycles on the processor) with static entropy (memory) to get the desired result within a desired margin of error.

this also means that theoretically a formatted hard disc weighs slightly more than an unformatted one by a comedy 'weighs less than an atom' amount, as energy is spent on changing the state, and it turns out to be true - a 1TB drive can contains something like 5J worth of extra energy in the potential energy of the magnetic dipoles. the dipoles want to be randomly aligned, alternating n/s/n/s or lined up in rings & random swirly fractal shapes, but when formatted the area of each bit needs the dipoles aligned together. if you could attach little pulleys to the dipoles and let em go, they would swing around to the random shape and the energy could be 'extracted'. chip based memory will have similar properties - in order to store a 0 or a 1 the states of the atoms need to be put in an unnatural state and will want to decay back, the difference in energy states is the potential energy. solid state memory has a larger energy barrier before they can flip which allows them to remember stuff when unpowered (thermal fluctuations in the atoms energy are not enough to push it over the energy hump) which means they can preserve their entropy for longer. but by definition this means the energy cost of using them is higher - no free lunches with thermodynamics.

note that most of the energy costs are due to the massively inefficient systems we use. single atom scale devices would use less than a billionth of what we use now, but the same principles would apply. to make any pattern of bits/information will require changing states of something, and the states must be different energy levels of some kind. changing state will require energy, and this energy must be larger than 'thermal noise' else the information will decay. the energy difference between these states will be the potential energy stored per bit, and by e=mc^2, it will have extra mass to be formatted.

# ? Oct 24, 2017 10:33

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Suspicious Dish posted:

nobody seemed to care about quad occupancy so i won't post anymore about that... in the meantime have a really good article from 2013 about warp branch behavior and divergence

https://tangentvector.wordpress.com/2013/04/12/a-digression-on-divergence/

im all about the occupancy iykwim

# ? Oct 24, 2017 11:36

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Suspicious Dish posted:

let's talk about triangles. everybody loves triangles. gpus love triangles: they're guaranteed planar shapes with very few edge cases (the big one is "degenerates" -- triangles which have no area). they're also *convex*, so you can determine if a point is inside them with three simple edge tests. triangles make everyone's life easier, everyone loves triangles.

what happens if you have a lot of triangles? well, let's combine it with a previous fact: gpu's run the pixel shader program for each triangle at least four times (four "threads"), in a 2x2 "quad" in screen space. this is required for things like derivatives where it wants to know how "far apart" two pixels are so it knows which miplevel to use, etc.

so, what happens if you have a lot of triangles? clarification: what happens if you have a lot of really small triangles, like on an extremely dense mesh? that means that you'll kick off four of those threads for that one triangle, even if it only barely covers one pixel in screen-space. this means that three out of the four threads in your quad are completely useless.

the amount of threads of your quad that are actually doing work is known as "quad occupancy". so in our case we have 25% quad occupancy, and that's a big performance issue. if all of your triangles have poor quad occupancy, your performance is a quarter of what it could be. as meshes get denser and denser and graphics more impressive, this is turning into one of the biggest bottlenecks. it's usually solved by LODing models: if a model is too dense, use a simpler, less dense form of the model. these LODs are usually generated dynamically by Simplygon or another piece of middleware like that.

this is one way that fillrates and triangle counts as specs are misleading. big triangles are more efficient because you can often reach 90% occupancy.

this presentation is a bit outdated but pages 22-24 talk about occupancy so it might be helpful http://on-demand.gputechconf.com/gtc/2016/presentation/s6138-christoph-kubisch-pierre-boudier-gpu-driven-rendering.pdf

isnt this due to the shared nature of gpu hardware? like it's required b/c gpu compute units amortize the expensive bits over many threads to allow for massive throughput

# ? Oct 24, 2017 11:38

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

suspicious dish post the blog

# ? Oct 24, 2017 11:45

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

BobHoward posted:

<---- is reminding you that if you have any deece intel cpu made since ~2012 (i think that's the right year) you have a true random number generator built into the cpu which shits out high quality (*) entropy at rates on the order of hundreds of megabytes per second

<---- has designed a true random number generator for a former employer (not intel) and can tell you that it is challenging to do well but quite doable

_{* unless you believe the nerds who don't trust RdRand because they think the got to intel, personally i doubt that happened but have no way to prove it}

how is it that difficult u just put a few reverse biased transistors and whiten the bits using von Neumann's trick

genuiniely interested

# ? Oct 24, 2017 11:47

Shame Boy: Mar 2, 2010

Malcolm XML posted:

how is it that difficult u just put a few reverse biased transistors and whiten the bits using von Neumann's trick

genuiniely interested

i assume it's challenging to do it in a way that's like, provably reliable and also fast? even with key whitening if the reverse-biased transistors get out of whack you could wind up with it drifting to the point where it's spitting out all-ones (and thus not emitting anything) or something

like mine did :v:

e: i think the "real" ones use avalanche diodes or some other mechanism that's a bit more predictable for this reason actually

# ? Oct 24, 2017 14:33

vOv: Feb 8, 2014

Malcolm XML posted:

how is it that difficult u just put a few reverse biased transistors and whiten the bits using von Neumann's trick

genuiniely interested

von neumann's trick doesn't work if the bits are intercorrelated

# ? Oct 24, 2017 17:44

Schadenboner: Aug 15, 2011; by Shine

vOv posted:

von neumann's trick doesn't work if the bits are intercorrelated

SEE VON NEUMANN'S ONE WEIRD TRICK!
STATISTICIANS HATE HIM BUT THERE'S NOTHING THEY CAN DO!

# ? Oct 24, 2017 20:46

theodop: Dec 30, 2005; rock solid, heart touching

OzyMandrill posted:

physicist in the house! (tho it was many years ago and i've been touching computers ever since)

yes
in order to not be random, you need to expend energy to change the bits. 1 bit is 1 unit of 'Shannon Entropy', which has been proved to be the same as the entropy we are familiar with in thermodynamics, but it's like the quantum-scale unit. the theory goes something like this...
for a given algorithm, you expect a certain output (in x bits). the more precise you need the answer, the less combinations of output bits are allowed, the lower the entropy, and the more work it takes to generate (why approximations run faster). there will be a theoretical lower limit of energy required, which varies depending on the algorithm.

take the simple case of generating the square root of a number. a double precision root op is expensive, 32bit is less expensive, and on a cpu we can use some bit twiddling to get a close approximation with fewer instructions. the fastest method of all (depending on memory transaction costs) would be a look up table, but this requires spending memory (fixed entropy) to save runtime energy - the memory is literally a store of pre-generated entropy that can be used. it has to be filled by doing the algorithm on all inputs (spent energy). a typical optimisation would be to only store 1% of the inputs. if we use it as-is, there will be some lower bits that are 'wrong'. this can be improved with interpolation - but that obviously is just changing the balance of 90% less stored entropy in exchange for the extra operations (energy spent) to interpolate between two entries each time the algorithm is used. the trick for optimisation is balancing the costs of work done (cycles on the processor) with static entropy (memory) to get the desired result within a desired margin of error.

this also means that theoretically a formatted hard disc weighs slightly more than an unformatted one by a comedy 'weighs less than an atom' amount, as energy is spent on changing the state, and it turns out to be true - a 1TB drive can contains something like 5J worth of extra energy in the potential energy of the magnetic dipoles. the dipoles want to be randomly aligned, alternating n/s/n/s or lined up in rings & random swirly fractal shapes, but when formatted the area of each bit needs the dipoles aligned together. if you could attach little pulleys to the dipoles and let em go, they would swing around to the random shape and the energy could be 'extracted'. chip based memory will have similar properties - in order to store a 0 or a 1 the states of the atoms need to be put in an unnatural state and will want to decay back, the difference in energy states is the potential energy. solid state memory has a larger energy barrier before they can flip which allows them to remember stuff when unpowered (thermal fluctuations in the atoms energy are not enough to push it over the energy hump) which means they can preserve their entropy for longer. but by definition this means the energy cost of using them is higher - no free lunches with thermodynamics.

note that most of the energy costs are due to the massively inefficient systems we use. single atom scale devices would use less than a billionth of what we use now, but the same principles would apply. to make any pattern of bits/information will require changing states of something, and the states must be different energy levels of some kind. changing state will require energy, and this energy must be larger than 'thermal noise' else the information will decay. the energy difference between these states will be the potential energy stored per bit, and by e=mc^2, it will have extra mass to be formatted.

dang, this is very interesting

# ? Oct 24, 2017 22:56

BobHoward: Feb 13, 2012; The only thing white people deserve is a bullet to their empty skull

Malcolm XML posted:

how is it that difficult u just put a few reverse biased transistors and whiten the bits using von Neumann's trick

genuiniely interested

what the Oreo eater said, and in my case i was asked to build a pure digital circuit that was reasonably process neutral. that pushes you towards some form of ring oscillator and those are tricky in many ways and also difficult to get high performance out of

intel published rather a lot about theirs and you can google up a lot about how it works. iirc it�s basically a circuit for forcing a flop into metastability and then letting it resolve to 0 or 1, with analog tuning and mixed signal feedback so it can make itself give a roughly fair coin flip output pre whitening. then for high quality whitening and rate enhancement they use true random bits to seed an aes stream cipher block. (as noted von Neumann assumes all samples are uncorrelated, which is difficult to prove, also von Neumann extractors aren�t terribly efficient unless you do a more obscure generalized version that isn�t taught in textbooks)

:ohdear:

this is becoming a derail

# ? Oct 24, 2017 23:08

Max Facetime: Apr 18, 2009

Suspicious Dish posted:

nobody seemed to care about quad occupancy so i won't post anymore about that... in the meantime have a really good article from 2013 about warp branch behavior and divergence

https://tangentvector.wordpress.com/2013/04/12/a-digression-on-divergence/

oh nooo, I love this stuff

like, partial derivatives within a 2x2 pixel block, what�s going to happen if some of those pixels diverge and try to calculate some other partial derivative of some other function?

I mean I know that the result is going to be undefined, but what is the hardware actually going to be doing?

# ? Oct 25, 2017 00:44

Adbot: ADBOT LOVES YOU

# ? Jun 9, 2024 04:35

ozymandOS: Jun 9, 2004

BobHoward posted:

this is becoming a derail

but a very interesting one, thank you for these posts

# ? Oct 25, 2017 19:43

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > gpgpgpu hot take thread: the "gp" stands for "garbage pile"

«‹›15 »