GPU Megat[H]read - the cores of wrath grew heavy on the die that day

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > GPU Megat[H]read - the cores of wrath grew heavy on the die that day

«‹›3859 »

Agreed: Dec 30, 2003; The price of meat has just gone up, and your old lady has just gone down

Factory Factory posted:

And who's really going to be the target market for an $800-$900 beast? The person who just wants one video card, or the person who already owns two 690s?

Speaking of...

Any word on whether the consumer-oriented Titan is going to bring prosumer-ish higher precision calculation? I'd love to retire the GTX 580 and have a 580 and a 680 to flog, I just don't see how such a profound architectural improvement and massive die plus tons of available VRAM could come in below the performance of a 580 for calculations, or a 680+580 pair for CUDA, even with the Fermi calculation advantage over early(/small) Kepler's high end.

From a practical standpoint I'm pretty sure I'm going to actually need all 16x of the PCI-e 2.0 I'm working with or else Titan is going to be rather choked for bandwidth.

Assuming I do decide to pick one up, not really settled on it, there aren't many games that stress the overclocked 680 and a headless coprocessor is kinda handy - but I think recent games show that developers are getting the hang of doing graphics and physics on the same card at the same time after all, or there's just a glut of processing power to spare, in either case I really don't see unfettered Titan as being somehow inferior at PhysX and other CUDA-based workloads even if it is pulling double duty rendering.

# ? Feb 19, 2013 09:14

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 22:22

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

I can only speculate, but let's assume that a GeForce 680 has raw compute about like half a Tesla K10 (alt Nvidia PDF), which is basically a workstation/HTPC GeForce 690. I pick that instead of the Quadro K5000 because more detailed specs are readily available; best I can find is that the GTX 680 peaks at ~3 GFLOPS (SP) @ 1006 MHz (plus boost?) and the K5000 at ~2.1 GFLOPS (SP) @ 706 MHz. Here's the K20X in more detail.

So that puts a 680-grade GK104 at 745 MHz around 2,290 GFLOPS (SP) and 1/24th that DP (~95 GFLOPS), or ~3.0 TFLOPS SP/125 GFLOPS DP for a stock-clock 680. The Tesla K20, the GK110 card with the same core count and clock speed as are rumored for the GeForce Titan, is 3.52 GFLOPS SP at 1/3 FP64, or 1.17 TFLOPS.

So the theoretical best-case compute performance of the chip is a moderate upgrade for SP, but an order of magnitude upgrade for DP. Kepler managed a number of compute upgrades over Fermi, like increased instruction-level parallelism, rather than Fermi's reliance on thread-level parallelism, and increased size and bandwidth for registers and L2 cache. On top of that, GK110 goes even further than GK104 with increased registers per thread and features like Hyper-Q, which is essentially a high-level hyperthreading scheme.

Each CUDA program loads a work queue to the GPU, which spawns threads. Fermi and GK104 cards can handle one queue, potentially leaving much of the hardware waiting if there are dependency or low-parallelism issues. With Hyper-Q, a GK110 can handle 32 different work queues and schedule these across SMXs that might otherwise be left idle. This is mostly something for ported MPI programs, though.

GK110 also includes a Dynamic Parallelism feature that basically lets kernels spawn new kernels, meaning work queues can generate their own successors if algorithms are rewritten to take advantage of this, rather than the current regime of having to hit the CPU each time. This requires a major software rewrite, though.

Of course, then all this stuff is going to be hampered in some way. GF110 had major artificial limits on its geometry and DP performance. From Tesla K5000 benchmarks, it looks like it at least has the geometry limits. And according to this page, the 680 really actually fights with the 580 more often than not, frequently horking up on larger workloads by running out of cache and/or register. According to that page, and the algorithms tested, a GeForce 680 manages 1/3 its peak SP performance for steady state, and 1/10 that DP. So, ironically, it looks like SP is throttled, but DP is so bad anyway that it's not.

So what can we expect for a GeForce-ified K20? Well, we'll probably see some DP throttling, at least the 1/2 extra of the Quadro 5000 and maybe up to the 1/6 of the mid-range Fermi Quadros - which still leaves it a solid bit faster than your existing GeForces. DP isn't terribly important in a consumer card. Single Precision is so much higher than the GeForce 580 that it may be adjusted, as well, but I can't see it being cranked much lower than a GeForce 690 can handle, i.e. ~2.2 TFLOPS SP. We may see Hyper-Q or Dynamic Parallelism disabled, but unless that can be driver-locked, I would tend to doubt it. The Titan might have use as a cheap dev board for those HPC features, but it would likely never replace a Tesla because of the probable lack of ECC memory and cache.

So: I'd call it an upgrade. A compelling upgrade? Not sure. But I'd bet that the performance per watt would be a loving TON higher than a 580 + 680. And you could become an industry partner/shill for Nvidia and push Hyper-Q et al. on CUDA DSP programmers.

# ? Feb 19, 2013 11:12

mayodreams: Jul 4, 2003; Hello darkness,
my old friend

The NDA lifted on Titan.

http://www.anandtech.com/show/6760/nvidias-geforce-gtx-titan-part-1

# ? Feb 19, 2013 15:16

TyrantWD: Nov 6, 2010; Ignore my doomerism, I don't think better things are possible

mayodreams posted:

The NDA lifted on Titan.

http://www.anandtech.com/show/6760/nvidias-geforce-gtx-titan-part-1

I'm disappointed with the price. I was aiming for a new build once Haswell comes out, and at 999 it doesn't alter the current price/performance ratio of the market. I knew the Titan would be out of my budget anyway, but I was hoping it might result in more bang for buck at the 600-700 range.

# ? Feb 19, 2013 16:01

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

Blah blah blah FP64 1/3 is single precision. Holy poo poo. Well, I was on the right track: it's a high-end personal compute/entry-level enterprise/HPC compute card.

Glad they have a page on Compute features... lessee... Hyper-Q is disabled, as is a DMA pass-through feature to third-party PCIe peripherals. ECC also isn't present, no surprise there.

Huh. FP64 is toggleable between 1/24th and 1/3 to let you set aside TDP for SP, if that's your thing (and for consumer workloads, it is). If it's set to 1/3, TDP limits kick in and boost clocking is disabled. That's different. 4.5 TFLOPS peak SP, but less than a third of that - 1.3 TFLOPS - on double precision. So it's 4.5 TFLOPS SP/187.5 GFLOPS DP or 3.9 TFLOPS SP/1.3 TFLOPS DP. Further analysis on Thursday.

# ? Feb 19, 2013 16:24

Agreed: Dec 30, 2003; The price of meat has just gone up, and your old lady has just gone down

Factory Factory posted:

Blah blah blah FP64 1/3 is single precision. Holy poo poo.

Well that's absolutely fantastic! Adios, weird dual card setup.

# ? Feb 19, 2013 20:04

Athropos: May 4, 2004; "Skeletons are Number One! Flesh just slows you down."

So in a nutshell, could a Titan be more powerful than my SLI 680s?

# ? Feb 20, 2013 03:58

Professor Science: Mar 8, 2006; diplodocus + mortarboard = party

Athropos posted:

So in a nutshell, could a Titan be more powerful than my SLI 680s?

if you're doing DP, yes. if you're doing gaming, no. some of it is a major improvement for CUDA (e.g., CUDA streams are actually usable and predictable on GK110 instead of some weird invocation that you do to try to get any extra performance that happens to show up on every previous chip), some of it is stuff that might be usable (dynamic work creation).

the real point of this card, as far as I can tell, is not a gaming card; it's something that academics can buy in order to start developing on GK110 without having to spend $4k per board without giving them something that they could buy in any significant volume to use for deployments. it makes a lot of sense to do so, too.

# ? Feb 20, 2013 04:08

mayodreams: Jul 4, 2003; Hello darkness,
my old friend

Performance of the Titan is out now.

http://www.anandtech.com/show/6774/nvidias-geforce-gtx-titan-part-2-titans-performance-unveiled

# ? Feb 21, 2013 16:08

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

mayodreams posted:

Performance of the Titan is out now.

http://www.anandtech.com/show/6774/nvidias-geforce-gtx-titan-part-2-titans-performance-unveiled

Between the Titan and the surprising showing from the 7970GE, the 680 is looking a bit crappy. Which is a ludicrous thought, but there you go.

# ? Feb 21, 2013 17:03

mikul: Feb 8, 2004

I'm currently using onboard graphics from the i7 3770, but have an old 8800 GTS lying around. As I'm doing a bit of video editing with CUDA supported programmes, should I plug the 8800 in, or is it so old that it won't make a difference?

# ? Feb 21, 2013 19:04

Alereon: Feb 6, 2004; Dehumanize yourself and face to Trumpshed; College Slice

If the application supports CUDA for that card (it may only be on GTX 400+ for example, newer cards support newer versions of the CUDA API) then yes, put it in and install the latest drivers from the nVidia site.

# ? Feb 21, 2013 19:17

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

What programs? For Adobe Premiere CS5.x Mercury Engine, an 8800 GTS 1GB is literally the minimum supported card, and even then you have to hack support in; it's not gonna be a powerhouse at all. More modern architectures are just TONS better at CUDA/OpenCL stuff.

# ? Feb 21, 2013 19:19

mikul: Feb 8, 2004

Factory Factory posted:

What programs? For Adobe Premiere CS5.x Mercury Engine, an 8800 GTS 1GB is literally the minimum supported card, and even then you have to hack support in; it's not gonna be a powerhouse at all. More modern architectures are just TONS better at CUDA/OpenCL stuff.

This is what I was looking at

http://www.sonycreativesoftware.com/moviestudiope

What would be a good, relatively cheap card to get then? Note I've got no interest in gaming.

# ? Feb 21, 2013 19:24

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

That supports OpenCL on Intel HD 4000, which means you can have some GPU acceleration just using the onboard graphics, no card required. The 8800 GTS is not supported; the system requirements state rather plainly that it needs a GeForce 200 series or newer (400, 500, 600 series preferred).

I would just start with that for now, personally. Otherwise... Hell's bells, not an easy questions to answer. Maybe get one of the upcoming 384-shader GCN-based AMD Radeon 7000/8000 series cards, if you aren't satisfied with the HD 4000.

# ? Feb 21, 2013 19:38

Dogen: May 5, 2002; Bury my body down by the highwayside, so that my old evil spirit can get a Greyhound bus and ride

I'm feeling pretty good about my tricked out 580 right now. I mean, $1000? Is the $400-500 next gen part far off?

# ? Feb 21, 2013 20:16

mikul: Feb 8, 2004

It looks like getting a display port to. Dual link DVi adapter is going to cost quite a bit so I think I'll get a new graphics card. What would you recommend for around �70/$100?

# ? Feb 21, 2013 20:18

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

Dogen posted:

I'm feeling pretty good about my tricked out 580 right now. I mean, $1000? Is the $400-500 next gen part far off?

Another year or so, from both Red and Green.

mikul posted:

It looks like getting a display port to. Dual link DVi adapter is going to cost quite a bit so I think I'll get a new graphics card. What would you recommend for around �70/$100?

Radeon 7750, I guess. Cheapest AMD GCN card, and Nvidia's price/performance at that price level is terrible.

# ? Feb 21, 2013 20:38

mikul: Feb 8, 2004

Factory Factory posted:

Radeon 7750, I guess. Cheapest AMD GCN card, and Nvidia's price/performance at that price level is terrible.

Those cards seem to come with either 1gb of DDR5 or 2gb of ddr3. Which would be best for editing 1080p video while working on a 2560x1600 screen?

# ? Feb 21, 2013 20:57

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

1GB GDDR5, no contest. The extra gig of RAM won't make a difference, but the more-than-double the memory bandwidth of GDDR5 will help the OpenCL-accelerated performance.

# ? Feb 21, 2013 21:01

mikul: Feb 8, 2004

This seems like a good deal and I like the idea of a silent card. But it does look enormous. Would there be any issue fitting it in the case?

http://www.amazon.co.uk/gp/product/B008NGDG9K/ref=gno_cart_title_1?ie=UTF8&psc=1&smid=A3P5ROKL5A1OLE

# ? Feb 21, 2013 21:13

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

What case? It's almost certain that that card will fit in whatever you've got, as it's not that much bigger than a stock 7750, just a bit bigger where those heatpipes stick up and extending back a bit farther.

# ? Feb 21, 2013 21:16

mikul: Feb 8, 2004

Factory Factory posted:

What case? It's almost certain that that card will fit in whatever you've got, as it's not that much bigger than a stock 7750, just a bit bigger where those heatpipes stick up and extending back a bit farther.

This one.

http://www.amazon.co.uk/gp/product/B004ZH18G4/ref=oh_details_o04_s01_i00?ie=UTF8&psc=1

And this is the mobo, for what it's worth
http://www.amazon.co.uk/gp/product/B007KZQERQ/ref=oh_details_o04_s02_i00?ie=UTF8&psc=1

# ? Feb 21, 2013 21:20

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

That looks borderline. There looks like there's some room for those heatpipes, but not a lot.

Just get one with a fan.

# ? Feb 21, 2013 21:22

mikul: Feb 8, 2004

So I went with this in the end.

http://www.amazon.co.uk/gp/product/B008CJVA8C/ref=oh_details_o00_s00_i00?ie=UTF8&psc=1

I must admit I was a little confused reading the OP about whether or not Intel's QuickSync would be better for my needs. I'm going to be importing AVCHD files, playing around with them and then saving a finished project.

# ? Feb 21, 2013 21:58

Alereon: Feb 6, 2004; Dehumanize yourself and face to Trumpshed; College Slice

In general QuickSync is the fastest but not quite as good quality as most software encoders, x264 with OpenCL enabled is the best quality and second in performance, and everything else falls in various places on the range. I have no idea if you can use external codecs with that Sony software though.

# ? Feb 21, 2013 22:12

mikul: Feb 8, 2004

Alereon posted:

In general QuickSync is the fastest but not quite as good quality as most software encoders, x264 with OpenCL enabled is the best quality and second in performance, and everything else falls in various places on the range. I have no idea if you can use external codecs with that Sony software though.

Would I be given the option of utilising QuickSync or OpenCL, or does having a graphics card disable QuickSync

# ? Feb 21, 2013 22:16

Zotix: Aug 14, 2011

So with all of this Titan talk has there been any word on the 700 series? I have a 580 GTX and I'm not sure whether to purchase a 670 gtx, or wait til the 700 series.

# ? Feb 24, 2013 02:08

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

Maxwell (the Kepler successor uarch) won't be out until 2014. If there are any "700" series cards this year, they will likely just be Fermi and Kepler rebadges. In fact, there are already some of those in the laptop field, just like there are "Radeon HD 8000M" parts that use first-gen GCN1.

No fundamentally new GPUs are due out for the rest of this year, really. The most shocking changes are low-end GCN Radeons and maybe the Kepler Refresh version of the current Kepler cards.

# ? Feb 24, 2013 02:24

forbidden dialectics: Jul 26, 2005

Factory Factory posted:

Maxwell (the Kepler successor uarch) won't be out until 2014. If there are any "700" series cards this year, they will likely just be Fermi and Kepler rebadges. In fact, there are already some of those in the laptop field, just like there are "Radeon HD 8000M" parts that use first-gen GCN1.

No fundamentally new GPUs are due out for the rest of this year, really. The most shocking changes are low-end GCN Radeons and maybe the Kepler Refresh version of the current Kepler cards.

Is Maxwell the one that is supposed to have a separate on-board ARM CPU for doing stuff like vertex calculations?

# ? Feb 24, 2013 05:52

Don Lapre: Mar 28, 2001; If you're having problems you're either holding the phone wrong or you have tiny girl hands.

Amazon has EVGA 670 FTW 2gb cards for $315 open box.

Shows 11 left right now.

http://www.amazon.com/gp/offer-listing/B0083Y6MV6/ref=sr_1_1_olp?ie=UTF8&qid=1361753609&sr=8-1&keywords=evga+ftw&condition=used

# ? Feb 25, 2013 01:55

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

Nostrum posted:

Is Maxwell the one that is supposed to have a separate on-board ARM CPU for doing stuff like vertex calculations?

I think so? I don't know what it's for, and I'm only reading the same rumors on that anyone else is, but yeah.

# ? Feb 25, 2013 06:14

SocketSeven: Dec 5, 2012

So AMD has been advertising this new thing they call TressFX.

http://blogs.amd.com/play/tressfx/

Anyone have thoughts on it yet? Or is everyone still waiting for AMD's official announcement about it?

# ? Feb 25, 2013 15:30

KillHour: Oct 28, 2007

Don Lapre posted:

Amazon has EVGA 670 FTW 2gb cards for $315 open box.

Shows 11 left right now.

http://www.amazon.com/gp/offer-listing/B0083Y6MV6/ref=sr_1_1_olp?ie=UTF8&qid=1361753609&sr=8-1&keywords=evga+ftw&condition=used

Jumped on this like a fat kid on the last Twinkie.

# ? Feb 25, 2013 15:34

Don Lapre: Mar 28, 2001; If you're having problems you're either holding the phone wrong or you have tiny girl hands.

KillHour posted:

Jumped on this like a fat kid on the last Twinkie.

Since they have a bunch i wonder if they had a bunch of busted boxes. Mine will be here tomorrow, cant wait.

# ? Feb 25, 2013 16:46

Malloc Voidstar: May 7, 2007; Fuck the cowboys. Unf. Fuck em hard.

SocketSeven posted:

So AMD has been advertising this new thing they call TressFX.

http://blogs.amd.com/play/tressfx/

Anyone have thoughts on it yet? Or is everyone still waiting for AMD's official announcement about it?

It seems to pretty blatantly be a proper hair rendering system (so of course some people have called it potentially AMD's version of Speed Tree)
Be nice to have hair that actually acts like hair, not a weird hairlike hat.

# ? Feb 25, 2013 17:05

KillHour: Oct 28, 2007

I'm going to bench before and after. Anyone want to guess how much of an FPS bump I'll get?

Current specs:

Intel i5 3570K (OC: 4.5GHz)
16GB DDR3 1600
BFG GTX 260 OC2 MAXCORE 55 (x2 in SLI)
Sandisk Extreme 120GB SATA III SSD

# ? Feb 25, 2013 17:11

SocketSeven: Dec 5, 2012

It'd be nice to have hair that looks right.

It will be very cool if TressFX manages it, but I have my doubts. I guess I'll have to wait for more info on it and some tech demos to be able to tell.

I wonder if AMD/ATI thinks this will be a real game changer for their market share, or if it's just shiny new tech. They've been standing behind NV/Intel for years now, is this an attempt to step ahead of NV/Intel? Or just something to compete with PhysX?

# ? Feb 25, 2013 17:13

slidebite: Nov 6, 2005; Good egg

e: Can't read thread title :shobon:

slidebite fucked around with this message at 17:54 on Feb 25, 2013

# ? Feb 25, 2013 17:39

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 22:22

Factory Factory: Mar 19, 2010; This is what
Arcane Velocity was like.

slidebite posted:

I'd appreciate some opinions here. I have been out of the GPU "news" for near 2 years so I really don't know much about the status quo and immediate future other than trying to catch up the last few pages of this thread.

I've got an i5-3500K, 16GB Ram and a MSI GTX570. I play @ 1920x1080.

The system plays well, but I am toying with the idea of getting a 670 to replace the 570 for a few reasons.

1 - I probably won't do any other "major" upgrades until 2014 so this should give me a bit more life.

2- I can still get a few bucks for the 570 to subsidize the purchase.

3- The 570 is pretty drat noisy when doing almost anything 3D.

Stupid idea?

1) Read the thread title.
2) Go to system building/parts picking thread
3) Wait for the next generation of cards. A 670 is an upgrade, but probably not a cost-effective one.

# ? Feb 25, 2013 17:46

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > GPU Megat[H]read - the cores of wrath grew heavy on the die that day

«‹›3859 »