Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
speng31b
May 8, 2010

Paniolo posted:

DirectX 9 came out in 2002 (dear god this makes me feel old). Anyone with a video card that can't support DX9-level features like instancing is not going to be playing games on that machine anyway. At a certain point you have to ask yourself, "am I sacrificing efficiency on modern hardware in order to support hardware that nobody is going to be using anyway?"

I was referring more to OpenGL. The ARB instancing extensions are 2008, unless I'm missing something.

e: from what I can tell only stuff from the GeForce 8800+ era definitely supports the OpenGL extensions.

speng31b fucked around with this message at 19:02 on Apr 12, 2011

Adbot
ADBOT LOVES YOU

Spite
Jul 27, 2001

Small chance of that...

octoroon posted:

I was referring more to OpenGL. The ARB instancing extensions are 2008, unless I'm missing something.

e: from what I can tell only stuff from the GeForce 8800+ era definitely supports the OpenGL extensions.

That's true, but consider that the g8x generation is like 5 years old at this point. Just about anything that's actually worth developing for supports instancing.

Ironically, nvidia does not support instancing in hardware on those parts - it's implemented in the driver. Still is faster than making multiple draws yourself.

As always, you have to play around with it and see what gives the best perf for your app.

speng31b
May 8, 2010

Spite posted:

That's true, but consider that the g8x generation is like 5 years old at this point. Just about anything that's actually worth developing for supports instancing.

Ironically, nvidia does not support instancing in hardware on those parts - it's implemented in the driver. Still is faster than making multiple draws yourself.

As always, you have to play around with it and see what gives the best perf for your app.

Yeah, I guess I'm feeling my age. 8800 still feels "newish" to me even though it's practically antique.

Still, this pretty much means that if you're going for lower-end support for a casual game, you're limited to double-buffering VBOs since instancing isn't an option.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
I still don't get the point of instancing. High-poly models don't benefit from it unless there are multiples of the same model at the same LOD, low-poly models are cheap enough to just dump to a dynamic buffer.

speng31b
May 8, 2010

Is there any actual, tangible benefit to using proper OpenGL 3+ context creation as opposed to just getting an old context and using something like GLEW for everything, anyhow? It seems like you end up with the same functionality either way.

Cowcatcher
Dec 23, 2005

OUR PEOPLE WERE BORN OF THE SKY
Does anyone know a good 2D graphics library for C? Preferably something OpenGL-based?

speng31b
May 8, 2010

Cowcatcher posted:

Does anyone know a good 2D graphics library for C? Preferably something OpenGL-based?

SFML is a strong C++ library with built in wrappers around OpenGL for 2D stuff. It's nice because it's still under constant development, so there's a lot of good support and quick developer responses to be had.

The old stand-by is, of course, SDL, the quintessential cross-platform OpenGL 2D windowing and rendering library (this one is straight C, but of course can be used with C++). The good news about SDL is that it's been developed for so long and used so widely that it's incredibly robust, and you probably won't run into any bugs with the stable release.

Both of those libraries can be used either as wrappers around OpenGL for 2d rendering, or just to create a cross-platform context and do your own manual OpenGL work.

speng31b fucked around with this message at 17:47 on Apr 13, 2011

Cowcatcher
Dec 23, 2005

OUR PEOPLE WERE BORN OF THE SKY

octoroon posted:

SFML is a strong C++ library with built in wrappers around OpenGL for 2D stuff. It's nice because it's still under constant development, so there's a lot of good support and quick developer responses to be had.

The old stand-by is, of course, SDL, the quintessential cross-platform OpenGL 2D windowing and rendering library (this one is straight C, but of course can be used with C++). The good news about SDL is that it's been developed for so long and used so widely that it's incredibly robust, and you probably won't run into any bugs with the stable release.

Both of those libraries can be used either as wrappers around OpenGL for 2d rendering, or just to create a cross-platform context and do your own manual OpenGL work.

SFML looks like exactly what I need, thanks!

Spite
Jul 27, 2001

Small chance of that...

OneEightHundred posted:

I still don't get the point of instancing. High-poly models don't benefit from it unless there are multiples of the same model at the same LOD, low-poly models are cheap enough to just dump to a dynamic buffer.

Well, stuff like single-quad grass is cheap enough to do dynamically, but instancing can be really useful for stuff like rocks and trees. Or for stuff like RTS games where you have a crapload of the same type of unit running round onscreen.

brian
Sep 11, 2001
I obtained this title through beard tax.

Thanks for the help with the software rendering everyone, got the Z buffer in after spending 2 days with it not working correctly due to working out the Z wrong because of a stupid mistake involving using edgeZDiff1 instead of edgeZDiff2. Anyway, it seems to be only slightly slower surprisingly, I'm still unsure as to where the bottleneck is in this, flash is kind of bizarre and I haven't done any proper profiling.

speng31b
May 8, 2010

Still curious if there's any tangible benefit to getting a newer 3.0+ OpenGL context. Is there some sort of optimization or practical improvement? Because right now it seems like just creating an old context and getting any entry points I need for newer functionality with GLEW is exactly the same as getting a 3.0+ compatibility context. Is there any reason at ALL to get a newer context if I don't care about asking for a core profile so that deprecated functionality literally throws an error?

speng31b fucked around with this message at 18:27 on Apr 16, 2011

Spite
Jul 27, 2001

Small chance of that...

octoroon posted:

Still curious if there's any tangible benefit to getting a newer 3.0+ OpenGL context. Is there some sort of optimization or practical improvement? Because right now it seems like just creating an old context and getting any entry points I need for newer functionality with GLEW is exactly the same as getting a 3.0+ compatibility context. Is there any reason at ALL to get a newer context if I don't care about asking for a core profile so that deprecated functionality literally throws an error?

Yes. 3.2+ removes a ton of outdated and stupid poo poo. This means the runtime can skip the checks and not have to worry about the fixed function built-in stuff. Also, mandating the use of VAO means you can optimize vertex submission in a couple ways that would be harder with the standard Bind/VertexPointer, etc method.

Now, I'm not sure if the windows drivers have all these optimizations, but it's worth 10% or so on OSX.

PDP-1
Oct 12, 2004

It's a beautiful day in the neighborhood.
Are there any known good algorithms for taking a list of randomly oriented triangles and joining them together into a mesh with minimal vertex and index lists?

I have a marching cubes algorithm that spits out a bunch of individual triangles. I'd like to take that output and form a mesh with vertex normals, texture coordinates, etc. I can imagine ways to do this but it seems like it's a common enough problem that someone else has probably solved it better than I could.

speng31b
May 8, 2010

Spite posted:

Yes. 3.2+ removes a ton of outdated and stupid poo poo. This means the runtime can skip the checks and not have to worry about the fixed function built-in stuff. Also, mandating the use of VAO means you can optimize vertex submission in a couple ways that would be harder with the standard Bind/VertexPointer, etc method.

Now, I'm not sure if the windows drivers have all these optimizations, but it's worth 10% or so on OSX.

So I would assume asking for a compatibility context negates these benefits.

Spite
Jul 27, 2001

Small chance of that...

octoroon posted:

So I would assume asking for a compatibility context negates these benefits.

It would depend on how the driver is implemented, but I would assume so. I haven't profiled it though.

UraniumAnchor
May 21, 2006

Not a walrus.
Maybe I'm not reading the documentation correctly but does glBeginConditionalRender actually stall if you tell it GL_QUERY_WAIT and the query in question hasn't completed yet, or does it file it away as 'do this as soon as you can' and immediately return?

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
WAIT/NO_WAIT is one of those hazy things that depends on the application and driver. Generally speaking, the entire point of ConditionalRender is to not stall your application, but it may cause the driver thread or hardware to stall.

Basically, if the card has unutilized resources that could be used to render some geometry, but the query hasn't returned yet, WAIT will cause it to stall, and NO_WAIT will allow it to start rendering anyway if the driver thinks that would be faster than waiting.

Whether there's a performance gain will depend on your application, but I would say you're probably better off using NO_WAIT unless skipping the query would have a visible effect on what you're rendering (i.e. lens flares). The REGION versions are designed for SLI, though I don't really know why you'd use REGION_WAIT.

Harokey
Jun 12, 2003

Memory is RAM! Oh dear!
Is there a 2d graphics question thread?

Anyway, my question is 2d graphics related.

I'm writing a library to draw arbitrary shapes. The interface supports a "width" which puts a border of that width around the polygon. I've been implementing this by first drawing the shape of normal size of the color of the outline, and then drawing a second, shrunk shape the color of the shape's fill.

This has worked fine for normal shapes, but I'm having a bit of trouble doing it for my arbitrary "polygon" shape. Is there an algorithm to do this? Or am I maybe going about this the wrong way?

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal
To draw the second shape you can probably go vertex by vertex, moving each one a set distance along the bisector of the angle formed by the edges at that vertex. That would handle things that aren't convex too, as long as you keep track of whether the angle is >180 degrees or not.

Unless I'm misunderstanding the question and you're asking for something else.

Edit: Actually now that I think about it, the distance to move along the bisector would vary from vertex to vertex, and should be width/sin(theta/2) if theta is the full angle between the edges at that vertex.

Deep Dish Fuckfest fucked around with this message at 22:11 on Apr 29, 2011

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!


halfangle = normalize(normalize(A-B) + normalize(C-B))

D = B + (thickness / dotproduct(halfangle, normalize(C-B)))

This will work fine as long as the polygon is convex. If it's not convex, you are probably better off expanding the lines in both directions and computing a cap plane by pushing off of the halfangle by the line thickness. i.e.

OneEightHundred fucked around with this message at 23:45 on Apr 29, 2011

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!
How do you set a skeleton position for a shader that skins the model, in a way that isn't disastrously sluggish, in Direct3D?

I'm noticing that if I have three differently-posed skeletons in a scene my framerate drops dramatically - what I'm doing right now is, before each render,
code:
Shader->SetMatrixArray(HandleToSkelMatrixArrayInShader,LocalMatrixArrayPointer,NumberOfBonesInSkeleton);
I'm fairly sure that's the problem, since if I omit that call for the later models the framerate collapse doesn't happen. (And all the humanoid models get synchronised.)

Should I be using a different shader object for each figure so they're not waiting to overwrite the same piece of graphics memory? Should I just be sending skeleton data in in a completely different way?

UraniumAnchor
May 21, 2006

Not a walrus.

roomforthetuna posted:

How do you set a skeleton position for a shader that skins the model, in a way that isn't disastrously sluggish, in Direct3D?

Would Constant Buffers help? You didn't specify which version so if you're stuck with 9 then you can't use those.

Paniolo
Oct 9, 2007

Heads will roll.

roomforthetuna posted:

How do you set a skeleton position for a shader that skins the model, in a way that isn't disastrously sluggish, in Direct3D?

I'm noticing that if I have three differently-posed skeletons in a scene my framerate drops dramatically - what I'm doing right now is, before each render,
code:
Shader->SetMatrixArray(HandleToSkelMatrixArrayInShader,LocalMatrixArrayPointer,NumberOfBonesInSkeleton);
I'm fairly sure that's the problem, since if I omit that call for the later models the framerate collapse doesn't happen. (And all the humanoid models get synchronised.)

Should I be using a different shader object for each figure so they're not waiting to overwrite the same piece of graphics memory? Should I just be sending skeleton data in in a completely different way?

You're only rendering three meshes? That shouldn't be impacting your frame rate at all.

Or are you rendering lots and lots of meshes, with three different poses, and you're copying the entire bone matrix array before each draw call? That could impact your frame rate, but it's easily fixed by sorting your meshes in order to minimize state changes.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Paniolo posted:

You're only rendering three meshes? That shouldn't be impacting your frame rate at all.
Yeah, only three meshes (or at least something on that order of magnitude) and it's taking me from my capped 60fps to 40fps. Which wouldn't be a problem except there's going to be more than a few guys running around.

UraniumAnchor posted:

Would Constant Buffers help? You didn't specify which version so if you're stuck with 9 then you can't use those.
Oh, yeah, sorry, it's Direct3D 9 so I don't even know what Constant Buffers are (in a DirectX context anyway). It does sound like a thing that would help!

Basically, the problem resembles when I was doing vertexbuffer->Lock() , write , Unlock() , render, with non-discardable buffers - everything would get very very slow with only like 20-30 very small buffer rewrites. So I'm guessing my bone matrices declared in the shader are acting the same way, as a bit of graphics memory that has to be locked and overwritten, and thus delays until the previous render using that memory is completed.

But in this case I have no idea how to have multiple bits of memory for that, unless having multiple instances of the same shader is the way you do this, but that doesn't seem right.

Paniolo
Oct 9, 2007

Heads will roll.
Well I would try to narrow it by manually passing the matrices using SetVertexShaderConstantF. If that fixes it then you know it's something in the effect system.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!
That's approximately what I've been doing while hoping someone else had a better answer - result, I was apparently completely mistaken about what the problem was, sorry about that.

I'm still getting the big frame rate drop, but now all I'm doing that causes the framerate to drop from 60fps to 40fps is...
code:
shader->SetTechnique(a_handle);
pD3DDevice->SetVertexDeclaration(the_appropriate_declaration);
pD3DDevice->SetStreamSource(vertexbuffer,offset,stride);
pD3DDevice->SetIndices(indexbuffer);
UINT passes;
shader->Begin(&passes,0);
for (UINT pass=0; pass<passes; pass++) {
  shader->BeginPass(pass);
  pD3DDevice->DrawIndexedPrimitive(D3DPT_TRIANGLELIST,0,0,numvertices,0,length);
  shader->EndPass(pass);
}
shader->End();
Three times over.

Making this even weirder to me is that there are already 5 or more of that happening each frame without it going below 60fps, so it seems strange for just 3 more to almost halve (maybe worse since it's capped at 60) the framerate. I imagine I'm doing something horribly wrong here and OneEightHundred is going to slap me for it. I'm sorry, game logic is my area, 3D API is not my comfort zone.

Edit: It appears that there probably isn't anything too wrong with that, as adjusting the shader to only calculate on 2 or 3 bones instead of 4 brings the framerate back up to 50 from 40. So it appears the problem is either with my shader or my meshes. Since the shader is approximately Microsoft's example shader for skinning, it seems unlikely that that's the problem - but is skinning a few indexed 330-vertex 448-triangle meshes really expected to be that time-consuming? Should I be pre-calculating my skinned-mesh frames and just using a lookup table if I'm expecting to have say 50 of these guys on screen at a time?

roomforthetuna fucked around with this message at 20:03 on Apr 30, 2011

Paniolo
Oct 9, 2007

Heads will roll.
Does anyone know whether or not PIX works with stream out?

When I try to drill into the draw call that generates the stream out geometry, I just get a "CreateGeometryShaderWithStreamOut failed" error, and if I drill into the DrawAuto draw call, the mesh section says "Not a draw call."

edit: It does seem to be a lack of PIX support because when I replace DrawAuto() with Draw() it works just fine.

Paniolo fucked around with this message at 23:52 on Apr 30, 2011

Spite
Jul 27, 2001

Small chance of that...

roomforthetuna posted:

Yeah, only three meshes (or at least something on that order of magnitude) and it's taking me from my capped 60fps to 40fps. Which wouldn't be a problem except there's going to be more than a few guys running around.

Oh, yeah, sorry, it's Direct3D 9 so I don't even know what Constant Buffers are (in a DirectX context anyway). It does sound like a thing that would help!

Basically, the problem resembles when I was doing vertexbuffer->Lock() , write , Unlock() , render, with non-discardable buffers - everything would get very very slow with only like 20-30 very small buffer rewrites. So I'm guessing my bone matrices declared in the shader are acting the same way, as a bit of graphics memory that has to be locked and overwritten, and thus delays until the previous render using that memory is completed.

But in this case I have no idea how to have multiple bits of memory for that, unless having multiple instances of the same shader is the way you do this, but that doesn't seem right.

That's not really how it works in DX9. Constants don't work the same way as buffers.

Without looking at the whole code it'd be hard for me to say for sure what's going on.
What do your passes do? If you are uploading a bunch of matrices, and doing 3 passes that's going to upload the data 3 times. Maybe that's why it's so slow? Dunno.

One thing to keep in mind is that it's better to do

Bind Shader 0
Bind Vtx buffer 0
Draw
Bind Vtx buffer 1
Draw
Bind Shader 1
Bind Vtx buffer 0
Draw
Bind Vtx buffer 1
Draw

Than to do
Bind Vtx Buffer 0
Bind shader 0
Draw
Bind Shader 1
Draw

Zerf
Dec 17, 2004

I miss you, sandman

Harokey posted:

This has worked fine for normal shapes, but I'm having a bit of trouble doing it for my arbitrary "polygon" shape. Is there an algorithm to do this? Or am I maybe going about this the wrong way?

I don't know how far you want to push this, but if you want an algorithm that handles odd cases you should look up http://en.wikipedia.org/wiki/Straight_skeleton

The algorithm I used last time I implemented straight skeletons was quite difficult to get robust though, so I'd advise against using them unless it's really necessary.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Spite posted:

One thing to keep in mind is that it's better to do ...
Thanks for that, which might help somewhat. I'm actually only using one shader object, but selecting from amongst multiple 'techniques' - is that the same thing as binding a new shader each time?

I was uploading a new batch of matrices each time, but when I stopped doing that it didn't actually help much (maybe from 40fps to 41fps) so that wasn't the problem. The little code extract I posted is almost literally what's being added - it's not all neatly in a row like that, and the variables have meaningful names, but I walked through the code and that's all the DirectX-facing functions that are called, in that order, to bring it down from 60fps to 50fps (or 40fps if the shader is always skinning on 4 bones rather than 2/3).

'shader' is the same object in each of the 3 repetitions, the technique being set may or may not be the same one. I just googled for "SetTechnique" to see if maybe there'd be a hint about usage, and found someone saying they'd have a separate shader class for each of their objects, but that didn't seem like someone who really knew what they were doing. Is this a horrible idea?

Is there some way to render everything with technique X within one shader->Begin/shader->End block, given that they'll have different transform matrices? Would it be something like
code:
shader->BeginPass(0);
pD3DDevice->SetVertexDeclaration(the_appropriate_declaration);
pD3DDevice->SetStreamSource(vertexbuffer,offset,stride);
pD3DDevice->SetIndices(indexbuffer);
shader->SetMatrix(transformhandle,a_matrix);
shader->CommitChanges();
pD3DDevice->DrawIndexedPrimitive(D3DPT_TRIANGLELIST,0,0,numvertices,0,length);

pD3DDevice->SetVertexDeclaration(the_appropriate_declaration);
pD3DDevice->SetStreamSource(different_vertexbuffer,offset,stride);
pD3DDevice->SetIndices(different->indexbuffer);
shader->SetMatrix(transformhandle,another_matrix);
shader->CommitChanges();
pD3DDevice->DrawIndexedPrimitive(D3DPT_TRIANGLELIST,0,0,numvertices,0,length);

shader->EndPass(0);
This is not what I'm doing now - this question is really "is this what CommitChanges is for?"

Harokey
Jun 12, 2003

Memory is RAM! Oh dear!

Zerf posted:

I don't know how far you want to push this, but if you want an algorithm that handles odd cases you should look up http://en.wikipedia.org/wiki/Straight_skeleton

The algorithm I used last time I implemented straight skeletons was quite difficult to get robust though, so I'd advise against using them unless it's really necessary.

I was hoping for something like this, but in the end I ended up cheating and just draw a line of that thickness along the polygon as well.

Spite
Jul 27, 2001

Small chance of that...

roomforthetuna posted:

Thanks for that, which might help somewhat. I'm actually only using one shader object, but selecting from amongst multiple 'techniques' - is that the same thing as binding a new shader each time?

I was uploading a new batch of matrices each time, but when I stopped doing that it didn't actually help much (maybe from 40fps to 41fps) so that wasn't the problem. The little code extract I posted is almost literally what's being added - it's not all neatly in a row like that, and the variables have meaningful names, but I walked through the code and that's all the DirectX-facing functions that are called, in that order, to bring it down from 60fps to 50fps (or 40fps if the shader is always skinning on 4 bones rather than 2/3).

'shader' is the same object in each of the 3 repetitions, the technique being set may or may not be the same one. I just googled for "SetTechnique" to see if maybe there'd be a hint about usage, and found someone saying they'd have a separate shader class for each of their objects, but that didn't seem like someone who really knew what they were doing. Is this a horrible idea?

Is there some way to render everything with technique X within one shader->Begin/shader->End block, given that they'll have different transform matrices? Would it be something like
code:
shader->BeginPass(0);
pD3DDevice->SetVertexDeclaration(the_appropriate_declaration);
pD3DDevice->SetStreamSource(vertexbuffer,offset,stride);
pD3DDevice->SetIndices(indexbuffer);
shader->SetMatrix(transformhandle,a_matrix);
shader->CommitChanges();
pD3DDevice->DrawIndexedPrimitive(D3DPT_TRIANGLELIST,0,0,numvertices,0,length);

pD3DDevice->SetVertexDeclaration(the_appropriate_declaration);
pD3DDevice->SetStreamSource(different_vertexbuffer,offset,stride);
pD3DDevice->SetIndices(different->indexbuffer);
shader->SetMatrix(transformhandle,another_matrix);
shader->CommitChanges();
pD3DDevice->DrawIndexedPrimitive(D3DPT_TRIANGLELIST,0,0,numvertices,0,length);

shader->EndPass(0);
This is not what I'm doing now - this question is really "is this what CommitChanges is for?"

I'm not super familar with D3DX to be honest. I'm not sure what it does under the hood. But CommitChanges has to do with State and not data - I would guess you don't need it if you are just changing constants. Also make sure you aren't having the runtime save state for you. You should manage it yourself.

As for setting the shader, I think it depends on whether each technique is actually a different shader or not. Otherwise it may just be state. Not sure.

You should run your app through PIX and see what the actual command stream looks like.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!
I guess I should just run some experiments doing things different ways thousands of times each to see which is faster, to figure out how I'm supposed to be doing it. All the documentation and tutorials show you how to do one thing, but they don't explain the correct procedure for turning that into efficiently doing ten similar things!

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal

roomforthetuna posted:

I guess I should just run some experiments doing things different ways thousands of times each to see which is faster, to figure out how I'm supposed to be doing it. All the documentation and tutorials show you how to do one thing, but they don't explain the correct procedure for turning that into efficiently doing ten similar things!

Welcome to graphics programming! I hope you like loving around until you find what your drivers and graphics hardware like.

Though to be fair, these are pretty complex systems with a whole lot of factors interacting, so often there's not much choice but to test and profile. But I do feel like a lot of the documentation could use some more hints and rules of thumb when it comes to performance concerns.

Paniolo
Oct 9, 2007

Heads will roll.
By the way are you sure you're running a release build with the release DirectX runtime? All the profiling in the world is worthless if you're in debug mode. Frankly, I don't know how you're possibly getting such a low frame rate with such a straightforward rendering operation.

Spite
Jul 27, 2001

Small chance of that...

YeOldeButchere posted:

Welcome to graphics programming! I hope you like loving around until you find what your drivers and graphics hardware like.

Though to be fair, these are pretty complex systems with a whole lot of factors interacting, so often there's not much choice but to test and profile. But I do feel like a lot of the documentation could use some more hints and rules of thumb when it comes to performance concerns.

Working on drivers has given me a whole new perspective on it. To the point where I now think that anyone writing high perf apps should do so. It sucks at times, because there are a lot of games that aren't written well and the drivers have to end up optimizing their (bad) usage of the API. Better documentation, examples and some transparency would really help that. Along with better education.

Though every vendor is loath to describe the hints/rules/etc because it would out all the ridiculously dirty hacks and unimplemented crap that's in every runtime/driver/etc.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Paniolo posted:

By the way are you sure you're running a release build with the release DirectX runtime? All the profiling in the world is worthless if you're in debug mode. Frankly, I don't know how you're possibly getting such a low frame rate with such a straightforward rendering operation.
No, it's definitely the debug runtime, but obviously I'm doing something wrong, the debug version shouldn't be slowing it down that badly. It's not like the debug stuff will be adding a full fiftieth of a second per render call for stuff that's even already on the graphics card. I'm pretty sure it has to be me doing something wrong that means it's blocking, waiting for a previous graphics card thing to be done before it can submit the next one (like when I was using a non-discardable vertex buffer and doing lock-write-unlock-render-lock-write-unlock-render, that had a similarly terrible effect because it couldn't lock it again until the previous render call was done with. Plus locking meant a copy from the graphics card every time.)

I can dump out 50 lines of crap out to the debugger per frame without slowing it down even a significant fraction of that badly, so I'm pretty sure the debug runtime isn't the problem - I mean sure, it'll contribute to the slowness, but I should be able to do what I'm doing, with the debug runtime, and have plenty of FPS to spare. I'm only drawing a total of about 10000 triangles per frame, and half of those aren't even textured!

(And just to be clear, when I say debug runtime, I don't mean it's running in reference mode!)

Paniolo
Oct 9, 2007

Heads will roll.

roomforthetuna posted:

No, it's definitely the debug runtime, but obviously I'm doing something wrong, the debug version shouldn't be slowing it down that badly. It's not like the debug stuff will be adding a full fiftieth of a second per render call for stuff that's even already on the graphics card.reference mode!)

How do you know what the debug runtime is doing? Never, never, never try to profile performance issues in debug mode. The data you're observing is completely worthless. Not saying that switching to a release build will solve your problem, but you really cannot even begin to troubleshoot in debug mode.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Paniolo posted:

How do you know what the debug runtime is doing? Never, never, never try to profile performance issues in debug mode. The data you're observing is completely worthless. Not saying that switching to a release build will solve your problem, but you really cannot even begin to troubleshoot in debug mode.
But how will I even be able to tell I'm doing something wrong that's causing a problem under a release runtime, since it'll put me above the 60fps cap!

I'm not trying to diagnose a little performance bottleneck, obviously I wouldn't do that under a debug runtime, but it's fairly clear that I must be doing something seriously wrong. I can't see how using a release build would help me find out what.

Adbot
ADBOT LOVES YOU

Paniolo
Oct 9, 2007

Heads will roll.

roomforthetuna posted:

I must be doing something seriously wrong.

Yes, you are assuming that performance data gathered from a debug build means anything at all. That's seriously wrong.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply