3D graphics questions that do not deserve their own thread (OpenGL / Dx10)

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > 3D graphics questions that do not deserve their own thread (OpenGL / Dx10)

«‹›74 »

OneEightHundred: Feb 28, 2008; Soon, we will be unstoppable!

Paniolo posted:

bus speeds are not.

PCI-E like AGP has been doubling its throughput every iteration, they're definitely going up. Part of the problem is nobody even talks about polygon count any more, so it's hard to tell what's been going up with it.

# ? Mar 30, 2011 21:34

Adbot: ADBOT LOVES YOU

# ? May 15, 2024 14:01

Hubis: May 18, 2003; Boy, I wish we had one of those doomsday machines...

OneEightHundred posted:

So, one thing I haven't been keeping up on: How important is throughput now, if at all? That is, 2003-ish, static buffers were becoming hot poo poo to reduce data throughput to the card compared to the previously-favored approach of essentially bulk-copying data from system memory to the GPU every frame.

However, it's always had drawbacks in ease of use and draw call reduction: Instancing can be used to handle duplicated models, but it can't handle models at different LOD. Accessing the data for CPU-side tasks (i.e. decals) requires storing it twice. Some things are practical to render from static data in some scenarios, but not others (i.e. skeletal animation can be easily done with vertex shaders, but large numbers of morph targets run out of attribute streams on older hardware). Some things could in theory be done statically, but would require stupid amounts of storage (i.e. undergrowth).

Is the performance hit of using only dynamic buffers and just bulk-copying data even noticeable any more, or is the bottleneck pretty much all draw calls and shaders now?

Bandwidth matters a little, but usually with constantly updating dynamic buffers API overhead (in DirectX) can matter more if you're not strongly GPU-bound. One good trick is to try and issue all your buffer updates (and other state for that matter) in a single large block -- this lets the drivers internal tracking/ref-count data stay in caches, and you'd be surprised how much it can improve performance if you are API-bound. Also, make sure you are mapping your buffers as "DISCARD/Write-only" so the API knows it can stage the write and doesn't have to wait for any calls in the pipe accessing that buffer resource.

In general though your instinct is correct. Static buffers are definitely still preferred, but you're not going to notice the downstream PCI-E bandwidth as much.

# ? Mar 31, 2011 11:13

speng31b: May 8, 2010

So I'm trying to render the geometry in my VBOs with as few draw calls as possible. Most of the geometry shares texture coordinates but uses different textures. So I was thinking I could just sew all the textures together into one big texture and alter the texture coordinates accordingly so that I wouldn't have to bind new textures very often.

The problem I'm seeing is that generating mipmaps for one big texture causes artifacts when parts of the texture are used for different pieces of geometry. Is there any way to get around that?

Would it be better to just suck it up and bind a lot of textures?

speng31b fucked around with this message at 04:52 on Apr 2, 2011

# ? Apr 2, 2011 04:21

Deep Dish Fuckfest: Sep 6, 2006; Advanced
Computer Touching; Toilet Rascal

Quick DirectX 11 question: what am I supposed to use to render basic text? DX10 had the awfully convenient ID3DX10Font interface with associated functions that made things really easy, but that seems to be gone in DX11.

# ? Apr 2, 2011 04:43

OneEightHundred: Feb 28, 2008; Soon, we will be unstoppable!

Rolling your own text rendering with FreeType is pretty easy.

# ? Apr 2, 2011 04:45

UraniumAnchor: May 21, 2006; Not a walrus.

If you want to be ultra lazy you can try SDL_ttf and use the resulting surface as a texture, but I have no idea what kind of speed you'd get out of that.

# ? Apr 2, 2011 11:46

Unormal: Nov 16, 2004; Mod sass? This evening?! But the cakes aren't ready! THE CAKES!; Fun Shoe

octoroon posted:

So I'm trying to render the geometry in my VBOs with as few draw calls as possible. Most of the geometry shares texture coordinates but uses different textures. So I was thinking I could just sew all the textures together into one big texture and alter the texture coordinates accordingly so that I wouldn't have to bind new textures very often.

The problem I'm seeing is that generating mipmaps for one big texture causes artifacts when parts of the texture are used for different pieces of geometry. Is there any way to get around that?

Would it be better to just suck it up and bind a lot of textures?

The simplest way is to put "padding" around each of the individual textures. In some cases I'll extend the outermost pixels of each texture a little so the mipmap gathers them. It's very inelegant but it works. I'm pretty sure you can also manually generate the mip map levels, but I haven't actually tried it yet.

# ? Apr 2, 2011 18:44

speng31b: May 8, 2010

Unormal posted:

The simplest way is to put "padding" around each of the individual textures. In some cases I'll extend the outermost pixels of each texture a little so the mipmap gathers them. It's very inelegant but it works. I'm pretty sure you can also manually generate the mip map levels, but I haven't actually tried it yet.

Yeah, I just figured this out actually. What I did is put some fully transparent (0,0,0,0) pixels around the edges of each texture. This seems to be working so far.

# ? Apr 2, 2011 18:45

Deep Dish Fuckfest: Sep 6, 2006; Advanced
Computer Touching; Toilet Rascal

OneEightHundred posted:

Rolling your own text rendering with FreeType is pretty easy.

UraniumAnchor posted:

If you want to be ultra lazy you can try SDL_ttf and use the resulting surface as a texture, but I have no idea what kind of speed you'd get out of that.

I'll look into those, but it seems like I was pretty tired last night since I missed the section about "DirectWrite" on MSDN. Seeing as it requires Windows 7 or an updated version of Vista, I'm guessing it's not some deprecated API no one uses anymore.

# ? Apr 2, 2011 20:12

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Another thing that might help with the mipmap problem of a texture atlas would be to have sub-textures (or at least boundaries) that are reasonably high powers of 2 in size - if your boundaries are on the 16 pixel line then at least the 1/2, 1/4, 1/8 and 1/16 scaled versions won't have any bleed from the next subtexture over. (And at the 1/32 level I doubt it really matters that much.)

I have not tried this so I could be wrong.

# ? Apr 2, 2011 20:23

speng31b: May 8, 2010

roomforthetuna posted:

Another thing that might help with the mipmap problem of a texture atlas would be to have sub-textures (or at least boundaries) that are reasonably high powers of 2 in size - if your boundaries are on the 16 pixel line then at least the 1/2, 1/4, 1/8 and 1/16 scaled versions won't have any bleed from the next subtexture over. (And at the 1/32 level I doubt it really matters that much.)

I have not tried this so I could be wrong.

So far just having a few pixels of transparency around the edges seems to be doing the trick. Someone on the OpenGL forums suggested it, and I'm not going to think too hard about it unless it breaks again. Having large power-of-two boundaries would increase the texture sizes pretty substantially, unless I'm misunderstanding.

speng31b fucked around with this message at 20:37 on Apr 2, 2011

# ? Apr 2, 2011 20:34

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

octoroon posted:

So far just having a few pixels of transparency around the edges seems to be doing the trick. Someone on the OpenGL forums suggested it, and I'm not going to think too hard about it unless it breaks again. Having large power-of-two boundaries would increase the texture sizes pretty substantially, unless I'm misunderstanding.

Well, by large I just mean like at least 8 - 2 is a power of 2 and wouldn't do you much good. And I don't mean scale the textures, I just mean pad them out to an 8- or 16-pixel threshold with transparent pixels, much like you are doing only with a deliberate threshold that caters to the construction of mipmaps rather than an arbitrary one.

If I'm right about how it'd work then your few pixels of padding will be wasted when the boundary between textures is already on a suitable threshold, and wouldn't entirely fix the 1/4 or 1/8th scale mipmap when it's in the worst possible off-threshold position. Again, not that you probably care that much about it once it's down to the deeper mipmap levels. If you've already got it working to your satisfaction you might as well stick with what you did!

# ? Apr 2, 2011 20:46

speng31b: May 8, 2010

roomforthetuna posted:

Well, by large I just mean like at least 8 - 2 is a power of 2 and wouldn't do you much good. And I don't mean scale the textures, I just mean pad them out to an 8- or 16-pixel threshold with transparent pixels, much like you are doing only with a deliberate threshold that caters to the construction of mipmaps rather than an arbitrary one.

If I'm right about how it'd work then your few pixels of padding will be wasted when the boundary between textures is already on a suitable threshold, and wouldn't entirely fix the 1/4 or 1/8th scale mipmap when it's in the worst possible off-threshold position. Again, not that you probably care that much about it once it's down to the deeper mipmap levels. If you've already got it working to your satisfaction you might as well stick with what you did!

The thing that's bothering me is power-of-two consistency. When I make a texture atlas, should the entire atlas be a power of two? Or just the subtextures? Or both?

edit:

Just to be clear, you're talking about something like this, right? (where the black represents transparency padding)

speng31b fucked around with this message at 22:45 on Apr 2, 2011

# ? Apr 2, 2011 21:35

Paniolo: Oct 9, 2007; Heads will roll.

YeOldeButchere posted:

I'll look into those, but it seems like I was pretty tired last night since I missed the section about "DirectWrite" on MSDN. Seeing as it requires Windows 7 or an updated version of Vista, I'm guessing it's not some deprecated API no one uses anymore.

Funny thing about DirectWrite is that it doesn't actually work with DirectX 11. That's not to say it's not possible to use it in a project that uses DX11, but the workaround involves need to create a DX10 device and then a texture resource that's shared between your DX10 and DX11 devices.

Between that and the fact that DX11's effect framework isn't a first class part of the API anymore (you need to build it from source yourself) DX11 still feels like it's in beta.

# ? Apr 2, 2011 22:56

Unormal: Nov 16, 2004; Mod sass? This evening?! But the cakes aren't ready! THE CAKES!; Fun Shoe

roomforthetuna posted:

Another thing that might help with the mipmap problem of a texture atlas would be to have sub-textures (or at least boundaries) that are reasonably high powers of 2 in size - if your boundaries are on the 16 pixel line then at least the 1/2, 1/4, 1/8 and 1/16 scaled versions won't have any bleed from the next subtexture over. (And at the 1/32 level I doubt it really matters that much.)

I have not tried this so I could be wrong.

Even with 128x128 or 256x256 square textures, I get mipmap artifacts if the textures are directly adjacent in an atlas with the automatic opengl mipmap generation on an nvidia card.

# ? Apr 2, 2011 23:08

pseudorandom name: May 6, 2007

Paniolo posted:

Funny thing about DirectWrite is that it doesn't actually work with DirectX 11. That's not to say it's not possible to use it in a project that uses DX11, but the workaround involves need to create a DX10 device and then a texture resource that's shared between your DX10 and DX11 devices.

Between that and the fact that DX11's effect framework isn't a first class part of the API anymore (you need to build it from source yourself) DX11 still feels like it's in beta.

I don't know a drat thing about any of this, but can't you QueryInterface on a ID3D11Texture2D to get a IDXGISurface, give that DXGI surface to ID2D1Factory::CreateDxgiSurfaceRenderTarget to get a ID2D1RenderTarget, and then use ID2D1RenderTarget::DrawTextLayout or ID2D1RenderTarget::DrawText?

pseudorandom name fucked around with this message at 23:20 on Apr 2, 2011

# ? Apr 2, 2011 23:15

speng31b: May 8, 2010

Unormal posted:

Even with 128x128 or 256x256 square textures, I get mipmap artifacts if the textures are directly adjacent in an atlas with the automatic opengl mipmap generation on an nvidia card.

With GL_LINEAR I get mipmap artifacts even with the textures separated by 8 transparent pixels of padding. However that seems to have something to do with the mipmap generator interpreting the transparent pixels as black rather than transparent... maybe I need to take your advice and extend the border pixels instead.

# ? Apr 2, 2011 23:23

Paniolo: Oct 9, 2007; Heads will roll.

pseudorandom name posted:

I don't know a drat thing about any of this, but can't you QueryInterface on a ID3D11Texture2D to get a IDXGISurface, give that DXGI surface to ID2D1Factory::CreateDxgiSurfaceRenderTarget to get a ID2D1RenderTarget, and then use ID2D1RenderTarget:rawTextLayout or ID2D1RenderTarget:rawText?

CreateDxgiSurfaceRenderTarget will fail if you try this.

# ? Apr 2, 2011 23:27

pseudorandom name: May 6, 2007

Paniolo posted:

CreateDxgiSurfaceRenderTarget will fail if you try this.

Even if you pass D3D11_CREATE_DEVICE_BGRA_SUPPORT to D3D11CreateDevice?

# ? Apr 2, 2011 23:41

Paniolo: Oct 9, 2007; Heads will roll.

pseudorandom name posted:

Even if you pass D3D11_CREATE_DEVICE_BGRA_SUPPORT to D3D11CreateDevice?

Yeah that isn't why the two don't work together. The incompatibility is well documented.

# ? Apr 3, 2011 00:30

Deep Dish Fuckfest: Sep 6, 2006; Advanced
Computer Touching; Toilet Rascal

Wait, so Microsoft removed the ID3DX10Font interface from D3D11, presumably because they expected people to use DirectWrite instead, then they added

Microsoft posted:

Developers who use Direct3D graphics and need simple, high-performance 2-D and text rendering for menus, user-interface (UI) elements, and Heads-up Displays (HUDs).

to the "Who is Direct2D aimed at" page, but they forgot to actually make DirectWrite+D2D compatible with D3D11?

Goddamnit.

I kind of understand the "DX11 feels like it's in beta" comment now. I'm not particularly heartbroken about the effect framework now being provided as optional source code since I prefer dealing with shaders and constant buffers and all that manually, but this is kind of bad.

# ? Apr 3, 2011 01:45

pseudorandom name: May 6, 2007

YeOldeButchere posted:

Wait, so Microsoft removed the ID3DX10Font interface from D3D11, presumably because they expected people to use DirectWrite instead, then they added

to the "Who is Direct2D aimed at" page, but they forgot to actually make DirectWrite+D2D compatible with D3D11?

And then they didn't document this anywhere.

Makes me wonder what kind of clusterfuck the implementation is if something so trivially obvious doesn't actually work.

# ? Apr 3, 2011 03:22

Relaxodon: Oct 2, 2010

I am going mad here over a problem i have been brooding over for days. You guys seem to be pretty experienced so maybe you have an idea what i am getting wrong here.

I am currently trying to replace the vertex shader with a CUDA program so I can do some Camera Space calculation with it. So I have written a CUDA function that allows me to multiply an array of vectors by a matrix and store the result in a VBO. The idea is that I feed this function my vertex buffer and the MVP Matrix and get a pre-transformed VBO as a result. In the Vertex Shader I simply do this:

code:

gl_Position = in_position;

Alas, that doesn't work. I get garbeled up geometry that rotates around the origin. Now, before you chew me out that my CUDA function is probably garbage, here is what i have verified so far:

1.) The Projection and ModelView Matrices are fine. I fed them into a normal shader and i get the expected result (a rotating cube). Here is the shader:

code:

gl_Position = projection_matrix * modelview_matrix * in_position;

2.) The CUDA function performs a correct Matrix * Vector multiplication. Checked this with pen and paper as well as comparing the results with some applet i found.

3.) The VBO is correctly written to. I output its content after the CUDA function and it is as expected.

Another weird thing is, that when i multiply my vertex buffer with only the modelview Matrix and then leave that projection matrix for the shader, it works!

code:

VBO = Vertices * Modelview Matrix   <--- CUDA
gl_Position = projection_matrix * in_position;    <---- Vertex Shader

As soon as I do the multiplication by the Projection Matrix outside the shader, things go wrong.
Since my Buffers and Shader Variables are all generic, the pipeline cannot know what these matrices represent.
Does the Pipeline do anything other then normal matrix multiplication? Does it do anything to the VBO or the Matrices that i do not do outside of it? I am at a loss here.

# ? Apr 3, 2011 07:27

OneEightHundred: Feb 28, 2008; Soon, we will be unstoppable!

Unormal posted:

Even with 128x128 or 256x256 square textures, I get mipmap artifacts if the textures are directly adjacent in an atlas with the automatic opengl mipmap generation on an nvidia card.

A huge thing to keep in mind is where the pixel centers are. 0,0 is the point between all 4 corner pixels, (1/width/2),(1/height/2) is dead center on a corner pixel but that will vary by LOD! Like the coordinate to get dead on that pixel at the first mipmap LOD is different because the width and height are both halved.

You need twice as much padding per mipmap level to prevent spillover artifacts, but one thing you can do to reduce the problem is set a minimum LOD. That means mipmaps will never reduce below a certain amount, and that's usually okay: Aliasing from texture minification is a lot less noticeable on things with high geometric detail.

quote:

maybe I need to take your advice and extend the border pixels instead.

Padding should always be something like the contents of the closest valid pixels, or an average of several of them, or something like that. It should never be an invalid value, that defeats the purpose of padding.

For semi-transparent stuff, always use premultiplied alpha! I can't stress this enough, premultiplied alpha is EXTREMELY useful because it lets you do additive effects (i.e. flares: zero alpha, additive RGB), blended alpha effects, and monochrome multiplicative effects (i.e. dark smoke) with the same blend type, allowing you to single-pass a lot more poo poo, throw more stuff into texture atlases, and it's not as prone to minification artifacts.

If you're not using premultiplied alpha, then you need to write a custom filter so that the mipmapped RGB channel actually considers translucency properly. That is, the average color of 4 pixels should not be (p1+p2+p3+p4)/4, but rather, (p1*a1+p2*a2+p3*a3+p4*a4)/(a1+a2+a3+a4)

OneEightHundred fucked around with this message at 19:05 on Apr 3, 2011

# ? Apr 3, 2011 18:55

Deep Dish Fuckfest: Sep 6, 2006; Advanced
Computer Touching; Toilet Rascal

pseudorandom name posted:

And then they didn't document this anywhere.

Makes me wonder what kind of clusterfuck the implementation is if something so trivially obvious doesn't actually work.

Well, to be fair, they did document it: http://msdn.microsoft.com/en-us/library/ee913554(VS.85).aspx#api_interoperability_overview.

It's just that you'd expect a path between D3D11 and D2D that doesn't go through D3D10.1.

EDIT: Nevermind, found the problem.

Deep Dish Fuckfest fucked around with this message at 18:46 on Apr 4, 2011

# ? Apr 4, 2011 02:32

Madox: Oct 25, 2004; Recedite, plebes!

YeOldeButchere posted:

Quick DirectX 11 question: what am I supposed to use to render basic text? DX10 had the awfully convenient ID3DX10Font interface with associated functions that made things really easy, but that seems to be gone in DX11.

There is a tool out there called ttf2bmp that I used for drawing text. I'm still using DX9 targetting XBox, and it was pretty easy to expand my sprite drawing code to use the font textures.

http://www.softpedia.com/get/Others/Font-Utils/Bitmap-Font-Maker-Plus.shtml

# ? Apr 4, 2011 17:18

speng31b: May 8, 2010

OneEightHundred posted:

Padding should always be something like the contents of the closest valid pixels, or an average of several of them, or something like that. It should never be an invalid value, that defeats the purpose of padding.

For semi-transparent stuff, always use premultiplied alpha! I can't stress this enough, premultiplied alpha is EXTREMELY useful because it lets you do additive effects (i.e. flares: zero alpha, additive RGB), blended alpha effects, and monochrome multiplicative effects (i.e. dark smoke) with the same blend type, allowing you to single-pass a lot more poo poo, throw more stuff into texture atlases, and it's not as prone to minification artifacts.

If you're not using premultiplied alpha, then you need to write a custom filter so that the mipmapped RGB channel actually considers translucency properly. That is, the average color of 4 pixels should not be (p1+p2+p3+p4)/4, but rather, (p1*a1+p2*a2+p3*a3+p4*a4)/(a1+a2+a3+a4)

Thanks! This really helped, I've got my atlases working rather nicely with mipmapping now.

# ? Apr 4, 2011 17:21

Unormal: Nov 16, 2004; Mod sass? This evening?! But the cakes aren't ready! THE CAKES!; Fun Shoe

octoroon posted:

Thanks! This really helped, I've got my atlases working rather nicely with mipmapping now.

What'd you end up doing?

# ? Apr 4, 2011 18:59

speng31b: May 8, 2010

Unormal posted:

What'd you end up doing?

Wrote some code load an image, add padding based on averaged border pixel values, and then sew a bunch of these images together into an atlas to the nearest power of two. No mipmap artifacts at any distance that I can see -- if they are there, they aren't visible.

# ? Apr 4, 2011 19:33

BanditCat: Apr 27, 2005

This is a OpenGL ES 2.0 shader for some cool recursive fractalish thing (in WebGL). It correctly raytraces and give the right normals, but the depth value isn't sorting properly, or is getting corrupted somehow.

I've spent too long on this, so see if it jumps out at you. It seems like it should be a simple variable getting incorrectly overwritten type bug. This is in Firefox on a AMD Radeon 5870 if you think it matters.

Figured it out: my spheres positions where being calculated with world space instead of camera space vectors.

BanditCat fucked around with this message at 23:38 on Apr 7, 2011

# ? Apr 7, 2011 02:05

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

I have a bunch of meshes that use several different formats, and I'm changing from my outdated FVF ways to DX9 shaders. I've gathered I can easily handle the different vertex formats with the D3DVERTEXELEMENT array declarations, I've done that, but I'm not clear on how to handle "there are a mess of different formats" at the shader level. The only way I've found so far is to make a shader function with a bunch of "uniform" flags passed in, in addition to the vertex data, and section off pieces of the code based on the flags so that, eg. the texture coordinates from the vertex data don't get used if the 'texturecount' value is 0, then doing, eg.

code:

technique RenderScene1Light1Texture0Diffuse
{
    pass P0
    {          
        VertexShader = compile vs_2_0 RenderSceneVS( 1, true, false, false );
        PixelShader  = compile ps_2_0 RenderScenePS( true );
    }
}

In my .fx file. But obviously this way gets ridiculous if I want to accommodate any format with 0-4 bones, 0-4 lights, 0-4 textures, 0-2 colors - that's a lot of combinations. Is there some reasonable way to produce this technique-compiling programmatically rather than including it manually in the effect file? And ideally not by appending it to a string either - I'd like to do it "on-demand", so when I first encounter a mesh with a previously unused profile I'd compile the shader with appropriate parameters at that time, preferably resulting in a handle the same as I get from "GetTechniqueByName" now.

Alternatively can I make a shader which, for example, uses default bone weights if the vertices don't contain bone weights?

# ? Apr 7, 2011 06:14

OneEightHundred: Feb 28, 2008; Soon, we will be unstoppable!

roomforthetuna posted:

Alternatively can I make a shader which, for example, uses default bone weights if the vertices don't contain bone weights?

If you want a constant value for an input stream, the way you do it is dump that constant somewhere in a vertex buffer, then SetStreamSource on it with the stride set to 0.

# ? Apr 7, 2011 16:34

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

OneEightHundred posted:

If you want a constant value for an input stream, the way you do it is dump that constant somewhere in a vertex buffer, then SetStreamSource on it with the stride set to 0.

Ah, that's a neat solution if that comes up in future.

I've now got my thing working using a "do bones at all" uniform flag (so when there are no bones it doesn't go into weighting code at all, thus not minding the absence of BLENDINDICES or BLENDWEIGHT data).

Fun discovery - a loop like

code:

		for (int i=0; i<g_nBoneCount; i++) {
			blendPos += mul(vPos,g_mBone[BlendIndices[i]]) * BlendWeights[i];
		}

Is incredibly incredibly slow, quartered my framerate even just rendering one very simple mesh. Just treating it like there's always 4 bones even when there aren't 4 weights is much faster, though I'm not entirely sure why it works - do blendweights not present in the vertex get set to zero? Is this reliable across systems?

# ? Apr 7, 2011 17:21

OneEightHundred: Feb 28, 2008; Soon, we will be unstoppable!

Avoid variable-iteration loops at all costs, the compiler can't unroll them so you get a branch every iteration, and branches are VERY expensive.

# ? Apr 7, 2011 18:12

UraniumAnchor: May 21, 2006; Not a walrus.

Isn't that only true if the branches diverge or am I getting it mixed up with CUDA? (Or is it not true there either?)

UraniumAnchor fucked around with this message at 18:21 on Apr 7, 2011

# ? Apr 7, 2011 18:19

OneEightHundred: Feb 28, 2008; Soon, we will be unstoppable!

UraniumAnchor posted:

Isn't that only true if the branches diverge or am I getting it mixed up with CUDA? (Or is it not true there either?)

There's a cost associated with evaluating the branch at all, it's about 3x as expensive as a texture lookup if I remember. Contingent sets are cheap, but it's not possible to optimize a loop into contingent sets if the iteration count is unknown.

# ? Apr 7, 2011 18:35

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Yeah, I figured that was why, which is why I unlooped it before I posted.

What about much smaller branches? I just tried something like "if (color.r==1.0f) color.g=1.0f;" and that didn't seem to cause a performance issue.

That brings me to a couple more questions - if I want to compare a color to an exact value, how would I go about doing that?

I have a float4 "diffuse" from the vertex - if I want to check that for being, say, exactly RGB 127,127,127 (and ignore the alpha, or not ignore it, either way) how is that done A. now that it's become all float-y, and B. preferably without doing something hideous?

# ? Apr 7, 2011 18:40

UraniumAnchor: May 21, 2006; Not a walrus.

So if you have a moderately complex shader that could do one of two things depending on a boolean flag, would it be better to just have two versions of the shader and switch between the two, rather than having a boolean uniform? I'm still not clear on how much of a stall you might get in the pipeline by switching shader stuff around. I read somewhere that the shader actually gets recompiled on some hardware when you modify a uniform.

# ? Apr 7, 2011 18:43

haveblue: Aug 15, 2005; Toilet Rascal

UraniumAnchor posted:

So if you have a moderately complex shader that could do one of two things depending on a boolean flag, would it be better to just have two versions of the shader and switch between the two, rather than having a boolean uniform? I'm still not clear on how much of a stall you might get in the pipeline by switching shader stuff around. I read somewhere that the shader actually gets recompiled on some hardware when you modify a uniform.

It's usually cheaper to do both calculations and throw one of them away rather than make real mutually exclusive code paths.

result = (formula1)*which + (formula2)*(1.0-which) where which is set to 0 or 1.

# ? Apr 7, 2011 18:45

Adbot: ADBOT LOVES YOU

# ? May 15, 2024 14:01

UraniumAnchor: May 21, 2006; Not a walrus.

haveblue posted:

It's usually cheaper to do both calculations and throw one of them away rather than make real mutually exclusive code paths.

result = (formula1)*which + (formula2)*(1.0-which) where which is set to 0 or 1.

The specific shader I had in mind was a good deal more complex than that, unfortunately, so that wouldn't really work. It performed an entirely different calculation if the flag was set, both of which couldn't really be squashed into a single line, at least not without being nearly unreadable. Sounds like in that case I'd be better off splitting it into two different shaders.

UraniumAnchor fucked around with this message at 18:53 on Apr 7, 2011

# ? Apr 7, 2011 18:48

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > 3D graphics questions that do not deserve their own thread (OpenGL / Dx10)

«‹›74 »