|
Paniolo posted:Yes, you are assuming that performance data gathered from a debug build means anything at all. That's seriously wrong.
|
# ? May 3, 2011 05:29 |
|
|
# ? May 30, 2024 12:10 |
|
How about you simply compile it for release and profile it there?
|
# ? May 3, 2011 05:30 |
|
roomforthetuna posted:Can someone else please confirm or deny this, because to me it seems a ridiculous premise that a debug build will have a performance problem where a release build will have no performance problem (not ridiculous that it will perform slower, that's a given, but that it will specifically perform badly, below a reasonable expectation for a given operation.) No. e: Also, if it's faster than 60 FPS but you want to see relative performance, why not take the cap off? wellwhoopdedooo fucked around with this message at 14:38 on May 3, 2011 |
# ? May 3, 2011 14:35 |
|
Trying to enable multisampling in OpenGL and I'm missing something. I am using glew elsewhere, but I know that all this stuff has to be initialised before glew, so there's been no glewInit() yet. This is the (I think) minimal failing code in my window setup routine.code:
Edit: Answer - it's NOT crashing there, the debugger just can't cope with functions defined that way and skips a bit, crashing in a perfectly sensible place later. Carry on. HauntedRobot fucked around with this message at 16:11 on May 3, 2011 |
# ? May 3, 2011 15:46 |
|
roomforthetuna posted:Can someone else please confirm or deny this, because to me it seems a ridiculous premise that a debug build will have a performance problem where a release build will have no performance problem (not ridiculous that it will perform slower, that's a given, but that it will specifically perform badly, below a reasonable expectation for a given operation.) If the compiler optimization setting is different between debug and release then performance could indeed be radically different in certain areas. Also, running in debug mode may be enabling extra logging, doing more bounds/sanity checks, skipping optimizing data transformations, and so on in the system libraries, especially inside 3D graphics drivers. "Badly" is subjective but you could gain 20-30% performance just by switching from debug to release, depending on what you are doing.
|
# ? May 3, 2011 16:12 |
|
It's not entirely graphics related, but if you want an idea of just how much difference debug/release builds can make, here's an example. Some time ago I was doing some physics stuff and I looked around for a linear algebra library because gently caress implementing that myself. I stumbled on a library (Eigen) that makes heavy use of templates and promises both solid performances and ease of use. I shoved it into my application, made a really simple test with a deformable cube, and compiled with a debug profile. That gave me about 10 FPS. Obviously that wasn't acceptable, so I messed around for a couple hours until I was ready to give up and decided to try a release build because, hey, I might get all the way up to 12 FPS. I ended up with ~2000 FPS. That's an extreme case, but it can happen.
|
# ? May 3, 2011 17:55 |
|
YeOldeButchere posted:That gave me about 10 FPS. Obviously that wasn't acceptable, so I messed around for a couple hours until I was ready to give up and decided to try a release build because, hey, I might get all the way up to 12 FPS. I ended up with ~2000 FPS. If it used STL that's not too surprising - checked iterators can add a pretty enormous performance penalty. The main point here is that a debug build isn't 30% slower because every function is exactly 30% slower - it's slower because it introduces bottlenecks that don't occur in release builds. It's not uncommon to profile a debug build and see most of the time spent in a function which would be inlined out in a release build. With DirectX the debug runtime does a ton of validation that the release runtime doesn't do. That can create bottlenecks in functions which would otherwise be a simple memcpy. Since you don't have access to the DirectX source code, if you profile an operation as taking a long time in debug mode, you really have no way of knowing if it's a genuine performance issue, or slowdown caused by the debug layer. Hence, the performance data you're collecting in debug mode is worthless. Just to be as clear as possible, the point isn't "switch to release mode and your problem will go away", it's "you cannot effectively troubleshoot your problem, or even be sure you have one, in a debug build."
|
# ? May 3, 2011 18:32 |
|
YeOldeButchere posted:It's not entirely graphics related, Anyway, now I feel really stupid because I found what I was doing wrong - I may not have been using a reference renderer, but I was using software vertex processing! etc. On the bright side I suppose, accidentally using software vertex processing did get me to make my vertex shader a little more efficient! Sigh. Always remember to undo a little debugging test change! My graphics library had been initializing with software vertex processing for probably the last 2 months or so.
|
# ? May 3, 2011 18:38 |
|
Paniolo posted:Just to be as clear as possible, the point isn't "switch to release mode and your problem will go away", it's "you cannot effectively troubleshoot your problem, or even be sure you have one, in a debug build." Yeah, I know that, I was just giving an extreme example of how debug builds can do weird things performance-wise in general. And no, it wasn't due to the STL. It's because the library is built with layers upon layers of templates which allow pretty complex linear algebra expressions that get broken up into inlined vectorized code by the compiler if optimizations are on. I was honestly impressed by what C++ templates could do.
|
# ? May 3, 2011 19:04 |
|
roomforthetuna posted:Can someone else please confirm or deny this, because to me it seems a ridiculous premise that a debug build will have a performance problem where a release build will have no performance problem (not ridiculous that it will perform slower, that's a given, but that it will specifically perform badly, below a reasonable expectation for a given operation.) I can confirm from my own experience that when you are using DirectX in debug mode (set in the DirectX control panel) it does a lot more sanity checks and bad state detection than in release mode. Release mode assumes you are doing everything right and fires it all through with almost no checks. Also its been a few years since I did C++ (all XNA now) but I pD3DDevice->SetVertexDeclaration(); pD3DDevice->SetStreamSource(); pD3DDevice->SetIndices(); from inside a BeginPass/EndPass I'll dig up some old code to see what I did. Edit: I am a moron, I made my calls the exact same way you have it posted. Madox fucked around with this message at 18:24 on May 6, 2011 |
# ? May 6, 2011 18:20 |
|
I don't really know where to post this but I think it fits here. I'm having trouble wrapping my head around a problem and I think at this point I've spent too long staring at it to see a simple solution. I have a 3d polygon displayed with an orthographic projection. For an arbitrary rotation, I want to be able to "snap" the vertices to the nearest point on a 2d grid overlaid on the orthographic projection. Right now what I do is the following: I use gluProject on three axis aligned unit vectors, then subtract a projected zero vector from each to get x_part, y_part, and z_part- each 3d axis' affect on 2d translation in the projection. Then I take the minimum nonzero component of each to be x_width, y_width, and z_width, and use (2D_GRID_SIZE / foo_width) as the width of a grid on foo's respective axis. Any vertex snapped to this 3d grid will align with the 2d grid on the scene's projection, and I can do this with some simple rounding magic. First of all, this solution is not very general and makes a million assumptions, the most ugly of which is that everything breaks if the 2d grid has a different size on each axis. Second of all, everything breaks if the origin is not on a gridpoint. I arrived here by trial and error and I can't really justify anything that I've done so far. Any ideas? Can anyone give me an idea where to start in building a general solution to this problem? I feel like there has to be an obvious solution that I just completely missed. edit: think I got it... Post following shortly a slime fucked around with this message at 14:53 on May 10, 2011 |
# ? May 10, 2011 11:40 |
|
Edit: Once again, stupid problem that had nothing to do with the code I posted.
HauntedRobot fucked around with this message at 14:37 on May 16, 2011 |
# ? May 16, 2011 12:18 |
|
I've got an odd problem where the bottom 1/10th of the screen or so seems to lag behind the rest of the scene when the camera rotates. It doesn't show up in screen shots, but looks like this simulated pic in the live program if the camera was rotating counter-clockwise: Any ideas? This is DX9 via XNA if that matters. Everything looks fine the second the camera is stopped.
|
# ? Jun 6, 2011 00:50 |
|
That is "tearing" and it happens because your screen updates are out of synch with the monitor displaying the new image. http://msdn.microsoft.com/en-us/library/bb174576(VS.85).aspx
|
# ? Jun 6, 2011 02:42 |
|
Can OpenGL triangle strips only ever be 1 triangle "tall"? For example this image I would describe as 1x4 triangles. What's the simplest way to instead draw 4x4 triangles? Should I use multiple index buffers on one vertex buffer?
Mata fucked around with this message at 08:16 on Jun 20, 2011 |
# ? Jun 20, 2011 08:10 |
|
Mata posted:Can triangle strips only ever be 1 triangle "tall"? For example this image I would describe as 1x4 triangles. What's the simplest way to instead draw 4x4 triangles? Should I use multiple index buffers on one vertex buffer? You can use degenerate triangles: by repeating vertices, you create invisible triangles that you can use to include a discontinuity in a single triangle strip (like a jump to a second layer) If it's supported, you can also use an index of -1 (0xffff or 0xffffffff) to do the same. This is called strip-cut index in D3D and the Primitive Restart extension in OGL.
|
# ? Jun 20, 2011 08:21 |
|
zzz posted:You can use degenerate triangles: by repeating vertices, you create invisible triangles that you can use to include a discontinuity in a single triangle strip (like a jump to a second layer) Looks like primitive restart is exactly what I was looking for thanks!
|
# ? Jun 20, 2011 08:38 |
|
I don't know if I asked before in this thread but I'll go ahead. How would I go about making multiple lights in my phong lighting shader? What I've done is have a uniform array attribute which I pass lighting information like position/direction/size/type of a fixed size (say 100) and loop through about a 100 times through the array. From this thread I've heard that a variable loop is pretty bad so I kept it fixed at 100 and put in an if statement to see if the current element is enabled. If there are more than a 100 lights, I just render the scene again with the lights it didn't go through and blend it with the previous rendered scene. I'm not sure if this is the right way, and if you guys want I'll put up the source code. It has some if statements in it anyways and I'm not sure how well shaders handle branching. I'd also liked to know how to handle attenuation of spot lights as everywhere I looked, they seem to use fixed values. I assumed I'd just linearly make less light depending on the distance but went with the fixed attenuation values.
|
# ? Jul 15, 2011 20:23 |
|
ShinAli posted:I don't know if I asked before in this thread but I'll go ahead. If you can give up blended transparency look into using 'deferred shading', it's complicated but oh so good for lots of lights. (and there's ways to get transparency back if you really want it)
|
# ? Jul 15, 2011 20:50 |
|
Unormal posted:If you can give up blended transparency look into using 'deferred shading', it's complicated but oh so good for lots of lights. (and there's ways to get transparency back if you really want it) That's exactly what I've been using, and I seem to be able to go to about a 1000 lights before it slows down below 30 fps. I just don't know if I'm doing it right.
|
# ? Jul 15, 2011 21:20 |
|
ShinAli posted:I'd also liked to know how to handle attenuation of spot lights as everywhere I looked, they seem to use fixed values. attenuation = saturate((angle - cosMaxAngle) / (cosMinAngle - cosMaxAngle)) Min angle = Minimum angle where the light source stops being visible at full intensity (might still be visible through a diffuser) Max angle = End of the fade-out angle, i.e. the angle where neither the light source nor diffusers are visible. OneEightHundred fucked around with this message at 21:36 on Jul 15, 2011 |
# ? Jul 15, 2011 21:31 |
|
ShinAli posted:I'm not sure how well shaders handle branching. Very, very poorly. Avoid if at all possible. Depending on what you are doing it may be faster to evaluate both branches and multiply the one you don't want to use by zero before combining it with the final result.
|
# ? Jul 15, 2011 21:34 |
|
ShinAli posted:That's exactly what I've been using, and I seem to be able to go to about a 1000 lights before it slows down below 30 fps. I just don't know if I'm doing it right. Generally if you're using a deferred shader, you shouldn't be branching in a single shader, you should be rendering a single quad (or sphere or whatever) per light volume.
|
# ? Jul 15, 2011 21:51 |
|
OneEightHundred posted:angle = dot(normalize(point - lightOrigin), normalize(lightDirection)) Argh, I actually meant point lights but this is still helpful as I need to implement spot lights anyways. For point lights, I'd assume you'd use two "sizes" where one is full intensity and the other is where the fall off would end. Would I just measure up the fall off size in some proportion of the intensity size? I'm trying to think on how to use angles as a part of this but it would seem that I need to know the range of the light anyways before I can take the angle into consideration. Unormal posted:Generally if you're using a deferred shader, you shouldn't be branching in a single shader, you should be rendering a single quad (or sphere or whatever) per light volume. I mostly wanted to batch as many lights as possible in a single pass to avoid doing a draw call for every light. Does it not matter as much if I do one light per pass? haveblue posted:Very, very poorly. Avoid if at all possible. Actually I don't know why I didn't think of that, as I use a 1.0 for on and 0.0 for off. ShinAli fucked around with this message at 22:37 on Jul 15, 2011 |
# ? Jul 15, 2011 22:33 |
|
ShinAli posted:For point lights, I'd assume you'd use two "sizes" where one is full intensity and the other is where the fall off would end. You can clamp the distance at a minimum value to avoid "hot spots" from the hyperbolic growth, but that distance is usually best kept low. There is no distance where a light will not actually affect a visible surface, but if you want to cull away negligibly-affected surfaces, a common criteria is sqrt(intensity*256), which is the distance where the light would fail to change the value of a pixel by itself. Intensity in this case would be the highest of the three RGB intensity values. OneEightHundred fucked around with this message at 18:02 on Jul 17, 2011 |
# ? Jul 17, 2011 18:00 |
|
I ran into something I don't understand today while working on a shader - I was sampling a mipmapped texture and the shader ran fine. Then I changed the texture and forgot to generate the mipmap and the framerate absolutely tanked. When I generated the mipmap on the new texture things ran great again. The obvious conclusion is to be sure to use mipmaps, but I don't understand why that makes such a difference in the framerate. Sampling a texture is sampling a texture, and if anything I'd have guessed that translating the UV coords to the mipmap would more work for the GPU. Why is sampling a mipmapped texture so much faster than sampling a non-mipmapped texture?
|
# ? Jul 22, 2011 04:15 |
|
Two possibilities:
|
# ? Jul 22, 2011 04:59 |
|
PDP-1 posted:I ran into something I don't understand today while working on a shader - I was sampling a mipmapped texture and the shader ran fine. Then I changed the texture and forgot to generate the mipmap and the framerate absolutely tanked. When I generated the mipmap on the new texture things ran great again. Better data locality. If you map the same region of the texture to a surface, the mipmap one requires fewer memory fetches.
|
# ? Jul 22, 2011 07:09 |
|
I doubt that the driver is generating anything on-demand since the un-mipped texture turns into visual noise at long draw distances. This was my clue to look at the mipmap status to begin with, but also suggests that the full size texture is being used directly if lower detail levels aren't available. The cache/data locality issue seems like it could be the cause. I have a shitton of data set in vertex buffers so it is likely that a lot of cache swapping is going on in general and loading the full 512x512 texture + bumpmap would take some time. Thanks for the help.
|
# ? Jul 22, 2011 14:20 |
|
haveblue posted:The driver is doing something dumb like generating the mip levels on-demand for each fragment evaluated. If this is with D3D, remember that the sampler state is not part of the texture state like it is with OpenGL, so it's possible that the sampler state doesn't match up with how the texture data is stored.
|
# ? Jul 22, 2011 17:27 |
|
I'm thinking about rendering a translucent spheroid as a "shield", and it strikes me that rendering it as simply a colored polygonal approximation of a spheroid with an alpha value will result in a totally flat appearance on screen. In reality, a transparent spheroid would appear both front and back, and 'denser' at the edges because you're looking through a thicker piece of the surface. Would this sort of effect be done with a shader that increases the alpha the more perpendicular the normal is to the camera? Alternatively, what is a nicer way of rendering a visible forcefield around an object?
|
# ? Aug 15, 2011 18:35 |
|
That would be the easiest way to fake it, yes. Comparing the normal to the camera vector is just a dot product.
|
# ? Aug 15, 2011 19:47 |
|
Just curious, having seen this on the feature list of OpenGL 4.2:quote:modifying an arbitrary subset of a compressed texture, without having to re-download the whole texture to the GPU for significant performance improvements; Couldn't you already do that with glCompressedTexSubImage2D?
|
# ? Aug 19, 2011 16:07 |
|
Vector math question - if I have an "up" and "forward" vector that I manipulate (in an ongoing cumulative manner) using pitch and roll rotations, how can I correct for the creep of float inaccuracy? I can renormalize them, obviously, but I imagine they'd still slowly creep away from being at right angles - how would I best bring them back in order? Thinking something like, in pseudocode: code:
roomforthetuna fucked around with this message at 04:24 on Aug 21, 2011 |
# ? Aug 21, 2011 04:15 |
|
Yeah, orthonormalizing axes is a common need when you're independently tweaking them over time. Have a look around - there's a few algorithms which minimize inaccuracy. Alternatively: quaternions.
|
# ? Aug 21, 2011 10:00 |
|
ynohtna posted:Alternatively: quaternions.
|
# ? Aug 21, 2011 16:44 |
|
I'm getting some pretty crusty lines with OpenGL ES 2.0. I am multisampling / anti-alising, but lines with glLineWidth(1.0) are pretty chunky: (image zoomed a bit) Is there a common solution to this that I'm missing?
|
# ? Aug 21, 2011 20:35 |
|
I am new to GL 4.1 and haven't made use of shaders before. I have a GLSL shader which has an in vec3 variable called v_position in the vertex shader and an out vec4 called outputColour in the fragment shader. I'm am wondering why does it still work if I comment out the following code:code:
|
# ? Aug 24, 2011 06:35 |
|
Psychic debugging: your shader uses the location keyword.
|
# ? Aug 24, 2011 08:02 |
|
|
# ? May 30, 2024 12:10 |
|
It doesn't, which adds to my confusion.
|
# ? Aug 24, 2011 13:15 |