|
UraniumAnchor posted:So if you have a moderately complex shader that could do one of two things depending on a boolean flag, would it be better to just have two versions of the shader and switch between the two, rather than having a boolean uniform? I'm still not clear on how much of a stall you might get in the pipeline by switching shader stuff around.
|
# ? Apr 7, 2011 19:09 |
|
|
# ? May 17, 2024 15:51 |
|
roomforthetuna posted:Yeah, I figured that was why, which is why I unlooped it before I posted. Why exactly do you want to check for equality? In general, you shouldn't be checking for equality with floating points (in normal code or shaders). There are ways to do so (such as checking that the number lies within a certain bound near the other), but you can often alter the algorithm to work without the check.
|
# ? Apr 7, 2011 19:24 |
|
HappyHippo posted:Why exactly do you want to check for equality? In general, you shouldn't be checking for equality with floating points (in normal code or shaders). There are ways to do so (such as checking that the number lies within a certain bound near the other), but you can often alter the algorithm to work without the check. The reason why is so that I can make very lazily configurable outfits - I can simply make a mesh with diffuse values at the vertices, and if the diffuse value (before lighting) is equal to COLOR_I_WOULD_NEVER_USE_1 or COLOR_I_WOULD_NEVER_USE_2, the vertex shader can replace it with shader_global_dye_color_1 or shader_global_dye_color_2. Edit: also another question, will the compiler optimize "if (a==b) a=c;" so it doesn't branch, or do I have to manually write it less readably into a calculation of some sort? roomforthetuna fucked around with this message at 20:30 on Apr 7, 2011 |
# ? Apr 7, 2011 20:22 |
|
There's not really a need for per-vertex values if you're just using paint maps.
|
# ? Apr 7, 2011 20:26 |
|
OneEightHundred posted:There's not really a need for per-vertex values if you're just using paint maps. (I don't know what paint maps means and Google doesn't help with that.)
|
# ? Apr 7, 2011 20:37 |
|
Paint maps are just textures where each channel is the intensity of a paint color to be added to the base texture before any further processing. You can then set the paint colors as uniforms and you get up to 4 paint colors per texture. Like if you had the stripes on an athletic uniform, they'd be black on the base texture, one of the channels on the paint map would be white at the stripe locations, and you'd set the uniform representing the color for that channel as whatever the stripe colors should be. albedo = baseTexture + paintColorR*paint.r + paintColorG*paint.g + paintColorB*paint.b + paintColorA*paint.a OneEightHundred fucked around with this message at 21:21 on Apr 7, 2011 |
# ? Apr 7, 2011 21:17 |
|
Ah, I see, and you'd do that in the pixel shader. You've told me about that before, I think, and it makes complete sense now that I've played with shaders a bit. But surely that's significantly more GPU-intensive than having a color set at the vertex? I'm probably not even going to be using textures because my 'bodies' are small enough that a texture would almost completely be a waste, and also because I'm not good with the modelling programs and wrapping textures onto things. (And also because the untextured aesthetic suits the game setting anyway.) Edit: I think I've got a reliable vertex-color-painting working now - I just send a D3DXCOLOR set with the same value as the "to be painted" diffuse value, to a shader global, so then the shader has a float exactly the same as it will get from the vertex with the "paint-me" RGB value, so an equality comparison works fine. It's all good so long as it's not compiling as a branch, which I'm fairly sure it's not. roomforthetuna fucked around with this message at 22:33 on Apr 7, 2011 |
# ? Apr 7, 2011 22:28 |
|
Quick question for anyone who's ever written a software renderer or just understands simple depth sorting in general: In my software renderer at the moment I have all the meshes push triangles onto a single stack that's then sorted by the centre Z value of the triangle (mean average of vertices), which works great for single objects rendering. The issue arises when there's one object on top of another where some of the triangles in the bottom object are technically closer, like so: (At the base of the turret thing) Anyway my question is, is it as simple as just processing sets of triangles per mesh as opposed to all at once (i'd try this without asking but it requires me rethinking how to group stuff up with regards to the scene graph's matrix stack) or is this a more complicated issue? Is this what Z-Fighting is all about? brian fucked around with this message at 04:09 on Apr 8, 2011 |
# ? Apr 8, 2011 04:06 |
|
So you're just using the painter's algorithm (each triangle is drawn in whole on top of whatever's "behind" it)? If you want "correct"/per-pixel depth sorting, then you'll have to implement a depth/z buffer, just like graphics cards do.
|
# ? Apr 8, 2011 08:28 |
|
Yeah, it's the big weakness of the painter's algorithm, it just doesn't work if you've got triangles that have different relative Z-orders at different screen points and happen to intersect. Not much you can do except go with a depth buffer like Screeb said, unless you're willing to do something like detect problematic cases in real-time and split the triangles so that each triangle is always completely behind or on top of another. I have no idea how that would perform and how complex it would be, but my guess is "not very well" and "unpleasantly so", respectively. Just go with a depth buffer.
|
# ? Apr 8, 2011 17:12 |
|
brian posted:Is this what Z-Fighting is all about? Z fighting is a depth buffer artifact caused by rounding error, where the difference between when two surfaces cross rounding boundaries can add non-linear error to the difference between their depth values. If that error is the difference between the surface being occluded or not, and the distance of the surfaces doesn't separate faster than they can clear the rounding oscillations, then you get striping artifacts. i.e. if you have two lines, where 1 = only line 1, 2 = only line 2, * = overlap, where depth is the vertical axis: code:
code:
OneEightHundred fucked around with this message at 18:22 on Apr 8, 2011 |
# ? Apr 8, 2011 17:52 |
|
Roger dodger thanks chaps, onto Z bufferin'!
|
# ? Apr 8, 2011 18:23 |
|
brian posted:Roger dodger thanks chaps, onto Z bufferin'! It's been a long time since I wrote a software renderer, but one of the big costs back in the day was clearning the z-buffer every frame. A dumb trick that used to really speed it up: Use only a "chunk", say half of your z-buffer resolution each time, adding 0.5 increment to each write every other frame, and only clear every other frame, or a quarter of your resolution and add 0.25 each frame and clear every 4 frames, for example. You lost some resolution, but it would make z-buffering go from "not fast enough for real-time" to "totally fast enough", so it was worth it back in the mode 13 320x200 days. Though I'm not sure if clearing the z-buffer is a significant hit these days.
|
# ? Apr 8, 2011 22:10 |
|
Clearing is always a cost, though it's certainly faster now. Of course, you have to consider that your framebuffer is likely to be much larger as well. There are a ton of things you can do. You can try to make a "fastclear" path that only clears pixels that have been touched. You can do Hi-Z. Some combination of the two will allow you to categorize blocks of the zbuffer that need to be cleared. Say you mark 4x4 blocks as dirty, then you know you need to clear them, etc. That also means you need to store your buffers in block-linear order, which is better than straight linear anyway.
|
# ? Apr 9, 2011 00:41 |
|
Is there any good resource for all this? I implemented it in what I can only assume is a terrible manner as while it fixed some of the issues it created similar issues in the back of the scene and slowed down the whole process to 15 from the already low 35 it runs at normally. At the moment i'm just taking the pre-projection camera-space z co-ords and interpolating them for each pixel as I rasterize between the two edges in question. I'm sure I can improve the rasterization method using the Bresenham method someone suggested I can use, but it seems like i'm shooting in the dark when it comes to optimal methods. Couple that with this just being a quick fun throwaway project if it's going to get insanely complicated for little result i'd rather stick with painter's and just make the game around it. The game itself is going to be a space shooter where you're a stationary turret shooting spaceships so there's unlikely to be a huge issue if I just model the turret base correctly as each ship is unlikely to get close to another and the ones I've done already have had no problems on this front.
|
# ? Apr 9, 2011 01:07 |
|
There are lots of things to think about, but I don't have a good reference off the top of my head, unfortunately. Bresenham is sort of the gold standard way to do it. There are variants, of course, but it's actually quite simple to implement. Other stuff to think about: Are you doing the z-cull in eye space or clip space? What are you doing to map to pixels? How are you interpolating (ie, you should do adds instead of mul/divide per pixel)? Are you working in floating point or with integers?
|
# ? Apr 9, 2011 07:12 |
|
Ughhhhhhh I'm having some OpenGL rasterization issues that are driving me crazy. I'm rendering bitmap fonts from a texture atlas, here the individual letters are moving around: Good frame: Bad frame: The effects are much more noticeable out of the box- right now I'm doing a glTranslatef(0.05f, 0.05f, 0.f) before I render anything, and the texture atlas is set up as follows: code:
code:
Any ideas?
|
# ? Apr 10, 2011 04:51 |
|
You're screwing up the region it's texture mapping from by mixing adds/subtracts with the rasterization offset, and you shouldn't be using a rasterization offset anyway because you're not using D3D9. Quick texture mapping/rasterization 101: 0,0 is the CORNER of corner element on the texture map, not the center of it. Ditto for pixels on the screen (unless you're using D3D9, where it's the center because Microsoft hates you). The simple rule for drawing stuff to the pixel is to include the entire pixel on the texture and on the screen. That means your texture coordinates should go from one corner of the pixel region you want to pull to the other corner, and the area you're drawing to on the screen should also include the full pixels. If you're going to use not-quite aligned coordinates on the screen and nearest-neighbor filtering, then go ahead, but you should still use corner-to-corner coordinates on the texture.
|
# ? Apr 10, 2011 06:48 |
|
Ahhh sorry, my post was super incoherent because I wrote it at six in the morning Thanks for the response. I was trying to say in my post above that without the rasterization offset and translate, the problems are even worse. They are less frequent, but much more noticeable. Observe:
|
# ? Apr 10, 2011 12:27 |
|
No I mean don't do the rasterization offset thing on the texture coordinates. If you're trying to draw entire pixels from a texture map, then the texture coordinates need to be from the corner of one pixel on the texture to the opposite corner of another pixel. So they should be stuff like (position.x + size_x) / width and that's it.
|
# ? Apr 10, 2011 16:26 |
|
OneEightHundred posted:No I mean don't do the rasterization offset thing on the texture coordinates. If you're trying to draw entire pixels from a texture map, then the texture coordinates need to be from the corner of one pixel on the texture to the opposite corner of another pixel. It might help to make the font texture bigger. You'd still get artifacts but they'd look less bad. Then with a bigger texture you could also use linear sampling which would get you a slightly blurrier result but without the honking great chunks of illegible. (Or, only use your font in nice straight lines at a 1:1 scale, in which case point filtering will give you the best results and you won't get artifacts.) roomforthetuna fucked around with this message at 16:43 on Apr 10, 2011 |
# ? Apr 10, 2011 16:41 |
|
Sorry, I'm really not explaining myself well.OneEightHundred posted:No I mean don't do the rasterization offset thing on the texture coordinates. If you're trying to draw entire pixels from a texture map, then the texture coordinates need to be from the corner of one pixel on the texture to the opposite corner of another pixel. Right, unless I'm misunderstanding you now, that's exactly what I'm doing in my most recent post above. roomforthetuna posted:I think the problem is you can't do distorted fonts like in the screenshots with "point" sampling (which I assume is what's being used since there's no half-lit pixels) without getting artifacts - in a moving distortion like that there'll always be some point where one pixel's v is 1.001 and the next pixel's v is 1.999, and you only wanted the line one pixel thick but now it's two. (Or alternatively, one pixel's v is 0.999 and the next is 2.001 so you missed that texture pixel entirely.) My fault for not explaining, everything is being rendered as an axis-aligned quad in its original size. The wavy text is just because I'm shifting individual letters vertically. Also, the shots above were rendered at 320x240 and magnified 2x just for the sake of posting. I'm not sure I can really solve the problem the way I wanted to, I think due to floating point inaccuracies you'll always have rasterization problems unless you do some kind of rounding before sending vertex coordinates to OpenGL. I have another question. How should one typically use VBOs for dynamic objects that are frequently moving on and off screen? Right now I'm just using glDrawElements with a client-side index array. Every frame, I iterate over all onscreen objects and collect their indices into an array and pass it to glDrawElements. When an object is offscreen, it remains in the VBO, I just don't pass its indices to glDrawElements. There are so many different ways to do the same thing, I have no intuition about how to compare different approaches and have trouble choosing one. For instance, I could use a server-side index buffer, zero out elements as soon as they go off screen, do some primitive memory management to reuse available parts of the index buffer, keep track of the maximum index used and pass all of that to glDrawRangeElements. Is that better than building a client-side index buffer of visible objects on the heap, passing it to glDrawElements, and then discarding it every single frame? I realize it depends on the application and I can't find an exact answer without trying it out and doing some profiling, but I need some kind of guiding principle to follow on my initial implementation, otherwise I might as well do anything, right?
|
# ? Apr 10, 2011 21:58 |
|
First, I'm confused. If the object isn't physically changing shape, etc, you don't need to dynamically update its vertex/index data. Unless I'm missing your point... For text, most people pass a sysmem pointer to index data. Usually you want to double-buffer VBOs. Frame 1: Mod VBO A, Draw with A Frame 2: Mod VBO B, Draw with B Think of it this way: All GL commands get put into a queue that will be pushed to the GPU (a "command buffer"). However, you can modify stuff in VRAM via DMA as well. So you cannot* modify something that is currently being used to draw. However, if you have 2 VBOs with the same data, you can get around this (and vertex data tends to be small, so it doesn't really hurt your mem usage). This true for textures,vertex data, cbuffers, etc (I've seen several DX10 apps that try to use one cbuffer and end up waiting after every draw). *D3D's LOCK_DISCARD and GL's FlushMappedBufferRange will allow you to modify stuff in use - it's up to you to make sure you don't smash your own memory.
|
# ? Apr 10, 2011 23:02 |
|
Spite posted:Usually you want to double-buffer VBOs. Also another option for using discard on OpenGL buffers is use glBufferData with a NULL data pointer.
|
# ? Apr 10, 2011 23:24 |
|
I've implemented the marching cubes algorithm and I'm getting some strange behavior that I'm hoping someone here can help me explain. My code produces results that look correct so long as I make the magnitude of the isosurface I'm building big enough, if it's too small I get a surface full of really nasty holes. Here's an example, this image is the 5000 level set of 1000*(sqrt(r2) + cos((x + y)/2.0) + cos((y + z)/2.0)): But here's the 5 level set of sqrt(r2) + cos((x + y)/2.0) + cos((y + z)/2.0) over the same domain: In case it's not obvious from the second picture, there are holes EVERYWHERE: This happens regardless of the level of detail, or the range of the coordinate axes. As far as I can tell the only factor that affects it is the magnitude of the function I'm grabbing a level set from. Does anyone know what's going on here?
|
# ? Apr 11, 2011 07:06 |
|
Given that each sample is evaluated 8 times (as each of 8 corners of the cube), is it possible that a vertex passes the threshold on some evaluations and not others? Maybe the threshold value is being computed for each comparison and the optimizer is generating code with slightly varying numerical behavior. The surface can be closed only if every sample is found to be greater than threshold all 8 times or less all 8 times.
|
# ? Apr 11, 2011 07:54 |
|
Spite posted:First, I'm confused. If the object isn't physically changing shape, etc, you don't need to dynamically update its vertex/index data. Unless I'm missing your point... Maybe I'm using VBOs in kind of a strange way. All I'm rendering are a bunch of axis aligned quads that (at least right now) are all textured from the same atlas. Whenever an object in the scene moves, I need to update its vertex coordinates in the VBO. Whenever a sprite animates, I need to update its tex coords. This way I can draw all of the objects in the scene with a single OpenGL call. Is this a bad approach? OneEightHundred posted:Why would you do this instead of just using discard? Doesn't this mean I'll have to completely repopulate the VBO? If I'm doing that every frame there's no real benefit over something like vertex arrays right?
|
# ? Apr 11, 2011 15:22 |
|
Fecotourist posted:Given that each sample is evaluated 8 times (as each of 8 corners of the cube), is it possible that a vertex passes the threshold on some evaluations and not others? Maybe the threshold value is being computed for each comparison and the optimizer is generating code with slightly varying numerical behavior. I thought about this but I don't think so. The entire volume is pre-computed and the threshold doesn't change while the algorithm is running, so it's quite literally comparing the same numbers each time it revisits a corner.
|
# ? Apr 11, 2011 15:25 |
|
Nippashish posted:I thought about this but I don't think so. The entire volume is pre-computed and the threshold doesn't change while the algorithm is running, so it's quite literally comparing the same numbers each time it revisits a corner. For what it's worth in my marching cubes implementation I'll occasionally get degenerate triangles (which look like holes in the mesh) at high frequency regions. The coarser the marching cubes grid, the worse the problem gets. I don't know if that's the same issue you're having since you are working with a perfectly smooth mesh, but you could try increasing your sampling density and see if that improves the problem. If it does then I think it's just an inherent issue with marching cubes, and if you aren't able to handle the increased polycount at acceptable framerates you may need to use a different algorithm or pass the mesh through some kind of massaging pass to fix holes and/or simplify the number of faces.
|
# ? Apr 11, 2011 15:43 |
|
not a dinosaur posted:Doesn't this mean I'll have to completely repopulate the VBO? If I'm doing that every frame there's no real benefit over something like vertex arrays right?
|
# ? Apr 11, 2011 20:09 |
|
OneEightHundred posted:Why would you do this instead of just using discard? Because the driver has to allocate you new VRAM when you discard or orphan the buffer. Any decent driver keeps a list of these so it doesn't have to do an actual allocation, but I feel it's a better design to handle this yourself. not a dinosaur posted:VBO stuff Fewer calls is better, yeah. You've got a couple options. For something like that, instancing works really well. You'd only have to upload a transform vector per quad instead of all 4 vertices. On modern hardware, sending points down to the geometry shader and generating quads would also work, but is almost certainly an unnecessary optimization. I'm not a fan of geometry shader in general, so I try to stay away from it. Or you can double buffer the VBO and update pieces as needed, which is essentially what you are doing now.
|
# ? Apr 11, 2011 20:42 |
|
Instancing looks cool, I'll have to play around with that. I don't think it really helps me too much though because most of my quads are different sizes. Eventually I figured I would write a geometry shader like you said. What's wrong with doing that? Also, how would double buffering VBOs help me? I'd have to repopulate the back buffer every frame, right?
|
# ? Apr 11, 2011 21:08 |
|
not a dinosaur posted:Instancing looks cool, I'll have to play around with that. I don't think it really helps me too much though because most of my quads are different sizes. So you send the dimensions of the quad as part of the vertex.
|
# ? Apr 11, 2011 21:22 |
|
Nippashish posted:I've implemented the marching cubes algorithm and I'm getting some strange behavior that I'm hoping someone here can help me explain. My code produces results that look correct so long as I make the magnitude of the isosurface I'm building big enough, if it's too small I get a surface full of really nasty holes. I would be tempted to say there's an issue with your interpolating between values along edges to find the zero-threshold, but the fact that your first example works means this is unlikely (unless your values are so frickin huge that you're getting NAN results) The same goes for triangulation - the holes look like some of the marching cubes cases are screwing up, but the first example has no issues. The frequency shouldn't matter as far as holes go, as marching cubes inherently generates a manifold. The result might look ugly, but there shouldn't ever be holes if you're consistent with your tube-case (where you connect opposite corners) My guess is you might have screwed up the memory allocation or something. Are you linearly interpolating to find the zero-thresholds on the edges? If you're evaluating the actual function using some sort of minimization, it's possible that you could be getting results outside of the cube (and that would be slow as hell to compute anyways, not really worth it since Marching Cubes has enough aliasing problems).
|
# ? Apr 11, 2011 21:27 |
|
Nippashish posted:I've implemented the marching cubes algorithm and I'm getting some strange behavior that I'm hoping someone here can help me explain. My code produces results that look correct so long as I make the magnitude of the isosurface I'm building big enough, if it's too small I get a surface full of really nasty holes. I'd take a closer look at your interpolating functions. In your first image there seems to be a kind of stepping or terracing feature which really shouldn't be there since you have way more samples than you need to get a smooth isosurface.In your second image the terracing is worse, and regions with a high rate of change are starting to show holes. In the closeup image it looks like there are triangles being generated where the holes appear, but the verticies are in the wrong locations. These things would all make sense if the interpolating function was slightly wrong. Maybe try simplifying your function to a sphere and then adjust the radius of the sphere to see if holes appear at high curvature/small radius. e: paulbourke.net gives some sample interpolation code if you want to compare it against your own. If you were calculating the slope (mu) wrong, say as mu = 1/(valp2 - valp1), you would get a situation where 'large' functions [ 1 << (valp2-valp1) ] produced a connected but step-ish surface like what you are seeing since mu would go to zero and each point vertex would essentially snap to the nearest grid point. Then when you get to 'small' functions [ 1 ~= (valp2 - valp1) ] the snapping effect goes away and you get a mish-mash of incorrect vertex locations. code:
PDP-1 fucked around with this message at 22:57 on Apr 11, 2011 |
# ? Apr 11, 2011 22:08 |
|
If you are doing 3d functions you can use the bisection method instead of interpolating. It's pretty fast and very accurate after a few iterations.
|
# ? Apr 11, 2011 22:10 |
|
PDP-1 posted:I'd take a closer look at your interpolating functions. In your first image there seems to be a kind of stepping or terracing feature which really shouldn't be there since you have way more samples than you need to get a smooth isosurface.In your second image the terracing is worse, and regions with a high rate of change are starting to show holes. In the closeup image it looks like there are triangles being generated where the holes appear, but the verticies are in the wrong locations. These things would all make sense if the interpolating function was slightly wrong. HappyHippo posted:If you are doing 3d functions you can use the bisection method instead of interpolating. It's pretty fast and very accurate after a few iterations. Good call on this. I replaced my linear interpolation with bisection and the holes disappear. PDP-1 posted:e: paulbourke.net gives some sample interpolation code if you want to compare it against your own. That's actually the page I used as a reference when I wrote the code. My interpolation method is essentially identical to his. I just tried removing the early exit checks and changing it to always use mu=0.5 and I no longer get holes. I'd tried setting mu=0.5 last night to no effect, but to turns out that is not enough on its own. This solution is probably good enough for what I have in mind. Bisection won't work beyond this test case, since what I really want to do is render surfaces from data, but I'm not too worried about getting them exactly. I just want something that looks alright. Nippashish fucked around with this message at 04:22 on Apr 12, 2011 |
# ? Apr 12, 2011 04:20 |
|
Spite posted:Fewer calls is better, yeah. But wouldn't instancing also restrict you to pretty new hardware? I mean, the approaches I've seen all either used deprecated fixed-function "hacks" (i.e., pseudoinstancing, a method that apparently gets bad performance on ATI) or some relatively modern extensions that won't be supported on older stuff.
|
# ? Apr 12, 2011 18:37 |
|
octoroon posted:But wouldn't instancing also restrict you to pretty new hardware? I mean, the approaches I've seen all either used deprecated fixed-function "hacks" (i.e., pseudoinstancing, a method that apparently gets bad performance on ATI) or some relatively modern extensions that won't be supported on older stuff. DirectX 9 came out in 2002 (dear god this makes me feel old). Anyone with a video card that can't support DX9-level features like instancing is not going to be playing games on that machine anyway. At a certain point you have to ask yourself, "am I sacrificing efficiency on modern hardware in order to support hardware that nobody is going to be using anyway?"
|
# ? Apr 12, 2011 18:48 |
|
|
# ? May 17, 2024 15:51 |
|
Paniolo posted:DirectX 9 came out in 2002 (dear god this makes me feel old). Anyone with a video card that can't support DX9-level features like instancing is not going to be playing games on that machine anyway. At a certain point you have to ask yourself, "am I sacrificing efficiency on modern hardware in order to support hardware that nobody is going to be using anyway?" iphone/pad or android? Not sure if they support instancing. Then again they don't support geometry shaders either.
|
# ? Apr 12, 2011 18:56 |