|
Can you even model that using blend modes? The equation contains abs(base-blend), and there's no standard way of doing that AFAIK(but I think Nvidia has a ton of blend extensions). Reference formula: https://github.com/jamieowen/glsl-blend/blob/master/difference.glsl Edit: Here's GL:s extension: https://www.khronos.org/registry/OpenGL/extensions/KHR/KHR_blend_equation_advanced.txt Maybe something like that exists for Metal? Zerf fucked around with this message at 18:58 on Mar 20, 2019 |
# ? Mar 20, 2019 18:55 |
|
|
# ? May 25, 2024 13:58 |
|
Yeah, you're both right, there isn't a blending-only solution for the PS difference behavior. I should be able to roll my own fragment shader that takes care of it -- just now looking into how I can reference the destination color within the shader.
|
# ? Mar 20, 2019 19:18 |
|
If it’s for iOS, you can totally just read the existing frame buffer color at the destination pixel. IIRC you just make your fragment shader take a specially marked variable, and the compiler generates code to read the framebuffer. Look at Apple’s multipass rendering sample from a couple years ago (the one with the columns and glowing orbs). This is possible on iOS devices but not Macs because PowerVR (and Apple’s GPUs) don’t do blending in hardware, so shaders being able to read the tile buffer’s existing contents is necessary. The shader compiler appends code to do the blending op to the end of your shader, which was part of the reason blending mode changes were expensive in OpenGL ES.
|
# ? Mar 21, 2019 13:40 |
|
Anybody ever implemented Dual Contouring? I'm working on implementing it to meshify signed distance functions, and I've got it mostly working except for the quadratic error function stuff. All of the implementations I can find online (including the reference implementation from the paper) do stupid poo poo like type out every single individual multiply in every matrix operation, rather than using a library. I think I've worked out what that nonsense is doing, and translated it to use Eigen, but I don't really know anything about least-squares minimization or how this stuff is supposed to work: code:
If I just use the midpoint (e.g. "return mp") it at least looks symmetrical, but still kind of "boxy":
|
# ? Apr 8, 2019 04:08 |
|
Rhusitaurion posted:Anybody ever implemented Dual Contouring? I'm working on implementing it to meshify signed distance functions, and I've got it mostly working except for the quadratic error function stuff. I don't know anything about dual contouring, but I think this line is wrong: code:
|
# ? Apr 8, 2019 22:01 |
|
Nippashish posted:I don't know anything about dual contouring, but I think this line is wrong: Yup, looks like you're right: I misinterpreted what Eigen's SVD would give me - I thought V would come back as as V-transpose (for some reason), and I'd have to transpose it back. Thanks!
|
# ? Apr 9, 2019 02:51 |
|
Vulkan question: Is there really no better way to be able to change textures mid-frame than by having a separate descriptor set for each and every image and then binding them as needed? Sure, there's the insane bind-less "Just make a huge array of textures, index into it with a push constant" method, but that seems to bring a bunch of problems with it. What if you don't know ahead of time how many images you'll be loading? What if you pick an array size that's too big for what the hardware supports? Both of those seem to suggest building and compiling the shaders at run-time, which sucks. And might mean rebuilding and recompiling the shader(s) if you wind up needing to load more images later. The push descriptors extension seems custom-made for fixing this and bringing a bit of sanity and programmer harm reduction to Vulkan, so of course AMD hates it and doesn't support it. There's the descriptor update templates extension, but it looks like you can't queue the updates up in a command buffer unless you've also got the push descriptors extension, in which case you're still screwed if you want to support AMD. Otherwise you're stuck updating between frames, which completely misses the point. Doc Block fucked around with this message at 09:17 on May 25, 2019 |
# ? May 25, 2019 08:59 |
|
Doc Block posted:Vulkan question: We use the bindless approach and use an array of texture arrays, so texture count is not really an issue. We bind each texture array once upon creation, but the rest of the time a texture manager keeps track of which indices in each array that points to which texture, and streams them in/out when needed. As such, there's no need to compile shaders at runtime, because we're well within limits. We don't use push constants for the texture indices either, rather they are placed in a storage buffer. Each model/mesh then gets fed an entity index in some way (via per instance data/push constant/glBaseInstanceIndex etc.) and this index is used to access different model/mesh settings.
|
# ? May 25, 2019 09:41 |
|
Doesn't that still leave you having to create a million descriptors, though? I read somewhere that some implementations have a really low number of max descriptors. And still seems like a hassle if you need to load or unload images on the fly, since wouldn't you have to rebuild the buffer of descriptor sets?
|
# ? May 25, 2019 21:18 |
|
Doc Block posted:Doesn't that still leave you having to create a million descriptors, though? I read somewhere that some implementations have a really low number of max descriptors. And still seems like a hassle if you need to load or unload images on the fly, since wouldn't you have to rebuild the buffer of descriptor sets? We create a total of 2 descriptor sets, since we double buffer most things. Each descriptor set contains approx 100 entries. The only extra work after initial setup is if we run out of space in a texture array and need to create and bind another one. Again, no biggie to update 2 descriptor sets...
|
# ? May 26, 2019 07:51 |
|
Descriptor allocation should be incredibly cheap. That's why the pool is there. You should have a cache for other backend objects, and you already need to (possibly) ring buffer your other dynamic buffers, so I would just create more descriptors.
|
# ? May 27, 2019 17:06 |
|
Yeah I think I just became confused because I read somewhere that the minimum number of descriptor sets that the standard requires implementations to support is only 4 or something, and I think even that is just how many can be actually bound at once. Also I was thinking people were saying you could have an array of textures without having to have a descriptor for each one. Anyway, got bindless textures to work and it was a lot less painful than I thought it’d be. Having to specify the size of the array in the shader is kind of a bummer though (my hardware doesn’t support the extension that lets you just do layout(whatever) uniform texture2D lotsaTextures[];, apparently).
|
# ? May 27, 2019 21:34 |
|
Doc Block posted:Yeah I think I just became confused because I read somewhere that the minimum number of descriptor sets that the standard requires implementations to support is only 4 or something, and I think even that is just how many can be actually bound at once. It's been a while since I've played around with Vulkan, but wasn't there a way to pass on the equivalent of preprocessor definitions on link/compile/whatever they call when you add in a SPIRV object file as a shader to a pipeline? Some kind of compile-time constant or something?
|
# ? May 27, 2019 21:40 |
|
specialization constants
|
# ? May 27, 2019 21:49 |
|
Oh cool, thanks. Also, since I'm anticipating needing only a few dozen to a few hundred textures at once, it probably wouldn't be too wasteful if I just did: layout(etc) uniform texture2D[reasonable amount + a little extra, say, 256];, right? I could make any unused entries all point to the first texture as a debug thing, too.
|
# ? May 28, 2019 03:50 |
|
Only way to be sure is by testing, especially on diverse hardware. Note that the maximum varies considerably.
|
# ? May 28, 2019 22:12 |
|
I hope this is the right thread for this: while I was reading about color transformations in GPU Gems I googled some dead links and found this evidence of some heated arguments about color coding and gamma correction from the late '90s. The internet always made you stupid.
|
# ? May 29, 2019 00:02 |
|
Ralith posted:Only way to be sure is by testing, especially on diverse hardware. Note that the maximum varies considerably. 🤔 Seems, as far as desktop GPUs are concerned, only Intel has a limit of less than at least several tens of thousands. I'd totally drop them if I wasn't doing this on my daily driver, a 2018 Mac Mini (LOL) with MoltenVK. Apparently, on Macs all hardware is limited to 128 images per stage by the driver because ~Feature Parity~ and because Apple shot themselves in the foot with Metal's "GPU Family" method of determining limits/features. Still, I could make a ring buffer or whatever of texture descriptor sets, each the size of the hardware limit (or number of textures needed, if they all fit in one) and then keep track of which texture is in which set somehow, and just bind the appropriate descriptor set when drawing. Semi-bindless. Doc Block fucked around with this message at 05:04 on May 29, 2019 |
# ? May 29, 2019 04:58 |
|
I want to draw some 3D shapes on top of everything else on the screen. The problem is, it depth stencil tests the shapes, so they clip into other 3D shapes on the screen. Currently, I solve this by doing a second render pass to draw on top of the existing texture. But that seems to add a bunch of overhead. Is there a way to trick the depth check into drawing a shape over everything else? or am I stuck doing two passes?
|
# ? Jun 20, 2019 17:48 |
|
Draw the 3D shapes you want to force in front first and write a stencil bit for their mask, then test against that mask in the subsequent draws for the background?
|
# ? Jun 20, 2019 18:03 |
|
Set the depth test to always?
|
# ? Jun 20, 2019 21:58 |
|
Suspicious Dish posted:Set the depth test to always? Yeah unless I'm missing something why not just turn off the depth test for the parts in question?
|
# ? Jun 20, 2019 22:12 |
|
If the foreground 3D part isn't convex then you still want it to be depth tested against itself. Just not against anything else. Drawing it first and masking seems like the easiest way to do that to me, at least. If for some reason you absolutely must draw the foreground object last then I guess what you really need is to clear the existing depth buffer around the foreground object or bind a separate depth buffer before drawing. There might be a way to program the depth test to not do a depth check if the stencil is some value or other on some platforms. I don't believe that's a check you can do in base GL at least.
|
# ? Jun 20, 2019 22:27 |
|
I was assuming the object was relatively simple, but sure. If you draw the object last why not just clear the depth buffer completely before drawing the object.
|
# ? Jun 20, 2019 22:36 |
|
IIRC he’s using Metal on iOS, so it requires an extra render pass object (not too big of a deal), but so long as you set up your store/load states for color attachments for the first/second pass correctly so they never get written out to main memory until everything is finished (also, look into memoryless render targets if you haven’t already) and set up the second pass to clear the depth buffer as the load action, it should be OK and not incur any appreciable overhead.
|
# ? Jun 20, 2019 23:17 |
|
Thanks all, and yeah, using Metal. I'm so close to nailing down my engine and this is all part of the optimization process.
|
# ? Jun 21, 2019 15:58 |
|
Similarly, I'm trying to implement a deferred renderer using light volumes and am currently scratching my head a bit at how to reconstruct world position in the actual shading pass. I'm using the stencil buffer to determine which pixels are within the light volume (a sphere for point lights) as described in http://ogldev.atspace.co.uk/www/tutorial37/tutorial37.html, https://www.3dgep.com/forward-plus/#Deferred_Shading and others. I'm using BGFX (a cross-platform rendering lib) so I had to adjust the approach a bit, where the stencil bits get "unmarked" when they fail the depth criteria. I have to do that because BGFX has no `glClear` equivalent so it makes it easier to reset the bits. (Why not? No idea tbh) Basically the procedure is the following (for each light): 1) Draw the light volume, with no culling, using stencil tests for the front and back face to mark the fragments we aren't interested in shading (by incrementing). There's no writes to RGB or depth buffers at this point. 2) Use the stencil to draw the light volume again, this time with front culling. Shade the fragments that pass the stencil test (e.g. weren't unmarked in the first draw) using the g-buffers attached as textures and additive blending. 2a) As part of the stencil test, replace the stencil value with zero on a passing depth test (which is ALWAYS for this draw). This gets around the need to call `glClear` like I've seen most posts suggest. I encounter two problems, one less annoying than the other. The first problem is that when I perform the first draw, I noticed that if a fragment's stencil value should be updated by both the front AND back stencil test, I only see a value of 1 (corresponding to only one increment). That's okay for my purposes (since once either tests pass, the pixel is unmarked), but I just wanted to confirm that at I shouldn't expect both tests to mutate the stencil buffer. The second problem, is that when I perform the second pass and go to shade the fragment, I'm not able to read from the depth buffer since it's part of the currently bound FBO and is used for the stencil test. So I don't have a way of reconstructing world/view position inside of my lighting shader. I'm not storing positions in my g-buffers, so this is annoying. For the moment I'm storing the depth buffer twice: inside of a D24S8 buffer as well as a R32F buffer that I am able to bind as a texture for the second pass. But this seems kind of wrong? And most of the blog posts I've read seem to totally gloss over this point, so I feel like I'm missing something obvious or am totally doing this all wrong. Any pointers or clarification would be greatly appreciated!
|
# ? Jun 22, 2019 05:13 |
|
Brownie posted:The first problem is that when I perform the first draw, I noticed that if a fragment's stencil value should be updated by both the front AND back stencil test, I only see a value of 1 (corresponding to only one increment). That's okay for my purposes (since once either tests pass, the pixel is unmarked), but I just wanted to confirm that at I shouldn't expect both tests to mutate the stencil buffer. Are you sure you've disabled backface culling, too?
|
# ? Jun 22, 2019 15:18 |
|
Absurd Alhazred posted:Are you sure you've disabled backface culling, too? Yeah, using RenderDoc I can see that the Cull Mode is set to NONE. If it was BACK or FRONT I'd also see artifacts in the final rendered image, but I don't, and manual inspect on the stencil shows that all the pixel I expect to be marked are marked, just some of them are only marked once instead of twice like I expect. (As an aside: thank the lord for tools like RenderDoc)
|
# ? Jun 22, 2019 16:21 |
|
Brownie posted:Yeah, using RenderDoc I can see that the Cull Mode is set to NONE. If it was BACK or FRONT I'd also see artifacts in the final rendered image, but I don't, and manual inspect on the stencil shows that all the pixel I expect to be marked are marked, just some of them are only marked once instead of twice like I expect. I've not had a lot of experience with stencil tests, there's also something where you have to ask it to perform the test on the backface, right? Is that set?
|
# ? Jun 22, 2019 16:23 |
|
Absurd Alhazred posted:I've not had a lot of experience with stencil tests, there's also something where you have to ask it to perform the test on the backface, right? Is that set? Yeah that's correct. You can pass it two different stencil functions, one for the front or back face: Here's the depth test that's available in RenderDoc for one of the light volumes. It shows the test passing/failing as green/red, but notably it's only for the front face. So the column that's in front of the volume gets marked correctly: Here's the resulting stencil buffer though, which does show that the backface tests are also working (evidenced by parts of the balcony that are beyond the volume being marked) But I'm surprised the the parts of the column that overlap with the openings in the balcony beyond are not marked twice (which would show up as white, since the image is scaled to be between [0,2] instead of [0, 255]). Like I said, this isn't actually causing me issues with the actual shading, since the stencil test output is good enough -- being marked twice isn't anymore useful that being marked just once, in my case. Brownie fucked around with this message at 16:52 on Jun 22, 2019 |
# ? Jun 22, 2019 16:49 |
|
The portion of the light volume's back faces that are behind the column fail the depth test, just like the front faces behind the column did. But for back faces, the fail op is keep now, so that's why the stencil buffer isn't incremented twice there. Indeed, the stencil buffer will never be incremented twice as configured because that would require a front face of the light volume to fail the depth test at the same place a back face passes: i.e., to have a front face be behind a back face, which for a convex volume is impossible (barring precision issues of course). Lime fucked around with this message at 04:44 on Jun 23, 2019 |
# ? Jun 23, 2019 04:41 |
|
Lime posted:The portion of the light volume's back faces that are behind the column fail the depth test, just like the front faces behind the column did. But for back faces, the fail op is keep now, so that's why the stencil buffer isn't incremented twice there. that makes perfect sense actually. Thanks for clearing that up for me. Now I just need to understand how people are reconstructing position in the second draw if they're also binding the depth-stencil buffer as part of the framebuffer. I will just keep duplicating the data for now, since I am not currently bandwidth limited in my lovely little Sponza scene.
|
# ? Jun 23, 2019 05:05 |
|
Is the second pass actually using the depth buffer? It sounds like the shading of fragments is determined entirely by the stencil buffer / g-buffers, and that updating the stencil buffer doesn't need it either because the depth func is ALWAYS. Can't you just disable depth testing and use a framebuffer object that has the same stencil buffer attachment but no depth buffer attached? Then you can freely use the depth texture as a texture. I've never used BGFX but it does seem to have a D0S8 texture format, suggesting a framebuffer can have separate attachments for depth and stencil buffers.
Lime fucked around with this message at 06:00 on Jun 23, 2019 |
# ? Jun 23, 2019 05:58 |
|
Most engines copy the depth target, or include it in the gbuffer. Some APIs will let you alias the depth target so you can read it, which will work since depth write is off, but not all do.
Suspicious Dish fucked around with this message at 12:47 on Jun 23, 2019 |
# ? Jun 23, 2019 12:45 |
|
Lime posted:Is the second pass actually using the depth buffer? It sounds like the shading of fragments is determined entirely by the stencil buffer / g-buffers, and that updating the stencil buffer doesn't need it either because the depth func is ALWAYS. Can't you just disable depth testing and use a framebuffer object that has the same stencil buffer attachment but no depth buffer attached? Then you can freely use the depth texture as a texture. I've never used BGFX but it does seem to have a D0S8 texture format, suggesting a framebuffer can have separate attachments for depth and stencil buffers. Rendering the final light volume requires depth testing.
|
# ? Jun 23, 2019 12:45 |
|
Lime posted:Is the second pass actually using the depth buffer? It sounds like the shading of fragments is determined entirely by the stencil buffer / g-buffers, and that updating the stencil buffer doesn't need it either because the depth func is ALWAYS. Can't you just disable depth testing and use a framebuffer object that has the same stencil buffer attachment but no depth buffer attached? Then you can freely use the depth texture as a texture. I've never used BGFX but it does seem to have a D0S8 texture format, suggesting a framebuffer can have separate attachments for depth and stencil buffers. The second draw is not using the depth buffer and is already set to ALWAYS. It just uses the stencil test, and also performs writes to the stencil test in order to "clear" the marked pixels on failure so that the next light volume doesn't have garbage in the stencil buffer. So it still needs write access to the packed buffer. Unfortunately I don't think BGFX has an API for attaching the packed depth-stencil buffer as only a stencil buffer to the framebuffer. And it's still not clear that this would allow my to read from that packed buffer in my shader? Everything I've read is that this is basically undefined behaviour in OpenGL / DX11. Doing it anyway (setting the depth-stencil as a texture while also having it bound to the active FBO) just yields a black screen and a poo poo ton of warnings that you aren't allowed to do that, which is exactly what you'd expect/want. Additionally, in OpenGL at least, implementations are not required to support attaching separate buffers for the depth and stencil, so it's basically not supported. quote:Rendering the final light volume requires depth testing. Well no, not quite since I've got everything I need in that stencil buffer that I posted above. So I'm actually not performing depth testing on the second draw that performs shading.
|
# ? Jun 23, 2019 13:34 |
|
Brownie posted:Additionally, in OpenGL at least, implementations are not required to support attaching separate buffers for the depth and stencil, so it's basically not supported. Ah, okay. This is what I meant rather than aliasing one buffer, and I fell right into the trap of thinking what the API says is logically possible is also what's physically possible.
|
# ? Jun 23, 2019 14:21 |
|
Lime posted:Ah, okay. This is what I meant rather than aliasing one buffer, and I fell right into the trap of thinking what the API says is logically possible is also what's physically possible. Yup. It's surprising considering how common this light volume technique seems, you'd think there'd be real demand to be able to just separate the two buffers entirely.
|
# ? Jun 23, 2019 14:45 |
|
|
# ? May 25, 2024 13:58 |
|
Felt like playing with matrix transforms ( linear and non linear), here's a work in progress : https://www.shadertoy.com/view/WtBXRD I made it work with both 2D and 4D matrix transforms - allowed me to finally truly understand why you need a 4D matrix for translation. You can comment / uncomment defines at the top for various features. Disclaimer : I had no idea how to draw a grid in a pixel shader, so I went with 1) scale/offset the space to the range I want (so (0,0) is in the middle of the screen) 2) if the value of (x - round(x)) is smaller than some epsilon we're on a grid line. (same for y) To draw the transformed grid lines I proceed by inversion; my reasoning is that for pixel x, after transformation T, it is now at some location x+dx. I can't write to another location as the pixel shader only runs on pixel x (compute shadertoy when!?) so instead, I consider that pixels are in the transformed space, and check if T_inverse(x+dx) makes pixel X lands on a grid line. It made sense when I did it, now I'm a bit confused but it seems to work. Would there be any other way to do it? It's weird, running on my 2013 Retina Macbook, I get 60 FPS for the 2D case, but it drops down to 25-30 FPS for the 4D case. I guess inverting a 4D matrix at every pixel is too much. At work on a GTX1060 I'm getting 60 FPs in all cases. Is it a Windows/OSX thing or my Macbook GPU just isn't that powerful? I'd be curious to hear perf reports from people here. As long as the transfo is linear (i.e. not the "fancy matrix" case) the matrix is the same for every pixel - is it possible to do the work just once with Shadertoy? I guess I'd have to do it in some sort of prepass, or just compute the values and hardcode them.
|
# ? Aug 7, 2019 17:51 |