Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Zerf
Dec 17, 2004

I miss you, sandman
Can you even model that using blend modes? The equation contains abs(base-blend), and there's no standard way of doing that AFAIK(but I think Nvidia has a ton of blend extensions). Reference formula: https://github.com/jamieowen/glsl-blend/blob/master/difference.glsl

Edit:
Here's GL:s extension: https://www.khronos.org/registry/OpenGL/extensions/KHR/KHR_blend_equation_advanced.txt

Maybe something like that exists for Metal?

Zerf fucked around with this message at 18:58 on Mar 20, 2019

Adbot
ADBOT LOVES YOU

lord funk
Feb 16, 2004

Yeah, you're both right, there isn't a blending-only solution for the PS difference behavior. I should be able to roll my own fragment shader that takes care of it -- just now looking into how I can reference the destination color within the shader.

Doc Block
Apr 15, 2003
Fun Shoe
If it’s for iOS, you can totally just read the existing frame buffer color at the destination pixel. IIRC you just make your fragment shader take a specially marked variable, and the compiler generates code to read the framebuffer. Look at Apple’s multipass rendering sample from a couple years ago (the one with the columns and glowing orbs).

This is possible on iOS devices but not Macs because PowerVR (and Apple’s GPUs) don’t do blending in hardware, so shaders being able to read the tile buffer’s existing contents is necessary. The shader compiler appends code to do the blending op to the end of your shader, which was part of the reason blending mode changes were expensive in OpenGL ES.

Rhusitaurion
Sep 16, 2003

One never knows, do one?
Anybody ever implemented Dual Contouring? I'm working on implementing it to meshify signed distance functions, and I've got it mostly working except for the quadratic error function stuff.

All of the implementations I can find online (including the reference implementation from the paper) do stupid poo poo like type out every single individual multiply in every matrix operation, rather than using a library. I think I've worked out what that nonsense is doing, and translated it to use Eigen, but I don't really know anything about least-squares minimization or how this stuff is supposed to work:

code:
// takes vectors of points and their corresponding normals, and returns the point that minimizes the sum of their dot products (?)
auto qef = (const std::vector<const Vector3f*> &ps, const std::vector<const Vector3f*> &ns) -> Vector3f {
    Vector3f mp = Vector3f::Zero();   // midpoint
    Matrix3f Ahat = Matrix3f::Zero(); // symmetric matrix of normals
    Vector3f bhat = Vector3f::Zero(); // vector of normals * dot product of normals and points
         
    for(size_t i = 0; i < ps.size(); ++i) {
        const auto &p = *ps[i];
        const auto &n = *ns[i];
        Ahat += n * n.transpose();
        bhat += n * n.dot(p);
        mp += p;
    }

    mp /= (float)ps.size();
    bhat -= Ahat * mp;
        
    Eigen::JacobiSVD<Matrix3f> svd(Ahat, Eigen::ComputeFullU | Eigen::ComputeFullV);
    auto U = svd.matrixU();
    auto S = svd.singularValues();
    auto V = svd.matrixV();
    S(0) = 1.0 / S(0);
    for(int si = 1; si < 3; ++si) {
        S(si) = (abs(S(si)) < 0.1) ? 0.0 : 1.0 / S(si);
    }
    Vector3f x = V.transpose() * Eigen::DiagonalMatrix<float,3>(S) * U.transpose() * bhat;
    return mp + x;
};
However this is clearly wrong since a sphere looks lumpy as poo poo:


If I just use the midpoint (e.g. "return mp") it at least looks symmetrical, but still kind of "boxy":

Nippashish
Nov 2, 2005

Let me see you dance!

Rhusitaurion posted:

Anybody ever implemented Dual Contouring? I'm working on implementing it to meshify signed distance functions, and I've got it mostly working except for the quadratic error function stuff.

All of the implementations I can find online (including the reference implementation from the paper) do stupid poo poo like type out every single individual multiply in every matrix operation, rather than using a library. I think I've worked out what that nonsense is doing, and translated it to use Eigen, but I don't really know anything about least-squares minimization or how this stuff is supposed to work:
...

I don't know anything about dual contouring, but I think this line is wrong:
code:
Vector3f x = V.transpose() * Eigen::DiagonalMatrix<float,3>(S) * U.transpose() * bhat;
It looks to me like you're trying to compute a stable version of inv(Ahat)*bhat. The SVD gives you three matrices such that Ahat = U * S * transpose(V) with U and V as orthogonal matrices. But that means inv(Ahat) = V * inv(S) * transpose(U), and you have an extra transpose on your V.

Rhusitaurion
Sep 16, 2003

One never knows, do one?

Nippashish posted:

I don't know anything about dual contouring, but I think this line is wrong:
code:
Vector3f x = V.transpose() * Eigen::DiagonalMatrix<float,3>(S) * U.transpose() * bhat;
It looks to me like you're trying to compute a stable version of inv(Ahat)*bhat. The SVD gives you three matrices such that Ahat = U * S * transpose(V) with U and V as orthogonal matrices. But that means inv(Ahat) = V * inv(S) * transpose(U), and you have an extra transpose on your V.

Yup, looks like you're right:


I misinterpreted what Eigen's SVD would give me - I thought V would come back as as V-transpose (for some reason), and I'd have to transpose it back. Thanks!

Doc Block
Apr 15, 2003
Fun Shoe
Vulkan question:

Is there really no better way to be able to change textures mid-frame than by having a separate descriptor set for each and every image and then binding them as needed?

Sure, there's the insane bind-less "Just make a huge array of textures, index into it with a push constant" method, but that seems to bring a bunch of problems with it. What if you don't know ahead of time how many images you'll be loading? What if you pick an array size that's too big for what the hardware supports? Both of those seem to suggest building and compiling the shaders at run-time, which sucks. And might mean rebuilding and recompiling the shader(s) if you wind up needing to load more images later.

The push descriptors extension seems custom-made for fixing this and bringing a bit of sanity and programmer harm reduction to Vulkan, so of course AMD hates it and doesn't support it.

There's the descriptor update templates extension, but it looks like you can't queue the updates up in a command buffer unless you've also got the push descriptors extension, in which case you're still screwed if you want to support AMD. Otherwise you're stuck updating between frames, which completely misses the point.

Doc Block fucked around with this message at 09:17 on May 25, 2019

Zerf
Dec 17, 2004

I miss you, sandman

Doc Block posted:

Vulkan question:

Is there really no better way to be able to change textures mid-frame than by having a separate descriptor set for each and every image and then binding them as needed?

Sure, there's the insane bind-less "Just make a huge array of textures, index into it with a push constant" method, but that seems to bring a bunch of problems with it. What if you don't know ahead of time how many images you'll be loading? What if you pick an array size that's too big for what the hardware supports? Both of those seem to suggest building and compiling the shaders at run-time, which sucks. And might mean rebuilding and recompiling the shader(s) if you wind up needing to load more images later.

The push descriptors extension seems custom-made for fixing this and bringing a bit of sanity and programmer harm reduction to Vulkan, so of course AMD hates it and doesn't support it.

There's the descriptor update templates extension, but it looks like you can't queue the updates up in a command buffer unless you've also got the push descriptors extension, in which case you're still screwed if you want to support AMD. Otherwise you're stuck updating between frames, which completely misses the point.

We use the bindless approach and use an array of texture arrays, so texture count is not really an issue. We bind each texture array once upon creation, but the rest of the time a texture manager keeps track of which indices in each array that points to which texture, and streams them in/out when needed. As such, there's no need to compile shaders at runtime, because we're well within limits.

We don't use push constants for the texture indices either, rather they are placed in a storage buffer. Each model/mesh then gets fed an entity index in some way (via per instance data/push constant/glBaseInstanceIndex etc.) and this index is used to access different model/mesh settings.

Doc Block
Apr 15, 2003
Fun Shoe
Doesn't that still leave you having to create a million descriptors, though? I read somewhere that some implementations have a really low number of max descriptors. And still seems like a hassle if you need to load or unload images on the fly, since wouldn't you have to rebuild the buffer of descriptor sets?

Zerf
Dec 17, 2004

I miss you, sandman

Doc Block posted:

Doesn't that still leave you having to create a million descriptors, though? I read somewhere that some implementations have a really low number of max descriptors. And still seems like a hassle if you need to load or unload images on the fly, since wouldn't you have to rebuild the buffer of descriptor sets?

We create a total of 2 descriptor sets, since we double buffer most things. Each descriptor set contains approx 100 entries.

The only extra work after initial setup is if we run out of space in a texture array and need to create and bind another one. Again, no biggie to update 2 descriptor sets...

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Descriptor allocation should be incredibly cheap. That's why the pool is there. You should have a cache for other backend objects, and you already need to (possibly) ring buffer your other dynamic buffers, so I would just create more descriptors.

Doc Block
Apr 15, 2003
Fun Shoe
Yeah I think I just became confused because I read somewhere that the minimum number of descriptor sets that the standard requires implementations to support is only 4 or something, and I think even that is just how many can be actually bound at once.

Also I was thinking people were saying you could have an array of textures without having to have a descriptor for each one.

Anyway, got bindless textures to work and it was a lot less painful than I thought it’d be. Having to specify the size of the array in the shader is kind of a bummer though (my hardware doesn’t support the extension that lets you just do layout(whatever) uniform texture2D lotsaTextures[];, apparently).

Absurd Alhazred
Mar 27, 2010

by Athanatos

Doc Block posted:

Yeah I think I just became confused because I read somewhere that the minimum number of descriptor sets that the standard requires implementations to support is only 4 or something, and I think even that is just how many can be actually bound at once.

Also I was thinking people were saying you could have an array of textures without having to have a descriptor for each one.

Anyway, got bindless textures to work and it was a lot less painful than I thought it’d be. Having to specify the size of the array in the shader is kind of a bummer though (my hardware doesn’t support the extension that lets you just do layout(whatever) uniform texture2D lotsaTextures[];, apparently).

It's been a while since I've played around with Vulkan, but wasn't there a way to pass on the equivalent of preprocessor definitions on link/compile/whatever they call when you add in a SPIRV object file as a shader to a pipeline? Some kind of compile-time constant or something?

pseudorandom name
May 6, 2007

specialization constants

Doc Block
Apr 15, 2003
Fun Shoe
Oh cool, thanks.

Also, since I'm anticipating needing only a few dozen to a few hundred textures at once, it probably wouldn't be too wasteful if I just did: layout(etc) uniform texture2D[reasonable amount + a little extra, say, 256];, right? I could make any unused entries all point to the first texture as a debug thing, too.

Ralith
Jan 12, 2011

I see a ship in the harbor
I can and shall obey
But if it wasn't for your misfortune
I'd be a heavenly person today
Only way to be sure is by testing, especially on diverse hardware. Note that the maximum varies considerably.

Absurd Alhazred
Mar 27, 2010

by Athanatos
I hope this is the right thread for this: while I was reading about color transformations in GPU Gems I googled some dead links and found this evidence of some heated arguments about color coding and gamma correction from the late '90s. The internet always made you stupid.

Doc Block
Apr 15, 2003
Fun Shoe

Ralith posted:

Only way to be sure is by testing, especially on diverse hardware. Note that the maximum varies considerably.

🤔 Seems, as far as desktop GPUs are concerned, only Intel has a limit of less than at least several tens of thousands. I'd totally drop them if I wasn't doing this on my daily driver, a 2018 Mac Mini (LOL) with MoltenVK. Apparently, on Macs all hardware is limited to 128 images per stage by the driver because ~Feature Parity~ and because Apple shot themselves in the foot with Metal's "GPU Family" method of determining limits/features.

Still, I could make a ring buffer or whatever of texture descriptor sets, each the size of the hardware limit (or number of textures needed, if they all fit in one) and then keep track of which texture is in which set somehow, and just bind the appropriate descriptor set when drawing. Semi-bindless.

Doc Block fucked around with this message at 05:04 on May 29, 2019

lord funk
Feb 16, 2004

I want to draw some 3D shapes on top of everything else on the screen. The problem is, it depth stencil tests the shapes, so they clip into other 3D shapes on the screen.

Currently, I solve this by doing a second render pass to draw on top of the existing texture. But that seems to add a bunch of overhead.

Is there a way to trick the depth check into drawing a shape over everything else? or am I stuck doing two passes?

Xerophyte
Mar 17, 2008

This space intentionally left blank
Draw the 3D shapes you want to force in front first and write a stencil bit for their mask, then test against that mask in the subsequent draws for the background?

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Set the depth test to always?

BattleMaster
Aug 14, 2000

Suspicious Dish posted:

Set the depth test to always?

Yeah unless I'm missing something why not just turn off the depth test for the parts in question?

Xerophyte
Mar 17, 2008

This space intentionally left blank
If the foreground 3D part isn't convex then you still want it to be depth tested against itself. Just not against anything else. Drawing it first and masking seems like the easiest way to do that to me, at least.

If for some reason you absolutely must draw the foreground object last then I guess what you really need is to clear the existing depth buffer around the foreground object or bind a separate depth buffer before drawing. There might be a way to program the depth test to not do a depth check if the stencil is some value or other on some platforms. I don't believe that's a check you can do in base GL at least.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
I was assuming the object was relatively simple, but sure. If you draw the object last why not just clear the depth buffer completely before drawing the object.

Doc Block
Apr 15, 2003
Fun Shoe
IIRC he’s using Metal on iOS, so it requires an extra render pass object (not too big of a deal), but so long as you set up your store/load states for color attachments for the first/second pass correctly so they never get written out to main memory until everything is finished (also, look into memoryless render targets if you haven’t already) and set up the second pass to clear the depth buffer as the load action, it should be OK and not incur any appreciable overhead.

lord funk
Feb 16, 2004

Thanks all, and yeah, using Metal. I'm so close to nailing down my engine and this is all part of the optimization process.

Brownie
Jul 21, 2007
The Croatian Sensation
Similarly, I'm trying to implement a deferred renderer using light volumes and am currently scratching my head a bit at how to reconstruct world position in the actual shading pass.

I'm using the stencil buffer to determine which pixels are within the light volume (a sphere for point lights) as described in http://ogldev.atspace.co.uk/www/tutorial37/tutorial37.html, https://www.3dgep.com/forward-plus/#Deferred_Shading and others. I'm using BGFX (a cross-platform rendering lib) so I had to adjust the approach a bit, where the stencil bits get "unmarked" when they fail the depth criteria. I have to do that because BGFX has no `glClear` equivalent so it makes it easier to reset the bits. (Why not? No idea tbh)

Basically the procedure is the following (for each light):
1) Draw the light volume, with no culling, using stencil tests for the front and back face to mark the fragments we aren't interested in shading (by incrementing). There's no writes to RGB or depth buffers at this point.
2) Use the stencil to draw the light volume again, this time with front culling. Shade the fragments that pass the stencil test (e.g. weren't unmarked in the first draw) using the g-buffers attached as textures and additive blending.
2a) As part of the stencil test, replace the stencil value with zero on a passing depth test (which is ALWAYS for this draw). This gets around the need to call `glClear` like I've seen most posts suggest.

I encounter two problems, one less annoying than the other.

The first problem is that when I perform the first draw, I noticed that if a fragment's stencil value should be updated by both the front AND back stencil test, I only see a value of 1 (corresponding to only one increment). That's okay for my purposes (since once either tests pass, the pixel is unmarked), but I just wanted to confirm that at I shouldn't expect both tests to mutate the stencil buffer.

The second problem, is that when I perform the second pass and go to shade the fragment, I'm not able to read from the depth buffer since it's part of the currently bound FBO and is used for the stencil test. So I don't have a way of reconstructing world/view position inside of my lighting shader. I'm not storing positions in my g-buffers, so this is annoying. For the moment I'm storing the depth buffer twice: inside of a D24S8 buffer as well as a R32F buffer that I am able to bind as a texture for the second pass. But this seems kind of wrong? And most of the blog posts I've read seem to totally gloss over this point, so I feel like I'm missing something obvious or am totally doing this all wrong.

Any pointers or clarification would be greatly appreciated!

Absurd Alhazred
Mar 27, 2010

by Athanatos

Brownie posted:

The first problem is that when I perform the first draw, I noticed that if a fragment's stencil value should be updated by both the front AND back stencil test, I only see a value of 1 (corresponding to only one increment). That's okay for my purposes (since once either tests pass, the pixel is unmarked), but I just wanted to confirm that at I shouldn't expect both tests to mutate the stencil buffer.

Are you sure you've disabled backface culling, too?

Brownie
Jul 21, 2007
The Croatian Sensation

Absurd Alhazred posted:

Are you sure you've disabled backface culling, too?

Yeah, using RenderDoc I can see that the Cull Mode is set to NONE. If it was BACK or FRONT I'd also see artifacts in the final rendered image, but I don't, and manual inspect on the stencil shows that all the pixel I expect to be marked are marked, just some of them are only marked once instead of twice like I expect.

(As an aside: thank the lord for tools like RenderDoc)

Absurd Alhazred
Mar 27, 2010

by Athanatos

Brownie posted:

Yeah, using RenderDoc I can see that the Cull Mode is set to NONE. If it was BACK or FRONT I'd also see artifacts in the final rendered image, but I don't, and manual inspect on the stencil shows that all the pixel I expect to be marked are marked, just some of them are only marked once instead of twice like I expect.

(As an aside: thank the lord for tools like RenderDoc)

I've not had a lot of experience with stencil tests, there's also something where you have to ask it to perform the test on the backface, right? Is that set?

Brownie
Jul 21, 2007
The Croatian Sensation

Absurd Alhazred posted:

I've not had a lot of experience with stencil tests, there's also something where you have to ask it to perform the test on the backface, right? Is that set?

Yeah that's correct. You can pass it two different stencil functions, one for the front or back face:



Here's the depth test that's available in RenderDoc for one of the light volumes. It shows the test passing/failing as green/red, but notably it's only for the front face. So the column that's in front of the volume gets marked correctly:



Here's the resulting stencil buffer though, which does show that the backface tests are also working (evidenced by parts of the balcony that are beyond the volume being marked)



But I'm surprised the the parts of the column that overlap with the openings in the balcony beyond are not marked twice (which would show up as white, since the image is scaled to be between [0,2] instead of [0, 255]).

Like I said, this isn't actually causing me issues with the actual shading, since the stencil test output is good enough -- being marked twice isn't anymore useful that being marked just once, in my case.

Brownie fucked around with this message at 16:52 on Jun 22, 2019

Lime
Jul 20, 2004

The portion of the light volume's back faces that are behind the column fail the depth test, just like the front faces behind the column did. But for back faces, the fail op is keep now, so that's why the stencil buffer isn't incremented twice there.

Indeed, the stencil buffer will never be incremented twice as configured because that would require a front face of the light volume to fail the depth test at the same place a back face passes: i.e., to have a front face be behind a back face, which for a convex volume is impossible (barring precision issues of course).

Lime fucked around with this message at 04:44 on Jun 23, 2019

Brownie
Jul 21, 2007
The Croatian Sensation

Lime posted:

The portion of the light volume's back faces that are behind the column fail the depth test, just like the front faces behind the column did. But for back faces, the fail op is keep now, so that's why the stencil buffer isn't incremented twice there.

Indeed, the stencil buffer will never be incremented twice as configured because that would require a front face of the light volume to fail the depth test at the same place a back face passes: i.e., to have a front face be behind a back face, which for a convex volume is impossible (barring precision issues of course).


:ughh: that makes perfect sense actually. Thanks for clearing that up for me.

Now I just need to understand how people are reconstructing position in the second draw if they're also binding the depth-stencil buffer as part of the framebuffer. I will just keep duplicating the data for now, since I am not currently bandwidth limited in my lovely little Sponza scene.

Lime
Jul 20, 2004

Is the second pass actually using the depth buffer? It sounds like the shading of fragments is determined entirely by the stencil buffer / g-buffers, and that updating the stencil buffer doesn't need it either because the depth func is ALWAYS. Can't you just disable depth testing and use a framebuffer object that has the same stencil buffer attachment but no depth buffer attached? Then you can freely use the depth texture as a texture. I've never used BGFX but it does seem to have a D0S8 texture format, suggesting a framebuffer can have separate attachments for depth and stencil buffers.

Lime fucked around with this message at 06:00 on Jun 23, 2019

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Most engines copy the depth target, or include it in the gbuffer. Some APIs will let you alias the depth target so you can read it, which will work since depth write is off, but not all do.

Suspicious Dish fucked around with this message at 12:47 on Jun 23, 2019

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Lime posted:

Is the second pass actually using the depth buffer? It sounds like the shading of fragments is determined entirely by the stencil buffer / g-buffers, and that updating the stencil buffer doesn't need it either because the depth func is ALWAYS. Can't you just disable depth testing and use a framebuffer object that has the same stencil buffer attachment but no depth buffer attached? Then you can freely use the depth texture as a texture. I've never used BGFX but it does seem to have a D0S8 texture format, suggesting a framebuffer can have separate attachments for depth and stencil buffers.

Rendering the final light volume requires depth testing.

Brownie
Jul 21, 2007
The Croatian Sensation

Lime posted:

Is the second pass actually using the depth buffer? It sounds like the shading of fragments is determined entirely by the stencil buffer / g-buffers, and that updating the stencil buffer doesn't need it either because the depth func is ALWAYS. Can't you just disable depth testing and use a framebuffer object that has the same stencil buffer attachment but no depth buffer attached? Then you can freely use the depth texture as a texture. I've never used BGFX but it does seem to have a D0S8 texture format, suggesting a framebuffer can have separate attachments for depth and stencil buffers.

The second draw is not using the depth buffer and is already set to ALWAYS. It just uses the stencil test, and also performs writes to the stencil test in order to "clear" the marked pixels on failure so that the next light volume doesn't have garbage in the stencil buffer. So it still needs write access to the packed buffer.

Unfortunately I don't think BGFX has an API for attaching the packed depth-stencil buffer as only a stencil buffer to the framebuffer. And it's still not clear that this would allow my to read from that packed buffer in my shader? Everything I've read is that this is basically undefined behaviour in OpenGL / DX11. Doing it anyway (setting the depth-stencil as a texture while also having it bound to the active FBO) just yields a black screen and a poo poo ton of warnings that you aren't allowed to do that, which is exactly what you'd expect/want.

Additionally, in OpenGL at least, implementations are not required to support attaching separate buffers for the depth and stencil, so it's basically not supported.

quote:

Rendering the final light volume requires depth testing.

Well no, not quite since I've got everything I need in that stencil buffer that I posted above. So I'm actually not performing depth testing on the second draw that performs shading.

Lime
Jul 20, 2004

Brownie posted:

Additionally, in OpenGL at least, implementations are not required to support attaching separate buffers for the depth and stencil, so it's basically not supported.

Ah, okay. This is what I meant rather than aliasing one buffer, and I fell right into the trap of thinking what the API says is logically possible is also what's physically possible.

Brownie
Jul 21, 2007
The Croatian Sensation

Lime posted:

Ah, okay. This is what I meant rather than aliasing one buffer, and I fell right into the trap of thinking what the API says is logically possible is also what's physically possible.

Yup. It's surprising considering how common this light volume technique seems, you'd think there'd be real demand to be able to just separate the two buffers entirely.

Adbot
ADBOT LOVES YOU

Colonel J
Jan 3, 2008
Felt like playing with matrix transforms ( linear and non linear), here's a work in progress : https://www.shadertoy.com/view/WtBXRD

I made it work with both 2D and 4D matrix transforms - allowed me to finally truly understand why you need a 4D matrix for translation.
You can comment / uncomment defines at the top for various features.

Disclaimer : I had no idea how to draw a grid in a pixel shader, so I went with
1) scale/offset the space to the range I want (so (0,0) is in the middle of the screen)
2) if the value of (x - round(x)) is smaller than some epsilon we're on a grid line. (same for y)

To draw the transformed grid lines I proceed by inversion; my reasoning is that for pixel x, after transformation T, it is now at some location x+dx. I can't write to another location as the pixel shader only runs on pixel x (compute shadertoy when!?) so instead, I consider that pixels are in the transformed space, and check if T_inverse(x+dx) makes pixel X lands on a grid line. It made sense when I did it, now I'm a bit confused but it seems to work. Would there be any other way to do it?

It's weird, running on my 2013 Retina Macbook, I get 60 FPS for the 2D case, but it drops down to 25-30 FPS for the 4D case. I guess inverting a 4D matrix at every pixel is too much. At work on a GTX1060 I'm getting 60 FPs in all cases. Is it a Windows/OSX thing or my Macbook GPU just isn't that powerful? I'd be curious to hear perf reports from people here.
As long as the transfo is linear (i.e. not the "fancy matrix" case) the matrix is the same for every pixel - is it possible to do the work just once with Shadertoy? I guess I'd have to do it in some sort of prepass, or just compute the values and hardcode them.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply