Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Polio Vax Scene
Apr 5, 2009



Another idea, assuming all you have for your heatmap is some points, is to keep those in a float2 array, and in the pixel shader use something that goes through each point and adds red based on the distance.

Adbot
ADBOT LOVES YOU

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Tres Burritos posted:

What's the simplest way to make a heatmap using shaders? (Assuming that's even a good idea?)

I'm fairly certain you could do it by creating a flat plane and then putting point lights over it where your different datapoints are. This also allows you to change the intensity / color of the light based on whatever factors you choose, correct?

And if that's all you're doing, would it be smart to use deferred rendering?

It depends on how accurate you need it to be, and what you want to use it for.

The easiest way would be the "splatting" method -- render your sample points as quads over your destination map with additive blending on. The size of the 'splat' should correspond to the importance/intensity of the sample (so that the 'edge' of the splat is where the weight function hits zero). Inside the pixel shader, you compute the distance the fragment being shaded is from the center of the splat using something like a windowed gaussian, and then output a value scaled by that function. All the splats will additively blend together and give you a localized density in the output texture, which you can then read from and feed into a color lookup if you want to get false-color rendering. This is also known as a "scatter" approach (you're "scattering" a single input to multiple outputs).

The above way is going to be the most efficient for most cases, though you might run into performance problems if you end up with most of your points in a very small area of the render target. In that case you'll run into a blending bottleneck, since the pixels will be contending for the same blending interlocks (although this isn't as much of a problem as it sounds for small amounts of overdraw). The other approach is something closer to what is done with "tiled" deferred lighting: divide the output into regular tiles (8x8 is probably a good number) and go through all the points so that each tile has a list of the points which may be affecting it (points may appear in multiple lists). The, using a compute shader pass (ideally -- though this is doable with a fullscreen PS pass as well, albeit less efficiently) figure out what tile each pixel belongs to and compute the sum of the distance-weighted value for every sample that hits that tile. Then output that, without blending. This can be more efficient if there's a lot of "hot spots" with blend contention, but it's still not necessarily a win because now you've got a much smaller number of threads doing a lot of work in serial, instead of a the entire GPU distributing the workload (and then relying on the blend interlock to coalesce it). This is what's referred to as a "gather" approach (you're "gathering" multiple inputs to a single output).

e:
And if the number of points is relatively small and performance is not critical, jam them into a constant buffer and just do a full screen PS pass that iterates over all the samples at every pixel and accumulates a weight. This is roughly equivalent to the second ("gather") approach except without the tiling step, which means you will probably have a lot of wasted work processing samples that don't affect the pixel at all. Still, if the number of samples is small (or their radius of effect is large) it could be roughly as efficient.

Hubis fucked around with this message at 15:57 on Oct 30, 2015

Sex Bumbo
Aug 14, 2004
https://www.shadertoy.com/view/lt2SWG

drag the mouse around to make it do something

Tres Burritos
Sep 3, 2009

Joda posted:

Assuming your data points are two dimensional, and the heat map you want to make is too, then draw a quad filling the entire screen and project it with an orthogonal projection. You have a couple of options on how to do the heat map for each of your data points. One is to simply, as you said, upload your data points array as a uniform, then loop through it in a for-loop in the shader adding the weighted* contribution of all data points to your fragment. This has the advantage of you being able to account for overexposure (colour values being over 1.) Another option is to enable additive blending (gl.enable(gl.BLEND);gl.blendFunc(gl.ONE,gl.ONE);gl.blendEquation(gl.FUNC_ADD)) and drawing your data points one at a time without clearing the buffer. This has the advantage that you can have an arbitrary amount of data points, and the shader won't have to know exactly how many there are (As a general rule, the GPU has to know exactly how much uniform data you have to upload, and how many times you plan to loop over your data at compile time.)

*Weighting is where the gaussian distribution (or whatever model you choose) comes in. You essentially assign a value between 0 and 1 to be your weight for the data point based on distance to the fragment. Another option is to simply do it linearly so you have max(0,(MAX_DISTANCE - fragDistance)/MAX_DISTANCE), where MAX_DISTANCE is the point at which you want contribution for the data point to cut off completely. Both will give you a circle around each data point, but the gaussian will be more smooth. If your map ends up too bright, try just multiplying the weight by a low factor (like in the range 0.5 - 0.9). To weight the contribution just multiply it by the weight value.

As for the gaussian distribution, it's just a normal distribution. That is to say you call it with gauss(x,y,sigma), where sigma is your standard deviation (68% of the volume under the graph will be within that distance,) and x and y are the coordinates for the vector from your data point to the fragment you're calculating for. It returns the probability that (x,y) would occur for normal distributed data. The 2D gaussian function looks like this:

code:
float gaussian(vec2 coords, float sigma) {
    return exp(-(coords.x * coords.x + coords.y * coords.y)/
                (2 * sigma * sigma));
}
You can multiply it by a factor if you want greater values.

Let me know if I'm still being too technical.

Ahhh yep, I follow.


Sex Bumbo posted:

https://www.shadertoy.com/view/lt2SWG

drag the mouse around to make it do something

Good poo poo. So where you put

code:
float totalWeight = 0.0;
    for (float i = 0.0; i < 69.0; ++i) {
        
    	totalWeight += d(uv, vec2(
            sin(iGlobalTime * 0.3 + float(i)) + sin(i * i), 
            cos(iGlobalTime * 0.4 + float(i * 2.0))
        ));
    }
I'd just be comparing that fragment against all the uniform points or whatever that got passed in. The problem with this seems to be that it doesn't run so hot on a 4k display, I'm guessing looping through all those fragments (for like 1000 points on a GTX 980) is a little expensive. Or maybe shaderToy just doesn't like fullscreen.

I think I'm going to try that first and see how it goes.


Hubis posted:

It depends on how accurate you need it to be, and what you want to use it for.

The easiest way would be the "splatting" method -- render your sample points as quads over your destination map with additive blending on. The size of the 'splat' should correspond to the importance/intensity of the sample (so that the 'edge' of the splat is where the weight function hits zero). Inside the pixel shader, you compute the distance the fragment being shaded is from the center of the splat using something like a windowed gaussian, and then output a value scaled by that function. All the splats will additively blend together and give you a localized density in the output texture, which you can then read from and feed into a color lookup if you want to get false-color rendering. This is also known as a "scatter" approach (you're "scattering" a single input to multiple outputs).

I'm not quite getting this one, so for each datapoint I'd create a quad (based on the intensity / whatever) of the point, then I'd run the gaussian function for just that plane and then render that to a texture?
And then when planes are overlapping the GPU would just know what's going on and do some behind the scenes blending? Would you have to make sure that the planes are in distinct layers (y = 0, y = 1, y = 2) so that you didn't get weird collision artifacts (I'm fairly certain I've seen that before)?

Sex Bumbo
Aug 14, 2004

Tres Burritos posted:

I'd just be comparing that fragment against all the uniform points or whatever that got passed in. The problem with this seems to be that it doesn't run so hot on a 4k display, I'm guessing looping through all those fragments (for like 1000 points on a GTX 980) is a little expensive. Or maybe shaderToy just doesn't like fullscreen.

You can't splat sprites using shader toy but the idea would be the same -- either partition your points in such a way that it makes the shader faster, or render the points as sprites and do a post-process to determine the color value. Notice it's doing an add operation for each point, equivalent to additive blending.

Or as a trivial example, say you have a NxN grid and put each point into a bucket. Then each pixel only examines nearby buckets for points. You need a clever way to encode the buckets and sizes but it shouldn't be too hard.

Joda
Apr 24, 2010

When I'm off, I just like to really let go and have fun, y'know?

Fun Shoe

Tres Burritos posted:

I'm not quite getting this one, so for each datapoint I'd create a quad (based on the intensity / whatever) of the point, then I'd run the gaussian function for just that plane and then render that to a texture?
And then when planes are overlapping the GPU would just know what's going on and do some behind the scenes blending? Would you have to make sure that the planes are in distinct layers (y = 0, y = 1, y = 2) so that you didn't get weird collision artifacts (I'm fairly certain I've seen that before)?

With additive blending enabled and glBlendFunc set to GL_ONE,GL_ONE, the GPU takes whatever is already in the framebuffer/texture for a given fragment, and adds the value you just calculated for the fragment.

E: The idea behind using multiple planes is that you only draw the part of the texture the point actually influences.

Joda fucked around with this message at 04:50 on Oct 31, 2015

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Tres Burritos posted:

Ahhh yep, I follow.


Good poo poo. So where you put

code:
float totalWeight = 0.0;
    for (float i = 0.0; i < 69.0; ++i) {
        
    	totalWeight += d(uv, vec2(
            sin(iGlobalTime * 0.3 + float(i)) + sin(i * i), 
            cos(iGlobalTime * 0.4 + float(i * 2.0))
        ));
    }
I'd just be comparing that fragment against all the uniform points or whatever that got passed in. The problem with this seems to be that it doesn't run so hot on a 4k display, I'm guessing looping through all those fragments (for like 1000 points on a GTX 980) is a little expensive. Or maybe shaderToy just doesn't like fullscreen.

I think I'm going to try that first and see how it goes.


I'm not quite getting this one, so for each datapoint I'd create a quad (based on the intensity / whatever) of the point, then I'd run the gaussian function for just that plane and then render that to a texture?
And then when planes are overlapping the GPU would just know what's going on and do some behind the scenes blending? Would you have to make sure that the planes are in distinct layers (y = 0, y = 1, y = 2) so that you didn't get weird collision artifacts (I'm fairly certain I've seen that before)?

They can all be in the same plane -- if you configure for additive blending like Joda described the blend/framebuffer unit will resolve overlapping regions correctly.

You'd accumulate your splats to a single-valued texture (R16_FLOAT or whatever) and then when you rendered that texture, you'd remap the float value to a color via a lookup.

Tres Burritos
Sep 3, 2009

Thanks goons!

I got Sex Bumbo's solution working

Demo

And then some blended splatting(?)

Demo

The splatting isn't as fancy right now, and it's missing the final step, which will be taking the red values and converting it to the same shading as the first heatmap, but it allows me to have around 2.5x more datapoints.

See, you just needed to use tiny words :haw:

Mugticket
Sep 13, 2011

I'm having trouble sending integers through shaders in opengl. I have a switch in the geometry shader that uses integers to see what shape it should make. The problem is that only the value zero goes through it like it should. One, two, three etc seem to come out as really large numbers. I tried setting the value of the integer inside the vertex shader which worked as intended. And if I send the values as floats and replace the switch with if's and if else's, it also works.

One thing I could find on the internet was, that since the sizes of integers could be different on different hardware the values could get messes up, but the problem persisted even when I stored the integers as GLint.

Edit: gently caress this. Now it works and I have no idea why.

Edit2: Found out why it works now, but I have no idea why you would do this.

code:
glVertexAttribPointer(shpAttrib, 1, GL_INT, GL_FALSE, vali, (void*)(4*sizeof(GLfloat)));
This is the original that only works when the integer that I send is 0, otherwise it becomes some crazy big number, when sent to vertex shader.

code:
glVertexAttribPointer(shpAttrib, 1, GL_FLOAT, GL_FALSE, vali, (void*)(4*sizeof(GLfloat)));
I changed the type to GL_FLOAT. Now it reads GLint:s properly?

Mugticket fucked around with this message at 21:42 on Nov 3, 2015

Ralith
Jan 12, 2011

I see a ship in the harbor
I can and shall obey
But if it wasn't for your misfortune
I'd be a heavenly person today
I think glVertexAttribPointer converts values to floats. Try glVertexAttribIPointer.

Mugticket
Sep 13, 2011

Ralith posted:

I think glVertexAttribPointer converts values to floats. Try glVertexAttribIPointer.

Thanks a lot!

Doc Block
Apr 15, 2003
Fun Shoe
Am I allowed to ask Metal questions in here?

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Doc Block posted:

Am I allowed to ask Metal questions in here?

This would probably be the right place, though I'd be curious to see how much expertise is out there

Doc Block
Apr 15, 2003
Fun Shoe
Well, here's the thing. I want to add a bloom effect to my AppleTV game (whose engine I wrote using Metal), and I've got it working, but it's slooooow.

The basic steps I'm doing right now:
1) Render the scene into a framebuffer (RGBA16F) that's the same resolution as the screen
2) Create a specular buffer by rendering the output of Step 1 into a texture that's 1/4th the screen resolution, using a shader that drops any values below a brightness threshold.
3) Blur the specular buffer
4) Render a fullscreen quad to the screen, with a shader that takes the output of Step 1 and Step 3 and combines them.

This pretty much kills the device's fillrate and is not kind to the GPU's tile cache. But there doesn't really seem to be a better way.

I can use multiple render targets to get rid of Step 2, but then the specular buffer has to be full resolution and makes Step 3 a lot slower. And there's still the need to store values outside of the tile cache and then read them back in, which is a performance no-no on PowerVR.

Basically, does anyone have a better way to do this in Metal? Hopefully one that's able to do most/all of the work inside the tile cache so the GPU doesn't have to touch main memory until the tile is finished?

Doc Block
Apr 15, 2003
Fun Shoe
And, of course, now that I've written it all out it occurs to me that large chunks of that could possibly be combined into just one or two compute shaders. Welp. v:shobon:v

lord funk
Feb 16, 2004

Shouldn't you be looking into the performance shaders for a blur effect? I thought that's specifically what those were made for.

Doc Block
Apr 15, 2003
Fun Shoe
That's what I'm using to do the blur, but there's a performance penalty involved in switching the GPU from draw mode to compute mode and then back within the same frame. The Metal Performance Shader documentation suggests either doing all your compute at the beginning or at the end of the frame to avoid this.

Even without that penalty, I imagine it'd be a lot faster if I wrote a compute shader (or two) that took the initial color buffer, downscaled it & clipped the sub-threshold colors to black, blurred that, then combined that with the original and just wrote it to the final drawable in as few operations as possible instead of a bunch of discrete steps like I'm doing now.

Plus, Metal Performance Shaders are only available on A8 GPUs and up, so if I bring the game to iPhone later I'd have to come up with an alternate solution for A7 devices anyway (never mind having to do an OpenGL renderer).

Doc Block fucked around with this message at 20:45 on Nov 6, 2015

Sex Bumbo
Aug 14, 2004
Why are you using RGBA16F, and can you avoid that?
Can you downsample it further?
What's the format of the specular buffer?
If you want to use compute, can you eat a frame of latency? That way you would have only one draw portion of your frame and one compute portion:

1: Compute last frame's blur
2: Draw new frame to different frame buffer
3: Draw/present last frame

Sex Bumbo fucked around with this message at 03:59 on Nov 7, 2015

Doc Block
Apr 15, 2003
Fun Shoe
I'm using RGBA16F for the offscreen color buffer so that it preserves brightness above 1.0 instead of just clipping, which is useful for doing post processing effects like bloom.

I've tried downsampling the specular buffer further, but then it looks blocky and shimmers if I downsample it enough to make a noticeable dent in performance. It's actually 1/8th the screen resolution, not 1/4 as I said earlier (divided by 4 in each direction = 1/8th not 1/4th, whoops).

The specular buffer pixel format is just R8.

I could definitely stand to have at least the bloom effect be a frame behind. Hadn't thought of that...

Sex Bumbo
Aug 14, 2004
As an experiment you might want to try lowering the bit depth, even to something extreme like 565, just to see how it affects performance. Other than that, I'm not familiar with Metal but there aren't really that many different ways to downsample a texture and blur it that I'm aware of. Are you able to profile it somehow? You can do hacky profiling like doing a pass-through non-blur just to see if it's the blur kernel that's the bottleneck.

Doc Block
Apr 15, 2003
Fun Shoe
Xcode has some pretty nice tools for Metal. You can do a GPU frame capture and it will give you an exact breakdown of everything, including how long it all takes, what the various state objects are set to at any given point during the frame, and even what the frame looked like at any specific draw call (with funky green outlines showing you what exactly was drawn in that particular call).



And yeah, the two biggest bottlenecks are the blur kernel and the composite pass, which combines the color buffer with the (downsampled and blurred) specular buffer and then writes it to the framebuffer.

(the "3D Mesh (normal)" pipeline isn't really a bottleneck since in that frame it's drawing multiple models with large, poorly-optimized geometry sets that take up large portions of the screen and use a crappy fragment shader that I haven't really optimized yet).

edit: those 8 million compute pipelines are the implementation of a Gaussian Blur filter from Apple's Metal Performance Shader framework).

Doc Block fucked around with this message at 08:39 on Nov 7, 2015

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Sex Bumbo posted:

As an experiment you might want to try lowering the bit depth, even to something extreme like 565, just to see how it affects performance. Other than that, I'm not familiar with Metal but there aren't really that many different ways to downsample a texture and blur it that I'm aware of. Are you able to profile it somehow? You can do hacky profiling like doing a pass-through non-blur just to see if it's the blur kernel that's the bottleneck.

Does Metal support 10-11-10?

Xerophyte
Mar 17, 2008

This space intentionally left blank

Hubis posted:

Does Metal support 10-11-10?

Looking at the Metal Programming Guide and they have RG11B10Float as well as RGB9E5Float shared exponent formats, apparently, which might work for the HDR in this case. Maybe. I really don't know enough about the problem domain here.

Sex Bumbo
Aug 14, 2004
I'd look into trying a comparable non-compute blur. It might be doing a constant time blur for any blur width, which might lose out pretty hard in performance compared to a fragment shader blur until you make the blur width enormous.

Compute Gaussian blurs seem to function better as example code in my experience.

Doc Block
Apr 15, 2003
Fun Shoe
The Metal Performance Shader library has tons of permutations for each operation, optimized for various image sizes, kernel sizes, and pixel formats. The API chooses which to use at runtime.

If memory serves, in one of the WWDC sessions Apple said they had created 60+ different Gaussian blur implementations.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
so you're saying they wrote a shader compiler

Doc Block
Apr 15, 2003
Fun Shoe
They explain in the WWDC session video: https://developer.apple.com/videos/play/wwdc2015-607

Anyway, I'll try a 2-stage blur, but given what I've seen so far of dependent texture reads on PowerVR, in not confident it'll be fast. Especially not if it has to swap out the tile cache to main memory. I think having to do multiple tile cache stores per frame is part of my current problem.

Sex Bumbo
Aug 14, 2004
You should be able to avoid dependent texture reads in a blur shader. After all, the position of every sample is always the same every blur for each pixel.

Doc Block
Apr 15, 2003
Fun Shoe
Whatever they're called then when you specify the texcoords in the fragment shader so the GPU doesn't know to prefetch the necessary texels.

Or is that not an issue anymore? I'll have to play around and see.

Sex Bumbo
Aug 14, 2004
It should know how to prefetch the necessary texels is what I'm saying. Like, if you're reading 3 texels, encode all three positions into the vertex data so it gets interpolated and you don't need any math in the fragment shader.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Doc Block posted:

Whatever they're called then when you specify the texcoords in the fragment shader so the GPU doesn't know to prefetch the necessary texels.

Or is that not an issue anymore? I'll have to play around and see.

Not a thing anymore (and hasn't been really since 2005 or so).

"Dependent Texture Reads" refer to one texture fetch relying on the value of a previous texture fetch to determine its lookup location. A screen-space distortion shader is a perfect example -- the value you output is fetched from the rendered framebuffer using a coordinate offset by a second "distortion map". This is potentially bad because the first texture read injects round-trip latency so the shader unit sits idle, then the texture unit sits idle while the shader unit computes the new sample position. If you have well balanced workloads that will keep the shader unit busy with other things then this isn't that big of a deal.

Reading something straight from a vertex interpolator will be no faster than if you read it and did some kind of (simple) math on it first. Either way it's executing some shader functions to put the interpolated (and possibly modified) value into a register that it then feeds to the texture unit. There's no "fast path" where you just pump the TEXCOORD semantic straight to the texture unit anymore.

Doc Block
Apr 15, 2003
Fun Shoe
I'm just way overthinking this, it seems. I'll try to get back to obsessing over this stupid bloom effect that I just had to have :rolleyes: in my game later this week or next weekend.

Thanks guys.

Doc Block fucked around with this message at 08:26 on Nov 8, 2015

Sex Bumbo
Aug 14, 2004

Hubis posted:

Not a thing anymore (and hasn't been really since 2005 or so).

"Dependent Texture Reads" refer to one texture fetch relying on the value of a previous texture fetch to determine its lookup location. A screen-space distortion shader is a perfect example -- the value you output is fetched from the rendered framebuffer using a coordinate offset by a second "distortion map". This is potentially bad because the first texture read injects round-trip latency so the shader unit sits idle, then the texture unit sits idle while the shader unit computes the new sample position. If you have well balanced workloads that will keep the shader unit busy with other things then this isn't that big of a deal.

Reading something straight from a vertex interpolator will be no faster than if you read it and did some kind of (simple) math on it first. Either way it's executing some shader functions to put the interpolated (and possibly modified) value into a register that it then feeds to the texture unit. There's no "fast path" where you just pump the TEXCOORD semantic straight to the texture unit anymore.

I thought you were wrong about this, at least regarding A7, but

quote:

The Apple A7, A8, and A9 GPUs do not penalize dependent-texture fetches.
https://developer.apple.com/library...SPlatforms.html

It is I who am the big dummy

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Sex Bumbo posted:

I thought you were wrong about this, at least regarding A7, but

https://developer.apple.com/library...SPlatforms.html

It is I who am the big dummy

It's a fair possibility -- my experience is mostly with discrete PC GPUs -- but yeah I would have been very surprised if the PowerVR architecture were hugely different in that area.

Joda
Apr 24, 2010

When I'm off, I just like to really let go and have fun, y'know?

Fun Shoe
This is kind of a tangential question, but as far as I can tell this is the place I'm most likely to find people who work/have worked with rendering academically.

What drawing program do/did you use for theses and papers to demonstrate spacial concepts? I'm currently working in Geogebra, and it works great for 2D simplifications of concepts, but there are some things where a 3D drawing is simply needed, and doing those in Geogebra are a pain.

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

Is there a way I can draw to a framebuffer in OpenGL such that if a pixel has been written to once then it is locked into that color and cannot be overwritten? I can't use the stencil or depth buffer and the pixel values could have alpha < 1.
Like if i clear it to (0,0,0,0), then only pixels with value == (0,0,0,0) should be allowed to be written to.

Or to put another way, if the alpha value is non-zero then don't let it be drawn over anymore.

Doc Block
Apr 15, 2003
Fun Shoe
Can you read from the destination color buffer in OpenGL? If so, just write a shader that reads the corresponding pixel in the destination color buffer, and if its alpha > 0 then discard the fragment by calling gl_DiscardFragment() (or whatever it's actually named).

What are you trying to do?

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

As I understand it, shaders can't read pixel data out of the framebuffer.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

peepsalot posted:

Is there a way I can draw to a framebuffer in OpenGL such that if a pixel has been written to once then it is locked into that color and cannot be overwritten? I can't use the stencil or depth buffer and the pixel values could have alpha < 1.
Like if i clear it to (0,0,0,0), then only pixels with value == (0,0,0,0) should be allowed to be written to.

Or to put another way, if the alpha value is non-zero then don't let it be drawn over anymore.
I'm not much of a shaderer, but as I understand it you're supposed to avoid conditionals (so Doc Block's solution might be slow, even if you could do it).

However, I think you also can't do it anyway - reading from the destination is broadly a no-no.

If you could make the prior rendering go to an intermediate texture then you could do a final combination render to the screen to get the effect you want.

You could do some weird thing for the combination without branches, something like
code:
float blendFactor = clamp((dst.r + dst.g + dst.b + dst.a) * 256, 0.0, 1.0); 
outputColor = dst * blendFactor + src * (1.0-blendFactor);

Adbot
ADBOT LOVES YOU

Doc Block
Apr 15, 2003
Fun Shoe
You can't read the destination color buffer in a shader in OpenGL? I thought you could...

You can in Metal :colbert:

Doc Block fucked around with this message at 05:35 on Nov 20, 2015

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply