|
Ralith posted:Semaphores introduce an execution dependency, not a memory barrier. You cannot use semaphores as a substitute for memory barriers under any circumstances. For operations that span queues you need both; for operations on a single queue, semaphores aren't useful. I'm probably misinterpreting the spec here, but the section on semaphore signaling says that all memory accesses by the device are in the first access scope, and similarly for waiting, all memory accesses by the device are in the second scope. Granted it might not be the best way to do it, but it seems like relying on a semaphore for memory dependencies is allowed.
|
# ? May 8, 2020 20:57 |
|
|
# ? May 11, 2024 09:27 |
|
Rhusitaurion posted:I'm probably misinterpreting the spec here, but the section on semaphore signaling says that all memory accesses by the device are in the first access scope, and similarly for waiting, all memory accesses by the device are in the second scope. Granted it might not be the best way to do it, but it seems like relying on a semaphore for memory dependencies is allowed.
|
# ? May 9, 2020 05:43 |
|
Ralith posted:No, you're right, I misremembered. Using a semaphore as you were is not unsound, just unnecessary effort for extra overhead. Note that you do typically need explicit barriers when expressing inter-queue dependencies regardless, but that's for managing ownership transitions when using resources with exclusive sharing mode. Got it. Thanks for the advice - I've switched over to a single command buffer with barriers, and it seems like it works. Not sure if I got the src and dst masks and whatnot correct, but the validation layers are not complaining, at least!
|
# ? May 9, 2020 17:36 |
|
Rhusitaurion posted:Got it. Thanks for the advice - I've switched over to a single command buffer with barriers, and it seems like it works. Not sure if I got the src and dst masks and whatnot correct, but the validation layers are not complaining, at least!
|
# ? May 9, 2020 18:08 |
|
I need help debugging some OpenGL code which is very old and crusty (still has a mix of some fixed function pipeline stuff in there ). Right now I'm trying to find the source of some weird graphical glitches which only show up on the Mac CI server, which is running: OpenGL Version: 2.1 APPLE-16.7.4 GL Renderer: Apple Software Renderer The report from the server includes framebuffer screenshots, and the glitches show as perfectly horizontal blank lines for various 3d rendered triangles, and the background just shows through. Each tri has a different set of these lines missing (exact lines are not global to screen/framebuffer). One thing I just noticed is that the shader code is basically written with single precision float/vec3/vec4 variables in mind, but when vertex attributes are passed to GPU, glVertexAttrib3d is called, so its passing in doubles. So my question at the moment is would mixing single/double precision in that way likely cause problems? Does OpenGL just know to coerce doubles into floats, or is there some risk of writing out of bounds with these double width values, or behaviour is undefined in such cases, or what? I don't use Macs so its a bit difficult to debug the problem via remote CI server. I updated the shader recently which introduced these glitches, but it tested fine on other Linux systems etc. Only other thing I can think is that the Apple Software Renderer has some bug in its fwidth function, which was one part of my changes.
|
# ? May 25, 2020 23:53 |
|
peepsalot posted:Does OpenGL just know to coerce doubles into floats, or is there some risk of writing out of bounds with these double width values, or behaviour is undefined in such cases, or what? Double precision vertex attributes do not exist in GL 2.1, using glVertexAttrib3d is merely specifying that the input data is doubles and they will be converted. I don't believe 2.1 has integer vertex attributes either: using glVertexAttrib3s will convert the input int16s to floats in [-1,1]. Functions that let you set 64-bit vertex attributes using doubles have an L suffix, like glVertexAttribL3d, and were added in GL4. I would be surprised if your error was due to the lowering conversion doing something especially strange on mac.
|
# ? May 26, 2020 00:59 |
|
I've started to experiment a little with opengl for some 2D rendering. I've been coding my own math functions and such because hey its a fun hobby. I got a little scene with some quads going, but I've run into a problem I can't seem to figure out. I decided to use different Z depths to control what quad goes in front of the other, but they begin to shrink as they go away from the camera even though I'm using an orthographic matrix, which as far as I understand means that distance should not affect size. Yet they do shrink, and their location in relation to the x, y axis also changes. Its almost as if everything is being scaled towards the origin. Everything seems to work fine, until I play with the Z axis. I do all my scaling and rotation in a 2x2 matrix, and then "promote" that matrix into a 4x4 matrix. Meaning I just copy the 2x2 values into the 4x4 identity matrix. Then I multiply that matrix by a 4x4 translation matrix. All my functions seem correct, and I'm following the second edition of 3D math for Graphics and games development for the math. Left handed convention and row ordered. Here is some of the pertinent code. https://pastebin.com/2CuKTUwW
|
# ? Aug 4, 2020 01:23 |
|
Xeom posted:I've started to experiment a little with opengl for some 2D rendering. I've been coding my own math functions and such because hey its a fun hobby. Your "orthographic" projection looks like perspective to me.
|
# ? Aug 4, 2020 02:02 |
|
Remember that opengl matrices are column-major. Your ortho matrix sets w = n2 * p.z + 1 if using column vector math, which means you will do a sort of perspective division.
|
# ? Aug 4, 2020 02:57 |
|
Xerophyte posted:Remember that opengl matrices are column-major. Your ortho matrix sets w = n2 * p.z + 1 if using column vector math, which means you will do a sort of perspective division. AAaaahhh!! I remember telling myself this before implementing it and then I totally forgot. I remember even having to convince myself that row-major would actually work for everything else. Now it all looks weird to me and I'll have to convince myself again. Thanks.
|
# ? Aug 4, 2020 03:44 |
|
row-major still works you just have to set the layout(row_major) flag in GLSL and hope that your GPU vendor didn't gently caress it up (they probably did)
|
# ? Aug 4, 2020 17:05 |
|
Suspicious Dish posted:row-major still works you just have to set the layout(row_major) flag in GLSL and hope that your GPU vendor didn't gently caress it up (they probably did) But the translation is column major. It's best to stick to just one type instead of mixing them.
|
# ? Aug 4, 2020 17:14 |
|
So my question is for functions like "glUniformMatrix4fv" that can be placed into row major mode what does that mean inside GLSL? Currently everything seems to be working right, but I really feel like I'm missing some key element because all my math seems to mostly be in row major form yet I had to make that change yesterday. I am doing the math inside glsl with row major form meaning things read left to right rather than right to left. Why is it working? I should just switch to column major mode as ugly as it is to my eyes.
|
# ? Aug 4, 2020 17:35 |
|
glUniformMatrix4fv uploads a float[16] matrix to the GPU's memory. If you pass the row-major flag, it will transpose it to column-major when it uploads it. GLSL has the ability to load a row-major matrix from memory with the layout(row_major) flag. This came into existence much layer than the row-major flag in glUniformMatrix4fv, and was originally intended for uniform buffers, which you're not using. All array access inside the GLSL language is column-major, but just don't ever try to take apart or put back together matrices in GLSL and you shouldn't run into major problems. There are other options you have, but I won't mention them so I don't confuse you. Absurd Alhazred posted:But the translation is column major. It's best to stick to just one type instead of mixing them. you don't have to mix them? I don't know what this means.
|
# ? Aug 4, 2020 17:43 |
|
Suspicious Dish posted:you don't have to mix them? I don't know what this means. In the code that we were presented, the translation and "ortho" projection matrices were in opposite majority for the intended use. You should stick to a single majority instead of mixing them.
|
# ? Aug 4, 2020 17:46 |
|
oh, sure. I just use row-major for everything, I like the "3 vec4s" notation a lot for affine matricies and it fits in my head nicely.
|
# ? Aug 4, 2020 17:48 |
|
Now I'm even more confused because the book I'm using claims that matrix is already in row major form. The 4th row being in the form {dx, dy, dz, 1}. In fact the other matrix being in row form shouldn't have mattered either because my uniform call is set with GL_TRUE.code:
|
# ? Aug 4, 2020 18:17 |
|
to me (I could be wrong!) row major is this, which is what GL accepts if you pass GL_TRUE to glUniformMatrix4fv [ a b c tx ] [ d e f ty ] [ g h i tz ] [ 0 0 0 1 ] The translation components are in indexes 3, 7 and 11. column major is this, which is what GL accepts if you pass GL_FALSE to glUniformMatrix4fv [ a d g 0 ] [ b e h 0 ] [ c f i 0 ] [ tx ty tz 1 ] The translation components are in indexes 12, 13 and 14. I prefer the first one because you can get a bit extra packing efficiency and store as 3 vec4s instead of 4 vec3s.
|
# ? Aug 4, 2020 18:23 |
|
I'm going to be honest, I always have to double-check myself whenever I'm editing any new matrix things in because I get confused and then different matrix libraries have inconsistent ways they handle the row of initializers.
|
# ? Aug 4, 2020 18:36 |
|
I suspect the book you're looking at is also using row vectors, in addition to row-major storage layout. One of the annoying parts of the entire row-vs-column kerfuffle is that a lot of guides and textbooks tend to conflate matrix memory layout and vector math convention. It's historically common to use row-major matrices with the (awful, no good, very bad) row vector math convention -- i.e. float3 p1 = p0 * M -- and column-major matrices with the column vector math convention -- i.e. float3 p1 = M * p0. This convention split happened sometime in the 90s: early Iris GL used row major storage and row vectors, Open GL swapped to column major storage and column vectors. You get subtle errors because both changing the layout type and changing the vector convention have the same result as transposing the matrix, since (Mv)T = vTMT for a vector v and matrix M. This was in fact the entire reason Open GL swapped the storage convention in the first place: it let them swap to a column vector math convention in the documentation without changing the API and breaking existing code. tl;dr: what's a "row major" vs "column major" transform matrix depends on how the author is applying the transforms. I'm not 100% sure that's your problem but what I can definitely say is that C++ code:
C++ code:
If it's any consolation, I'm pretty sure everyone who has ever written any CG-related code has at least one "oh for gently caress's sake"-moment per year related to this mess.
|
# ? Aug 4, 2020 18:51 |
|
When multiplying a vec4(row) by a mat4 I don't see how your row matrix would lead to translation. Your w term in your vector would have all the translation information. https://imgur.com/Y5ITil8 Seems to make sense to me, but clearly I'm missing something. EDIT: written towards Suspicious Dish reading Xeros post now.
|
# ? Aug 4, 2020 18:54 |
|
Xero's post has the more complete answer -- I intentionally didn't mention the full explanation because 99% of people only need the shorthand, but if you're going through a book, it will have its own conventions! It's worth noting that math has had its own conventions for years, and they have goals that don't necessarily apply to modern computer science. If your book takes a maths-first perspective, then both HLSL and GLSL are backwards!
|
# ? Aug 4, 2020 19:07 |
|
Xeom posted:When multiplying a vec4(row) by a mat4 I don't see how your row matrix would lead to translation. Your w term in your vector would have all the translation information. That excerpt says quote:Then we could rotate and then translate a point v to compute a new point v' by The row vector style is not typical in GL and is not used in any of the GL documentation. Also, and this is just my personal opinion, it's a lovely pile of unintuitive garbage that should be set on fire and shot into the sun.
|
# ? Aug 4, 2020 19:11 |
|
I did understand the difference between row and column vectors, but I totally forgot that GLSL was doing the math assuming a column vector. Currently my shader is setup as if GLSL did row vector multiplication. Somehow it all worked out because everything seems fine in my test program. I'll have to figure out exactly WHY it worked at a latter time. At least I can go about fixing everything now. Bugs and math can be really weird sometimes.
|
# ? Aug 4, 2020 19:26 |
|
If you do vec * mtx in GLSL, it will type-cast the vector to a row-vector, so it's effectively the same as transpose(mtx) * vec. Your two mistakes canceled each other out.
|
# ? Aug 4, 2020 19:52 |
|
|
# ? Aug 4, 2020 20:47 |
|
I am having a bad time with alpha blending because I have no idea what I am doing, So, I have MSDF fonts on a curve which look ok: I have an outline shader because I don't really want to re-encode the fonts with an SDF channel, and I have managed to get anti-aliasing to some degree on both the inside and outside: However adding a drop-shadow shader highlights the failures in processing the anti-aliasing applied at the outside of the outline. Where there should be a lerp between the shadow and the outline there is a black halo, which could be taken as artistic effect, but would like to address. So, one of the ugly shaders for creating the shadow: code:
If I remove the anti-aliasing on the outline then I can get a correct aliased shadow, but the aliasing is pretty bad on the thin typeface at such a low resolution. I made the outline shader a bit worse than when it started, code:
MrMoo fucked around with this message at 18:47 on Aug 22, 2020 |
# ? Aug 22, 2020 18:35 |
|
If I understand your problem correctly, you don't actually want to use lerp at all, because mixing red and blue doesn't make sense for the middle values. You might be looking for the over operator found here: https://en.wikipedia.org/wiki/Alpha_compositing I.e. "out = outline + shadow * (1-outline.a);"
|
# ? Aug 22, 2020 18:55 |
|
This saturates the shadow, although it does cleanup the halo for the most part. Changing the outline shader from lerp/mix to the over function fixes the remaining halo issues, So now all remains is the colour itself. MrMoo fucked around with this message at 19:54 on Aug 22, 2020 |
# ? Aug 22, 2020 19:21 |
|
Ironically the halo works perfectly for Cincinnati,
|
# ? Aug 22, 2020 20:02 |
|
After those interesting posts I'm going to post a beginner question that is completely boring. I'm making a font texture atlas using Freetype2. Everything seems to be working well and I can get a png out with the exact results I want, but something goes completely wrong when I try to load it into opengl. The best way I can describe it is that the texture becomes skewed and compressed. Funnily enough I can load the png I saved into that same quad with the same VAO and shader and it looks completely fine. code:
I tried vec4(1,1,1,texture(blah,blah).r), but it doesn't seem to help.
|
# ? Aug 30, 2020 02:55 |
|
Skewed image usually means the row length is wrong, are you calling glPixelStore?
|
# ? Aug 30, 2020 02:59 |
|
haveblue posted:Skewed image usually means the row length is wrong, are you calling glPixelStore? Thank you for the help with the stupid questions. Everything looks good, just gotta flip it now.
|
# ? Aug 30, 2020 03:05 |
|
I'm rendering font, but it seems to be taking a long time to render even with a texture atlas. Printing the string "The quick fox jumped over the brown fence" is taking about 0.1 to 0.05 milliseconds, which seems like a long time for this sort of things. Right now I'm using a texture atlas and using glBufferSubData to update the texture coordinates for each character printed. I'm also using glUniform to provide updates to a projection,view, and model matrix. Only the model matrix gets updated per character. I'm guessing updating the texture coordinates is what is taking so long, but I'm not quite sure what to do. Should I just build a VAO for each character and just switch between those? I originally switched to the texture atlas to avoid switching between textures, but I guess updating a VBO is worse.
|
# ? Sep 7, 2020 21:15 |
|
You don’t need to change the model matrix per character. Just output a VBO of 2D textured quads with the vertex coordinates in screen space, and then draw the whole thing at once. Depending on how you do it, you don’t even need to multiply by a modelViewProjection matrix in your shader; just pass the view width & height, and use that to convert screen space coords into NDC coords (aka -1.0 to 1.0)
|
# ? Sep 8, 2020 01:04 |
|
I'm porting something from D3D11 to OpenGL ES 2. D3D11 pretty much eliminated the standard attribute bindings, and the D3D version just produces screen coordinates from a generic attribute. I know GLES2 has generic attributes, but will the draw succeed if nothing is ever specified via glVertexPointer?
|
# ? Sep 20, 2020 18:54 |
|
I was going to ask a question about how best to do asynchronous, progressive compute work in vulkan/modern GPU frameworks when I want to continually display the work in progress, but I think that in typing it out I managed to figure out what the best approach -- well, an approach, at least -- would be. Thank you, pseudonymous rubber duck collective. However, it did make me come up with another, related question: what is the current state of atomics in GPU land? I was planning on accumulating path tracer samples by using atomic increments to scatter, which I expect would be helpful in a wavefront path tracer as I'll probably be grouping work by the rays' Morton order instead of the source pixel. However, if I understand correctly base vulkan only offers atomic writes for single int values, and floats are a very recent Nvidia-only extension. Do people just do it with floatBitsToInt and atomicExchange & co? Are atomics currently a thing to avoid outside of very specific and limited cases?
|
# ? Oct 31, 2020 04:48 |
|
Contended global atomics are very slow. I've had good results from using subgroup operations to do one atomic op per subgroup, though.
|
# ? Oct 31, 2020 17:13 |
|
Ralith posted:Contended global atomics are very slow. I've had good results from using subgroup operations to do one atomic op per subgroup, though. This lead me down a rabbit hole of looking at the subgroup stuff from 1.1 which I was completely unaware existed; I'm not very current or good with GPU framework stuff which is why I started this litte hobby project. Thanks! I noticed that the 1 atomic/subgroup was exactly what the subgroup tutorial recommends too. I expect the subgroup operations will be very useful for stuff like sampling, since I can make the subgroup collectively vote on the BRDF to sample which should be efficient. Unfortunately I don't think I can boil down path tracing sample accumulation to scan local group + one atomic op in that way. The problem with GPU pathtracing has always been that it's incoherent: paths started in nearby pixels will very quickly diverge and veer off into paths that access completely different parts of the scene. Most GPU path tracers deal with this by doing wavefront tracing. Generate a lot of subpath rays, sort them by their position and direction, dispatch work according to that order so the local work group always access the same region of the scene. The problem with that is that now the local work group will include paths with vastly different origin pixels instead, and writing any new samples is a big incoherent scatter write. I expect I can deal with that by just sorting the samples back into image space buckets or something like that, it'll just be a little more annoying than just atomically adding them to the target accumulation storage image immediately when I have them.
|
# ? Oct 31, 2020 19:33 |
|
|
# ? May 11, 2024 09:27 |
|
If the target locations are effectively random, contention might not be too big an issue, though I suppose that's scene dependent.
|
# ? Nov 2, 2020 17:48 |