Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Happy Thread
Jul 10, 2005

by Fluffdaddy
Plaster Town Cop

Colonel J posted:

Felt like playing with matrix transforms ( linear and non linear), here's a work in progress : https://www.shadertoy.com/view/WtBXRD

I made it work with both 2D and 4D matrix transforms - allowed me to finally truly understand why you need a 4D matrix for translation.
You can comment / uncomment defines at the top for various features.

Disclaimer : I had no idea how to draw a grid in a pixel shader, so I went with
1) scale/offset the space to the range I want (so (0,0) is in the middle of the screen)
2) if the value of (x - round(x)) is smaller than some epsilon we're on a grid line. (same for y)

To draw the transformed grid lines I proceed by inversion; my reasoning is that for pixel x, after transformation T, it is now at some location x+dx. I can't write to another location as the pixel shader only runs on pixel x (compute shadertoy when!?) so instead, I consider that pixels are in the transformed space, and check if T_inverse(x+dx) makes pixel X lands on a grid line. It made sense when I did it, now I'm a bit confused but it seems to work. Would there be any other way to do it?

It's weird, running on my 2013 Retina Macbook, I get 60 FPS for the 2D case, but it drops down to 25-30 FPS for the 4D case. I guess inverting a 4D matrix at every pixel is too much. At work on a GTX1060 I'm getting 60 FPs in all cases. Is it a Windows/OSX thing or my Macbook GPU just isn't that powerful? I'd be curious to hear perf reports from people here.
As long as the transfo is linear (i.e. not the "fancy matrix" case) the matrix is the same for every pixel - is it possible to do the work just once with Shadertoy? I guess I'd have to do it in some sort of prepass, or just compute the values and hardcode them.

I'd love to see it but for some reason my WebGL 2 stopped working on this machine. I don't remember changing any settings. How odd

I haven't seen it so consider that I don't know what you've already got working, but I imagine a lot of the shaders on ShaderToy would just draw a grid using path tracing, since each given 2D pixel could map onto potentially multiple grid line intersections if they line up behind the pixel in 3D.

Also, where has this thread been my whole life???

Adbot
ADBOT LOVES YOU

czg
Dec 17, 2005
hi
In an effort to learn stuff, I've been working on making a little vulkan renderer in my spare time.

I thought I had a pretty neat little setup, with two threads separately handling the actual rendering and updating of data, kinda like this:

I'm guessing this isn't a completely uncommon setup?

This actually works perfectly fine, and I get the expected results drawn, but the validator doesn't like it.
As soon as I update a descriptorSet for an image in the update thread, it says that the commandbuffer currently used in the render thread is invalidated:
code:
(null)(ERROR / SPEC): msgNum: 9 - You are adding vkQueueSubmit() to command buffer 0x140616d65b0 that is invalid because bound DescriptorSet 0x14 was destroyed or updated.
The spec valid usage text states 'Each of fence, semaphore, and swapchain that are valid handles must have been created, allocated, or retrieved from the same VkInstance'
(https://www.khronos.org/registry/vulkan/specs/1.0-extensions/html/vkspec.html#VUID-VkAcquireNextImageInfoKHR-commonparent)
    Objects: 1
       [0] 0x140616d65b0, type: 6, name: (null)
I looked around and it looked like if I used the extension VK_EXT_descriptor_indexing and created my descriptorSetLayouts with the flag VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT, I should be better off, but no such luck.
Is there something else I need to flag to be able to update descriptors (in my case a texture array) while they're being used in a commandBuffer?
Or will I just have to pause my render thread before I update descriptors and resume it once the new command buffers are ready?

And as I wrote, this renders perfectly fine without a hitch or any glitchiness, it's just the validator complaining. Am I right to assume that I should always heed the validator, and if it works despite validation errors I'm just lucking out?

I'm writing this in c# using SharpVk in case that matters.

Absurd Alhazred
Mar 27, 2010

by Athanatos

czg posted:

In an effort to learn stuff, I've been working on making a little vulkan renderer in my spare time.

I thought I had a pretty neat little setup, with two threads separately handling the actual rendering and updating of data, kinda like this:

I'm guessing this isn't a completely uncommon setup?

This actually works perfectly fine, and I get the expected results drawn, but the validator doesn't like it.
As soon as I update a descriptorSet for an image in the update thread, it says that the commandbuffer currently used in the render thread is invalidated:
code:
(null)(ERROR / SPEC): msgNum: 9 - You are adding vkQueueSubmit() to command buffer 0x140616d65b0 that is invalid because bound DescriptorSet 0x14 was destroyed or updated.
The spec valid usage text states 'Each of fence, semaphore, and swapchain that are valid handles must have been created, allocated, or retrieved from the same VkInstance'
([url]https://www.khronos.org/registry/vulkan/specs/1.0-extensions/html/vkspec.html#VUID-VkAcquireNextImageInfoKHR-commonparent[/url])
    Objects: 1
       [0] 0x140616d65b0, type: 6, name: (null)
I looked around and it looked like if I used the extension VK_EXT_descriptor_indexing and created my descriptorSetLayouts with the flag VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT, I should be better off, but no such luck.
Is there something else I need to flag to be able to update descriptors (in my case a texture array) while they're being used in a commandBuffer?
Or will I just have to pause my render thread before I update descriptors and resume it once the new command buffers are ready?

And as I wrote, this renders perfectly fine without a hitch or any glitchiness, it's just the validator complaining. Am I right to assume that I should always heed the validator, and if it works despite validation errors I'm just lucking out?

I'm writing this in c# using SharpVk in case that matters.

It's been a while since I've played with Vulkan, but don't you have to make sure a command buffer has been fully built and submitted (and perhaps even processed?) before changing any important state relevant to it, so you might need to double/triple buffer descriptor sets. That being said, is that what you usually update in Vulkan? I thought you'd be updating underlying data while descriptor sets were more static.

czg
Dec 17, 2005
hi
Maybe I’ve misunderstood how it should be done then.

At startup I create a descriptor of an array of ~100 or so combinedImageSamplers, and initialize them all to the same 4x4 checkerboard image.
I then index into that array with pushconstants in the shaders.
When I then want to update a texture, I create and upload a new image and imageView, transfer data to it, and then update the desired slot in the descriptor array.

Just updating the underlying data works great for vertex data.

This is all a learning experience for me, so I don’t really know how things work yet...

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

czg posted:

Is there something else I need to flag to be able to update descriptors (in my case a texture array) while they're being used in a commandBuffer?

No. This is illegal. You can't update resources while they're pending or in use. Use a separate set of descriptors instead.

czg posted:

Or will I just have to pause my render thread before I update descriptors and resume it once the new command buffers are ready?

Don't pause it as this would prevent double or triple buffering, which is how you get all of your throughput in a modern renderer, just make sure not to touch resources while they're in use.

czg
Dec 17, 2005
hi
Thanks for the hints.
Luckily implementing double buffered descriptors turned out to be pretty simple, and now everything works perfectly and the validator is happy!

Absurd Alhazred
Mar 27, 2010

by Athanatos

czg posted:

Thanks for the hints.
Luckily implementing double buffered descriptors turned out to be pretty simple, and now everything works perfectly and the validator is happy!

:buddy:

Odddzy
Oct 10, 2007
Once shot a man in Reno.
I'm a 3d artist in a games company and would like to learn a bit more about programming HLSL or GLSL stuff to get more of the technical aspect of the job. Could I have some recommendations of books that could break me in to the subject?

Absurd Alhazred
Mar 27, 2010

by Athanatos

Odddzy posted:

I'm a 3d artist in a games company and would like to learn a bit more about programming HLSL or GLSL stuff to get more of the technical aspect of the job. Could I have some recommendations of books that could break me in to the subject?

I think your best bet would be to just get a book about one of the graphics APIs and learn from that, giving you the pipeline context where shaders fit and everything. I've found The OpenGL SuperBible pretty readable, moreso than the Programming Guide. I imagine DirectX has similarly usable books.

Brownie
Jul 21, 2007
The Croatian Sensation
Similarly, I'm trying to learn some D3D 12 and man the API is weird. Coming from OpenGL and Vulkan I've found the DXGI patter really kind of bizarre. For example: there are 4 different IDXGIAdapter interfaces: IDXGIAdapter1, IDXGIAdapter2, IDXGIAdapter3, and IDXGIAdapter4. The docs have basically no explanation on why you'd use one or that other, just that some were released in newer DXGI interfaces. Looking at Microsoft's D3D12 examples, they use IDXGIAdapter1... why? No idea.

Does anyone have any good resources on how to reason about API quirks like this? It's obvious to me it has to do with the D3D ecosystem being around for so long, but I don't understand if there are parts of this I should be ignoring or what.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
It's a COM practice. APIs are versioned. Instead of adding new methods to adapter interfaces which can break ABI compatibility, they instead create new ones which extend the old ones. Use the version which has support for what you need, and only upgrade if you need a newer interface.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Odddzy posted:

I'm a 3d artist in a games company and would like to learn a bit more about programming HLSL or GLSL stuff to get more of the technical aspect of the job. Could I have some recommendations of books that could break me in to the subject?

Tread carefully. Shaders are very complicated beasts, and most engines have fancy material pipelines that you will need to work within. I would talk to your studio's graphics engineers to get a better understanding of your engine's own material pipeline. I can't speak for your engine, but it's likely using HLSL (99% of games out there use HLSL, not GLSL).

Getting a handle on material graphs such as Unreal's will get you most of the way there to understanding the art side of things without having to think about the engine-specific concepts like bindings, resource management, passes, etc.

Colonel J
Jan 3, 2008
If you know a bit of programming you could do the learnopengl.com tutorials, the basics are gentle for beginners and it'll show you roughly what's happening in setting up/using a shader.

It's opengl though,setting up a DirectX pipeline will be different but the concepts are roughly similar.

haveblue
Aug 15, 2005



Toilet Rascal
Yeah, how much do you know about non-shader programming already? The contents of _______.hlsl is a program, just in a language specialized for doing certain kinds of math and with different quirks and limitations from general-purpose programming.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...
Shadertoy might be fun to check out

Colonel J
Jan 3, 2008
I really enjoyed this shader deconstruction by Inigo Quilez : https://www.youtube.com/watch?v=Cfe5UQ-1L9Q

Dude is really good, and he makes it quite accessible.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Most of the techniques you see on shadertoy, like raymarching, are generally not techniques you can use in game development, so be aware. There is some use for them, but a lot of the distance-field modelling you see there is better done with traditional modelling techniques.

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

Colonel J posted:

I really enjoyed this shader deconstruction by Inigo Quilez : https://www.youtube.com/watch?v=Cfe5UQ-1L9Q

Dude is really good, and he makes it quite accessible.
drat, I've seen some of this guy's demos and web pages which are really informative and impressive. So I'm interested in checking this out eventually, but holy poo poo 6 hours long!

Colonel J
Jan 3, 2008

peepsalot posted:

drat, I've seen some of this guy's demos and web pages which are really informative and impressive. So I'm interested in checking this out eventually, but holy poo poo 6 hours long!

I'm slightly ashamed to say I watched it all and every minute was good.

I especially like the "physically-inspired" approach rather than going for the absolute realism. Seems like you can get more done quicker this way, for a much more artistic result.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Suspicious Dish posted:

Most of the techniques you see on shadertoy, like raymarching, are generally not techniques you can use in game development, so be aware. There is some use for them, but a lot of the distance-field modelling you see there is better done with traditional modelling techniques.

Yeah, ShaderToy is kind of funny in that you sort of already have to be an expert to tell the difference between "this is a state-of-the-art techinque/best practice" versus "someone trying to be cute/clever". However, it can also serve as an excellent framework for just experimenting with simple material shading/procedural techniques without having the surrounding rendering engine if you want.

Happy Thread
Jul 10, 2005

by Fluffdaddy
Plaster Town Cop
I have a slow computer and I really, really wish the "slideshow" browser on ShaderToy worked, as in click something to advance to the next shader, versus right now where it tries to advance on a fixed timer out of your control and stops if a shader won't load. Because their main "browse" page tries to load the max number of submissions simultaneously every page, and that is just not gonna happen

Happy Thread
Jul 10, 2005

by Fluffdaddy
Plaster Town Cop

Inigo Quilez helpfully posts his own supplemental utility files somewhere and now it's clear that he's a great instructor too. I'm so glad I clicked this video and dedicated part of an afternoon! Thank you! I still need to finish it to learn more about cone tracing and its limitations.

If it were a lecture, this video would probably occur at the end of a graphics course, after the learner has already experimented with shaders and drawn something with ray tracing math. Our graphics course traditionally only asked students to make CPU-based ray tracers. Doing it in the shader actually appears to be much simpler (and obviously gives a better frame rate), so my school perhaps shouldn't have continued that practice well into 2017.

Favorite parts so far:

-Seeing how SDFs can be truly easy to work with. Our course only dealt with exact analytical solutions to spheres/triangles, not iterative/numerical ones. But SDFs give you the power to draw so many more shapes with so little additional code.

-His cool trick to quickly cobble together a rotation matrix using Pythagorean triples. It makes sense that they'd be both orthogonal and have unit-length columns (because he divides by the hypotenuse).

-His smooth step function. Wow. It blends shapes and is so simple and useful. Like meta-balls but much easier to work with.

-Just learning what's going through his head, in terms of aesthetic decisions like lighting / scene design.

-Seeing GLSL 3.0 features in use (I've been stuck in the WebGL 1.0 stone ages). Wow, those normal vectors sure were easy to get, even on non-analytical shapes.

-Lots of little math shortcuts to get something informative up on the screen (like a checkerboard pattern).

Happy Thread
Jul 10, 2005

by Fluffdaddy
Plaster Town Cop

Suspicious Dish posted:

Most of the techniques you see on shadertoy, like raymarching, are generally not techniques you can use in game development, so be aware. There is some use for them, but a lot of the distance-field modelling you see there is better done with traditional modelling techniques.

While Inigo mentions that there's limitations, he kind of downplays them, answering a question to say that video games can use these techniques -- no mention of when they should/shouldn't. I bet good engines like Unreal might occasionally have a need for raymarching SDFs to render some objects, and have the capability. Maybe they just do it for a few important foreground objects. That got me thinking about how an engine might get away with it for lots and lots of unique SDFs.

In Inigo's case, he avoided one big limitation by keeping the scene very simple with few objects. If it were a complex scene, he's out of luck. If every single pixel had to raymarch and test against 10,000,000 SDF shapes, then we'd never finish drawing the image.

I suppose any solution would have to partition space into volumes, assign each SDF function to one or more volumes, and make sure that an individual ray-march only has to consider a dozen or so "nearby" SDF functions, not the whole set.

I guess the process would look like:

code:
Suppose we have some data structure made of bounding volumes.
Suppose we've already walked it to find the most visible volumes to our camera that are also "occupied" by containing one 
or more SDF-based shapes.

for every object O that's probably in view, and its bounding volume B,
    
  Render the bounding volume itself (a cube or whatever), using traditional Z buffering.  The vertex shader is done.  
  
  Our image is a bunch of 3D boxes for us to color in now.  
  
  In the fragment shader, ray march the SDF functions, but only those SDFs that happen to be stored in partition B -- a very 
  small list.   (We somehow passed the correct SDF functions to the GPU through the vertices, so we can know which 
  boxes to draw as which SDFs.)
  
  Discard the pixels that miss any SDF shape, letting the stuff behind draw.
And I think the problem there is that last part -- "discard" is a performance hog, breaking the depth ordering assumptions, and not actually saving us any processing. We still end up uselessly running all the shader instances to draw all the fragments in the back that got occluded behind things. Hundreds of extra ray-marches per pixel won't be good.

The only thing I can think of is to do some kind of progressive stenciling process, whereby you step your rays increasingly farther in your scene each pass. That assumes you can easily walk each ray forward in your data structure CPU-side to test if anything's there. Now we're doing a per-pixel operation on the CPU which is slow, but you'd end up drawing all your bounding volumes front-to-back in passes. After each pass it's the GPU's turn to march and draw the actual SDF functions. Hopefully each pass colors in more pixels that you don't have to run fragment instances on the next pass, until the frame is done.

There's probably better methods that I won't find by guessing. It seems I need to read some papers and books to see what the real accepted solutions are for SDFs. Surely games companies have found tricks to mass-render objects that we do not know the explicit boundaries of by the millions. Let me know if anyone has any suggested keywords for reading.

Happy Thread fucked around with this message at 00:57 on Oct 7, 2019

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
There are engines that can bake out distance fields from geometry to smoothly blend between bits of geo. It exists in our engine (as a prototype, currently unused by any shipping games, but planned to ship in the next one).

Distance fields are a tool, and very good at building certain kinds of scenes. There's research into "what's beyond triangles" -- PS4's Dreams, http://unbound.io/ and other sculpting tools, and things like distance fields and raymarching. I think there's promise in it, and just like effects from the past, what was first only used for certain scenes on Ultra graphics quality might eventually become standard techniques.

Inigo is well-aware of the limitations of raytracing, but for the kinds of work and art he does, I think it's fine. He did ship that Facebook sculpting tool, after all.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Dumb Lowtax posted:

While Inigo mentions that there's limitations, he kind of downplays them, answering a question to say that video games can use these techniques -- no mention of when they should/shouldn't. I bet good engines like Unreal might occasionally have a need for raymarching SDFs to render some objects, and have the capability. Maybe they just do it for a few important foreground objects. That got me thinking about how an engine might get away with it for lots and lots of unique SDFs.

In Inigo's case, he avoided one big limitation by keeping the scene very simple with few objects. If it were a complex scene, he's out of luck. If every single pixel had to raymarch and test against 10,000,000 SDF shapes, then we'd never finish drawing the image.

I suppose any solution would have to partition space into volumes, assign each SDF function to one or more volumes, and make sure that an individual ray-march only has to consider a dozen or so "nearby" SDF functions, not the whole set.

I guess the process would look like:

code:
Suppose we have some data structure made of bounding volumes.
Suppose we've already walked it to find the most visible volumes to our camera that are also "occupied" by containing one 
or more SDF-based shapes.

for every object O that's probably in view, and its bounding volume B,
    
  Render the bounding volume itself (a cube or whatever), using traditional Z buffering.  The vertex shader is done.  
  
  Our image is a bunch of 3D boxes for us to color in now.  
  
  In the fragment shader, ray march the SDF functions, but only those SDFs that happen to be stored in partition B -- a very 
  small list.   (We somehow passed the correct SDF functions to the GPU through the vertices, so we can know which 
  boxes to draw as which SDFs.)
  
  Discard the pixels that miss any SDF shape, letting the stuff behind draw.
And I think the problem there is that last part -- "discard" is a performance hog, breaking the depth ordering assumptions, and not actually saving us any processing. We still end up uselessly running all the shader instances to draw all the fragments in the back that got occluded behind things. Hundreds of extra ray-marches per pixel won't be good.

The only thing I can think of is to do some kind of progressive stenciling process, whereby you step your rays increasingly farther in your scene each pass. That assumes you can easily walk each ray forward in your data structure CPU-side to test if anything's there. Now we're doing a per-pixel operation on the CPU which is slow, but you'd end up drawing all your bounding volumes front-to-back in passes. After each pass it's the GPU's turn to march and draw the actual SDF functions. Hopefully each pass colors in more pixels that you don't have to run fragment instances on the next pass, until the frame is done.

There's probably better methods that I won't find by guessing. It seems I need to read some papers and books to see what the real accepted solutions are for SDFs. Surely games companies have found tricks to mass-render objects that we do not know the explicit boundaries of by the millions. Let me know if anyone has any suggested keywords for reading.

Ray marching has its place in games. Most of the time, more advanced volumetric effects will either use a pseudo Ray marching effect (voxelization to a frustum-aligned 3d texture and then summation from far to near plane) or actual day marching. The latter is common for things like sky/cloud rendering -- Horizon Zero Dawn, Battlefield, and Red Dead Redemption 2 are all examples. It can also find a place for rendering dynamic simulated fluids, where the fluid is modeled as a set of balls and they are raymarched to produce the actual fluid surface.

SDFs are used in UE4 for a certain class of object soft shadowing. In this case the SDF is (I believe) derived from a mesh and then voxelized in a 3D texture, and referencing the texture allows you to use a kind of adaptive ray march step (since sampling it tells you the minum amount you SHOULD need to step to intersect the object). I'm not sure where else off the top of my head.

Discard is only a performance hog if you are actually writing depth in the discarded pass (and also if you only have valid depth to test to begin with).

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
oh man I just remembered who you are. thank you for your service on NSIGHT Graphics, sorry I didn't last long enough to really clean it up

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
I'm working on porting a very old game to Windows. The game runs at a fixed 60Hz and changing the frame rate is not an option. Is there a way to determine the monitor refresh rate in windowed mode so I can skip or duplicate frames as necessary when the refresh rate isn't 60Hz? (I might use SyncInterval 0 instead, but I'm thinking that predictable skip/duplication rates is probably more consistent, and regardless of that, I still need a way of differentiating between 60Hz and anything else.)

Absurd Alhazred
Mar 27, 2010

by Athanatos

OneEightHundred posted:

I'm working on porting a very old game to Windows. The game runs at a fixed 60Hz and changing the frame rate is not an option. Is there a way to determine the monitor refresh rate in windowed mode so I can skip or duplicate frames as necessary when the refresh rate isn't 60Hz? (I might use SyncInterval 0 instead, but I'm thinking that predictable skip/duplication rates is probably more consistent, and regardless of that, I still need a way of differentiating between 60Hz and anything else.)

Just use a timer and have the graphics loop sleep the rest of the frame, or if you want to push frames out all the time, have your physics loop do that instead.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

Absurd Alhazred posted:

Just use a timer and have the graphics loop sleep the rest of the frame, or if you want to push frames out all the time, have your physics loop do that instead.
That's basically what would happen in the SyncInterval 0 option. The only reason I'm even worrying about this is because of the potential pathological case at 60Hz where the present time is close to a frame boundary, in which case timing jitter could cause it to randomly overwrite the pending frame before it's been displayed, causing frame drops and duplication. I dunno if that's really much of an issue, but using SyncInterval 1 at 60Hz would avoid it since it would never overwrite a pending frame.

The problem is I'm not sure how to tell if the refresh rate is actually 60Hz when not in exclusive fullscreen mode.

Lime
Jul 20, 2004

If you do want the refresh rate of the monitor a window is on, use EnumDisplaySettings. You need the display device name, you can get that by doing MonitorFromWindow then GetMonitorInfo or just enumerate all monitors with EnumDisplayDevices and pick the primary or something.

Absurd Alhazred
Mar 27, 2010

by Athanatos

OneEightHundred posted:

That's basically what would happen in the SyncInterval 0 option. The only reason I'm even worrying about this is because of the potential pathological case at 60Hz where the present time is close to a frame boundary, in which case timing jitter could cause it to randomly overwrite the pending frame before it's been displayed, causing frame drops and duplication. I dunno if that's really much of an issue, but using SyncInterval 1 at 60Hz would avoid it since it would never overwrite a pending frame.

The problem is I'm not sure how to tell if the refresh rate is actually 60Hz when not in exclusive fullscreen mode.

If you're porting an old game to a modern computer I think you're way likelier to find yourself near the start of a frame with nothing to do than near the end of a frame. What I would do to avoid jitters is make sure I'm measuring towards the next frame, not 1/60 seconds from the start of frame. Have a frame counter you advance each frame, then multiply by 1.0/60.0 to get the current target end of frame. That way you won't be compounding errors on consecutive frames.

Another option, which should avoid you pushing a bunch of frames in succession if some other part of the program or another running program results in you skipping frames, is to measure time from start of run, divide by frame time and use floor to get the last frame which should have finished, then calculate your sleep time to get to the end of the current frame.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

Absurd Alhazred posted:

If you're porting an old game to a modern computer I think you're way likelier to find yourself near the start of a frame with nothing to do than near the end of a frame. What I would do to avoid jitters is make sure I'm measuring towards the next frame, not 1/60 seconds from the start of frame. Have a frame counter you advance each frame, then multiply by 1.0/60.0 to get the current target end of frame. That way you won't be compounding errors on consecutive frames.
I'm not worried about it taking too long to do a frame, I'm worried about the relative timing of the presents and the vblank intervals. If I draw and present frames at 60Hz with SyncInterval 0, that might work fine, unless the presents are happening close to the time when the buffer is sent off to the DWM/monitor, in which case any timing jitter would cause the presents to randomly happen before or after that boundary, causing overwrites of the previous frame (drops) or frames where nothing was presented (duplicates). SyncInterval 1 would block until a buffer is freed up, putting it at the start of the interval, but is synchronized with the monitor refresh rate (and maybe other things?).

I think I'll have to just try inspecting the timestamps or something.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Monitor refresh rate is not guaranteed to be constant or sane. In a multimonitor scenario, DWM does not composite both monitors separately relative to their refresh rates. There's also the straddle case. Present frames as you generate them and DWM will present them when it feels like it.

Rhusitaurion
Sep 16, 2003

One never knows, do one?
I have a question about geometry shaders.

I'm using them to generate 3D geometry from 4D geometry. For example:

https://imgur.com/zD9J15J

The way this works is I have a tetrahedral mesh that I send into the geometry shader as lines_adjacency (since it gives you 4 points at time - very convenient). There (and this is the sketchy part), I have a bunch of branchy code that determines if every tetrahedron intersects the view 3-plane, and emits somewhere between 0 and 6 (for the case where the whole tetrahedron is in-plane) vertices in a triangle strip.

It's a neat trick, but it seems sketchy. I'm no GPU wizard, but my understanding is that geometry shaders are slow, and branchy shaders are slow. Additionally they don't seem to be supported in WebGL, or Metal.

Is there any reasonable alternative for generating geometry that's dependent on transformed vertices? I could do this on the CPU, but I'd have to end up doing essentially all the vertex transforms there, which seems lovely. I could save a lot of work with some kind of BVH, but still. Compute shaders seem promising, but I think I'd have to send the transformed vertices back to the CPU to get the 4-to-many vertices thing.

Absurd Alhazred
Mar 27, 2010

by Athanatos

Rhusitaurion posted:

I have a question about geometry shaders.

I'm using them to generate 3D geometry from 4D geometry. For example:

https://imgur.com/zD9J15J

The way this works is I have a tetrahedral mesh that I send into the geometry shader as lines_adjacency (since it gives you 4 points at time - very convenient). There (and this is the sketchy part), I have a bunch of branchy code that determines if every tetrahedron intersects the view 3-plane, and emits somewhere between 0 and 6 (for the case where the whole tetrahedron is in-plane) vertices in a triangle strip.

It's a neat trick, but it seems sketchy. I'm no GPU wizard, but my understanding is that geometry shaders are slow, and branchy shaders are slow. Additionally they don't seem to be supported in WebGL, or Metal.

Is there any reasonable alternative for generating geometry that's dependent on transformed vertices? I could do this on the CPU, but I'd have to end up doing essentially all the vertex transforms there, which seems lovely. I could save a lot of work with some kind of BVH, but still. Compute shaders seem promising, but I think I'd have to send the transformed vertices back to the CPU to get the 4-to-many vertices thing.

What if you used a compute shader to conditionally send vertices over to one of two vertex buffers, and only draw one of them?

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Rhusitaurion posted:

I have a question about geometry shaders.

I'm using them to generate 3D geometry from 4D geometry. For example:

https://imgur.com/zD9J15J

The way this works is I have a tetrahedral mesh that I send into the geometry shader as lines_adjacency (since it gives you 4 points at time - very convenient). There (and this is the sketchy part), I have a bunch of branchy code that determines if every tetrahedron intersects the view 3-plane, and emits somewhere between 0 and 6 (for the case where the whole tetrahedron is in-plane) vertices in a triangle strip.

It's a neat trick, but it seems sketchy. I'm no GPU wizard, but my understanding is that geometry shaders are slow, and branchy shaders are slow. Additionally they don't seem to be supported in WebGL, or Metal.

Is there any reasonable alternative for generating geometry that's dependent on transformed vertices? I could do this on the CPU, but I'd have to end up doing essentially all the vertex transforms there, which seems lovely. I could save a lot of work with some kind of BVH, but still. Compute shaders seem promising, but I think I'd have to send the transformed vertices back to the CPU to get the 4-to-many vertices thing.

So this is a perfect example of how you would use geometry shaders; however, due to how they were spec'ed (and implemented in some hardware) it is problematic as you say. There's nothing sketchy about what you are doing, but doing variable expansion (emitting a varying number of primitives) is bad for performance because the API still requires order preservation in all cases, meaning geometry shader output needs to serialize even if the output is not order dependant. (If it were a fixed output size this wouldn't be a problem, because each invocations write offset would still be known at execution time and so could run in parallel).

Meshlet Shaders ( https://devblogs.nvidia.com/using-turing-mesh-shaders-nvidia-asteroids-demo/ ) are NVIDIA's attempt at a Grand Unified Geometry Pipleine to fix both Geometry Shaders and Tessellation, but are still not broad platform. For what you need to do I would concur with the suggestion of using a computer shader that reads a vertex array as input and produces an index buffer as output. For optimal performance you might have to be creative by creating a local index buffer in shared memory and then appending it to your output IB as a single block (to preserve vertex reuse). Basically, have a shared memory array that is the size of your maximum possible triangles per dispatch, then compute your actual triangles into there and use atomic increment on a global counter to fetch and then increment a write offset into the output array by that amount. You are effectively reimplementing the GS behavior, but completely relaxing the order dependency.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Also worth pointing out that it becomes a lot easier if you generate indexed triangles instead of triangle strips, since you can just jam triangles through without guaranteeing that strips are similar. You can also output degenerate tris and do a draw call without any readback on the CPU, or you can store the index count in a separate buffer and use an indirect draw.

Rhusitaurion
Sep 16, 2003

One never knows, do one?

Hubis posted:

Basically, have a shared memory array that is the size of your maximum possible triangles per dispatch, then compute your actual triangles into there and use atomic increment on a global counter to fetch and then increment a write offset into the output array by that amount. You are effectively reimplementing the GS behavior, but completely relaxing the order dependency.

That makes sense, thanks! I've not really messed with compute shaders before, so I wasn't sure what is and isn't possible. I think this will be pretty doable, since the maximum number of vertices is not that much more than the number of input vertices.

Suspicious Dish posted:

Also worth pointing out that it becomes a lot easier if you generate indexed triangles instead of triangle strips, since you can just jam triangles through without guaranteeing that strips are similar.

I think using indexed triangles would actually make some of the logic easier as well, so that's good to know.

I'm thinking something like have a vertex buffer with 6 output vertices for every (4 choose 2) combination of a tetrahedron's vertices. Then also have an index buffer to connect up the ones that actually land in the view 3-plane, using Hubis's suggestion.

MrMoo
Sep 14, 2000

How straightforward is it to make a fragment shader to convert idk a basic quad to look like this: (ignoring the logos and text above). Targeting WebGL2, i.e. GLSL 3.00 ES.



It would be nicer to replace a static background with a coloured smoke type thing, like this shadertoy, :



https://www.shadertoy.com/view/ldBSDd

Is it better to find a freelancer to conjure up something? idk, I'm just researching and prototyping some scoreboard screens for MLB / NFL / NBA / NHL.

MrMoo fucked around with this message at 20:49 on Jan 10, 2020

Adbot
ADBOT LOVES YOU

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
yeah, you can just render a quad with that shader. you'll have to bind the input uniforms correctly (iResolution and iTime) but other than that i think it should be pretty smooth sailing. fragCoord is basically gl_FragCoord, and the output return color is basically gl_FragColor (in WebGL1, WebGL2 has multiple output support)

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply