|
Hubis posted:Actually, having threads from the same subgroup hitting the same global atomic is ironically probably going to be faster, because you can coalesce the atomic within a warp (essentially as a parallel sum) and then do a single write to the global address. At least Intel doesn't do this; explicit subgroup ops made a shader of mine an order of magnitude faster.
|
# ¿ Nov 3, 2020 18:38 |
|
|
# ¿ May 15, 2024 16:14 |
|
peepsalot posted:Since the attributes are not interleaved, is there any way to have them indexed separately from the vertices? There is not, but you can always use a storage buffer instead of a vertex attribute, and index it with whatever logic you like.
|
# ¿ Nov 30, 2020 06:36 |
|
Is your code validation clean? Wouldn't be shocked if you're hitting a moltenvk or driver bug, "old integrated Mac" is almost the worst environment you could've chosen. It's also not really an unrealistic shader at all, most games do something similar for full screen passes. Ralith fucked around with this message at 23:49 on Nov 23, 2022 |
# ¿ Nov 23, 2022 23:42 |
|
I think they're still not widely supported, so if you use them it can only be as an optional extra, which makes it a harder sell.
|
# ¿ Mar 19, 2024 18:29 |