Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Raenir Salazar
Nov 5, 2010

College Slice
I finally did it!




Holy poo poo this was surprisingly difficult.

It seems there isn't a way to pull the materials from the mesh without there being textures. You know how in blender you can set a face to have a diffuse color? Yeah I can't seem to have that handled 'alone' and ignore textures. Has to have a texture for it to have diffuse.

ninja edit: Just to double check but the obviously terrible looking planes there making up there sphere are mostly noticeable because I'm doing everything in the vertex shader right?

Adbot
ADBOT LOVES YOU

HiriseSoftware
Dec 3, 2004

Two tips for the wise:
1. Buy an AK-97 assault rifle.
2. If there's someone hanging around your neighborhood you don't know, shoot him.

Raenir Salazar posted:

I finally did it!




Holy poo poo this was surprisingly difficult.

It seems there isn't a way to pull the materials from the mesh without there being textures. You know how in blender you can set a face to have a diffuse color? Yeah I can't seem to have that handled 'alone' and ignore textures. Has to have a texture for it to have diffuse.

ninja edit: Just to double check but the obviously terrible looking planes there making up there sphere are mostly noticeable because I'm doing everything in the vertex shader right?

You can have a vertex array of color attributes (like texture coordinates or normals) but to allow each face to have its own color you'd have to "unshare" the vertices so that one vertex is only used by one face. That increases the data processed by the video card by up to 3x, but it's what you'd need to do.

Raenir Salazar
Nov 5, 2010

College Slice

HiriseSoftware posted:

You can have a vertex array of color attributes (like texture coordinates or normals) but to allow each face to have its own color you'd have to "unshare" the vertices so that one vertex is only used by one face. That increases the data processed by the video card by up to 3x, but it's what you'd need to do.

To populate that vertex array though do you know if that can be done automatically with assimp to extract it from the mtl and obj files?

HiriseSoftware
Dec 3, 2004

Two tips for the wise:
1. Buy an AK-97 assault rifle.
2. If there's someone hanging around your neighborhood you don't know, shoot him.

Raenir Salazar posted:

To populate that vertex array though do you know if that can be done automatically with assimp to extract it from the mtl and obj files?

I have no experience with that library, but I would hope that it does. The unsharing would have to work for texture coordinates as well - if two faces sharing the same vertex had different UVs, then you'd have to split that vertex into two unique ones.

And it says it supports loading color channels.

Raenir Salazar
Nov 5, 2010

College Slice

HiriseSoftware posted:

I have no experience with that library, but I would hope that it does. The unsharing would have to work for texture coordinates as well - if two faces sharing the same vertex had different UVs, then you'd have to split that vertex into two unique ones.

And it says it supports loading color channels.

Saying it supports a thing isn't the same as being intuitive to program to implement said thing. :v:

Nevertheless, now that I finally got textures working I'll stick to that until I have the time to try to experiment. Voila:

Fellatio del Toro
Mar 21, 2009

I'm looking for some guidance and/or references on proper VBO usage, specifically for 2D applications.

I'm working on a 2D engine and, from what I've figured out, transparency in sprites pretty much demands back-to-front rendering. (I've looked a bit into order independent techniques but they seem like overkill for 2D) What I'm not clear on is the best way to handle frequent insertion/reordering of polygons while using something more efficient than immediate rendering. I can think of several different ways it could be done:

0) Sorting vertices within a VBO
-I think this one is impossible

1) Rebuilding one big VBO every frame and drawing the whole thing at once
-Is this any faster than just using an array?

2) Using large, unordered VBOs and calling draw functions individually
-Seems to be calling a number of gl functions comparable to immediate rendering, but I'd guess not resending vertex data is faster at least

3) Individual VBOs for every sprite
-Not sure this would improve performance much at all

4) Use one set of vertices and transform it for every sprite
-I have no idea how costly matrix transformations are

Of course the correct answer is probably that it's 2D and will render fine at like 500FPS no matter how I do it but that's neither here nor there :shobon:

Fellatio del Toro fucked around with this message at 05:49 on Apr 13, 2014

Madox
Oct 25, 2004
Recedite, plebes!
I've had to answer this kind of question myself for my UI code that draw the 2D UI over top of the scene. It has transparency and UI element layering so it has the same issues.

0) You think right
1) This is what I do. Note you aren't re-alloc'ing the VBO and recreating it, you are just altering the values in the existing one. If I suddenly need more quads than I have room for, I double the size of the VBO and recreate it, so I probably will have enough room for a while. (You can also extend it by 1024 or whatever works)
2) Less draw calls is best
3) Less draw calls is best
4) You could do this but I think its pointless. You'd still have to pass in position and size info to transform the one VBO, which means a large array of vecs or mats, but how efficient is it? Also you run in to card limitations on array sizes. Maybe read the matrixs from a texture, which still needs to be rebuilt each frame. Anyhow the amount of extra code involved makes me frown on it.

Raenir Salazar
Nov 5, 2010

College Slice
Any good tutorials on Level of Detail algorithms or is it just adding and removing triangles based on distance? How easy is it to access and fiddle with the VBO's in that instance? I don't think tessellation is going to be the correct thing in here.

UraniumAnchor
May 21, 2006

Not a walrus.
Couple of related questions regarding meshes:

1) How much of a performance gain does it actually tend to be when a relatively complex mesh is composed of strips using primitive restart rather than discrete triangles? (still using indexed vertices, of course)
2) Is there a good algorithm that can 'bake' a triangle mesh into strips suitable for use with primitive restart? Ideally something that will be called the first time the game loads and then the result gets saved to a cache somewhere. As close to optimal as possible assuming that the complexity doesn't get worse than, say, n^2.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

UraniumAnchor posted:

Couple of related questions regarding meshes:

1) How much of a performance gain does it actually tend to be when a relatively complex mesh is composed of strips using primitive restart rather than discrete triangles? (still using indexed vertices, of course)
2) Is there a good algorithm that can 'bake' a triangle mesh into strips suitable for use with primitive restart? Ideally something that will be called the first time the game loads and then the result gets saved to a cache somewhere. As close to optimal as possible assuming that the complexity doesn't get worse than, say, n^2.

Are the meshes static or dynamic?

If they are static, then you might not see any performance gain at all unless you are vertex-shader bound. Even if you are VS-bound, so long as the verticies are indexed the gains will vary depending on how cache-efficient your vertex ordering is (since it skips vertices whose results are still in the cache) and how wide/efficient your vertex data itself is.

If the meshes are dynamic, then stripping gets more important, since it means touching less memory when the buffer is updated. That has more to do with CPU workload, however, so if you're GPU-bound it may not amount to much.

UraniumAnchor
May 21, 2006

Not a walrus.
Static, I'm mostly wondering if it's worth it to bake some mesh data in this manner before shipping it. I'd be a little surprised, but it's intended for mobile devices which have all sorts of surprising performance bottlenecks.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

UraniumAnchor posted:

Static, I'm mostly wondering if it's worth it to bake some mesh data in this manner before shipping it. I'd be a little surprised, but it's intended for mobile devices which have all sorts of surprising performance bottlenecks.

OK, for mobile devices this might make a lot more sense, since they're often chunkers (and vertex load is multiplied). The chance of being vertex bound is greater, but the general trends are the same -- it will matter much more with wide per-vertex sizes than smaller ones, and how much computation is actually being saved by being able to re-use results already in the cache.

UraniumAnchor
May 21, 2006

Not a walrus.
That's what I figured.

I suppose my third question is where I might find a good algorithm that does so in a reasonable span of time.

"Reasonable" here meaning that it's alright if it takes a while since I can just bake it into the package (or worst case store it in the device cache one time and only rebuild it on a reinstall).

There seems to be an abundance of packages that will claim to do it FOR you, but that's not really what I want either unless it's open source in some way.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

UraniumAnchor posted:

That's what I figured.

I suppose my third question is where I might find a good algorithm that does so in a reasonable span of time.

"Reasonable" here meaning that it's alright if it takes a while since I can just bake it into the package (or worst case store it in the device cache one time and only rebuild it on a reinstall).

There seems to be an abundance of packages that will claim to do it FOR you, but that's not really what I want either unless it's open source in some way.

take a look at Assimp -- it has a model import flag for "optimizing meshes" that might give you what you're looking for.

Xerophyte
Mar 17, 2008

This space intentionally left blank

Hubis posted:

take a look at Assimp -- it has a model import flag for "optimizing meshes" that might give you what you're looking for.

Assimp just does reorders the triangle index order for basic cache optimization. It's a pretty simple greedy algorithm from what I can tell (see source) and probably not bad but it's not going to be all that optimal for striping and something like OpenCCL should do better at the ordering task.

I'm not up to date with striping but Vaněček's thesis was a good overview when I last heard anything. Said thesis is apparently 9 years old now and probably dated but one of his chief results was that optimizing the index buffer order for cache coherency was more of a performance gain than any of the striping techniques on the hardware of the day. He compared with Bogomjakov's algorithm for index ordering which I think generally does worse than COL (which backs OpenCCL).

On open source triangle striping implementation, apparently OpenSceneGraph has one. I just skimmed the code and it looks like basic greedy stuff and probably not that optimal, but it could be something to start with.

[e: On reflection, OpenCCL may be less competitive on low poly game meshes and I'm overestimating it because it's nice at work. Still, point is, even greedy vertex order optimization is likely to be more helpful (and cheaper) than explicit striping.]

Xerophyte fucked around with this message at 00:07 on Apr 16, 2014

Fellatio del Toro
Mar 21, 2009

Madox posted:

I've had to answer this kind of question myself for my UI code that draw the 2D UI over top of the scene. It has transparency and UI element layering so it has the same issues.

0) You think right
1) This is what I do. Note you aren't re-alloc'ing the VBO and recreating it, you are just altering the values in the existing one. If I suddenly need more quads than I have room for, I double the size of the VBO and recreate it, so I probably will have enough room for a while. (You can also extend it by 1024 or whatever works)
2) Less draw calls is best
3) Less draw calls is best
4) You could do this but I think its pointless. You'd still have to pass in position and size info to transform the one VBO, which means a large array of vecs or mats, but how efficient is it? Also you run in to card limitations on array sizes. Maybe read the matrixs from a texture, which still needs to be rebuilt each frame. Anyhow the amount of extra code involved makes me frown on it.

Well I've started working on this and pretty much everything I do begs more questions than it answers. I'm starting to develop a real appreciation for graphics optimization at least.

For example:

I was previously handling rotation with matrix transformations, but with VBOs I'm not sure if I should continue to do so (requiring extra draw calls) or if I should just precompute the rotation since I'm doing a lot of modifications and reordering of vertices anyways. Or maybe do something with shaders but I haven't even touched that stuff yet.

Raenir Salazar
Nov 5, 2010

College Slice
OH GOD! Why did I make my model with a billion faces, it makes EVERYTHING harder!

Madox
Oct 25, 2004
Recedite, plebes!

Fellatio del Toro posted:

Well I've started working on this and pretty much everything I do begs more questions than it answers. I'm starting to develop a real appreciation for graphics optimization at least.

For example:

I was previously handling rotation with matrix transformations, but with VBOs I'm not sure if I should continue to do so (requiring extra draw calls) or if I should just precompute the rotation since I'm doing a lot of modifications and reordering of vertices anyways. Or maybe do something with shaders but I haven't even touched that stuff yet.

Ah, I'm lucky that none of my UI elements rotated, so I didn't have to deal with that. I think you don't want to have more than a single draw call, really (unless you need to change textures or something like that (or separate glowy ones from nonglowy ones etc?)) (aside: but you could you build a texture atlas or just bind multiple textures and add a float to the vert to pick which texture to use...)

Also 'don't make the cpu do any work that the shader can do' tends to be a good rule of thumb. So I wouldn't precompute it. You need to get the rotation numbers in to the shader, which are the origin of the sprite (center of rotation) and the angle. Pass these three floats in as vertex parameters. Its 2d rotation and GLSL has sin/cos so I assume cards nowadays have sin/cos built in? I used to use a taylor series for it... Anyhow, you could compute the composite matrix in the shader that translates/rotates/untranslates.

You'd need all 4 verts to duplicate those 3 floats which isnt a huge deal. Though I might try to pack those float triplets in a 1d texture for fun, then its just 1 float per vert to do the texture lookup and get the 3 flaots you need.

Well its pretty late and this is off the top of my head so I'm sure someone can correct me.

Fellatio del Toro
Mar 21, 2009

Madox posted:

Ah, I'm lucky that none of my UI elements rotated, so I didn't have to deal with that. I think you don't want to have more than a single draw call, really (unless you need to change textures or something like that (or separate glowy ones from nonglowy ones etc?)) (aside: but you could you build a texture atlas or just bind multiple textures and add a float to the vert to pick which texture to use...)

Anyone have a good sample or reference on how to do this? Ideally I'd like to make my engine flexible enough to be able load an indeterminate number of textures and switch between them on the shader. I'm not sure if it's worth the effort/complexity over just rebinding textures.

Can I make a uniform sampler array of indeterminate size? I've yet to find an example of a shader using a uniform array that didn't explicitly define the size at compile time.

Also I'm finding mixed information about using textures of varying size in a sampler array.

And finally, how do I go about getting the array index to the fragment shader? It seems like the only way is to set it as a float vertex attribute for all vertices, then use a varying float to get it to the fragment shader. Is there a better way?

Madox
Oct 25, 2004
Recedite, plebes!
I can add more info since it was my comment :p At one point I did this to pick between grass, rock and sand for a terrain (although at the time it was DirectX - not sure if I can find that code).

You have to use a varying float to get the index to the pixel shader, and it will interpolate it so you have to floor it or do some other math to find the desired whole number again. I don't know about an array of samplers. You can use a cube map I guess? but if you want to use different sized textures, I'm not sure. I set them all as separate uniforms and had a bunch of ifs.

Also for efficiency - what happens in the shader is that if you have 4 textures and are picking one of them using the array index, the shader will sample every texture and THEN throw 3 away, so be aware of that (unless cards are a lot better now - I havnt looked at asm in a while). It's not picking one sampling to do.

As I understand it, all branching is done this way. Both branches run always, and then one is thrown away.

Raenir Salazar
Nov 5, 2010

College Slice
I'm trying to do Picking in a modern opengl context, I am picking out individual vertices; I know I could probably cheat and check "Is ray origin == vertex*MVP?" and be done, but I feel I should do it the proper way that can be applied to objects/meshes.

I think I got it to work using a ray intersection implementation with a small little bounding box for each vertex but I have over 6000 of the bastards.

My first thought was to maybe subdivide them, since they're all basically just coordinates I could have vector arrays that correspond to the four quadrants of the screen. +X & +Y, +X & -Y, -X&-Y, -X&+Y. But that's still 1500~ vertices which is still several seconds before there's a response. I'm not sure why the test's for intersection run so slowly but they do.

Before I go and experiment by using an actual physics engine that apparently supports acceleration for me, I would be interesting in trying to implement a Binary Space Partitioning Tree, which I think follows intuitively to my idea that I had above in having this sorta recursive tree structure to quickly go through all my vertex objects.

Are there any good tutorials with example code, preferably in an opengl context that I can hit up that people could recommend?

High Protein
Jul 12, 2009

Fellatio del Toro posted:

Anyone have a good sample or reference on how to do this? Ideally I'd like to make my engine flexible enough to be able load an indeterminate number of textures and switch between them on the shader. I'm not sure if it's worth the effort/complexity over just rebinding textures.

Can I make a uniform sampler array of indeterminate size? I've yet to find an example of a shader using a uniform array that didn't explicitly define the size at compile time.

Also I'm finding mixed information about using textures of varying size in a sampler array.

And finally, how do I go about getting the array index to the fragment shader? It seems like the only way is to set it as a float vertex attribute for all vertices, then use a varying float to get it to the fragment shader. Is there a better way?

Look into Array Textures. I'm more versed in Direct3D but there you don't have to specify the size in the shader, you can however retrieve the x/y/z size of the bound array. All elements must be the same size though. Using array textures together with instancing (where the per-instance data specifies what texture to use) allows you to draw a huge amount of stuff with a single draw call.

Raenir Salazar posted:

I'm trying to do Picking in a modern opengl context, I am picking out individual vertices; I know I could probably cheat and check "Is ray origin == vertex*MVP?" and be done, but I feel I should do it the proper way that can be applied to objects/meshes.

I think I got it to work using a ray intersection implementation with a small little bounding box for each vertex but I have over 6000 of the bastards.

My first thought was to maybe subdivide them, since they're all basically just coordinates I could have vector arrays that correspond to the four quadrants of the screen. +X & +Y, +X & -Y, -X&-Y, -X&+Y. But that's still 1500~ vertices which is still several seconds before there's a response. I'm not sure why the test's for intersection run so slowly but they do.

Before I go and experiment by using an actual physics engine that apparently supports acceleration for me, I would be interesting in trying to implement a Binary Space Partitioning Tree, which I think follows intuitively to my idea that I had above in having this sorta recursive tree structure to quickly go through all my vertex objects.

Are there any good tutorials with example code, preferably in an opengl context that I can hit up that people could recommend?

I can see wanting to implement this for the heck of it, but if it suits your purposes, per-pixel picking is pretty easy to implement. I used this guide: http://trac.bookofhook.com/bookofhook/trac.cgi/wiki/MousePicking

In combination with D3D11 it meant creating a cpu-accessible texture, copying the clicked pixel of the z-buffer there, and transforming it back.

Of course, if you want to know what object was clicked instead of just the world position, one solution is to render everything with a specific color and use that to determine what was clicked.

High Protein fucked around with this message at 23:48 on Apr 22, 2014

nebby
Dec 21, 2000
resident mog
I'm working through the OpenGL superbible and one of the examples wasn't working. I figured it out, but have no idea why the change I made should make a difference. I'll post more of the code if this isn't enough for someone to explain.

In their vertex shader, they have a block uniform declared as follows for tracking raindrop state:

code:
struct droplet_t
{
  float x_offset;
  float y_offset;
  float orientation;
  float unused;
};

layout (std140) uniform Droplets {
  droplet_t droplet[256]
};
and were referencing it in their shader as:
code:
droplet[i].x_offset
Etc. The shader compiles and runs, and it looks like they are loading the uniform correctly from the C++ side, but the program wasn't working because accessing the uniform members seem to always yield zero from within the shader. The fix was to change the uniform declaration to have a scope:
code:
layout (std140) uniform Droplets {
  droplet_t droplet[256]
} droplets;
and do
code:
droplets.droplet[i].x_offset
Etc. When I make this small change then the data is there and things work as expected. I thought maybe there was a name collision or something with another variable in the shader but no. Any idea why introducing a scope would fix things? My OpenGL version is 4.2 on Windows with an ATI card.

Raenir Salazar
Nov 5, 2010

College Slice
I just can't deal with the Unresolved External Errors I get whenever I try to use a library, right now there's Bullet Physics and I try to include it and it just refuses; I only get unresolved linker errors, how do people not go insane from these? It doesn't help there seems to be no instructions for how to include it.

I just use cmake, then open the sln file, and build_all and then it doesn't work, why is this?

Tres Burritos
Sep 3, 2009

Raenir Salazar posted:

I just can't deal with the Unresolved External Errors I get whenever I try to use a library, right now there's Bullet Physics and I try to include it and it just refuses; I only get unresolved linker errors, how do people not go insane from these? It doesn't help there seems to be no instructions for how to include it.

I just use cmake, then open the sln file, and build_all and then it doesn't work, why is this?

Hahaha, you're going down the exact same path I did and having the exact same problems I did. I spent waaaay too much time getting bullet to link in as well. I'm no linux-guru so everything took a couple of tries (hours).

Raenir Salazar
Nov 5, 2010

College Slice

Tres Burritos posted:

Hahaha, you're going down the exact same path I did and having the exact same problems I did. I spent waaaay too much time getting bullet to link in as well. I'm no linux-guru so everything took a couple of tries (hours).

I'm using Windows and Visual Studio :smith:

I feel like the problem is probably some Visual Studio setting that obviously every coder knows to check/flip/switch/enable/disable/enter/remove so of course why bother including it in the instructions or FAQ? Like some kind of overly extended hazing ritual for programmers to other programmers.

e: I went and followed The getting started tutorial and I am cautiously optimistic.

e2: Success! No more errors! I find it kinda weird that I need to include their projects and GitHub is going to likely crash on me adding the new 5000 files but its okay!

Edit3: Aw gently caress, I wish I was kidding, github refuses now to even load the changes :smith:

Raenir Salazar fucked around with this message at 20:29 on Apr 25, 2014

MarsMattel
May 25, 2001

God, I've heard about those cults Ted. People dressing up in black and saying Our Lord's going to come back and save us all.
In visual studio you'll need to either explicitly add the lib to the linker input. Pre VS2010 you can use the project dependencies to manage this, 2010+ you need to add the project as a dependency.

Its probably not explicitly mentioned since its toolchain dependent.

TZer0
Jun 22, 2013
I'm not quite sure where to post this, but I need an OpenGL-program and I thought this might be the place to ask about it (even though this is more of a technical thread). If you know a more suited thread to post this question, please let me know.

I need some sort of (nice-looking) demo scene with the following characteristics:
  • Open source under some friendly license (GPL, MIT, what have you - I will publish my modifications anyway)
  • GLSL 330 or later (I can perhaps port it from 1.3)
  • It must run on Linux/easily portable to Linux.
  • Preferably non-messy code.
  • Preferably a section with transparency.

It could also be a game.

This is for research purposes.

If you have something or know about something I could use - please tell me, my searches have turned up nothing.

Raenir Salazar
Nov 5, 2010

College Slice

TZer0 posted:

I'm not quite sure where to post this, but I need an OpenGL-program and I thought this might be the place to ask about it (even though this is more of a technical thread). If you know a more suited thread to post this question, please let me know.

I need some sort of (nice-looking) demo scene with the following characteristics:
  • Open source under some friendly license (GPL, MIT, what have you - I will publish my modifications anyway)
  • GLSL 330 or later (I can perhaps port it from 1.3)
  • It must run on Linux/easily portable to Linux.
  • Preferably non-messy code.
  • Preferably a section with transparency.

It could also be a game.

This is for research purposes.

If you have something or know about something I could use - please tell me, my searches have turned up nothing.

Well, here's been what I've been working on: Link to video Originally the functionality for this was in 2.1 so I've updated it to 3.3; I'll be putting in a walking animation and adding in more advanced shaders; if this is acceptable I could try converting the code to compile and run on Ubuntu.

TZer0
Jun 22, 2013

Raenir Salazar posted:

Well, here's been what I've been working on: Link to video Originally the functionality for this was in 2.1 so I've updated it to 3.3; I'll be putting in a walking animation and adding in more advanced shaders; if this is acceptable I could try converting the code to compile and run on Ubuntu.

Sorry, I have a tech demo much akin to the one you have, I need something a bit more advanced - landscape, more movement, perhaps some effects, etc.

Thanks for sharing anyway.

Mata
Dec 23, 2003
Here's a HLSL shader question.. I'm still terrible at writing shader code, so hopefully it's much easier than I'm thinking..
For a given quad, I'd like to sample 4 textures: the top right corner should have its own texture, the top left corner its own texture, etc.

The reason I want to do this is so I can recombine tilesheets more efficiently. Take for example this road, drawn on top of this terrain mesh (in wireframe):

This works by dividing up the road into unique tileset pieces. This picture contains 8 (I think?) different road textures, so the terrain vertex buffer is drawn 8 times with different textures and different index buffers - i.e. the road segment that's just a straight line from NW to SE has an index buffer with 34 primitives (17 quads), whereas that 3-way intersection is only used once, so its index buffer is just 2 primitives.
This is pretty inefficient, but works for shapes of limited complexities, like this road.
What I'd like to do is divide each "tile" up into 4 smaller segments, so that a smaller tileset can be used to construct more complicated shapes.
Right now, the best I can think of is a hacky pixel shader solution where I write a texture filled with 4 texture offset coordinates per quad, then do awful lookups in it with conditional branching:
pre:
if ((input.TexCoord.x % 1) > 0.5){
   if ((input.TexCoord.y % 1) > 0.5){
      texcoords.xy - lookupValueFromTexture(topRight);
   }
   texcoords.xy = lookupValueFromTexture(top);
}
else if ((input.TexCoord.y % 1) > 0.5){
   texcoords.xy = lookupValueFromTexture(right);
}
You get the idea...

Mata fucked around with this message at 12:48 on May 1, 2014

High Protein
Jul 12, 2009

Mata posted:

Here's a HLSL shader question.. I'm still terrible at writing shader code, so hopefully it's much easier than I'm thinking..
For a given quad, I'd like to sample 4 textures: the top right corner should have its own texture, the top left corner its own texture, etc.


I don't think I fully understand; so you want a piece of road with one rounded corner to somehow use the 'rounded corner' texture for one corner and the 'straight road' texture for others?

First of all, look into using array textures instead of tile sheets, it's easier to work with an array index than some set of starting texture coordinates for your tile.

As you're only drawing quads, it isn't much of a waste to just generate a vertex buffer for the whole road and have which texture to use (or the starting coords of the tile) as per-vertex data. If you don't want to do that, you could use instancing and just instance your quad a hundred times; then as per-instance data bind the position world of the quad and an int4 of texture indices (or an array of 4 starting coords) and use SV_VertexID to do the lookup. Or, you could not even bind the vertex buffer for the quad and just use SV_VertexID to calculate the vertex coordinates.

To determine the proper texture coordinate in the pixel shader, just have your texture coordinate run from 0 to 2 and do fmod(texCoord,1) in the pixel shader :) Again, the texture coordinate could be generated using SV_VertexID.

If you're not using D3D10, SV_VertexID is of course easy enough to make part of the vertex manually.

High Protein fucked around with this message at 20:13 on May 1, 2014

Fellatio del Toro
Mar 21, 2009

Madox posted:

I can add more info since it was my comment :p At one point I did this to pick between grass, rock and sand for a terrain (although at the time it was DirectX - not sure if I can find that code).

You have to use a varying float to get the index to the pixel shader, and it will interpolate it so you have to floor it or do some other math to find the desired whole number again. I don't know about an array of samplers. You can use a cube map I guess? but if you want to use different sized textures, I'm not sure. I set them all as separate uniforms and had a bunch of ifs.

Also for efficiency - what happens in the shader is that if you have 4 textures and are picking one of them using the array index, the shader will sample every texture and THEN throw 3 away, so be aware of that (unless cards are a lot better now - I havnt looked at asm in a while). It's not picking one sampling to do.

As I understand it, all branching is done this way. Both branches run always, and then one is thrown away.

Well this ended up being waaay too much work for what is probably negligible (or possibly negative) gain, but I now have rotation and texture switching done on the shader!



I ended up dynamically generating new shader programs after textures get loaded to just avoid using arrays altogether. I'm sure all that poo poo about branching makes this probably not a great idea. I've kept all of my rendering methods (immediate/vertexarrays/VBO/VBO+ShaderTextures) as a selectable option but I'm kind of scared to go actually test and find out my shaders give worse performance.

Oh well, I mostly just wanted to figure out how the gently caress shaders work so mission accomplished :eng101:

Woodsy Owl
Oct 27, 2004
I'm trying to get into Processing but I'm having a bit of trouble. In fact, it doesn't appear to be just Processing that is having trouble, but rather any code implementing JOGL. I am able to compile and execute code just fine, but the app frames just display a white background and the promptly freeze/crash, along with any open applications (like Firefox) and Explorer.exe, and then it becomes necessary to manually power cycle my PC. I've only got the integrated video from my Ivy bridge Intel CPU. OpenGL applications like Stellarium seem to work just fine. I have updated my video drivers, too, but no dice.

Thoughts? Is Intel's OpenGL implementation just a big hunk of junk? Or is this just Fate providing the perfect excuse to snag a dedicated video card?

edit: This'd probably help a bit: Windows 7 64-bit, 32-bit JDK, 32-bit Processing, most recent JOGL build (I've tried 64-bit JDK together with 64-bit Processing for the sake of isolation, also using some old JOGL builds)

edit 2: This is the result of running the JOGL test from here: http://jogamp.org/deployment/jogamp-current/jogl-test-applets.html which is the only applet that will execute from the text page; the other tests throw a variety of exceptions...
code:
--------------------------------------------------------------------------------------
Platform: WINDOWS / Windows 7 6.1 (6.1.0), x86 (arch), GENERIC_ABI, 2 cores
MachineDescription: runtimeValidated true, littleEndian true, 32Bit true, primitive size / alignment:
  int8    1 / 1, int16   2 / 2
  int     4 / 4, long    4 / 4
  int32   4 / 4, int64   8 / 8
  float   4 / 4, double  8 / 8, ldouble 12 / 4
  pointer 4 / 4, page    4096
Platform: Java Version: 1.7.0_51 (1.7.0u51), VM: Java HotSpot(TM) Client VM, Runtime: Java(TM) SE Runtime Environment
Platform: Java Vendor: Oracle Corporation, [url]http://java.oracle.com/,[/url] JavaSE: true, Java6: true, AWT enabled: true
----------------------------------------------------------------------------------
Package: com.jogamp.common
Extension Name: com.jogamp.common
Specification Title: GlueGen Java Bindings Generator
Specification Vendor: JogAmp Community
Specification Version: 2.1
Implementation Title: GlueGen Run-Time
Implementation Vendor: JogAmp Community
Implementation Vendor ID: com.jogamp
Implementation URL: [url]http://jogamp.org/[/url]
Implementation Version: 2.1.5
Implementation Build: 2.1-b779-20140310
Implementation Branch: origin/master
Implementation Commit: 6476552f46c7bc7b151d53a9e8d2332deda10fcb
---------------------------------------------------------------------------------------
Package: javax.media.opengl
Extension Name: javax.media.opengl
Specification Title: Java Bindings for OpenGL API Specification
Specification Vendor: JogAmp Community
Specification Version: 2.1
Implementation Title: Java Bindings for OpenGL Runtime Environment
Implementation Vendor: JogAmp Community
Implementation Vendor ID: com.jogamp
Implementation URL: [url]http://jogamp.org/[/url]
Implementation Version: 2.1.5
Implementation Build: 2.1-b1240-20140311
Implementation Branch: origin/master
Implementation Commit: ba0dc6462a88ee7512a087deaaca760239915548
--------------------------------------------------------------------------------------
WindowsGraphicsDevice[type .windows, connection decon]: 
	Natives
		GL4bc 	true [4.0 (Compat profile, arb, ES2 compat, FBO, hardware)]
		GL4 	true [4.0 (Core profile, arb, ES2 compat, FBO, hardware)]
		GLES3 	false
		GL3bc 	true [4.0 (Compat profile, arb, ES2 compat, FBO, hardware)]
		GL3 	true [4.0 (Core profile, arb, ES2 compat, FBO, hardware)]
		GL2 	true [4.0 (Compat profile, arb, ES2 compat, FBO, hardware)]
		GLES2 	false
		GLES1 	false
		Count	5 / 8
	Common
		GL4ES3 	false
		GL2GL3 	true
		GL2ES2 	true
		GL2ES1 	true
	Mappings
		GL2ES2 	GLProfile[GL2ES2/GL4.hw]
		GL2ES1 	GLProfile[GL2ES1/GL4bc.hw]
		GL2 	GLProfile[GL2/GL4bc.hw]
		GL4 	GLProfile[GL4/GL4.hw]
		GL3 	GLProfile[GL3/GL4.hw]
		GL4bc 	GLProfile[GL4bc/GL4bc.hw]
		GL2GL3 	GLProfile[GL2GL3/GL4bc.hw]
		GL3bc 	GLProfile[GL3bc/GL4bc.hw]
		default GLProfile[GL4bc/GL4bc.hw]
		Count	8 / 12

Swap Interval  -1
GL Profile     GLProfile[GL4bc/GL4bc.hw]
GL Version     4.0 (Compat profile, arb, ES2 compat, FBO, hardware) - 4.0.0 - Build 10.18.10.3496 [GL 4.0.0, vendor 10.18.10 (- Build 10.18.10.3496)]
Quirks         [NoDoubleBufferedBitmap]
Impl. class    jogamp.opengl.gl4.GL4bcImpl
GL_VENDOR      Intel
GL_RENDERER    Intel(R) HD Graphics
GL_VERSION     4.0.0 - Build 10.18.10.3496
GLSL           true, has-compiler-func: true, version: 4.00 - Build 10.18.10.3496 / 4.0.0
GL FBO: basic true, full true
GL_EXTENSIONS  184
GLX_EXTENSIONS 0
-----------------------------------------------------
edit 3: I fired up Ubuntu and Processing runs flawlessly on it with the latest Intel drivers. I wonder what's broken in Windows... Whatever, it's a good excuse to get to play around with 14.04 LTS, so I'm just gonna roll with Ubuntu for a while.

Woodsy Owl fucked around with this message at 16:03 on May 3, 2014

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Madox posted:

Also for efficiency - what happens in the shader is that if you have 4 textures and are picking one of them using the array index, the shader will sample every texture and THEN throw 3 away, so be aware of that (unless cards are a lot better now - I havnt looked at asm in a while). It's not picking one sampling to do.

As I understand it, all branching is done this way. Both branches run always, and then one is thrown away.

Actually, depending on how you do it, all four textures will be sampled no matter what (they are sampled outside the branch and then the results are conditionally moved into the output). This is because you need execution of all the shader sample instructions across the local screen-space neighborhood to determine the screen-space UV derivatives that mip-mapping uses to determine its mip level.

If you're not using mip-mapping, then the samples will only be executed for branches that are actually taken in the local screen-space neighborhood of the rendered geometry. In other words, if you are drawing a triangle that only samples from one texture then only one of those branches will ever be taken (the shader is smart enough to skip sections with 0 coverage). You're right about what happens if even one pixel in the local neighborhood takes a different branch, however.

Fellatio del Toro
Mar 21, 2009

So if I'm not using mipmapping and am using the same float attribute for every vertex, it'll know that every fragment shader in that triangle is going to use the same sampler and only run that one branch?

Anyone have a good reference for this stuff?

Madox
Oct 25, 2004
Recedite, plebes!

Hubis posted:

Actually, depending on how you do it, all four textures will be sampled no matter what (they are sampled outside the branch and then the results are conditionally moved into the output). This is because you need execution of all the shader sample instructions across the local screen-space neighborhood to determine the screen-space UV derivatives that mip-mapping uses to determine its mip level.

If you're not using mip-mapping, then the samples will only be executed for branches that are actually taken in the local screen-space neighborhood of the rendered geometry. In other words, if you are drawing a triangle that only samples from one texture then only one of those branches will ever be taken (the shader is smart enough to skip sections with 0 coverage). You're right about what happens if even one pixel in the local neighborhood takes a different branch, however.

That's interesting, I assumed all four textures will be sampled no matter what all the time. The last time I looked at the assembler output of the D3D9 shader compiler, the assembly always sampled all textures and there was no smartness about mipmapping though maybe its better at run time now - this was like 5 years ago?


Fellatio del Toro posted:

So if I'm not using mipmapping and am using the same float attribute for every vertex, it'll know that every fragment shader in that triangle is going to use the same sampler and only run that one branch?

Anyone have a good reference for this stuff?

That sounds right

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Madox posted:

That's interesting, I assumed all four textures will be sampled no matter what all the time. The last time I looked at the assembler output of the D3D9 shader compiler, the assembly always sampled all textures and there was no smartness about mipmapping though maybe its better at run time now - this was like 5 years ago?

What you're seeing is correct -- it's hoisting out the tex instructions because it can't guarantee derivatives if they were in branches. Since Mipmapping is a state property, the compiler doesn't know if that's safe or not (hardware-level drivers may recompile the shader based on actual state, but that's beyond what HLSL does).

The key thing is that this is really all about the derivatives, not mipmapping -- you can put tex2d samples inside branches if you manually hoist calculating (dUV/ddx, dUV/ddy) out above the branch and then feed that into the branched tex2D instructions. In SM 4.0+ I believe that's the SampleGrad instruction.

Raenir Salazar
Nov 5, 2010

College Slice



YEEEES! YEEEEEEEEEES! I got it! I finally figured out how glBufferSubData works, woot!

Edit: So here's what I did.

First, I followed the colour picking tutorial here, I don't quite notice a performance issue but it seems to work fine for what I got.

I render the mesh as a solid colour of white with a uniquely coloured box located at each vertex. When I click on a box I test the pixel to see if its white, if it isn't I convert the colour to an integer id and that is my index for which vertex I'm messing with.

Then, with this information I calculate a translation matrix based on mouse movement which I in a very inefficient process pass back to the main program where I then rebind my VBO based on the new vertex position. It seems like I also need to rebind index/element buffer which seems weird, I don't fully understand whats happening but this seems to work.

Raenir Salazar fucked around with this message at 19:30 on May 9, 2014

Adbot
ADBOT LOVES YOU

MarsMattel
May 25, 2001

God, I've heard about those cults Ted. People dressing up in black and saying Our Lord's going to come back and save us all.
Not sure if this is the right thread, but it seems the best fit.

I've started experimenting with OpenCL and have moved some computation heavy code (calculating noise values for a voxel renderer) into OpenCL kernels. Everything works as expected with one thread, but when I start calling the same code from multiple threads I either get crashes in various OpenCL API calls or deadlocks or other weird behaviour.

The OpenCL standard says that all API calls are thread safe with the exception of clSetKernelArg when operating on the same cl_kernel object. My implementation creates a single device, context and queue, but for each invocation a new cl_kernel is made by the calling thread -- so by my understanding my code should be thread safe since no thread can use a cl_kernel except its own. However, this doesn't seem to be the case in practice.

In the samples there is a multi-threaded example, but that is multiple threads with multiple devices, not multiple threads with a single shared device.

Do I need to create a context per thread? That would imply a queue per thread which would give less scope for re-ordering operations etc, which seems bad (although I'm not sure how much scope there would be for that in my current implementation anyway).

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply