Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
shodanjr_gr
Nov 20, 2007

Hubis posted:

I have high confidence you could get this working on an NVIDIA Ion system

That is very likely, considering that the GPU is the same as the one on the Macbook im currently working on.

Adbot
ADBOT LOVES YOU

shodanjr_gr
Nov 20, 2007
Can't help you with the selection buffer bit, but I caught this while going over your code:

code:
gluPerspective(60, 1.0, 0.0001, 1000.0);
I think such a small Z-near can result in a screwed up projection matrix that will give wacky results.

Any particular reason you are doing that?

shodanjr_gr
Nov 20, 2007

PnP Bios posted:

I imagine most of the changes in 3.0 have to do with GLSL rather than the core API. ex, implementation of geometry shaders.

The only real change left to the core API is to remove the fixed pipeline functionality. That's never going to happen though, since the CAD developers would poo poo bricks.

Geometry Shaders have been in as an extension before 3.0 came out.

3.0 was a horrible release for the most part though. Khronos had been promissing blow-jobs and a new API all around, and they ended up with an incremental update instead of a major revision...

shodanjr_gr
Nov 20, 2007

OneEightHundred posted:

As for getting tutorials for it: Try to think of OpenGL 3.0 as OpenGL 2.2, except with "GL3/gl3.h" as your include file. What you really want to get in to is using GLSL instead of the fixed-function poo poo, since they hardly changed anything else.

I'd actually like to track down a tutorial for the new "direct state access" dealie.

I can see that making my OpenGL code a bit more readable.


Also, is there a go-to tutorial for using stream-out buffers?

shodanjr_gr
Nov 20, 2007
Since we are talking about point sprites, is it possible to get to the point-sprite generated geometry inside a geometry shader?

shodanjr_gr
Nov 20, 2007

Spite posted:

If you are using geometry shaders (and really, they kind of suck since they aren't very performant), why not extrude a single vertex into a quad yourself?

It could possibly be faster if done natively by the GPU.

shodanjr_gr
Nov 20, 2007
code:
vec4 displacedir=vec4(
		gl_Normal.x,
		gl_Normal.y,
		gl_Normal.z,
		0
	) * projectionmatrix;
I'm not sure that this is correct. Try changing it to this:

code:
vec4 displacedir=vec4(
		gl_Normal.x,
		gl_Normal.y,
		gl_Normal.z,
		1.0
	) * projectionmatrix - vec4(0.0,0.0,0.0,1.0) * projectionmatrix;
Since you want to define the displacement direction, which is a vector, thus defined by it's start and end points, which both need to be trasnformed to the new coordinate system.

Why would you want to do displacement mapping in projection space?

shodanjr_gr
Nov 20, 2007

heeen posted:

I'm doing adaptive subdivision surfaces, and I'm projecting the control mesh first and do the subdivision on the already projected points. This way I can calculate the error to the limit surface depending on perspective and I save a lot of matrix-vector nultiplications.

Interesting. So you only subdivide the points that actually matter to the viewer, that makes sense. Do you have any relevant papers that you can link to? I like learning new stuff :D.

shodanjr_gr
Nov 20, 2007
http://kotaku.com/5335483/new-cryengine-3-demo

Anyone got any info/links for the technique demonstrated by Cry-tek in the linked video?

edit: http://www.crytek.com/fileadmin/user_upload/inside/presentations/2009/Light_Propagation_Volumes.pdf white paper here!

shodanjr_gr fucked around with this message at 17:40 on Aug 12, 2009

shodanjr_gr
Nov 20, 2007

BattleMaster posted:

Wait people still think that graphics and gameplay are tradeoffs? Even though the skills needed to develop either don't really overlap?

Fallout 3 would have been so much better if it didn't use 3D graphics. Bethesda should have opted for ASCII visuals instead, they improve gameplay :iamafag:.

shodanjr_gr
Nov 20, 2007

sex offendin Link posted:

I could barely follow that Crytek whitepaper, but just enough to see that my guess was completely wrong. I didn't think we had reached the point where a true volumetric effect like that was possible, I thought screenspace tricks were still a necessity.

They use a reflective shadowmap (downsampled render target from the light's POV) to capture the first bounce, but I'm trying to figure out exactly what they do after that...it looks like they generate point light sources from the RSM which are then stored into a volume and their radiance iteratively propagated?

shodanjr_gr
Nov 20, 2007

Contero posted:

Where do you guys look for research-y type free models to test out rendering techniques with?

In particular I'm looking for a huge, textured, relatively nice looking landscape mesh to play around with.

Can't say I've ever seen any "reference" terrain models (like Cornell Box/Sponza Atrium/Stanford Bunny/Little Bhuda/Dragon etc) . Just do what MasterSlowPoke suggests, displacement mapping on a planar mesh with reads off a perlin noise texture, it gives some very nice results. If you feel like it you can do coloring based on the displacement value to simulate water/snowy mountains/whatever.

shodanjr_gr
Nov 20, 2007
Can anyone recommend a "robust" camera class/mini-library with the ability to switch between arcball/flythrough manipulation modes etc. on the fly?

shodanjr_gr
Nov 20, 2007
So I am working on getting Ogre3D to work inside a CAVE environment. I've written some code for calculating off-axis projection matrices for an arbitrary viewport (defined by the bottom left, top left and bottom right corners) and an arbitrary eye position.

If I define a viewport with it's center at (0,0,-1), extending 1 units to each side (so that the top frusturm edge is at (0,1,-1), the bottom one at (0,-1,-1), etc) and I place my eye at (0,0,0) in this "frustum space", I get this sort of result:


Click here for the full 797x750 image.


While I am expecting to get this (which is what gets produced if I just create a viewport with an aspect ratio of 1.0):


Click here for the full 800x747 image.


I've actually tried 2 different ways to calculate the off-axis projection matrices (1 on my own, 1 ripped off from Syzygy, a VR Library) and I get the same result. My intuition is that the custom projection matrix ends up having a far larger FOV than the non-custom one...

Any ideas?

shodanjr_gr
Nov 20, 2007

PDP-1 posted:

I ran into something I don't understand today while working on a shader - I was sampling a mipmapped texture and the shader ran fine. Then I changed the texture and forgot to generate the mipmap and the framerate absolutely tanked. When I generated the mipmap on the new texture things ran great again.

The obvious conclusion is to be sure to use mipmaps, but I don't understand why that makes such a difference in the framerate. Sampling a texture is sampling a texture, and if anything I'd have guessed that translating the UV coords to the mipmap would more work for the GPU.

Why is sampling a mipmapped texture so much faster than sampling a non-mipmapped texture?

Better data locality. If you map the same region of the texture to a surface, the mipmap one requires fewer memory fetches.

shodanjr_gr
Nov 20, 2007
I've run into an OpenGL/OpenCL multithreading/resource sharing question.

I have an app that uses multiple threads to poll a bunch of Kinects for depth/rgb frames. It then renders the frames into separate OpenGL contexts (right now there is a 1-1 correspondence between an image and an OpenGL context).

To my understanding it is possible to get OpenGL contexts to share display lists and textures (I'm using Qt for the UI and it offers such functionality and bone stock OpenGL does it as well). However, I haven't found it explicitly stated anywhere that more than two contexts can share resources.

Additionally, I plan to add some OpenCL functionality that basically does computation on these images and outputs results that I also want to be able to render in the aforementioned OpenGL contexts. Now, OpenCL allows you to interop with OpenGL by defining a single OpenGL context to share resources with.

The overarching question is whether I can "chain" resource sharing between contexts. As in, when my application starts, create a single "parent" OpenGL context, then have ALL other OpenGL and OpenCL contexts (that may reside in other threads) actually share resources with that "parent" context and as an extension share resources with each other?

shodanjr_gr
Nov 20, 2007

Paniolo posted:

Why are you using separate OpenGL contexts for everything?

I have different widget classes that handle visualizing different types of data and Qt makes no guarantee that they will share an OpenGL context.

shodanjr_gr
Nov 20, 2007

Spite posted:

If you are using multiple contexts you MUST use a separate context per thread.

To answer the main question, yes, multiple contexts can share resources. Note that not everything is shared, like Fences (but Syncs are). Check the spec for specifics.
If you are trying to do something like
Context A, B, C are all shared
Context B modifies a texture
Context A and C get the change
That should work.
Create context A. Create contexts B and C as contexts shared with A. That should allow you to do this.


That's what I ended up doing. I create context on application launch and then any other contexts that are initialized share resources with that one context.

quote:

Keep in mind you've now created a slew of race conditions, so watch your locks!
Also, you have to make sure commands have been flushed to the GPU on the current context before expecting the results to show up elsewhere.
For example:
Context A calls glTexSubImage2D on Texture 1
Context B calls glBindTexture(1), Draw

This will have undefined results. You must do the following:

Context A calls glTexSubImage2D on Texture 1
Context A calls glFlush
Context B calls glBindTexture(1), Draw

On OSX, you must also call glBindTexture before changes are picked up, not sure about windows. It's probably vendor-specific. And again, you need a context per thread. Do not try to access the same context from separate threads; only pain and suffering awaits.
I already have wrappers around all GPU-specific resources and the convention is that all of them must be locked for Read/Write before access. I am also figuring out a way to ensure consistency between the GPU and CPU versions of a resource (potentially have "LockForWriteCPU" and "LockForWriteGPU" functions along with a state variable and then an unlock function that updates the relevant copy of the data based on the lock state).

quote:

You probably want to re-architect your design, as it doesn't sound very good to me. Also, can't Qt pass you an opaque pointer? You can't hold a single context there and lock around its use? Or have a single background thread doing the rendering?
There's a couple of issues. First of all, I am using multiple widgets for rendering and I am not guaranteed that I will get the same context for all widgets (even if they are in the same thread I believe and I plan on multithreading those as well since they tank the UI thread). Additionally, I plan on having an OpenCL context in a separate thread. That's on top of a bunch of threads that produce resources for me. I wanted to minimize the amount of "global" locking that takes place in this scheme hence this exercise...

shodanjr_gr
Nov 20, 2007

Spite posted:

This is way complicated. Have you tried a single OpenGL context that does all your GL rendering and passes the result back to your widgets? The widgets can update themselves as the rendering comes back. Remember: there's only one GPU so multiple threads will not help with the actual rendering.

You mean as in having a single class that manages the context and rendering requests get posted to that class which does Render-To-Texture for each request then returns the texture to a basic widget for display?


quote:

Also GPU and CPU resources are separate from each other unless you are using CLIENT_STORAGE or just mapping the buffers and using that CPU side. You can track what needs to be uploaded to the GPU by just making dirty bits and setting them. Multiple threads should not be trying to update the same GPU object at once in general - that gets nasty very fast.

That's what I plan on doing...basically have each wrapper for my resources carry a CPU-side version ID and a GPU-side version ID and then a function that ensures consistency between the two versions. I am also providing a per-resource lock so technically more than one thread should not be locking the same resource for writing at the same time (either on the CPU or the GPU side).

shodanjr_gr
Nov 20, 2007

ickbar posted:

I apologize complete Noob here, don't have too much experience with OpenGl but i'm trying to make a hack for an game using a proprietary Opengl engine with no available SDK or source-code to download.

I'm able to successfully Detour OpenGL functions, and have been using glDisable(GL_TEXTURE_2D) under functions glBegin(), glDrawElements, glDrawArrays to check what objects are being rendered with what commands. These are the only drawing functions that I see on the list of imported opengl functions in the game in Ollydbg. Interestingly Deletelists is referenced in Olly, but no corresponding call to functions related to Displaylist creation.

It's successfully disabled all model textures, except for the ground, foliage which still have textures enabled, trees and certain static objects .

I don't understand how they are still being drawn even though I think i've detoured all the drawing functions. I'm not sure what i'm doing wrong, missing here, or whether Olly is not displaying the entire list of gl functions being used for some reason.

Any input and ideas from OpenGL guru's here on what's going on would appreciated.

Shot in the dark here but maybe they are using some extension wrangler (like GLEW) to get access to various API entry points? (e: thus mangling up the symbols/names)

shodanjr_gr
Nov 20, 2007
I got a question about GLSL Tesselation Shaders.

I got some geometry that I'm rendering either as GL_Quads (through a vertex shader) or GL_PATCHES (through a vertex->tess control->tess eval shader chain). The VBO is exactly the same (same vertex definitions and indices).

When I look at the wireframe of the GL_QUADS version of the geometry, it shows, as expected, quads.

When I look at the wireframe of the GL_PATCHES version however, each quad is subdivided into two triangles. My tesselation control shader has layout(vertices=4) out, set at the top and my tesselation evaluation shader is set to layout(quads) in. Is there some way to work around this issue or am I stuck with looking at my quads with a line in the middle (I'm asking cause I want to make figures for a paper I'm writing and having to explain that "I swear I'm issuing 4-vertex patches to teh GPU instead of two triangles" might not jive very well...).

shodanjr_gr
Nov 20, 2007
Are const arrays inside shader code supposed to be blatantly slow?

I have a tessellation control shader that needs to index the vertices of the input primitive based on the invocation ID (so gl_InvocationID == 0 means that the TCS operates on the edge of gl_in[0] and gl_in[1], etc).

Initially, my code had an if-statement (which I would assume GPUs don't like that much when it diverges inside the same execution unit) to make this determination. I figured that I could flatten the vertex indices out into a single const int[8] and index them based on the invocation ID (so I could say indices[glInvocationID * 2] and indices[glInvocationID * 2 + 1] and get the stuff that I need).

However, doing this seems to hit me with a 66% performance drop when compared to using if-statements! Would passing the index array as a uniform yield a performance benefit?

shodanjr_gr
Nov 20, 2007

Hubis posted:

What graphics card are you seeing this with? Depending on the hardware, the problem probably isn't the const array, but the fact that you are using dynamic indexing into the array. Some GPUs don't really support base+offset indexing, instead mimicing it using registers. Unfortunately, if you index the arrays dynamically, this requires all the accesses to be expanded (either by unrolling loops, or expanding into a giant nasty set of branches). So you could actually be gaining MORE branches, instead of eliminating them.

This is on a Quadro 5000.

quote:

Why do you need to index the edges like you are doing? Your best bet would be to structure the input to your shader so that it doesn't need to branch at all, even if that means just adding extra interpolants. I'm not sure if that would work for you here, though.
This is happening inside a tessellation control shader. For each vertex of the input primitive, you get one invocation of the TCS. All invocations of the TCS for each primitive have access to a gl_in[numVertices] array of structs that contains the per-vertex attributes of the primitive as they are passed by the vertex shader. I want each invocation of the TCS to do stuff for each EDGE of the input primitive and thusly i need to index the vertex positions that define the edge from gl_in. Since the per-edge operations are kind of expensive, I can not have each invocation of the TCS do the operations for ALL edges unfortunately (I am taking this approach for other, cheaper per-primitive operations).

quote:

e: There's no way to see intermediate assembly with GLSL, right? For DirectX, you could use FXC to dump the shader which might show you if that were happening at the high-level level (though not if it's being introduced at the machine-code translation stage).

I believe there are utilities released by AMD/NVIDIA that will compile GLSL down to the machine-level assembly for you....

shodanjr_gr
Nov 20, 2007

Hubis posted:

Higher quality bins of chips, much better handling of geometry-heavy scenes (not just lots of triangles, but lots of small triangles), and the driver and technical support commensurate with a workstation-level GPU (not just perf, but some weird/edge-case features that don't matter much for consumers but might for a professional system).

That's very true. I work at a research University and I've been able to ping both NVIDIA and AMD engineers regarding weird behaviors/driver issues/optimizations with their professional level cards. I assume that if you are buying geforce/radeon you can't really call em up and say "Why do the textures in Rage look like crap? send me a custom driver that fixes this!".

shodanjr_gr
Nov 20, 2007

quote:

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

I am pretty sure glTexParameter() calls are applied PER TEXTURE and not globally. That means that the current active texture (for the last call to glBindTexture) will have its state affect by these. Also, I believe that you need to provide minification/magnification filters for an OpenGL texture to be valid.

What ends up happening is, if you comment out the background call (including the glBindTexture), the active texture at the end of render() will be your teapot_texture_id. During the next render, the glTexParameteri calls get applied to teapot_texture_id and the texture becomes valid for usage. Then you switch to the plane texture (which is not properly set up, hence your ground plane being screwed up) and then back to the teapot texture and your teapot renders fine.

Try applying these texture sampler settings to EACH texture, during initialization. You generally do not want to be reapplying them on runtime, unless you have to change their state.

shodanjr_gr
Nov 20, 2007

RocketDarkness posted:

Dang, you hit the nail on the head. Tossing those lines of code in after each BindTexture call fixed it right up. I really appreciate it! And thanks to everyone else that spent any time looking, as well.

No problem :).

Just to reiterate, there isn't much point in setting the texture filtering modes every time you render (and it probably hurts performance wise). Just do it once when you initialize the textures and then just switch them when you really need to. Also, use GL_LINEAR :).

shodanjr_gr
Nov 20, 2007
e: Nevermind, I'm a moron.

shodanjr_gr fucked around with this message at 06:20 on Oct 23, 2012

shodanjr_gr
Nov 20, 2007
Is there an OpenGL debugger that works in Windows 8.1?

I've tried gDebugger (both the Graphics Remedy version and the new AMD one) and they all blow up on me when breaking and trying to inspect textures.

NVidia's NSight doesn't seem to like the fact that I'm rendering in OpenGL within a QT widget (most/all of the fancy HUD stuff does not work).

shodanjr_gr
Nov 20, 2007

Boz0r posted:

I tried moving my raytracing project to my desktop Windows PC to get some more power, but when I try running it, I get the most vague error message:
code:
The application was unable to start correctly 0xc00007b.
From googling, this seems to be a general error, so I have no idea where to begin troubleshooting. Any ideas?

If you just copied the executable from your development machine to the desktop, chances are you are missing either some DLLs for any of the external dependencies (e.g. GLUT) or you are missing the visual studio runtime that your raytracer was compiled against. Each version of visual studio has a different set of DLLs that implement various bits and pieces of C/C++ functionality and those have to be either in your PATH system variable or in the working directory of the application (generally, the same directory where your executable is). If you Google "Visual Studio 20XX runtime" you will find download links directly from Microsoft.

You might wanna consider trying to build your code locally on your desktop and running it with a debugger attached, that should give you more information about what's happening.

shodanjr_gr
Nov 20, 2007

roomforthetuna posted:

You can also (more sensibly in my opinion) configure it to do static linking so that you don't have to distribute those files with your exe, for any project that isn't going to be made of a bunch of modules. Assuming you're not using MFC or something.

(I expect you'd still need the GLUT, and there's no static linking DirectX either, but the runtime libraries I'm pretty sure still do.)

That's actually a better approach for something small-scale. I also believe that GLUT can also be static-linked.


On another note, is there an OpenGL debugger that works properly in Windows 8.1? I used to use gRemedy's gDebugger in Windows 7 and it worked great for me, but since I moved to Win 8.1, it crashes whenever I try to look at textures/buffers when at a breakpoint. I tried AMD's version as well, but it crashes the same way. NVidia's NSight graphics debugger doesn't want to debug non-core GL contexts.

edit:

To answer my own question, AMD's GPU Perf Studio 2 seems to work fine on NVidia cards (including GLSL on-the-fly editing) and all the nice HUD injection stuff). The UI is somewhat more janky than gDebugger (slower, .NET based and talks to a server over HTTP) but it gets the job done and the profiling tools are better. However, it doesn't do other stuff that gDebugger does, like allow you to break on certain GL calls and it doesn't seem to be able to show you stack traces that lead to API calls either.

edit 2: Actually, the Frame Debugger and API Trace functionality works but the frame profiler doesn't since it seems to need access to low level hardware counters. Bummer...

shodanjr_gr fucked around with this message at 07:14 on Feb 26, 2014

shodanjr_gr
Nov 20, 2007
I haven't used JOGL so I'm not sure what it does internally but I'm noticing a couple of things wrong with your code.

You start a draw call (gl.glBegin()) and then inside of that you render the vertices for both of your primitives. I'm pretty sure that non vertex state can not be affected within a drawCall (between glBegin() and glEnd()).

You should do your glBind call outside of the glBegin()/glEnd() block and only submit vertex geometry within that block. Probably what happens in your situation is that the second glBindTexture() call within your draw call ends up getting applied after the draw call ends and is used as the active texture for the draw calls in the next frame. You probably want to rewrite your Graphics2D::Draw() function like this:

code:
texture.bind(gl);
gl.glBegin(QUADS)
// vertex submission goes here
gl.glEnd(QUADS)
and your drawImmediate should look like this (notice that the glbegin/glend has been removed):

code:
private void drawImmediate() {
        int size = graphicList.getSize();
        
        gl.glClearColor(0, 0, 0, 0);
        gl.glClear(GL.GL_COLOR_BUFFER_BIT);
        for (int i = 0; i < size; i++) {
            graphicList.getAt(i).draw();
        }
   }

shodanjr_gr
Nov 20, 2007

MarsMattel posted:

Not sure if this is the right thread, but it seems the best fit.

I've started experimenting with OpenCL and have moved some computation heavy code (calculating noise values for a voxel renderer) into OpenCL kernels. Everything works as expected with one thread, but when I start calling the same code from multiple threads I either get crashes in various OpenCL API calls or deadlocks or other weird behaviour.

The OpenCL standard says that all API calls are thread safe with the exception of clSetKernelArg when operating on the same cl_kernel object. My implementation creates a single device, context and queue, but for each invocation a new cl_kernel is made by the calling thread -- so by my understanding my code should be thread safe since no thread can use a cl_kernel except its own. However, this doesn't seem to be the case in practice.

In the samples there is a multi-threaded example, but that is multiple threads with multiple devices, not multiple threads with a single shared device.

Do I need to create a context per thread? That would imply a queue per thread which would give less scope for re-ordering operations etc, which seems bad (although I'm not sure how much scope there would be for that in my current implementation anyway).

Which OpenCL version is this? http://www.khronos.org/message_boards/showthread.php/6788-Multiple-host-threads-with-single-command-queue-and-device this is old but apparently 1.0 does not support thread-safe kernel enqueueing?

shodanjr_gr
Nov 20, 2007

fritz posted:

OK, I'm now using a model/view/projection thing, and every prism has its own model matrix, so it's something like:

code:
// bind the triangles and color info, load shaders, etc
for(all prisms)
MVP = Projection * View * make_model(parameters);
// bind the MVP uniform
glDrawArray(GL_TRIANGLES, 0, 36);
(I also am now encoding the prism as 12 triangles instead of 8 quads, glDrawArray(GL_QUADS) doesn't seem to work).

Alternatively I could bind the model parameters to a series of uniforms and do the computation in the shader? (they're just a scale/rotation/translation of the prisms).


When I go to adding the full specification of the various objects, should I just lump them all into one big contiguous section of memory on the heap (like with a std::vector<float>), bind it to the buffer once, set the MVP for each object, and call glDrawArray with different offsets?

Also, if you are drawing hundreds/thousands of these, you might wanna look into instanced rendering, since your geometry is the same across all draw calls.

shodanjr_gr
Nov 20, 2007

roomforthetuna posted:

It looks like I need to just ask this because outdated answers from the internet are no answers at all.

What's the least frustrating way to use OpenGL such that it will work with minimal effort cross-platform, ideally including both mobile devices and Intel GPUs that are several years old (and regular modern hardware too of course)? I'm not looking to do anything fancier than a bunch of single textured triangles with alpha blending and render targets (for a 2D game), so I don't need any kind of advanced shader functionality.

Is it going to end up being "use OpenGL ES for new hardware and write different code and different shaders to use an older version of OpenGL for old hardware", or will OpenGL ES work on older PC hardware, or...?

May be a bit of an overkill but you could use OpenSceneGraph....especially if you are not rolling out shaders.

shodanjr_gr
Nov 20, 2007

nye678 posted:

My guess would be that your VM's graphics driver cannot create a 3.3 context. Try commenting out the hints for the GL version and see if it won't create a window for you. I believe glfw will create the window with whatever the highest possible context version is so check out what what it gives you after the fact.

That's most likely the case...I was messing around with OpenGL inside Windows VMs on OS X and i think that none were able to generate a context with version > 2.1 (that was a year or so ago)...

shodanjr_gr
Nov 20, 2007

fritz posted:

Update: y'all were right, gl in VirtualBox is only 1.1 (!!!), when I hauled out the windows laptop and built it over there it all works ok.

If you try VMWare Fusion or Parallels, at least you will get a context that you can compile GLSL in.

shodanjr_gr
Nov 20, 2007

Nahrix posted:

This is what I'm aiming at: packing multiple meshes into a single draw call. My confusion lies in how the pixel shader would know which index in the texture array to call for that pixel. Right now, I just reference a single texture, and use a texcoords variable to find which pixel color to draw. If there's an array of textures in there, what's a good way of finding which texture to start with when referencing coordinates?

Edit: Or, you mentioned something called an atlas (I don't know what that is). Is that a better-fitting solution to the problem I'm describing? Are there some good resources on solving my problem with either an array or atlas?

Simple solution: pass the texture array index for each piece of geometry as a vertex attribute, and pass that through to the pixel shader.

Less simple solution: encode the index of the texture array into one of the vertex attributes you are already passing to the vertex shader (e.g. use 2 bits from the x component tex coord and 2 bits from the y tex coord to encode up to 16 indices). Depending on the size of your atlases and the precision of your vertex attributes, you might be able to get away with this without a loss of image quality.

Less less simple solution: pack all your individual textures into a single huge texture (you can go up to 16K by 16K these days) and then post-processes the texture coordinates of your meshes to index directly into that. This is what Sex Bumbo meant by "texture atlas".

Adbot
ADBOT LOVES YOU

shodanjr_gr
Nov 20, 2007

Nahrix posted:

I see. I think a modified version of your second method ("Less simple") would ideally work for me. Now that I know what an atlas is, I'm not sure it would work, in a practical sense, for a scene that has models dynamically added / removed, as it would require a reprocessing of the atlas, and putting it back on the graphics card.
Generally speaking, atlases are created "offline". You bake all the textures that you would need into the atlas and never touch it again. There exist more advanced techniques that will do this kind of swapping in real-time e: and also allow you to exceed the max texture size limits imposed by the GPU (if you're interested, look up "virtual texturing" or "partially resident textures").

quote:

What I mean by "modified version" is using a second TEXCOORD (or other variable in the layout; I'm not familiar with them all yet), so I don't potentially cause issues with drawing textures. Although, I'm thinking that you suggested using a few bits in an existing variable, because it's an unreasonable amount of waste to use an entire other TEXCOORD (or other variable), for a texture index.

Exactly! The whole idea of approach #2 is that you don't actually use an additional variable in your vertex layout and you do not generate another array of vertex attributes (which would waste memory). So, if you are using 16-bit texture coordinates, you would pack the 14 most significant bits of the actual coordinate component and then use the other 2 to encode "half" of your texture index. Repeat for the other texture coordinate component to get the remaining 2 bits. Then combine and enjoy.

But as Sex Bumbo said, this entire thing will only really matter if you use A LOT of different textures (100s). If you are using 10 or 20 or whatever, just render your meshes in "batches", grouped by texture (assuming that the rest of the render state is the same).

shodanjr_gr fucked around with this message at 07:04 on Apr 3, 2015

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply