Game Development Megathread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Game Development Megathread

«‹›623 »

Raenir Salazar: Nov 5, 2010; College Slice

Comparison:

My first attempt at a two-pass implementation; except it is only the first pass. :v:

I need to integrate the Weight Union-Find library I found to work with how I implemented the algorithm and I'm not sure the best approach yet; the library I found works with ints and not sure how to use it with node-like structures but I'll see.

Since its the first pass I assume the current trippy appearance is due to not merging the equivalence labels yet; and I hope there's no weird bug in there somewhere.

I think largely whats happening is pixels that are part of a larger landmass that occur sooner by being more south and east than some later pixel that is north and west is going to result in the first pass a split coloured area.

Good news is doing this has a time of approximately 28 seconds; which is about 20 seconds faster than I was expecting it to be to scan both water and land zones.

I even managed to figure out some nifty optimizations, like being able to avoid bounds checks, or even bother with checking for labels.

Code:

code:

        // we know immediately that (0,0) is unlabelled as its our origin,
        // so we pre-label it.
        currentLabel++;
        labelMap[0, 0] = currentLabel;
        Labels.Add(new LabelInfo(currentLabel, rColour, new List<Coord>()));
        Labels[0].regionCoords.Add(new Coord(0, 0));

        List<Coord> firstRegion = Labels[0].regionCoords;

        // prefill the bottom row and leftmost column as currently it is 
        // assumed the edges are all ocean and they are all trivially connected.
        // this might not always be true but thats for future me to solve.
        for (int j = 1; j < maxWidth; ++j)
        {
            labelMap[j, 0] = currentLabel;
            SetPixel(j, 0, rColour, ref pixelsToPaint, ref firstRegion, maxWidth);
        }

        for (int i = 1; i < maxHeight; ++i)
        {
            labelMap[0, i] = currentLabel;
            SetPixel(0, i, rColour, ref pixelsToPaint, ref firstRegion, maxWidth);
        }

        Coord[] neighbours = new Coord[2];
        int neighCounter = 0;

        for (int i = 1; i < maxHeight; ++i)
        {
            for (int j = 1; j < maxWidth; ++j)
            {
                if (labelMap[j, i] == UNLABELED)
                {
                    neighCounter = 0;
                    // check neighbours

                    // check if West tile is the same colour as the current pixel
                    if (map[i * maxWidth + j - 1].Equals(map[i * maxWidth + j]))
                    {
                        neighbours[neighCounter] = new Coord(j - 1, i);
                        neighCounter++;
                    }

                    // check if South tile is the same colour as the current pixel
                    if (map[(i - 1) * maxWidth + j].Equals(map[i * maxWidth + j]))
                    {
                        neighbours[neighCounter] = new Coord(j, i - 1);
                        neighCounter++;
                    }

                    if (neighCounter > 0)
                    {
                        // get the smallest label of the two adjacents that shares the colour as the 
                        // current pixel
                        int minLabel = Mathf.Min
                            (
                                labelMap[neighbours[0].x, neighbours[0].y], 
                                labelMap[neighbours[neighCounter - 1].x, neighbours[neighCounter - 1].y]
                            );

                        labelMap[j, i] = minLabel;
                        LabelInfo _labelInfo = Labels[minLabel - 1]; // smallest label is 1 but the list is 0 indexed

                        SetPixel
                        (
                             j, i, Labels[minLabel - 1].regionColour, ref pixelsToPaint, ref _labelInfo.regionCoords, maxWidth
                        );

                        // ensure the labels list has the info; not sure how structs work when passed into a function
                        Labels[minLabel - 1] = _labelInfo; 
                    }
                    else
                    {
                        currentLabel++;
                        Color nuColour = Random.ColorHSV(0, 1, 0, 1, 1, 1, 1, 1);

                        Labels.Add(new LabelInfo(currentLabel, nuColour, new List<Coord>()));

                        Labels[currentLabel - 1].regionCoords.Add(new Coord(j, i));
                        List<Coord> nuRegion = Labels[currentLabel - 1].regionCoords;
                        labelMap[j, i] = currentLabel;
                        SetPixel(j, i, Labels[currentLabel - 1].regionColour, ref pixelsToPaint, ref nuRegion, maxWidth);
                        LabelInfo labelInfo = Labels[currentLabel - 1];
                        labelInfo.regionCoords = nuRegion;
                        Labels[currentLabel - 1] = labelInfo;
                    }
                }
            }
        }

I think the bad news is that 28 to 30 seconds total which looks to be the lower bound, is still not quite fast enough and I think this is basically as optimized as it gets for sequential use.

So I think once I finish this off I'll next look into multithreading or GPU parallelization which is tricky as this algorithm is heavily dependent on the previous state of the mapping.

# ? Oct 25, 2021 07:38

Adbot: ADBOT LOVES YOU

# ? May 25, 2024 10:19

fuf: Sep 12, 2004; haha

sorry this probably gets asked all the time but does anyone have a good Unity course they would recommend?

I already know a bit of C# so ideally one that is a course on Unity rather than an introduction to programming in general.

I started "Complete C# Unity Game Developer 2D" on udemy and the guy is very pleasant but it's just too slow (like he explains where the parentheses keys are on the keyboard level of slow).

I want to make a strategy game hehe

# ? Oct 25, 2021 12:16

Mr Shiny Pants: Nov 12, 2012

fuf posted:

sorry this probably gets asked all the time but does anyone have a good Unity course they would recommend?

I already know a bit of C# so ideally one that is a course on Unity rather than an introduction to programming in general.

I started "Complete C# Unity Game Developer 2D" on udemy and the guy is very pleasant but it's just too slow (like he explains where the parentheses keys are on the keyboard level of slow).

I want to make a strategy game hehe

Unity has some tutorials on their site which are quite handy. I like the FPS microgame one.

# ? Oct 25, 2021 12:45

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Raenir Salazar posted:

So I think once I finish this off I'll next look into multithreading or GPU parallelization which is tricky as this algorithm is heavily dependent on the previous state of the mapping.

Since you're flood-filling multiple regions, you could make each region take a thread?

28 seconds seems nuts for a flood filling. Perhaps it would be faster if you performed it in a structure that's less complicated that whatever structure you're actually using? Like if you first convert whatever representation your 2D grid is into a single flat array of bytes that are good enough to signify whatever the criteria are for edges of the flood fill, then perform the flood fill in your bytes, then map the outcome back onto your original structure.

I don't know what's going on under the hood in your case, but flood-filling sections of even a 4000x4000 grid shouldn't be the work of tens of seconds on a modern CPU.

# ? Oct 25, 2021 13:15

fuf: Sep 12, 2004; haha

Mr Shiny Pants posted:

Unity has some tutorials on their site which are quite handy. I like the FPS microgame one.

oh cool thanks, yeah those "pathways" on their site look pretty good

# ? Oct 25, 2021 14:36

Bongo Bill: Jan 17, 2012

A quadtree might be a useful data structure for this problem.

# ? Oct 25, 2021 16:27

Raenir Salazar: Nov 5, 2010; College Slice

roomforthetuna posted:

Since you're flood-filling multiple regions, you could make each region take a thread?

28 seconds seems nuts for a flood filling. Perhaps it would be faster if you performed it in a structure that's less complicated that whatever structure you're actually using? Like if you first convert whatever representation your 2D grid is into a single flat array of bytes that are good enough to signify whatever the criteria are for edges of the flood fill, then perform the flood fill in your bytes, then map the outcome back onto your original structure.

I don't know what's going on under the hood in your case, but flood-filling sections of even a 4000x4000 grid shouldn't be the work of tens of seconds on a modern CPU.

I mean, its 33,554,432 pixels for a 8192 by 4096 texture. I feel like that's gotta be large enough that no matter how modern the CPU, working within the Unity editor play mode its probably going to struggle?

I went and commented out SetPixel(...) and it was 19 seconds.

Fake edit, I went and stripped as much as I could so its only populated a 2D int array with labels and a int[2][2] array with neighbours which gets re-written.

code:

        // we know immediately that (0,0) is unlabelled as its our origin,
        // so we pre-label it.
        currentLabel++;
        labelMap[0, 0] = currentLabel;

        // prefill the bottom row and leftmost column as currently it is 
        // assumed the edges are all ocean and they are all trivially connected.
        // this might not always be true but thats for future me to solve.
        for (int j = 1; j < maxWidth; ++j)
        {
            labelMap[j, 0] = currentLabel;
        }

        for (int i = 1; i < maxHeight; ++i)
        {
            labelMap[0, i] = currentLabel;
        }

        for (int i = 1; i < maxHeight; ++i)
        {
            for (int j = 1; j < maxWidth; ++j)
            {
                if (labelMap[j, i] == UNLABELED)
                {
                    neighCounter = 0;
                    // check neighbours
                    // if none are labelled (i.e label == 0)
                    // then assign new label
                    // else if already labelled,
                    // then assign neighbours to parent label
                    // to main label

                    // check if West tile is the same colour as the current pixel
                    if (map[i * maxWidth + j - 1].Equals(map[i * maxWidth + j]))
                    {
                        neighbours[neighCounter][0] = j - 1;
                        neighbours[neighCounter][1] = i;
                        neighCounter++;
                    }

                    // check if South tile is the same colour as the current pixel
                    if (map[(i - 1) * maxWidth + j].Equals(map[i * maxWidth + j]))
                    {
                        neighbours[neighCounter][0] = j;
                        neighbours[neighCounter][1] = i - 1;
                        neighCounter++;
                    }

                    if (neighCounter > 0)
                    {
                        // get the smallest label of the two adjacents that shares the colour as the 
                        // current pixel
                        int minLabel = Mathf.Min
                            (
                                labelMap[neighbours[0][0], neighbours[0][1]], 
                                labelMap[neighbours[neighCounter - 1][0], neighbours[neighCounter - 1][1]]
                            );

                        labelMap[j, i] = minLabel;
                    }
                    else
                    {
                        // new label
                        currentLabel++;
                        labelMap[j, i] = currentLabel;
                    }
                }
            }
        }

And its now a whole 18 seconds, 1 second faster; which leads me to believe that Dr Strange's 33,554,432 possible futures on my machine in the Unity compiler takes at minimum 18ish seconds to compute doing nothing more than the minimum on a 8k by 4k array.

With 16 threads I imagine this takes 2 seconds which is probably fine, but I think the reason why it seems slow is the Unity editor is just on a single thread? Without multithreading it isn't like its using the full power of my machine.

Bongo Bill posted:

A quadtree might be a useful data structure for this problem.

Hrm, assuming you're talking to me, a quadtree is useful if you're trying to determine if certain features are close to other features. I imagine a quadtree would be useful at some point for my purposes because I'm going to be doing lots of checks to know if certain land masses are close to other landmasses and that would speed things up along with pathfinding algorithms like A* to check connectivity but I don't think its useful here where what I want to do is have each region water or land, be processed, and have the number of pixels comprising it and their coordinates.

I think mathematically it doesn't really work, because instead of iterating over each pixel twice for two pass; each pixel then I think has to 16 or so iterations through the quadtree to see who the neighbours are? I don't think that works.

e2: Maybe the Colour32 comparisons are slow? I converted a texture2D to a Colour32 native array outside my stopwatch wrapper which I figured was faster than Texture2D.GetPixel(...); maybe because the rgba values are floats that's slower? I doubt it though.

Raenir Salazar fucked around with this message at 16:58 on Oct 25, 2021

# ? Oct 25, 2021 16:50

chglcu: May 17, 2007; I'm so bored with the USA.

What type is map there, and does the type of thing it contains implement an overload of Equals that takes the specific type instead of object? If not, that might be boxing on every call to Equals. 18 seconds does seem slow to me, and that's my only theory as to why it is at the moment after looking at that code.

e: The Unity docs don't mention implementations of Equals or the == operator for either of the color types, which I think would mean it's doing a slow thing there (a bit hazy on the specifics of the default implementation of Equals for structs, but pretty sure it would have to box). Of course, the docs could just be missing that info.

chglcu fucked around with this message at 17:20 on Oct 25, 2021

# ? Oct 25, 2021 17:03

Raenir Salazar: Nov 5, 2010; College Slice

chglcu posted:

What type is map there, and does the type of thing it contains implement an overload of Equals that takes the specific type instead of object? If not, that might be boxing on every call to Equals. 18 seconds does seem slow to me, and that's my only theory as to why it is at the moment after looking at that code.

e: The Unity docs don't mention implementations of Equals or the == operator for either of the color types, which I think would mean it's doing a slow thing there (a bit hazy on the specifics of the default implementation of Equals for structs, but pretty sure it would have to box). Of course, the docs could just be missing that info.

The name of the native array is "map" that is not the type. The .Equals is on Colour32 types.

I realized I can actually get a byte[] array using the GetRawTextureData method which might be more performant but that leaves me with a tricky problem that I'm not sure about the representation. A byte array means for a RGBA32 format pixel for say a 16 by 8 texture with no mipmaps is a 16 by 8 by 4 array of bytes according to the docs.

So do I do this:

code:

if (byteMap[i * maxWidth + j - 1] == byteMap[i * maxWidth + j])

or this:

code:

if (byteMap[i * maxWidth + j - 1].Equals(byteMap[i * maxWidth + j]))

or if I'm a little less lucky this:

code:

if (byteMap[i * maxWidth + j - 1][0] == byteMap[i * maxWidth + j][0] &&
byteMap[i * maxWidth + j - 1][1] == byteMap[i * maxWidth + j][1] &&
byteMap[i * maxWidth + j - 1][2] == byteMap[i * maxWidth + j][2] &&
byteMap[i * maxWidth + j - 1][3] == byteMap[i * maxWidth + j][3])

or do I'm unlucky gotta loop over the z values

code:

if (byteMap[width * height * k + width * i + j - 1] == byteMap[width * height * k + width * i + j])
...
if (byteMap[width * height * k + width * (i - 1) + j] == byteMap[width * height * k + width * i + j])

# ? Oct 25, 2021 17:42

chglcu: May 17, 2007; I'm so bored with the USA.

I'm curious what would happen if you add this:

code:

    static bool ColorEquals(Color32 a, Color32 b)
    {
        return
            a.r == b.r &&
            a.g == b.g &&
            a.b == b.b &&
            a.a == b.a;
    }

and replace your

code:

(map[i * maxWidth + j - 1].Equals(map[i * maxWidth + j]))

calls with

code:

ColorEquals(map[i * maxWidth + j - 1], map[i * maxWidth + j])

That would eliminate any boxing that's going on. According to VS, Color32.Equals would be calling ValueType.Equals(object), which would box the arg.

# ? Oct 25, 2021 18:02

Raenir Salazar: Nov 5, 2010; College Slice

Lol, that reduces it to 6 seconds. Wtf Unity.

Helps I actually can also just skip the Alpha channel since they should never be anything other than full.

# ? Oct 25, 2021 18:10

chglcu: May 17, 2007; I'm so bored with the USA.

Raenir Salazar posted:

Lol, that reduces it to 6 seconds. Wtf Unity.

Helps I actually can also just skip the Alpha channel since they should never be anything other than full.

The difference in the generated IL code, in case you're curious:

Calling .Equals:

code:

    // [22 9 - 22 31]
    IL_0011: ldloca.s     a
    IL_0013: ldloc.1      // b
    IL_0014: box          [UnityEngine.CoreModule]UnityEngine.Color32
    IL_0019: constrained. [UnityEngine.CoreModule]UnityEngine.Color32
    IL_001f: callvirt     instance bool [netstandard]System.Object::Equals(object)
    IL_0024: stloc.2      // b1

Calling ColorEquals:

code:

    IL_0025: ldloc.0      // a
    IL_0026: ldloc.1      // b
    IL_0027: call         bool NewBehaviourScript::ColorEquals(class [UnityEngine.CoreModule]UnityEngine.Color32, class [UnityEngine.CoreModule]UnityEngine.Color32)
    IL_002c: stloc.3      // b2

This isn't strictly a Unity thing, except for them not providing an Equals implementation for Color32. It would be the case for any C# code calling .Equals on any struct that doesn't provide its own Equals implementation. Here's a relevant article: https://theburningmonk.com/2015/07/beware-of-implicit-boxing-of-value-types/

# ? Oct 25, 2021 18:15

Raenir Salazar: Nov 5, 2010; College Slice

That's extremely interesting; there's a lot happening under the hood I often don't really get to appreciate until situations like these. :v:

I'll amend my earlier statement to just "lol".

I did some more stop watches, my program also seems to have an initial 3-4 second lag since I save some textures to the hard drive, and added together is approximately how long it takes from pressing the play button to the editor unfreezing itself.

Thanks for that, that saves me from having to do finnicky coding with bytes.

e: For the record its apparently 0.006s to generate two render textures from the GPU for the initial heat map; so presumably its like 0.003s per. Or 3ms. I assume thats good.

Raenir Salazar fucked around with this message at 18:34 on Oct 25, 2021

# ? Oct 25, 2021 18:22

HappyHippo: Nov 19, 2003; Do you have an Air Miles Card?

You can probably just check the red channel for determining what's land/water.

# ? Oct 25, 2021 23:57

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

It's going to be pretty minor compared to silently allocating memory every single time you call Equals, but I also noticed you're iterating through your label array like:

code:


labelMap[1, 1]
labelMap[2, 1]
labelMap[3, 1]
labelMap[4, 1]
...
labelMap[1, 2]
labelMap[2, 2]
...

Especially with such a large 2d array, this is going to be a pretty rubbish use of CPU cache and will be noticeably slower than if you were to just swap the j and i when indexing it.

A modern CPU can execute billions of instructions per second - with merely 33 million pixels to look at, I'd expect you could get times comfortably under a second with just a single thread if you can avoid the CPU stalling because it needs to access memory instead of local caches.

# ? Oct 26, 2021 00:25

Raenir Salazar: Nov 5, 2010; College Slice

Jabor posted:

It's going to be pretty minor compared to silently allocating memory every single time you call Equals, but I also noticed you're iterating through your label array like:
code:
labelMap[1, 1]
labelMap[2, 1]
labelMap[3, 1]
labelMap[4, 1]
...
labelMap[1, 2]
labelMap[2, 2]
...
Especially with such a large 2d array, this is going to be a pretty rubbish use of CPU cache and will be noticeably slower than if you were to just swap the j and i when indexing it.

A modern CPU can execute billions of instructions per second - with merely 33 million pixels to look at, I'd expect you could get times comfortably under a second with just a single thread if you can avoid the CPU stalling because it needs to access memory instead of local caches.

Are you suggesting to make it a 1D array? I'm not super well versed on how it works to access memory instead of local caches etc.

# ? Oct 26, 2021 00:41

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

Basically, accessing main memory is really really slow. Hundreds of clock cycles slow. Instead of waiting for main memory all the time, the processor will have a local cache that can be accessed much more quickly. This cache is organized as a series of 64-byte cache lines, and it will contain both lines that have been recently accessed, and sometimes lines that a (rather simple) prefetcher thinks will be used next.

When you're doing a bulk operation, where you do the exact same thing to very many bits of data, you really really want things to be arranged so that you access memory in a linear sequence - that way you access memory that's already cached as much as possible, and when you step from one cache line to the immediate subsequent line, that's probably already been prefetched.

Using a 1d array is one way to make it clear whether you're accessing the elements linearly, or if you're making big jumps between widely-spaced elements. But you can do it with a 2d array as well - the implementation of multidimensional arrays is pretty straightforward.

# ? Oct 26, 2021 01:04

chglcu: May 17, 2007; I'm so bored with the USA.

Jabor posted:

It's going to be pretty minor compared to silently allocating memory every single time you call Equals, but I also noticed you're iterating through your label array like:
code:
labelMap[1, 1]
labelMap[2, 1]
labelMap[3, 1]
labelMap[4, 1]
...
labelMap[1, 2]
labelMap[2, 2]
...
Especially with such a large 2d array, this is going to be a pretty rubbish use of CPU cache and will be noticeably slower than if you were to just swap the j and i when indexing it.

A modern CPU can execute billions of instructions per second - with merely 33 million pixels to look at, I'd expect you could get times comfortably under a second with just a single thread if you can avoid the CPU stalling because it needs to access memory instead of local caches.

Nice catch (or cache, I suppose). I completely missed that. Yeah, that should help a bit.

I�ll let Jabor explain fully since they found it, but the short version is that if you swap your i and j indices into labelMap, you�ll be accessing memory sequentially which is much more performant due to how CPUs handle reading from RAM and the layout of a 2d array in memory. CPU cache 2d array locality might be good search terms for the details.

# ? Oct 26, 2021 01:06

Raenir Salazar: Nov 5, 2010; College Slice

I *think* I did it, but didn't seem to be noticably faster. Clocking in total around 8.5s.

My currently revised code:

code:

        timer.Start();

        // we know immediately that (0,0) is unlabelled as its our origin,
        // so we pre-label it and create our first region
        currentLabel++;
        labelMap[0, 0] = currentLabel;
        List<Coord> firstRegion = new List<Coord>();
        firstRegion.Add(new Coord(0, 0));
        Labels.Add(new LabelPair(currentLabel, firstRegion));

        // prefill the bottom row and leftmost column as currently it is 
        // assumed the edges are all ocean and they are all trivially connected.
        // this might not always be true but thats for future me to solve.
        // we add these coordinates to the current label and current region
        // bottom row
        for (int x = 1; x < maxWidth; ++x)
        {
            labelMap[x, 0] = currentLabel;
        }

        // leftmost column
        for (int y = 1; y < maxHeight; ++y)
        {
            labelMap[0, y] = currentLabel;
        }

        for (int i = 1; i < maxWidth; ++i)
        {
            for (int j = 1; j < maxHeight; ++j)
            {
                neighCounter = 0;
                // check neighbours
                // can skip seeing if labelled because 
                // every pixel we visit isn't labelled.

                // instead we see if they match the colour
                // then we get the min label of South or West
                // neighbours to get current label to assign;
                // if there's no match then assign a new label

                // check if West tile is the same colour as the current pixel
                if (ColourEquals(map[j * maxWidth + i - 1], map[j * maxWidth + i]))
                {
                    neighbours[neighCounter][0] = i - 1;
                    neighbours[neighCounter][1] = j;
                    ++neighCounter;
                }

                // check if South tile is the same colour as the current pixel
                if (ColourEquals(map[(j - 1) * maxWidth + i], map[j * maxWidth + i]))
                {
                    neighbours[neighCounter][0] = i;
                    neighbours[neighCounter][1] = j - 1;
                    ++neighCounter;
                }

                if (neighCounter > 0)
                {
                    // if there's at least one match
                    // get the smallest label of the two adjacents that shares
                    // the colour as the current pixel
                    int minLabel = Mathf.Min
                        (
                            labelMap[neighbours[0][0], neighbours[0][1]], 
                            labelMap[neighbours[neighCounter - 1][0], neighbours[neighCounter - 1][1]]
                        );

                    // set the current label to the label map
                    labelMap[i, j] = minLabel;

                    // add the current coordinate to the appropriate region
                    // appropriate region is -1 because of offseting in the list
                    Labels[minLabel - 1].regionCoords.Add(new Coord(i, j));
                }
                else
                {
                    // increment current label id
                    ++currentLabel;

                    // create new label with new id
                    Labels.Add(new LabelPair(currentLabel, new List<Coord>()));

                    // add the current coordinate/pixel to the region list of coords
                    Labels[currentLabel - 1].regionCoords.Add(new Coord(i, j));

                    // set the label to the label map
                    labelMap[i, j] = currentLabel;
                }
            }
        }

        timer.Stop();

The Else block by my estimation occurs about 15,500 times. But from my tests pre and after stripping out the LabelPair code it amounts to two seconds. Maybe the Mathf.Min is not a well optimized version?

# ? Oct 26, 2021 01:33

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

Now you've got the opposite problem - you're jumping all over map instead of accessing it in a linear fashion!

You need to transpose labelMap so that it gets walked in the same way as map - with y positions first and x positions second - then you'll be able to set up your loops so that both arrays are always accessing adjacent elements.

# ? Oct 26, 2021 01:53

Raenir Salazar: Nov 5, 2010; College Slice

Jabor posted:

Now you've got the opposite problem - you're jumping all over map instead of accessing it in a linear fashion!

You need to transpose labelMap so that it gets walked in the same way as map - with y positions first and x positions second - then you'll be able to set up your loops so that both arrays are always accessing adjacent elements.

The idea of transposing is kinda breaking my brain though, like is my "0,0" still supposed to be the bottom left or technically is it supposed to be the bottom right?

e: Would this be easier if I unrolled my 2D array into a 1D array so the entries matched up?

Lets suppose I transpose my original 2D array like so:

code:

int[,] labelMap = new int[maxHeight, maxWidth];

if the idea here that what was my first row before is now my last column? As though it has been rotated 90 degrees?

For something like this?

code:

        for (int x = 1; x < maxWidth; ++x)
        {
            labelMap[maxHeight - 1, x] = currentLabel;
        }

        // leftmost column
        for (int y = 1; y < maxHeight; ++y)
        {
            labelMap[y, 0] = currentLabel;
        }

If so, my eyes are starting to go a little crosseyed at this point here (i's and j's replaced with x and y's):

code:

        for (int y = 1; y < maxHeight; ++y)
        {
            for (int x = 1; x < maxWidth; ++x)
            {

...
                // A: is this correct? Or does it need to be swapped?
                if (ColourEquals(map[y * maxWidth + x - 1], map[y * maxWidth + x]))
                {
                    neighbours[neighCounter][0] = y;
                    neighbours[neighCounter][1] = x - 1;
                    ++neighCounter;
                }

                // B: To this? As our x's are now our "height" and our y's are our width?
                if (ColourEquals(map[x * maxHeight + y - 1], map[x * maxheight + y]))
                {
                    neighbours[neighCounter][0] = y;
                    neighbours[neighCounter][1] = x - 1;
                    ++neighCounter;
                }

...
                    // and then somewhere this?
                    labelMap[y, x] = minLabel;

Raenir Salazar fucked around with this message at 03:10 on Oct 26, 2021

# ? Oct 26, 2021 02:53

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

0,0 is still the top left.

Right now 0,1 is the leftmost column, one pixel down from the top, while 1,0 is top row, one pixel in from the left. You want it to be the other way around, so 0,1 is top row, one pixel in.

This will make it so that map[i*width + j] and labelMap[i, j] correspond to the same position.

# ? Oct 26, 2021 03:06

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

That was fun to read from my last post, where I started to doubt myself in the face of "but it's a shitload of pixels so tens of seconds is fine" then other people chimed in and got it to be a simpler structure and now it's under ten seconds like I thought it should be. Hooray!

I think done right it should be achievable in under 2 seconds (on one thread, even less if the work could be divided sensibly), but I'm not really sure what the final purpose of the data is so maybe it has to end in a less efficient structure. For most purposes if it was me I'd make the primary representation a flat array of 4096*8192 bytes, each byte being a "label" index, do most things based on that, and only convert to other formats on demand.

You could offload converting a 32MB index array to a 128MB 32-bit texture to the GPU even, if it's a conversion you might need to do frequently.

# ? Oct 26, 2021 03:13

Raenir Salazar: Nov 5, 2010; College Slice

Jabor posted:

0,0 is still the top left.

Right now 0,1 is the leftmost column, one pixel down from the top, while 1,0 is top row, one pixel in from the left. You want it to be the other way around, so 0,1 is top row, one pixel in.

This will make it so that map[i*width + j] and labelMap[i, j] correspond to the same position.

Hrm, IIRC in Unity3D for texture coordinates the origin 0,0 is Bottom Left no? So when I import my data from my Texture2D its 0,0 assumes bottom left? I think its been a couple of times now I said bottom left and you said top left and I want to clear this up in case it matters.

e: Unity documentation says lower left corner for "texture coordinates" https://docs.unity3d.com/ScriptReference/Texture2D.GetPixel.html but I think this is still unclear as to whether that also means "pixel coordinates" vs "UV coordinates".

e2: https://answers.unity.com/questions/810616/which-corner-is-the-origin-for-getpixel.html the guy here also says bottom left; most people at the Unity forums seem to say bottom left.

roomforthetuna posted:

That was fun to read from my last post, where I started to doubt myself in the face of "but it's a shitload of pixels so tens of seconds is fine" then other people chimed in and got it to be a simpler structure and now it's under ten seconds like I thought it should be. Hooray!

I think done right it should be achievable in under 2 seconds (on one thread, even less if the work could be divided sensibly), but I'm not really sure what the final purpose of the data is so maybe it has to end in a less efficient structure. For most purposes if it was me I'd make the primary representation a flat array of 4096*8192 bytes, each byte being a "label" index, do most things based on that, and only convert to other formats on demand.

You could offload converting a 32MB index array to a 128MB 32-bit texture to the GPU even, if it's a conversion you might need to do frequently.

The purpose is to create a procedural world map from a height map; so I take my height map and then use a cut off to determine whats land and whats water; but I want to take the land tiles and carve out things like provinces/boundaries etc so I need to know information like all pixels that connect to each other to form separated land masses and similarly for water. My first goal is to use this to make a procedural world map generator for Paradox games so the water information is important because many inland water zones are un-interactable.

I greatly appreciate all the help in trying to get this to operate in a reasonable time without resorting to the GPU.

Raenir Salazar fucked around with this message at 03:27 on Oct 26, 2021

# ? Oct 26, 2021 03:17

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

Raenir Salazar posted:

Hrm, IIRC in Unity3D for texture coordinates the origin 0,0 is Bottom Left no? So when I import my data from my Texture2D its 0,0 assumes bottom left? I think its been a couple of times now I said bottom left and you said top left and I want to clear this up in case it matters.

I'm not familiar with Unity3D specifically, you're probably correct here.

It doesn't really matter in this case though, the important thing is that the coordinates should be the same for both your labelMap array and your imported texture data. In the code you've presented they're swapped around (labelMap[j, i] corresponds to the map pixel map[i*width + j]), you need to un-swap them.

# ? Oct 26, 2021 03:28

Raenir Salazar: Nov 5, 2010; College Slice

Hrm, I'm still having a little trouble wrapping my head around as there were some revisions between postings of code but here's my updated code and I switched the 2D labels array to a 1D array and it shaved off another 2 seconds.

This revision swaps i and j to x and y to hopefully be more clear.

code:

        List<List<Coord>> regions = new List<List<Coord>>();
        NativeArray<Color32> map = InMap.GetRawTextureData<Color32>();

        int maxWidth = InMap.width;
        int maxHeight = InMap.height;

        int[] labelMap = new int[maxWidth * maxHeight];

        // pre-allocated coords, seems to shave 0.1s, effect unclear.
        Coord[] spareCoords = new Coord[maxWidth * maxHeight];

        int currentLabel = 0;
        List<LabelPair> Labels = new List<LabelPair>();

        // init array of neighbouring pixels which is
        // 2 pixels with two channels (for x,y).
        int[][] neighbours = new int[2][];
        neighbours[0] = new int[2];
        neighbours[1] = new int[2];

        int neighCounter;

        // we know immediately that (0,0) is unlabelled as its our origin,
        // so we pre-label it and create our first region
        currentLabel++;
        labelMap[0] = currentLabel;
        List<Coord> firstRegion = new List<Coord>();
        firstRegion.Add(new Coord(0, 0));
        Labels.Add(new LabelPair(currentLabel, firstRegion));

        // prefill the bottom row and leftmost column as currently it is 
        // assumed the edges are all ocean and they are all trivially connected.
        // this might not always be true but thats for future me to solve.
        // we add these coordinates to the current label and current region
        // bottom row
        for (int x = 1; x < maxWidth; ++x)
        {
            labelMap[x] = currentLabel;
        }

        // leftmost column
        for (int y = 1; y < maxHeight; ++y)
        {
            labelMap[y * maxWidth] = currentLabel;
        }

        List<List<Coord>> lists = new List<List<Coord>>();
        // pre-allocating, seems to save maybe 0.2s? Effect Unclear.
        for (int i = 0; i < 16000; ++i)
        {
            lists.Add(new List<Coord>());
        }

        int minLabel = 0;

        for (int y = 1; y < maxHeight; ++y)
        {
            for (int x = 1; x < maxWidth; ++x)
            {
                neighCounter = 0;
                // check neighbours
                // can skip seeing if labelled because 
                // every pixel we visit isn't labelled.

                // instead we see if they match the colour
                // then we get the min label of South or West
                // neighbours to get current label to assign;
                // if there's no match then assign a new label

                // check if West tile is the same colour as the current pixel
                if (ColourEquals(map[y * maxWidth + x - 1], map[y * maxWidth + x]))
                {
                    neighbours[neighCounter][0] = x - 1;
                    neighbours[neighCounter][1] = y;
                    ++neighCounter;
                }

                // check if South tile is the same colour as the current pixel
                if (ColourEquals(map[(y - 1) * maxWidth + x], map[y * maxWidth + x]))
                {
                    neighbours[neighCounter][0] = x;
                    neighbours[neighCounter][1] = y - 1;
                    ++neighCounter;
                }

                spareCoords[y * maxWidth + x].x = x;
                spareCoords[y * maxWidth + x].y = y;

                if (neighCounter > 0)
                {
                    // if there's at least one match
                    // get the smallest label of the two adjacents that shares
                    // the colour as the current pixel
                    minLabel = Mathf.Min
                        (
                            labelMap[neighbours[0][1] * maxWidth + neighbours[0][0]],
                            labelMap[neighbours[neighCounter - 1][1] * maxWidth + neighbours[neighCounter - 1][0]]
                        );

                    // set the current label to the label map
                    labelMap[y * maxWidth + x] = minLabel;

                    // add the current coordinate to the appropriate region
                    // appropriate region is -1 because of offseting in the list
                    Labels[minLabel - 1].regionCoords.Add(spareCoords[y * maxWidth + x]);
                }
                else
                {
                    // increment current label id
                    ++currentLabel;

                    // create new label with new id
                    Labels.Add(new LabelPair(currentLabel, lists[currentLabel]));

                    // add the current coordinate/pixel to the region list of coords
                    Labels[currentLabel - 1].regionCoords.Add(spareCoords[y * maxWidth + x]);

                    // set the label to the label map
                    labelMap[y * maxWidth + x] = currentLabel;
                }
            }
        }

Does this still have the locality issue with not using the cache and the 2s boost a result of improved spatial locality, but it could still be better through swapping x and y? Or was this another way of integrating that solution?

e to add: I'm basically at 5.8s now when it used to be 8.8s; which is pretty good.

e2: I also changed ColourEquals to only compare the red channel, which strangely didn't net very much.

Raenir Salazar fucked around with this message at 04:43 on Oct 26, 2021

# ? Oct 26, 2021 04:35

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

Yup, that sounds about right. The way you have it is correct - note that immediately after processing, say, element 2037, you then proceed on to element 2038, which will be right next to element 2037 in both arrays. That gives you very good locality.

I think further optimizations are possible, but you'd need to change the structure you're outputting (but at a guess, the "bag of coordinates" you're currently outputting is going to be difficult to do anything performant with downstream anyway...).

My first thought would be that the whole "finding which label is smaller when there are two possibilities" part is totally redundant - if you're going to be using UnionFind to merge the labels together, it doesn't actually matter which one you picked for this node. So you could use an algorithm like:

code:

if (cell left is in the same region) {
  use left label
  if (cell above is in the same region) {
    merge top and left labels in UnionFind
  }
} else if (cell above is in the same region) {
  use top label
} else {
  mark with new label
}

I decided to prefer the left label over the top label to match the direction we're scanning the array in - we want to process all the pixels where regions X and Y are adjacent to each other all at once (as we're scanning along that row), instead of having that interaction repeatedly reappear as we come to one particular column.

e: I guess swap "above" with "below" for your use case, I know I'm going to keep messing that up so I won't even try to fix it.

Jabor fucked around with this message at 05:06 on Oct 26, 2021

# ? Oct 26, 2021 04:57

Raenir Salazar: Nov 5, 2010; College Slice

I'll give that a look tomorrow!

Someone in a discord I'm in suggesting Vectorizing my loops to take advantage of SIMD instructions; would that gain any benefit? I was doing some googling but nothing I could easily implement came up.

e: A clever solution I think might just be not to do the "bag collection" of pixels here at all; just record all the label groups and then move the pixel collecting to the GPU where I can just assign all the pixels to the relevant arrays and return them somehow or use ParallelFor since its a simpler problem space by separating them?

Raenir Salazar fucked around with this message at 05:24 on Oct 26, 2021

# ? Oct 26, 2021 05:08

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

Generally speaking, vectorizing with 4-way SIMD instructions will improve performance somewhere between "not at all" and a 4x improvement depending on the problem.

I'm not too sure it'll help in this case - critically, the label you assign to the fourth value you look at is going to depend on what label gets assigned to the preceding three values - that dependency is going to heavily limit how much benefit you can get from vectorizing. It's more useful for totally-parallel operations where the output from one element only depends on a defined set of inputs, and not on the output from other elements.

You could perhaps look at a diagonal stripe all at once - say, process (1,4),(2,3),(3,2) and (4,1) as one unit, followed by (1,5),(2,4),(3,3) and (4,2), that way all of the outputs each one depends on will have already been written. You'd need to write some special-case code to handle the start and end of each row. It's questionable whether that would actually improve things though - if your existing solution was constrained by memory bandwidth instead of CPU cycles, it probably wouldn't.

If you really wanted more performance, you could learn about using profiling tools to identify what parts of your code are the bottlenecks where improving them would net the biggest gains.

# ? Oct 26, 2021 05:34

Raenir Salazar: Nov 5, 2010; College Slice

I went and did what you suggested plus I removed the code were I added Coords to a List and I'm down to 1.8s! :cheers:

e code:

code:

       List<List<Coord>> regions = new List<List<Coord>>();
        NativeArray<Color32> map = InMap.GetRawTextureData<Color32>();

        int maxWidth = InMap.width;
        int maxHeight = InMap.height;

        int[] labelMap = new int[maxWidth * maxHeight];

        int currentLabel = 0;
        List<int> SimpleLabels = new List<int>();

        var timer = new System.Diagnostics.Stopwatch();
        timer.Start();

        // we know immediately that (0,0) is unlabelled as its our origin,
        // so we pre-label it and create our first region
        currentLabel++;
        labelMap[0] = currentLabel;

        SimpleLabels.Add(currentLabel);

        // prefill the bottom row and leftmost column as currently it is 
        // assumed the edges are all ocean and they are all trivially connected.
        // this might not always be true but thats for future me to solve.
        // we add these coordinates to the current label and current region
        // bottom row
        for (int x = 1; x < maxWidth; ++x)
        {
            labelMap[x] = currentLabel;
        }

        // leftmost column
        for (int y = 1; y < maxHeight; ++y)
        {
            labelMap[y * maxWidth] = currentLabel;
        }

        for (int y = 1; y < maxHeight; ++y)
        {
            for (int x = 1; x < maxWidth; ++x)
            {
                // check neighbours
                // check if West tile is the same colour as the current pixel
                if (ColourEquals(map[y * maxWidth + x - 1], map[y * maxWidth + x]))
                {
                    labelMap[y * maxWidth + x] = labelMap[y * maxWidth + x - 1];
                }
                // check if South tile is the same colour as the current pixel
                else if (ColourEquals(map[(y - 1) * maxWidth + x], map[y * maxWidth + x]))
                {
                    labelMap[y * maxWidth + x] = labelMap[y * maxWidth + x - 1];
                }
                else
                {
                    // increment current label id
                    ++currentLabel;

                    // create new label with new id
                    SimpleLabels.Add(currentLabel);

                    // set the label to the label map
                    labelMap[y * maxWidth + x] = currentLabel;
                }
            }
        }

What I could do now is ParallelFor those two initial setup loops and maybe investigate a pre-allocated int array for SimpleLabels and maybe this is as good as it gets?

Raenir Salazar fucked around with this message at 05:40 on Oct 26, 2021

# ? Oct 26, 2021 05:37

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

What to replace the bag collection with is going to depend a lot on what you intend to do with this output. For a lot of use cases, a single point within the region is equally as useful as the entire bag of points, because what the consuming code really wants to do is something like find all the boundary points, or work with a 2d bitmask of in vs. out points, or something like that. But it really depends on what you want to do.

# ? Oct 26, 2021 05:39

TheLastManStanding: Jan 14, 2008; Mash Buttons!

Raenir Salazar posted:

What I could do now is ParallelFor those two initial setup loops and maybe investigate a pre-allocated int array for SimpleLabels and maybe this is as good as it gets?

What are you trying to do with SimpleLabels? It looks like it's just an array of numbers that counts up to whatever currentLabel is at?

# ? Oct 26, 2021 05:49

Raenir Salazar: Nov 5, 2010; College Slice

Jabor posted:

What to replace the bag collection with is going to depend a lot on what you intend to do with this output. For a lot of use cases, a single point within the region is equally as useful as the entire bag of points, because what the consuming code really wants to do is something like find all the boundary points, or work with a 2d bitmask of in vs. out points, or something like that. But it really depends on what you want to do.

Figuring things out like the size of the regions is a little important because some of the "islands" are going to be really tiny and need to be consolidated together with other tiny islands as being the "same" province or with the nearest larger landmass province and be "merged" with them. (Merging being that they get coloured the same unique colour as the rest of the province they are a part of for province 'selection' purposes)

I dunno off hand if I want information like all the pixels that form the outer edge of a area but I might.

One important thing and partly why I went to this trouble is also so I can randomly select only land pixels or only water pixels for generating delaney triangulation points to form a voronoi map to create provinces and I want to be able to deterministically say "Okay I want the map to have 1,000 provinces" and have exactly that, and my initial idea is just selecting 1,000 random land pixels distributed by area size to proportionally distribute them out.

Assuming the average province is 20,000 pixels in area and I know an island has 2,000,000 pixels they should have on average 100 provinces and each area I can determine their share of the total number of provinces and assign nearly exactly that many.

And islands that are too small but are close together would be a special case to consolidate them together as 1 province instead of assigning 1 triangulation point to a blob of merely 30 pixels which might be too small to even click.

TheLastManStanding posted:

What are you trying to do with SimpleLabels? It looks like it's just an array of numbers that counts up to whatever currentLabel is at?

Its essentially list of detected regions; prior to the second pass anyways.

e: Now I think I see what you're getting at, as the code currently stands I don't need a list of them; but the second past of the algorithm I think might need them because it involves merging equivalent labels together.

Raenir Salazar fucked around with this message at 05:57 on Oct 26, 2021

# ? Oct 26, 2021 05:51

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

roomforthetuna posted:

I think done right it should be achievable in under 2 seconds

Raenir Salazar posted:

... and I'm down to 1.8s!

# ? Oct 26, 2021 12:43

HappyHippo: Nov 19, 2003; Do you have an Air Miles Card?

This is fun, you went from 3 minutes to under 2 seconds

# ? Oct 26, 2021 15:07

Raenir Salazar: Nov 5, 2010; College Slice

Sad news is adding in the UnionFind code and the additional if comparison to record the equivalences adds between 4 to 6s to the total runtime.

For reference, code from here: https://www.codeproject.com/Articles/336915/Connected-Component-Labeling-Algorithm

code:

    internal class Label
    {
        #region Public Properties

        public int Name { get; set; }

        public Label Root { get; set; }

        public int Rank { get; set; }

        #endregion

        #region Constructor

        public Label(int Name)
        {
            this.Name = Name;
            this.Root = this;
            this.Rank = 0;
        }

        #endregion

        #region Public Methods

        internal Label GetRoot()
        {
            if (this.Root != this)
            {
                this.Root = this.Root.GetRoot();//Compact tree
            }

            return this.Root;
        }

        internal void Join(Label root2)
        {
            if (root2.Rank < this.Rank)//is the rank of Root2 less than that of Root1 ?
            {
                root2.Root = this;//yes! then Root1 is the parent of Root2 (since it has the higher rank)
            }
            else //rank of Root2 is greater than or equal to that of Root1
            {
                this.Root = root2;//make Root2 the parent

                if (this.Rank == root2.Rank)//both ranks are equal ?
                {
                    root2.Rank++;//increment Root2, we need to reach a single root for the whole tree
                }
            }
        }

        #endregion
    }

If check:

code:

                if (ColourEquals(map[y * maxWidth + x - 1], map[y * maxWidth + x]))
                {
                    labelMap[y * maxWidth + x] = labelMap[y * maxWidth + x - 1];
                    
                    if (ColourEquals(map[(y - 1) * maxWidth + x], map[y * maxWidth + x]) &&
                        labelMap[(y - 1) * maxWidth + x] != labelMap[y * maxWidth + x - 1])
                    {                        
                        Labels[labelMap[y * maxWidth + x]].Join(Labels[labelMap[(y - 1) * maxWidth + x]]);
                    }
                }

Putting in the and to check to see if they're different makes it run 2 seconds faster; so that UnionJoin seems to be pricey.

Goes from 1.8s, to 3.2s with the additional if check to 5.2s with the Join.

Without the != it would be 8.5s or so.

My code for the second pass which seems to take 2.2s:

code:

        int WxH = maxWidth * maxHeight;
        for (int i = 0; i < WxH; ++i)
        {
            newLabelMap[i] = Labels[labelMap[i]].GetRoot().Name;
        }

If there's anything obvious I could do I would appreciate it, but otherwise I'm going to move on to putting this data into a shader to colour my texture to make sure its working even if it does sadly add about 6 seconds to the total runtime; but in any case getting it to run under 10s exceeded my wildest dreams, so lots of thanks already.

# ? Oct 28, 2021 04:30

chglcu: May 17, 2007; I'm so bored with the USA.

I�m not seeing the declarations for Labels and labelMap. What types are they? I�m wondering if those index operators might be slow, but hard to tell without knowing what exactly they are. The join itself doesn�t immediately look like it should be that big of an impact to me.

# ? Oct 28, 2021 04:44

Raenir Salazar: Nov 5, 2010; College Slice

chglcu posted:

I�m not seeing the declarations for Labels and labelMap. What types are they? I�m wondering if those index operators might be slow, but hard to tell without knowing what exactly they are. The join itself doesn�t immediately look like it should be that big of an impact to me.

No problemo.

code:

int[] labelMap = new int[maxWidth * maxHeight];

code:

Dictionary<int, Label> Labels = new Dictionary<int, Label>();

# ? Oct 28, 2021 04:47

chglcu: May 17, 2007; I'm so bored with the USA.

Raenir Salazar posted:

No problemo.

code:

int[] labelMap = new int[maxWidth * maxHeight];

code:

Dictionary<int, Label> Labels = new Dictionary<int, Label>();

I guess it could be the dictionary lookup. Can that be made an array using your keys as indices? Can�t tell for sure at the moment if the keys it�s using are a contiguous sequence.

# ? Oct 28, 2021 04:51

Adbot: ADBOT LOVES YOU

# ? May 25, 2024 10:19

Raenir Salazar: Nov 5, 2010; College Slice

chglcu posted:

I guess it could be the dictionary lookup. Can that be made an array using your keys as indices? Can�t tell for sure at the moment if the keys it�s using are a contiguous sequence.

I think they are, making it an array instead actually makes it about 2.2s faster. Largely negating the time of the second pass.

Makes it around 5s exactly now as the total runtime. So subtract 2.2s and we're down to 2.8s for the first pass.

e: There's an alternate Weighted Quick Union Find out there here but I'm confused as to how I'm supposed to use it.

It seems like you instantied the data structure with a pre-set number of "dots" and then can start doing your unions but I don't know if that's supposed to actually correspond to something, like the approximate number of labels? Do I pass in 20,000 and then use it as is?

Raenir Salazar fucked around with this message at 05:05 on Oct 28, 2021

# ? Oct 28, 2021 04:58

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Game Development Megathread

«‹›623 »