Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Quaternion Cat
Feb 19, 2011

Affeline Space
I'd enjoy hearing some more about the game design overall, but in particular the challenges you faced and the solutions you came up with for making a (relatively?) balanced layout of items (and the data structure used) for this type of game - how have things changed (if any) from eg Cave Explorer? I had a brief look at the source, it's not exactly comment heavy but it's not unnavigable either - a higher level description would be cool. Also, I guess a little under half the ROM is the graphics - how big do you think a game like this could get with a simpler visual feel? (btw the graphics in Eaty are really good)

Adbot
ADBOT LOVES YOU

Quaternion Cat
Feb 19, 2011

Affeline Space
Thanks for talking about Eaty, I'm really interested in the adventure/exploration genre and wonder how far it could be taken on chip-8.

Additionally, it was suggested to me on IRC that I expound somewhat on pretty much the 2 main components of Octopeg:
1) dealing with large decimal numbers, in the order of hundreds of thousands or millions
2) Interesting looking collisions in an 8 bit math space.

This is however going to come across as more of a badly written report on my experience writing Octopeg, so, bear with me. Honestly, skip to the start of the pictures if you want to know about the collisions, rather than the inane drivel that is me talking about my bad decimal math engine.

Octopeg has a set of functions called BigDecimal for the score and the decimal mathematic trappings thereof. Naturally, I looked at the BCD function for this - it converts a 0-255 register into 3 bytes of hundreds, tens, and units, writing them into 3 bytes of memory based from I. The issues with this were that:
a) I knew I was going to want to have a number bigger than this
b) If you don't have any eg hundreds, you will still write the hundreds bytes value, so I can't easily pack them together in memory.
c) The order in which the digits are written into memory - bcd writes in Hundreds, Tens, Units. This means tracking the I register backwards if your order increases to eg thousands etc.

So I got to thinking and came up with a (not very good) solution: Using the same method you're taught as a kid to add up digits. Yeah...
I decided to lay out, in memory, individual bytes for each decimal digit 0-9 for an arbitrary number of digits, in little endian byte order, so, the first byte is the units, then tens, then hundreds etc. I eventually had 3 of these memory addresses I could refer to, but in their infancy, they were just reading/writing directly to the score. These 3 memory area are A, B and C, running through each digit of A, add each digit of B, and put the result in C and carry any overflow above 10. And then all of the code would modify each of the acquisition functions to add one to the addresses, and repeat.

If you've read that and thought about it any, I imagine you can see that it's actually kind of crap and could be done a lot better. My initial plan was to award points for every collision in real time, however, my strategy here ended up being far too slow, and I replaced it. However, this wasn't entirely a dumb solution.

One of the key elements is that if you were adding a small number to a much larger number, you can stop early rather than process all digits. I decided that B would be the small increasing value (eg 150 points), terminated with 0xff, so in memory would appear as eg 0 5 1 -1. We can conclude two rules about this process: we know that we must continue until we hit this -1 in B, and until we have no carry value. So if we were to add this to 1239850, you would continue to carry into the 10000s column, observe that you have both run out of B to add, and your carry is now empty, and do not need to look at any further digits of the number, so you simply stop processing at the '4' in your result, 1240000. If no carrying were required, we would simply stop after we hit the -1 element of B. Not only does this save you from having to actually access the memory, it saves you from having to redraw, too. This, it turns out, is not the problem.

This result worked ok for fixed bonus values, which I could prewrite into memory, and dealing with however many points you'd earnt during a turn, but, actually calculating those points in a turn was a different kettle of fish now. I decided that the scoring system for Octopeg should award you additional points the more full your combo/fever bar is, hoping to make it so trying to hit as many as possible with a full fever bar before dunking into a pit of your choosing was worth while. Tying these two things together wasn't straight forward, thus, I went back and reconsidered my score format.

Drawing a much simpler 'you hit something' effect was top of the agenda, so, a memory value was read, incremented, and used for a single line's height - gradually filling the combo bar as you hit more pegs. The 'fill' amount of this bar was left in a register, and was the perfect tool for doing the scoring; I set a register to -100 at the start of each turn, and add to this the 'fill' amount, sans the 3 least significant bits. This meant that, for every 8 lines (lined up with the indicators on the bar), you would get an extra 8 points, so, 8, 16, 24 etc points per peg hit depending on the bar, plus an extra 20 if you had the bar filled (64 + 20) - the only requirement is that it be < 100. This value was added to my register storing -100. However, since -100 isn't really a number, really it's just 156, when you eventually add a total > than 100, it would result in an overflow and vf being set to 1. When vf is 1, you subtract 100 from the register, and I used a memory read/write to increase the number of 'hundreds or more I guess' in memory. Storing this overflow in a register would be important to speed things up.

At the end of the turn, subtracting 156, or, adding 100, to the points tracker, combined with use of the previously discarded BCD command, would give access to the units and tens for your score that turn. Rearranging and copying it into my 'points this turn' memory address for BigDecimal, repeating for the 'hundreds or more I guess', followed by a -1, would offer the required format. I also did the classic 'put a fake 0 in' so you got 80 points+ instead of 8, so, the most points you can get in a turn is 255960 I guess - at the maximum bonus of a full fever bar, this would only be like 38 pegs, so, uh, that might actually be possible to hit but it's pretty unlikely.

The short of it is, If I had to do it again (and I probably will), I would design it quite differently:
Firstly, I would remember that you can do I += vn, a command I completely forgot existed until after I'd written BigDecimal. Instead of just having self modifying code that sets I to a given value and then applying a vn offset, it reloads the memory into v0 and v1, sets a working register to one, adds it to v1, handles overflows, and then resaves those memory locations, for all of A, B and C, for each digit, when it could simply store an offset value in a register and skipped 80% of that.
Secondly, I would swap to a big endian byte order and simply start my offset at eg 7 and reduce it each digit so that BCD would be easier to integrate with it.
Finally, BigDecimal was designed to minimise redraws, something it is no longer concerned with and was likely a misguided goal anyway. As a result, there's really no reason to keep it as single digits in each address - I'd pack each byte much the same way as I tracked points in a turn, and would BCD out the units and tens of each value when having to draw the score.

Ultimately I'm not very proud of it at all, but, it did get me really comfortable making tools and dealing with self modifying code, so, there's that.

What I am proud of, however, is the collision mathematics. I invested a lot of time into getting the collisions in Octopeg as reliable and as interesting to look at as I could. Collisions are of course, much more interesting with diagrams, so, there'll be some of those this time!

The first step in having things collide is to know if they are colliding or not. Of course, we use chip 8's vf register to let us know if we drew over a pixel that was already on. However, what do we do next?



I reuse the peg sprite, a single pixel, to search each pixel the ball is composed of. You may note the testing order is a little unusual, but, that's how I ended up doing it. When vf is triggered again during this 1 pixel search, you detect which pixel of the ball collided with the peg. This is just the first step in a long and arduous process, but I hoped to use the above image to draw you in~

Really though from here I need to talk about the ball, and how it moves. The ball's position is expressed to some extent by using 'subpixels', a technique where you define the position, and sort of more importantly, the velocity, with greater precision than you can actually display. There are two obvious ways you could do this: Firstly, in lores mode, the display is only 64 x 32. These only require 6 and 5 bits to define every pixel of. This means that you could store additional information in the 2 and 3 respectively unnecessary bits of the 8 bit registers you define the x and y position in - for example, 0 through 7 could all be quantised to a height of 0, 8 through 15 to 1, and so on. The main problem with this strategy in Chip-8 is that bit shifts are one bit at a time, so you'll need a lot of operations to redraw them, which is no good for a high speed ball moving game that will move every frame. The alternative is to have dedicated subpixel registers. These are registers that you add your velocity into instead of your position directly, and you modify the positions only when you eg under or overflow your subpixels. This is the strategy I used for Octopeg.

With the addition to these subpixel registers, I would need: position-x, position-y, velocity-x, velocity-y, subpixel-x, and subpixel-y. However, I found that I also required more information. Sign-x and Sign-y - easy to access sign bits of the velocities. This was because I needed to know if I should test for an overflow or an underflow of the subpixel registers, and how to modify the position register in such an event. An important factor to note is that, vx and vy are still stored as negative numbers when travelling in the reverse direction, and additionally, due to the seperate sign bits, the ball has a full range of speeds from -256 to 255. I could have really tightened up my code for applying the velocity to the ball if only I'd thought a bit more, but, I needed cycles for the scoring and collision process anyway and this didn't seem to be a big deal and it fell by the way side. If I'd done it right originally, the updates would look a little bit like this:
code:
sx += vx 				# Add velocity to subpixel register
if vf != signx begin  			# If we don't overflow and are -ve, or do overflow and are +ve
	if signx == 0 then px += 1 	# If we're travelling in a positive direction, increase by 1
	if signx == 1 then px += -1 	# If we're travelling in a negative direction, decrease by 1
	end
So, now you know a little bit about how the ball is updated, I can carry on with the collision description. The process of designing the collisions was somewhat iterative, the deeper we go, the higher quality the collisions appear. Let's start at the simplest things we could do when we collide: we could invert our horizontal, or our vertical velocity.

We have pixel information, how about if we attached horizontal bounces to two of them, and vertical bounces to the other two? What would that look like? By the way, some of these ball movement gifs are a little, weird? With how the ball moves? I messed something up and redoing it would take 30 minutes+, there's a youtube link at the end where they all look nice.



Well, honestly that isn't too bad. But, there are some problems:
1) It feels very... boxy? Bounces are all very similar, there is no understanding of the ball as a circle and the peg as a pin.
2) Sometimes, something like this happens:


What is happening here? Is it something we want? If not, how do you detect and resolve it?

Let's consider what we have in our tool box to help us with situation 1. By assigning horizontal and vertical bounces to each corner, we are pretending that each corner is one side of a a square, and that we are colliding with other squares. It's no wonder it looks very boxy!



What could we do instead? Well, really we have 'corners' instead of 'sides' - maybe we should pretend to be a diamond? How do diamonds collide?



OK that's great, but, what does the transform that we apply look like at that point of collision?



The answer in this case, is, noting that our y axis increases in the downward direction here, a reflection on the line y = x. In this case, our x velocity becomes our y velocity, and our y velocity becomes our x velocity. We do not however flip them. However, this only applies to 2 of our diamond sides, the other two sides reflect on a different line, the line y = -x. Hopefully you can guess from the diagram that if we had collided with this corner, the result will be a vector travelling in a -ve direction on both x and y, likely the vector (-4, -1), which is, x = -y, y = -x.

So, let's swap our vx and vy, negating as needed, swapping sign registers et al. What would this look like?



Well, that's... better? It does a lot of good bouncing around in addition to going around in circles a lot. Modelling ourselves as a diamond is a little better than modelling ourselves as a square, but, we're supposed to be a ball, circles can behave in both of these ways. Can we do better?

When we test our ball vs the peg, we test which of our 4 pixels the ball is touching the peg with. This really only gives us a small amount of information. Can we get more information about where the ball is or how it is behaving with relation to the peg? The answer to this question is, of course, yes - we have subpixel information. If we were to look at just the most significant bit of our subpixel registers, this would tell us which corner of a pixel our ball is in. If we combine this information with our collision pixel, we dramatically increase the amount of information we have available:



Now instead of just 4 zones to consider where the peg has collided with us, we have 16 zones to consider. With this multitude of zones, we could consider ourselves a square in the zones along the top bottom and sides, and consider ourselves a diamond in the zones in the corners. Because we can move a whole pixel in a single frame, which is now larger than a single 'zone', we can have the peg penetrate inside our ball model, so, we have to consider the 4 'internal' zones too. I chose to simply model these as diamond type collisions also. So now, instead of being a square or a diamond, we're pretending to be uh, this weird octagon shape:



Now, technically, there's a small problem with the way we pretend to be this shape - Firstly, we don't have sub pixel information for the peg. Secondly, we only actually know the peg even exists when it's already within our 2x2 body. So, sort of what we're doing here is wrong? If you consider the opposing peg in the top left corner vs the bottom right corner, the ball can only use the extra subpixel information to move itself down and to the right, which results in an imbalanced new perspective on the peg's location. Fortunately, we can't really do any better than this approximation anyway, and this will happen fast enough that no one will notice or really stop to think about what might be happening.



Anyway, having decided to just roll with it, we need to use some brain logic. Our collision pixels are numbered from 0 to 3, and we can combine our sy and sx msbs into the 2nd and 1st bit of another value, to create a similar 0 to 3 value. Observe this figure I drew:



So for example, if we are in the 00 pixel, then we want to be a corner in state 00 or 11 subpixel, and a side in 01 or 10. Corner for 01 in 01 or 10, side in 00 or 11, etc. You may quickly observe that the logic for this is: if CollisionPixel == SubPixel or CollisionPixel == (SubPixel ^ 0b11) then Corner else Side (^ in this case being XOR). This is a test we can easily do in CHIP-8, and it is in fact the exact test that I perform. You can talso perform tests on bit 2 of CollisionPixel vs Subpixel (match: top/bottom, different: side - eg, cp = 0b01 and sub = 0b11 => side) to work out if we want to present a horizontal or vertical side when we pretend to be a square.

Now, what does THIS look like?



Getting better. But, there's still some problems. We're modelling our ball a lot better now, but, a whiles back we noticed a problem. This:


What's going on here? In this case, our ball happens to move on x and y such that it sort of... catches? the peg with its corner. Back when we recorded this footage, we were only doing a horizontal inversion, so, here's a diagram of what's going on here:



How do we describe what is going wrong here? In just this case, if we were bouncing, ideally, we would be leaving the way we were already going. If you're mathematically minded at all, you could say something like the dot product of our relative entry velocity and a line perpendicular to our line of reflection vs our relative position vector & the same having the same signs means it will be bad. If you're not mathematically minded at all, then, well, this:



So... given uh, that, we need to somehow detect and resolve this. For a horizontal bounce like this, this is straight forwards and we actually don't need to do any maths at all really: If our horizontal velocity sign matches the appropriate bit of the detected pixel as 0bYX, then, don't bounce because it will look weird? This kind of filtering works ok for our 'square' type collision, but, if this is a problem here, it's also a problem with our diamond type collisions. How do we test the complicated dot product thing I said earlier when we reflect on the line y = x?

Well, this is actually remarkably simple. The relative position business is implicit to which of our pixels we collide with. We do however need to analyse our relative velocity. Or really just our velocity because the peg can't move. Let's look at this figure:



Here I've drawn an 'X' over our ball, and drawn on several velocities it might have. It's colliding with a peg in its upper right corner, and it's modelling a corner/diamond type collision. We expect it to reflect on the line y = x. If we came in with the Green velocity, we would expect a result that looked like the Blue velocity. And I'm sure if it came in with the red velocity it would look fine too. But, if it collided while travelling with either the Blue or Black velocities, how would it bounce? Remember that it CAN collide like this, even though the peg would have sort of had to have phased through part of the ball - we don't get to model that event as our collision data will only trigger on whole pixel moves, all we have is the state we've found ourselves in. In these cases, it would bounce 'towards' the peg, which is the precise case we don't like.

If we discount the impossible zone, where if we were travelling with a velocity in that zone, we could never collide with the peg unless we moved 2 pixels in a single step, then we can combine our sign information with another piece of information to detect these situations. If the X sign is Negative, we have a problem only if we are in the LEFT/RIGHT area of the X. If the Y sign is Positive, we have a problem if we're in the UP/DOWN area of the X. If we are not in the corresponding areas, then there is no problem. Additionally, since we know we can't be in the impossible zone, we actually know what kind of bounce we could do *instead* of a diamond bounce. If we were the blue line, we could do a vertical bounce, and if we were the black line, we could do a horizontal bounce. That's great!

Now, how do we know which zone of the X we are in? Well, it's actually super simple: if our X velocity is greater than our Y velocity, then, we must be in the LR zone. If it's not, then we're in the UD zone. It's that simple. I wrote all of the pixel match ups vs LR/UD states by hand, I'm sure there's some bit comparison you could do, but given Chip-8's options, I'm not sure it would work out faster. Now that we filter these 'unwanted' collisions, what does this look like?



That's not bad... but, can we do better? Currently, we present a very large 'flat' surface to our pegs. These result in horizontal or vertical reflections. These are boring reflections, and they don't add much dynamicism to our ball as it bounces around. Could we take our diamond type reflections further? What if instead of presenting as that octagonal shape from earlier, we present ourselves as this shape:



These new edges would generate reflections on the lines y=2x and y=0.5x (and the inverses there of). Having solved our problems with reflecting into pegs by comparing the ratio of y vs x, we can imagine that we will need to do the same thing if we want these fancier reflections. Fortunately, we can do a couple of bitshifts, if we shift right, then we divide by 2, so, we can compare vx with a shifted vy, and a shifted vx with vy. If 0.5 of vx is still bigger than vy, then, we must be really moving it from left to right, hence, we're in a super LR zone, and if 0.5vy is bigger than vx, then we must be in a super UD zone.



We can use these zones to filter out our unwanted reflections for these news lines of reflection, although we do need to select which one is appropriate for our collision type, so, there has to be some canoodling of data between our subpixel state and our X zone detection/selection. I opted to determine all the zone states into a single register as packed bits (so eg 0x7 would be in the UD zone for all Xs), and in a bitmask depending on subpixel state to determine which one we care about, and then tested if the X zone was 0 or not when filtering collisions.

Anyway, so, while we can adequately filter these new collision types, how do we actually reflect on these lines. Well, I thought this was easy, 'oh yeah you just add some divided by two bits around it'll be simple'. I was quite wrong; the correct equation for reflecting on the line y = 2x is, as you can see here, x = 0.8y - 0.6x, y = 0.8x + 0.6y, or 4/5s and 3/5ths respectively. The different arrangements of -2x, 0.5x etc use these same coefficients, but, juggle the signs around, so eg 0.8 + 0.6 etc. Doing division by 5 was uh, not going to be a thing I wanted to do or even something I actually know where to get started with in 8 bit so, I decided that, maybe 0.75 and 0.5 would be ok. So that's 1 shift right for 0.5, and then another for 0.25, and then adding that in as required for each value.

The first time I tried this however I ran into a problem. First of all, I completely lost track of the correct value for the separate sign bit, second of all, 255 * 0.75 + 255 * 0.5 is going to be > 255, so, there's an overflow problem to consider too. As a result, I decided that the correct course of action was to a) shift everything down by 1 and throw a bit away so that I could detect and cap the velocities when this overflow/underflow for -vs would occur and b) shift everything down by another one and use the most significant bit as a sign bit, and do real twos compliment math. This means that at the start of this routine, there's 3 bitshifts down, and then oring in 0xE0 if the sign bits are set to fill out the 3 MSBs correctly.

The result is that, when shifting back up, you first shift out your new sign bit into vf, retain this in some way (I used a logic branch) then shift again and compare vf. If the first bit was 1 and the 2nd 0, then, we are travelling -ve and have had an underflow, thus, cap at -ve vel max, if the first bit was 0 and the 2nd was 0, then, we're +ve and we haven't had an overflow. In addition, there is the special case of 0b1100000, which will be -256 when shift back up. This isn't going to work for us unfortunately, so it has to be detected and capped to -ve vel max also. The ultimate result of all of this is actually to rob quite a lot of velocity away from the ball. As it turns out though, that's actually a good thing. Slowing the ball down in some way was an important elemnt, and in fact, it's probably one of the hardest elements to implement independently, and these bounce types gave it for free.

And that's it, that's Octopeg's peg collision & reflection system in a very large and extremely longwinded nutshell. The final result looks like this:



And here's that youtube link of all the bounce types with the right fps:

https://www.youtube.com/watch?v=qY2OpMRxsGk

edit - fixed a 0x11 to 0b11 and also :3: VVV also technically I mostly meant that it would get a bad grade or something because I wrote 'I did x' a lot.

Quaternion Cat fucked around with this message at 09:15 on Nov 6, 2015

Quaternion Cat
Feb 19, 2011

Affeline Space
I tried this on a few devices, a samsung s2, an iPad mini 4 and running from 1 to 6 instances on my laptop, all report 60.08kOps - is this technique working as expected?

Quaternion Cat
Feb 19, 2011

Affeline Space

Centripetal Horse posted:

Please tell me this is going to be Scorched Earth: CHIP-8 Edition.

So, uuhhhhh, think I ought to link you to one of the submissions for octojam II: http://octojam.com/octojam-ii/games/t8nks

Quaternion Cat
Feb 19, 2011

Affeline Space

taqueso posted:

Yeah, that was the idea.

Aw man, hope that I didn't just rain on your parade. IIRC both T8nks and Octopeg (the latter of which I made) only support at most a single pixel of movement per projectile per frame, so, they kinda each have their limitations which could be overcome with a more focussed development.

Adbot
ADBOT LOVES YOU

Quaternion Cat
Feb 19, 2011

Affeline Space

taqueso posted:

I'm solving equations approximating a projectile's trajectory (like you would use in high-school physics class) for x and y each frame and incrementing a fixed-point time value.

Riiiight, hence the multiplication... that's actually really cool! Using the suvat equations of motion could allow for very specific modelling of a projectile, such as (off of the top of my head) mixed magnification levels, overcoming 1 px per frame maximum movement speed, and rewindability/replayability.

edit: also slow motion

Quaternion Cat fucked around with this message at 02:33 on Jun 18, 2016

  • Locked thread