|
Oh, this should be fun to follow along with, and I like the idea of this being more about reading the code as opposed to trying to actually pretend to enjoy the games. I too grew up with the Ahl books in the library, and thanks to your posts earlier in the retrogaming forum, I wound up nerdsniping myself with the next game in this article. ... good luck explaining the program code as-is. I managed to work out its logic, but my conclusion for improving the program involved burning something like 80% of it completely to the ground.
|
# ¿ Jun 23, 2020 05:25 |
|
|
# ¿ May 21, 2024 06:34 |
|
I've still got some prep work to do on it, but I'm going to just note before crashing out that there is a functionality-breaking typo in this listing. Line 695 is incorrect. It should be possible to figure out what it should be even without working out the full details of the listing.
|
# ¿ Jun 23, 2020 09:58 |
|
Guest Lecture: Amazing I'm not going to do a line-by-line exegesis of the code the way FredMSloniker has, for three reasons. First, many of the basics have already been explained, at a line-by-line level. Second, this code is extremely repetitive and, as we'll see, also very inefficient of programmer time. Finally, looking at the trees gives us no insight into the forest here. What I will mostly instead be doing is describing how I analyzed the program, expecting to replicate it. There is one new BASIC construct in the code, though, so I will point that one out when we reach it. That said, the first thing I did with the program was type it in while being mindful of what it was I was typing. I would not expect full understanding, but a core skill here is to be able to maintain incomplete truths in your head as you go, trusting that you will be able to make sense of them later. The Prologue The opening is mostly boilerplate, but we're definitely not off to a good start here: code:
Note that even though we're using IF/GOTO here, this is still basically a do-while loop in a language like C or Java. Line 110 then dimensions two arrays named W and V that match the dimensions we've set. This is the first place I have to deviate in my personal copy of the code; I'm using a QBASIC-compatible variant of BASIC for this and that means I can't have a variable and an array be named V. My personal copy gets H and V renamed XM and YM, for X-Max and Y-Max. Lines 120-150 produce a symmetrical spacing around the prompt to match line 30. code:
code:
pre:Repeat with I running from 1 to H If I=X PRINT ".--"; Otherwise PRINT ". "; PRINT "." and end the line (As an aside: when some thicket of GOTOs turns out to be organized in a way that I can write it out as clean Pascal-like pseudocode, I will be calling it well-structured. Otherwise, I will be calling it ill-structured.) code:
The prologue ends by jumping into the middle of the next chunk of code. We have left the land of well-structured code behind. The Main Program Logic ... is a big, ill-structured mess, in a way that makes it very hard to pin down what's actually happening on a first pass through the code. I do deduce some things about the variables, though:
So let's put this mess behind us momentarily and look at the final part of the program, lines 1010-1073. Printing Out The Maze This is a series of nested, well-structured loops full of equally well-structured conditional printing statments: code:
I'm a little bit cranky about that, to be honest; there's a very noticable delay between the first line of the maze and the rest of it, when you run it. Better, I think, to rely on the fact that the values in W() are ever increasing. We don't need to dump the top row right away, just as we've determined what the first cell is; as we discovered in the main program loop, writes to W() are ever-increasing and so we could identify column I as the entrance simply by seeing if W(I,1) was 1! Lines 1045-1060 print out the southern walls of each cell based on, essentially, whether or not V(I,J) is even or not; it's checking against the specific values of 0 and 2, but we noticed when typing in the main program loop that V never receives a value above 3. Similarly, and moving backwards slightly, the westernmost wall has always exists and is unconditionally printed at the start of each row by line 1011, and lines 1013-1030 conditionally print out an eastern wall depending on whether the value in V(I,J) is less than 2 or not; a value of 0-1 means there is a wall, and a value of 2-3 means there is a passage. This means that we now know what the V() array is for! It's tracking the actual shape of the maze, and each entry is basically two flags. Each cell's value starts at zero. If there is a passage in this cell to the south, add 1. If there is a passage to the east, add 2. This also means there's no need for any special logic to print out the exit the way we needed it for the entrance; it is stored in V() the same way everything else was. Furthermore, alas, we have nowhere else to hide. We must now dive into the maelstrom of the maze generation logic. Deciphering the Main Program Loop The first time we enter the main loop from lines 210-1000, we skip to line 260. What did we miss? code:
pre:Repeat If R is not H, increment R. Otherwise, if S is not V, reset R to 1 and increment S. Otherwise, R and S are both reset to 1. While W(R,S) is zero This code also shows R and S properly in use. R and S were initialized to X and 1. Throughout the program, they are only ever used by checking their values against 0, H, or V (H always for R, and V always for S), or as an index into the W() and V() arrays. Their initial value is (X, 1)—the maze entrance. When written, they are only ever incremented, decremented, or force-set to 1. R and S are specifying a pair of coordinates within the maze grid, serving much the same purpose as a pointer or a reference, just, well, in a language that has neither. If you find some old enough textbooks, you will see this technique given a name: the (R,S) pair is a cursor into V() and W(). And very like a cursor, what this loop is doing is advancing the cursor through the maze as if it were scanning text on a page, looking for the next cell for which W(R,S) is not 0. If it hits the end, it starts over at the top left. This is a routine that causes the cursor to scan for the next maze cell that has already been visited. The next wodge of code to structure is rather large and indigestible: code:
pre:Loop forever: If NeedScan is true: (lines 210-250) If DoFirstBit is true: (lines 260-520) (rest of loop) It is also worth asking why line 260 isn't 260 IF R-1=0 OR W(R-1,S)<>0 THEN 530. (One might also ask why it's not testing R=1 rather than R-1=0.) This turns out to vary by BASIC dialect. In QBASIC, you could collapse those lines without incident, but in GW-BASIC, there's an ugly corner case. Tests that include AND and OR, in modern languages, are said to short-circuit; once it knows what the result must be it stops evaluating terms. This is true in QBASIC, but in GW-BASIC every term is evaluated. That means that line 260 is fine, but if you were to collapse, say, lines 290 and 300 into IF R=H OR W(R+1,S)=0 THEN 330. it would check if R=H, and then even if it was it would check to see if W(R+1,S)=0, which probably will crash your interpreter since we only DIMensioned W() to have a maximum width of H! QBASIC, like C, Javascript, or even Lisp (predating BASIC by a decade), halt evaluation when the first term comes out true. If I were doing this, and I had the spare memory—and for any maze that will actually fit on my screen, I absolutely do, even if I only have 8KB of RAM—I would collapse them anyway for clarity and just dimension in an extra row and column for W(). ON-GOTO and ON-GOSUB We also have a new BASIC statement! ON X GOTO L1,L2,L3...,Ln looks at the integer value of X and then consults the supplied table of N line numbers. If X is between 1 and N, it goes to the line number that is the Nth entry on the list. If it isn't, it proceeds to the next statement. If you replace the GOTO with a GOSUB then it calls the looked-up line number as a subroutine and with RETURN then proceeding to the next statement. I prefer ON-GOSUB myself because the out-of-range semantics are more consistent. All the uses of ON-GOTO in this code look something like what we see in lines 310 and 320: code:
The Random Jump Targets Blocks 790-815, 820-850, 860-905, and 910-1000 are all well-structured blocks, and are also the only way out of lines 210-780, except for one line that jumps directly to line 1000. Let's look at them in turn to see what alternatives are being randomly selected. code:
In summary, this block of code says "Dig West." code:
code:
The general logic of the code guarantees that if we're in this code at all, V(R,S) can only be 0 or 1; lines 870-880 could be replaced with 870 C=C+1:V(R,S)=V(R,S)+2 with no harm to the code. (What you really want is to bitwise OR it with 2, but that's out of the scope of our analysis here.) This is the only branch that skips the 260-520 block in the main loop. We don't yet know why. code:
If Q is one... code:
It's not clear yet what the significance of these actions are, but this is also the only branch where C is not incremented and also the only branch where no new cells are dug out; it's simply opening a passage within the current cell and then scanning. We'll keep that in mind. Back to the Main Code We know what the random jumps do. Let's revisit the main loop again, this time up to the few random jumps: code:
pre:If it's OK to go west If it's OK to go north If it's OK to go east Dig in a random direction chosen between (North, East, West) otherwise 330 If we aren't at the bottom then GOTO 340 334 If Z=1 then GOTO 370 338 Q=1:GOTO 350 340 If the cell to our south is visited then goto 370 350 Dig in a random direction chosen between (North, West, South) 370 Dig in a random direction chosen between (North, West)
This does teach us a few things. Remember how Z was set to 1 only when we dug south while Q was 1? Q is only set to 1 if we're on the bottom row of the maze and Z is still 0! We've finally figured out what Q and Z mean: Q is 1 if we may be trying to dig the exit from the maze, and Z is 1 if we have already done this. This also means that Q is, strictly speaking, redundant. We can tell if we're trying to dig the exit just by looking at S and V! If we have selected the option to dig south, and we're on the bottom row, we are in fact digging the exit, thank you very much; we need no Q to keep track of this. More importantly, it means we can abstract away the entire dance involving Q and Z as being part of "is it OK to go south"—it's just that instead of doing a simple bounds-check and visitation-check like the other three directions, we have an extra rule that says that if we have not yet dug the exit, the bounds-check may be skipped. Now we can translate the whole 260-1000 block: pre:If it's OK to go west If it's OK to go north If it's OK to go east Dig in a random direction chosen between (North, East, West) otherwise If it's OK to go south Dig in a random direction chosen between (North, West, South) otherwise Dig in a random direction chosen between (North, West) otherwise If it's OK to go east If it's OK to go south Dig in a random direction chosen between (East, West, South) otherwise Dig in a random direction chosen between (East, West) otherwise If it's OK to go south Dig in a random direction chosen between (West, South) otherwise Dig west otherwise If it's OK to go north If it's OK to go east If it's OK to go south Dig in a random direction chosen between (North, East, South) otherwise Dig in a random direction chosen between (North, East) otherwise If it's OK to go south Dig in a random direction chosen between (North, South) otherwise Dig north otherwise If it's OK to go east If it's OK to go south Dig in a random direction chosen between (East, South) otherwise Dig east otherwise If it's OK to go south Dig south otherwise We're stuck; scan for the next occupied space, proceed to next iteration Now that we've laid out the entire set of logic, we can also wee what it does: it is selecting a random valid direction to dig, or, if it is completely boxed in, moving the cursor to the next cell that's still part of the maze. It's just that it's spending nearly seventy lines of code to do so, by exhaustively enumerating every possible case. (The case where all directions are valid is missing, but that is an impossible case; you start at an edge, and in every other case you had to arrive at the current cell from some direction that is now illegal.) This is... not the best way to perform this operation. Even in BASIC. Even without GOSUB. The alternative I devised was eight lines long, none of which were over 80 characters in length. Nevertheless, now we know what it's doing and how it does it. On the bright side, we can also now see that there's no problem with that GOTO 530 in the loopback code; that branch was only taken when digging east, and essentially just skipped the entire "if it is OK to go west" block. It had come from the west, so it was skipping a test that it knew was guaranteed to fail. We don't really need a DoFirstBit flag after all. "Needs a scan" also may be more sensibly rephrased as "we are stuck". The Big Picture We've been talking about moving cursors around and setting and checking various cells in a grid. That's not what we're here for. We're here to generate a maze. What's the top-level view of the program? pre:Get valid maze dimensions from user Select a random column (X) to start the maze Print northern edge of maze with entrance marked Mark (X,1) as visited Mark (X,1) as current location. Repeat until the maze and its exit have been fully dug out: Select a random valid direction if possible If it was not possible: Mark ourselves as stuck If it was possible, and we're not digging the exit: Dig in that direction, and set the current location to that new value If it was possible, and we *are* digging the exit: Open the exit If this cell has an exit to the east, mark ourselves as stuck. If this cell has no exit to the east: Change current location to (1,1) Mark ourselves as stuck if the (1,1) is still not dug out If we have marked ourselves as stuck: Scan through the grid left to right and top to bottom for the next visited cell Mark ourselves as not stuck Print resulting maze After sorting out how it worked, I created an equivalent but vastly shorter version of it, and then made some additional modest improvements. This post has gone on far too long as it is, though, so I will save that for another post. ManxomeBromide fucked around with this message at 01:36 on Jun 24, 2020 |
# ¿ Jun 23, 2020 22:12 |
|
EVEN MORE AMAZING Here is my improved version of Amazing, rewritten with sensible data structures and algorithms and noticeably improved graphics. It is compatible with QBASIC and GW-BASIC, and it is less than fifty lines of code, none of which are more than eighty characters long: code:
A Challenge There's still a bug lurking in this code somewhere, and it's visible in the screenshot. Can you find it? Commodore Conversion The Commodore 8-bits do not have the same graphics characters as the PC, but they do share some of them even when they are in different places. This program is pretty trivially portable to the 40-column Commodore computers (PET 4016, C64, C128, C16, C/+4) with only a few changes: code:
The VIC-20 has only 20-odd columns so you won't get a maze wider than 6 cells, but this code will work even on an unexpanded VIC-20 (3.5KB of RAM!)... as long as you shorten the prompt in line 40: code:
A Final Thought You know how people often talk about the Grand Old Days when people had to actually be good at programming to get decent results out of their computers? Don't. Believe. The. Hype. Check out how much slack there was to squeeze out of this, and it worked fine on 1970s personal hardware.
|
# ¿ Jun 24, 2020 01:16 |
|
There's a strong connection between these old dialects of BASIC and 80s-era assembly language, to be honest. All variables are global because your chip isn't very good at stack variables, with GOTO and GOSUB serving almost exactly the same purpose as JUMP and CALL statements, and with every branch being some kind of IF ____ THEN GOTO statement. Even if it were designed for assembly language, generalizing "pick a random valid direction" would have been a vast improvement even there.
|
# ¿ Jun 25, 2020 01:38 |
|
loaf posted:My first real programming job involved a lot of Hewlett-Packard BASIC, maybe Dijkstra was right. He wasn't. (The quote in question, for those unfamiliar with it, is "It is practically impossible to teach good programming to students that have had a prior exposure to BASIC: as potential programmers they are mentally mutilated beyond hope of regeneration.")
1980s BASIC is never the right approach, these days. I'll give Dijkstra that. But if you can't write something, even in 1980s BASIC, that is recognizably a map-reduce operation, you don't really fully grasp map-reduce. At best you can command it.
|
# ¿ Jun 26, 2020 02:42 |
|
FredMSloniker posted:Okay. Remember how I said BASIC has subroutines? I was a filthy liar. The only difference between GOTO and GOSUB is that GOSUB leaves a note on the top of what's called a stack saying where to go the next time the program RETURNs. This is very fun, and the optimization that happens here can be both (1) made even more awful and (2) done mechanically. The more awful bit: if that GOSUB were to a routine that started right after this one, you would not even need the GOTO. You could simply let control fall into the second function directly. As for doing it mechanically... there are two transformations you can do to make this happen "automatically:"
FredMSloniker posted:The other thing you need to be aware of is that integers have limits, just like floating point numbers do. With floating point numbers, it's generally significant digits (i.e, you might have trouble storing exactly 1.795668586868469) but with integers, it's a hard limit on the minimum and maximum values - and in, say, QuickBASIC, that maximum limit is 32767! Fortunately QuickBASIC also provides the long integer type (the & suffix), which goes up to around two trillion (precisely, 2147483647), but, say, Commodore BASIC V2 does not. Nitpick: you're off by a factor of a thousand. That is two billion. That is also the range of a 32-bit signed integer, still a very common data range in modern programs!
|
# ¿ Jun 26, 2020 03:14 |
|
Carbon dioxide posted:Is it legal to do a complete for loop in a single line? so FOR blabla : some operation : NEXT Yes. Just, as noted in the "Animals" commentary, make sure that the NEXT always executes, and you really should make sure that there is exactly one NEXT for every for. It is theoretically possible to write code that handles this correctly, but seriously, just don't. Put the NEXT at the very end and conditionally jump to it as needed. NEXT is not C's "continue". Some operations (IF, for instance) track by line number, where failing a test skips the whole rest of the line. They're the unusual cases. Normally, things like GOSUB or NEXT or FOR or whatever track at the individual statement level, not the line level.
|
# ¿ Jun 26, 2020 07:46 |
|
Carbon dioxide posted:You know, I was thinking and the best BASIC games I've ever seen are probably GORILLA.BAS and NIBBLES.BAS that came with MS-DOS QBasic. This motivated me to dig through my old QBasic stuff to see how big they really were. It turns out they're a lot larger than I thought they'd be: NIBBLES.BAS is 720 lines of code, and GORILLAS is around 1150. Most of the lines are blank or comments, though, or very short; unlike the programs we've looked at here, there is little or no combination of multiple statements onto a single line. Still, it's pretty verbose. Back in the Compute's Gazette LP one of the games studied was "The Viper", which is a Snake clone like NIBBLES.BAS, but which clocks in at 140 lines and was written to run on a much less powerful machine. You're right that GORILLAS.BAS and NIBBLES.BAS are basically unportable off the PC. GORILLAS used full-screen bitmap graphics and NIBBLES relied heavily on PC-specific character graphics. That said, despite the fact that these two games were written in 1990, a 1981 IBM PC with 512KB of RAM and CGA would play both games. NIBBLES would run flawlessly in that environment (It's built out of character graphics in the 16-color 80x25 text mode that's been there for the entire lifespan of the PC), while GORILLAS really wanted EGA so it could run in a higher resolution with more colors... but it could drop back to a 320x200 CGA mode if necessary. At least some of GORILLAS.BAS is spent on error handling when it discovers that it's running on hardware that can't manage its preferred performance. So, let's see here: Program written: 1990 GORILLAS.BAS: runs on a top-of-the-line but not maxed-out PC from 1984 NIBBLES.BAS: runs on a top-of-the-line but not maxed-out PC from 1981 I was going to make a comment about how times sure have changed, but it's 2020, and if you were to target a top-of-the-line game console from 2014, you'd still be targeting the PS4. So I guess, at least here, times haven't changed in the intervening 30 years.
|
# ¿ Jun 27, 2020 23:43 |
|
|
# ¿ May 21, 2024 06:34 |
|
FredMSloniker posted:See how much simpler the code is when we take advantage of the natural flow? It's not a huge savings in code length - It's easier to understand, but it's treacherous. This burned me in AMAZING a bit. In the modern world we generally think of AND and OR and like it terms that ultimately come from LISP: there's a collection of values, and we want to know of all (AND) or any (OR) of them are true. We are used to a world where the computer goes through the alternatives in order and sees if they are true (or "truthy" - many languages don't explicitly have 'true/false' as separate values and instead have some way of assigning truth or falsity to arbitrary values.) Once it finds something that lets it know the answer right away, we expect the calculation to stop. BASIC doesn't do this, and that means that if you need to need to check something like IF (it is safe to even check the value) AND (the value checked has some property) THEN (do the thing) BASIC will go right ahead and do the dangerous check, and very likely end up crashing at some point with the BASIC equivalents of a NullPointerException or an IndexOutOfBoundsException. In these cases you have to do the chained IF statements. The formal term for this is that in BASIC, the AND/OR/NOT operators do not short-circuit. The reason they don't, in many but not all BASICs, is kind of interesting: it's because they aren't actually logical operators. Instead, these are operations that combine the bits in their arguments; AND returns the number that has a 1 bit in every slot that both operands had a 1; OR returns a number with a 1 bit in every slot that either operand had a 1 in; NOT flips all the bits. In a Microsoft BASIC, 10 AND 12 is 8; 10 OR 12 is 14; NOT 10 is - due to the way integer math works on the computers Microsoft BASIC uses - negative 11. So why does this work at all? Well, it's not really consistent what you'll get out if you print out a statement like PRINT 3>2. But on MS BASICs, it will print out -1. This is the integer with all 1 bits set. And that is what makes this work. Generally speaking, 0 is false, and all non-zero values are truthy. The problem with this from a logical standpoint is that if you'd like NOT to be negation, now all numbers save -1 are falsy, which means that this program prints out both lines on a C64: code:
(This is also why languages in the C tradition use && and || in their logical tests; & and | are the bitwise checks and would suffer from the problems we see here.)
|
# ¿ Jul 4, 2020 03:50 |