Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
ManxomeBromide
Jan 29, 2009

old school
Oh, this should be fun to follow along with, and I like the idea of this being more about reading the code as opposed to trying to actually pretend to enjoy the games. I too grew up with the Ahl books in the library, and thanks to your posts earlier in the retrogaming forum, I wound up nerdsniping myself with the next game in this article.

... good luck explaining the program code as-is. I managed to work out its logic, but my conclusion for improving the program involved burning something like 80% of it completely to the ground.

Adbot
ADBOT LOVES YOU

ManxomeBromide
Jan 29, 2009

old school
I've still got some prep work to do on it, but I'm going to just note before crashing out that there is a functionality-breaking typo in this listing. Line 695 is incorrect. It should be possible to figure out what it should be even without working out the full details of the listing.

ManxomeBromide
Jan 29, 2009

old school
Guest Lecture: Amazing

I'm not going to do a line-by-line exegesis of the code the way FredMSloniker has, for three reasons. First, many of the basics have already been explained, at a line-by-line level. Second, this code is extremely repetitive and, as we'll see, also very inefficient of programmer time. Finally, looking at the trees gives us no insight into the forest here.

What I will mostly instead be doing is describing how I analyzed the program, expecting to replicate it. There is one new BASIC construct in the code, though, so I will point that one out when we reach it.

That said, the first thing I did with the program was type it in while being mindful of what it was I was typing. I would not expect full understanding, but a core skill here is to be able to maintain incomplete truths in your head as you go, trusting that you will be able to make sense of them later.

The Prologue

The opening is mostly boilerplate, but we're definitely not off to a good start here:

code:
10 PRINT TAB(28);"AMAZING PROGRAM"
20 PRINT TAB(15);"CREATIVE COMPUTING  MORRISTOWN, NEW JERSEY"
30 PRINT:PRINT:PRINT:PRINT
100 INPUT "WHAT ARE YOUR WIDTH AND LENGTH";H,V
102 IF H<>1 AND V<>1 THEN 110
104 PRINT "MEANINGLESS DIMENSIONS.  TRY AGAIN.":GOTO 100
110 DIM W(H,V),V(H,V)
120 PRINT
130 PRINT
140 PRINT
150 PRINT
The first three lines print the program header, and then lines 100-104 are a loop that prompts the user for dimensions until it gets a result it's happy with. Unfortunately, line 102 is the one doing the check here, and it's checking only that neither height nor width is exactly equal to 1. It's possible to enter 0, and in any BASIC I've ever used it's also possible to enter negative numbers, and this won't check for those. Better to instead check for greater than here.

Note that even though we're using IF/GOTO here, this is still basically a do-while loop in a language like C or Java.

Line 110 then dimensions two arrays named W and V that match the dimensions we've set. This is the first place I have to deviate in my personal copy of the code; I'm using a QBASIC-compatible variant of BASIC for this and that means I can't have a variable and an array be named V. My personal copy gets H and V renamed XM and YM, for X-Max and Y-Max.

Lines 120-150 produce a symmetrical spacing around the prompt to match line 30.

code:
160 Q=0:Z=0:X=INT(RND(1)*H+1)
Q and Z are mysteries to us right now, but at least they're both zero. X is a random X coordinate.

code:
165 FOR I=1 TO H
170 IF I=X THEN 173
171 PRINT ".--";:GOTO 180
173 PRINT ".  ";
180 NEXT I
190 PRINT "."
This is a FOR loop with an IF-GOTO and a standalone GOTO in it. If we actually trace through the possibilities, though, this is also representing a computation that can be readily stated without GOTO, should we have the expressive power to state it:
pre:
Repeat with I running from 1 to H
    If I=X
        PRINT ".--";
    Otherwise
        PRINT ".  ";
PRINT "." and end the line
This is printing out the first line of the maze, and the random X coordinate we chose back in line 160 was to select which column held the entrance to the maze.

(As an aside: when some thicket of GOTOs turns out to be organized in a way that I can write it out as clean Pascal-like pseudocode, I will be calling it well-structured. Otherwise, I will be calling it ill-structured.)

code:
195 C=1:W(X,1)=C:C=C+1
200 R=X:S=1:GOTO 260
The last bit of code in the program's prologue initializes three variables that are new to us: C, R, and S. It is not immediately clear why line 195 isn't just 195 W(X,1)=1:C=2 but it may be that the meanings of the variables are such that the author thought this was making it clearer.

The prologue ends by jumping into the middle of the next chunk of code. We have left the land of well-structured code behind.

The Main Program Logic

... is a big, ill-structured mess, in a way that makes it very hard to pin down what's actually happening on a first pass through the code. I do deduce some things about the variables, though:
  • Lines 210-1000 are some kind of master control loop.
  • C indexes that loop somehow, and seems to be counting up to H*V+1; we're essentially looping once for every cell in the grid.
  • W is only ever assigned the current value of C, and elements of W are regularly checked against the value 0. W is tracking a gradually spreading property of the grid, and a good first guess is that it's marking which cells in the grid have been "dug out" to build the maze, with 0 meaning "it hasn't been" and any other value being the time point where it was dug out. This makes line 195 make a little more sense; It's maintaining the invariant that W is only ever assigned the current value of C.
  • V is more mysterious; it's basically write-only; the only times its values are ever checked are right before overwriting them with something else. Since it's only ever assigned constants, though, we can restrict its set of possible values to 0-3.
  • X is consistently used as a value of a random choice of something. In the main loop the thing being chosen is a random location to GOTO, but it's being done in a controlled manner that we'll analyze when we get to it.
  • Q and Z are still mysteries, and they're tightly bound up with the most ill-structured parts of the code, but neither takes a value other than 0 or 1, and Z was assigned 0 in the prologue. If Z is modified in the main loop, though, the only possible value it can take is 1. Q and Z are event flags of some kind, and Z represents something irrevocable. Q, on the other hand, has its value bounce back and forth a lot.
Yuck. On the other hand, there's only one way out of the 210-1000 line block, and it's via one of four jumps to line 1010. There are fully four ways this "loop" can loop back—to lines 210, 250, 260, or 530—but there's only one point you can be at when you exit.

So let's put this mess behind us momentarily and look at the final part of the program, lines 1010-1073.

Printing Out The Maze

This is a series of nested, well-structured loops full of equally well-structured conditional printing statments:
code:
1010 FOR J=1 TO V
1011 PRINT "I";
1012 FOR I=1 TO H
1013 IF V(I,J)<2 THEN 1030
1020 PRINT "   ";
1021 GOTO 1040
1030 PRINT "  I";
1040 NEXT I
1041 PRINT
1043 FOR I=1 TO H
1045 IF V(I,J)=0 THEN 1060
1050 IF V(I,J)=2 THEN 1060
1051 PRINT ":  ";
1052 GOTO 1070
1060 PRINT ":--";
1070 NEXT I
1071 PRINT "."
1072 NEXT J
1073 END
Two loop variables, I, and J. The outer loop runs J from 1 to V, so is iterating over rows in the maze. The two inner loops loop I from 1 to H, iterating over each column twice. Each cell in the printout is two lines tall, basically, so the first loop is drawing the "middle" of each cell in the grid, and the second loop is drawing the southern wall of it. We drew the very first line of the maze up in the function prologue, presumably because that first line wasn't any cell's southern wall.

I'm a little bit cranky about that, to be honest; there's a very noticable delay between the first line of the maze and the rest of it, when you run it. Better, I think, to rely on the fact that the values in W() are ever increasing. We don't need to dump the top row right away, just as we've determined what the first cell is; as we discovered in the main program loop, writes to W() are ever-increasing and so we could identify column I as the entrance simply by seeing if W(I,1) was 1!

Lines 1045-1060 print out the southern walls of each cell based on, essentially, whether or not V(I,J) is even or not; it's checking against the specific values of 0 and 2, but we noticed when typing in the main program loop that V never receives a value above 3.

Similarly, and moving backwards slightly, the westernmost wall has always exists and is unconditionally printed at the start of each row by line 1011, and lines 1013-1030 conditionally print out an eastern wall depending on whether the value in V(I,J) is less than 2 or not; a value of 0-1 means there is a wall, and a value of 2-3 means there is a passage.

This means that we now know what the V() array is for! It's tracking the actual shape of the maze, and each entry is basically two flags. Each cell's value starts at zero. If there is a passage in this cell to the south, add 1. If there is a passage to the east, add 2.

This also means there's no need for any special logic to print out the exit the way we needed it for the entrance; it is stored in V() the same way everything else was.

Furthermore, alas, we have nowhere else to hide. We must now dive into the maelstrom of the maze generation logic.

Deciphering the Main Program Loop

The first time we enter the main loop from lines 210-1000, we skip to line 260. What did we miss?

code:
210 IF R<>H THEN 240
215 IF S<>V THEN 230
220 R=1:S=1:GOTO 250
230 R=1:S=S+1:GOTO 250
240 R=R+1
250 IF W(R,S)=0 THEN 210
This code has a GOTO on every line but 1, but it's actually well-structured!
pre:
Repeat
    If R is not H, increment R.
    Otherwise, if S is not V, reset R to 1 and increment S.
    Otherwise, R and S are both reset to 1.
While W(R,S) is zero
Not only is it well-structured, it actually defangs the loopback we saw to line 250. Line 250 is the "while" check here; essentially, when it looped back to 250 it was actually saying "If W(R,S)=0, loop back to 210; otherwise loop back to 260." We've dropped the number of ways to start an iteration from four to three, conceptually.

This code also shows R and S properly in use. R and S were initialized to X and 1. Throughout the program, they are only ever used by checking their values against 0, H, or V (H always for R, and V always for S), or as an index into the W() and V() arrays. Their initial value is (X, 1)—the maze entrance. When written, they are only ever incremented, decremented, or force-set to 1.

R and S are specifying a pair of coordinates within the maze grid, serving much the same purpose as a pointer or a reference, just, well, in a language that has neither. If you find some old enough textbooks, you will see this technique given a name: the (R,S) pair is a cursor into V() and W(). And very like a cursor, what this loop is doing is advancing the cursor through the maze as if it were scanning text on a page, looking for the next cell for which W(R,S) is not 0. If it hits the end, it starts over at the top left. This is a routine that causes the cursor to scan for the next maze cell that has already been visited.

The next wodge of code to structure is rather large and indigestible:
code:
260 IF R-1=0 THEN 530
265 IF W(R-1,S)<>0 THEN 530
270 IF S-1=0 THEN 390
280 IF W(R,S-1)<>0 THEN 390
290 IF R=H THEN 330
300 IF W(R+1,S)<>0 THEN 330
310 X=INT(RND(1)*3+1)
320 ON X GOTO 790,820,860
330 IF S<>V THEN 340
334 IF Z=1 THEN 370
338 Q=1:GOTO 350
340 IF W(R,S+1)<>0 THEN 370
350 X=INT(RND(1)*3+1)
360 ON X GOTO 790,820,910
370 X=INT(RND(1)*2+1)
380 ON X GOTO 790,820
390 IF R=H THEN 470
400 IF W(R+1,S)<>0 THEN 470
405 IF S<>V THEN 420
410 IF Z=1 THEN 450
415 Q=1:GOTO 430
420 IF W(R,S+1)<>0 THEN 450
430 X=INT(RND(1)*3)+1
440 ON X GOTO 790,860,910
450 X=INT(RND(1)*2+1)
460 ON X GOTO 790,860
470 IF S<>V THEN 490
480 IF Z=1 THEN 520
485 Q=1:GOTO 500
490 IF W(R,S+1)<>0 THEN 520
500 X=INT(RND(1)*2+1)
510 ON X GOTO 790,910
520 GOTO 790
The first two lines, at least, are a relief: they are two IF statements that branch to 530 (the line right after this block) if one of two conditions hold. This means we can make the master loop well-structured with the addition of two extra flags:
pre:
Loop forever:
    If NeedScan is true:
        (lines 210-250)
    If DoFirstBit is true:
        (lines 260-520)
    (rest of loop)
We could then replace GOTO 210 with "Set NeedScan and DoFirstBit to true, skip to next iteration", GOTO 250 with "Set DoFirstBit to true, and NeedScan to whatever W(R,S)=0 evaluates to, skip to next iteration", GOTO 260 with "Set NeedScan to False and DoFirstBit to true, next iteration" and GOTO 530 with "Set NeedScan and DoFirstBit to false, next iteration". GOTO 1010 translates to "break out of the loop." We do not actually do this, mind you, but if we wanted to translate this code into JavaScript or something as closely as possible without using GOTOs or simulations of them, this would let us do it.

It is also worth asking why line 260 isn't 260 IF R-1=0 OR W(R-1,S)<>0 THEN 530. (One might also ask why it's not testing R=1 rather than R-1=0.) This turns out to vary by BASIC dialect. In QBASIC, you could collapse those lines without incident, but in GW-BASIC, there's an ugly corner case. Tests that include AND and OR, in modern languages, are said to short-circuit; once it knows what the result must be it stops evaluating terms. This is true in QBASIC, but in GW-BASIC every term is evaluated. That means that line 260 is fine, but if you were to collapse, say, lines 290 and 300 into IF R=H OR W(R+1,S)=0 THEN 330. it would check if R=H, and then even if it was it would check to see if W(R+1,S)=0, which probably will crash your interpreter since we only DIMensioned W() to have a maximum width of H! QBASIC, like C, Javascript, or even Lisp (predating BASIC by a decade), halt evaluation when the first term comes out true.

If I were doing this, and I had the spare memory—and for any maze that will actually fit on my screen, I absolutely do, even if I only have 8KB of RAM—I would collapse them anyway for clarity and just dimension in an extra row and column for W().

ON-GOTO and ON-GOSUB

We also have a new BASIC statement! ON X GOTO L1,L2,L3...,Ln looks at the integer value of X and then consults the supplied table of N line numbers. If X is between 1 and N, it goes to the line number that is the Nth entry on the list. If it isn't, it proceeds to the next statement. If you replace the GOTO with a GOSUB then it calls the looked-up line number as a subroutine and with RETURN then proceeding to the next statement. I prefer ON-GOSUB myself because the out-of-range semantics are more consistent.

All the uses of ON-GOTO in this code look something like what we see in lines 310 and 320:

code:
310 X=INT(RND(1)*3+1)
320 ON X GOTO 790,820,860
X is never out of range, and we're essentially choosing one of three blocks of code to run here. There are four possible targets across all the code: 790, 820, 860, and 910. Those are our next targets. Anything that will improve our ability to summarize the code in the 260-780 block.

The Random Jump Targets

Blocks 790-815, 820-850, 860-905, and 910-1000 are all well-structured blocks, and are also the only way out of lines 210-780, except for one line that jumps directly to line 1000. Let's look at them in turn to see what alternatives are being randomly selected.

code:
790 W(R-1,S)=C
800 C=C+1:V(R-1,S)=2:R=R-1
810 IF C=H*V+1 THEN 1010
815 Q=0:GOTO 260
The cell to the west of the cursor is timestamped in 790. In 800 the timestamp is incremented, the cell to the west of the cursor is marked as having a passage to the east and no passage to the south, and then the cursor is decremented. (It's fine to set it so that there's a wall to the south; the cell to our west was never visited before this; that's why we got to timestamp it in 790.) Line 910 checks to see if the new timestamp is larger than the total number of cells in the maze, and if it is, it breaks out of the master loop and goes to print the maze out. If we still have work to do, though, the Q flag is reset to 0 and we jump back for the next iteration. We're skipping the scan at 210.

In summary, this block of code says "Dig West."

code:
820 W(R,S-1)=C
830 C=C+1
840 V(R,S-1)=1:S=S-1:IF C=H*V+1 THEN 1010
850 Q=0:GOTO 260
Exactly equivalent code, but now we dig north.

code:
860 W(R+1,S)=C
870 C=C+1:IF V(R,S)=0 THEN 880
875 V(R,S)=3:GOTO 890
880 V(R,S)=2
890 R=R+1
900 IF C=H*V+1 THEN 1010
905 GOTO 530
Digging east. We need extra logic here, though, because of the way we store exits. If we dig into our current cell from the south, we need to preserve that passageway as we dig east, so this code sets V(R,S) to 2 or 3 depending on whether the value is 0 or not.

The general logic of the code guarantees that if we're in this code at all, V(R,S) can only be 0 or 1; lines 870-880 could be replaced with 870 C=C+1:V(R,S)=V(R,S)+2 with no harm to the code. (What you really want is to bitwise OR it with 2, but that's out of the scope of our analysis here.)

This is the only branch that skips the 260-520 block in the main loop. We don't yet know why.

code:
910 IF Q=1 THEN 960
920 W(R,S+1)=C:C=C+1:IF V(R,S)=0 THEN 940
930 V(R,S)=3:GOTO 950
940 V(R,S)=1
950 S=S+1:IF C=H*V+1 THEN 1010
955 GOTO 260
If Q is not 1, we're digging south here, in a manner analogous to digging east. Note that we are doing the equivalent of adding 1 here, since the fact that we are here means that V(R,S) could only possibly be 0 or 2.

If Q is one...

code:
960 Z=1
970 IF V(R,S)=0 THEN 980
975 V(R,S)=3:Q=0:GOTO 1000
980 V(R,S)=1:Q=0:R=1:S=1:GOTO 250
1000 GOTO 210
960 is the only place in the main loop where Z can change value, so we know now that Z is an event flag tracking to see if we ever chose to dig south while Q=1. What it seems to do carve a southern exit out of the current cell, and then, if we had entered this cell from the north or the west, teleports the cursor to the upper left of the maze. It then jumps to 250 or 210, triggering the scan for the next maze position we've already visited. The jump to 250 suggests that if we teleported to the upper left and had already visited that cell, then this is an acceptable place to stop the scan.

It's not clear yet what the significance of these actions are, but this is also the only branch where C is not incremented and also the only branch where no new cells are dug out; it's simply opening a passage within the current cell and then scanning. We'll keep that in mind.

Back to the Main Code

We know what the random jumps do. Let's revisit the main loop again, this time up to the few random jumps:

code:
260 IF R-1=0 THEN 530
265 IF W(R-1,S)<>0 THEN 530
270 IF S-1=0 THEN 390
280 IF W(R,S-1)<>0 THEN 390
290 IF R=H THEN 330
300 IF W(R+1,S)<>0 THEN 330
310 X=INT(RND(1)*3+1)
320 ON X GOTO 790,820,860
330 IF S<>V THEN 340
334 IF Z=1 THEN 370
338 Q=1:GOTO 350
340 IF W(R,S+1)<>0 THEN 370
350 X=INT(RND(1)*3+1)
360 ON X GOTO 790,820,910
370 X=INT(RND(1)*2+1)
380 ON X GOTO 790,820
Lines 260 through 300 are only slight variations of each other; they are bounds-checks on R and S followed by checking if a neighbor of cell (R,S) has been visited. The basic check here is is it OK to go this direction? This lets us translate through to line 330 very easily, and then slot the 330-370 block into the larger structure:

pre:
If it's OK to go west
    If it's OK to go north
        If it's OK to go east
            Dig in a random direction chosen between (North, East, West)
        otherwise
            330 If we aren't at the bottom then GOTO 340
            334 If Z=1 then GOTO 370
            338 Q=1:GOTO 350
            340 If the cell to our south is visited then goto 370
            350 Dig in a random direction chosen between (North, West, South)
            370 Dig in a random direction chosen between (North, West)
Lines 330-370 are A Problem. We can't directly translate this into a set of IF/THEN/ELSE statements; we'll have to look at the conditions under which each line runs and then work out what the blocks should be from that.
  • Line 330 is where we start so it always runs.
  • Line 334 runs if the test in line 330 failed, and thus runs only if we are at the bottom of the maze.
  • Line 338 requires both previous tests to fail, which means we are both at the bottom of the maze, and Z is not 1. (Since the only value Z ever takes that isn't 1 is 0, that means Z must be 0 here.)
  • Line 340 runs if we aren't at the bottom of the maze.
  • Line 350 runs if line 338 ran (so, if we are at the bottom and Z=0), OR if line 340 ran and its test failed (so, the cell to our south exists and has not been visited).
  • Line 370 runs if the tests at 334 or 340 succeed, so, if we are at the bottom and Z=1, or if we aren't at the bottom and the cell to our south has already been visited.

This does teach us a few things. Remember how Z was set to 1 only when we dug south while Q was 1? Q is only set to 1 if we're on the bottom row of the maze and Z is still 0! We've finally figured out what Q and Z mean: Q is 1 if we may be trying to dig the exit from the maze, and Z is 1 if we have already done this.

This also means that Q is, strictly speaking, redundant. We can tell if we're trying to dig the exit just by looking at S and V! If we have selected the option to dig south, and we're on the bottom row, we are in fact digging the exit, thank you very much; we need no Q to keep track of this. More importantly, it means we can abstract away the entire dance involving Q and Z as being part of "is it OK to go south"—it's just that instead of doing a simple bounds-check and visitation-check like the other three directions, we have an extra rule that says that if we have not yet dug the exit, the bounds-check may be skipped. Now we can translate the whole 260-1000 block:

pre:
If it's OK to go west
    If it's OK to go north
        If it's OK to go east
            Dig in a random direction chosen between (North, East, West)
        otherwise
            If it's OK to go south
                Dig in a random direction chosen between (North, West, South)
            otherwise
                Dig in a random direction chosen between (North, West)
    otherwise
        If it's OK to go east
            If it's OK to go south
                Dig in a random direction chosen between (East, West, South)
            otherwise
                Dig in a random direction chosen between (East, West)
        otherwise
            If it's OK to go south
                Dig in a random direction chosen between (West, South)
            otherwise
                Dig west
otherwise
    If it's OK to go north
        If it's OK to go east
            If it's OK to go south
                Dig in a random direction chosen between (North, East, South)
            otherwise
                Dig in a random direction chosen between (North, East)
        otherwise
            If it's OK to go south
                Dig in a random direction chosen between (North, South)
            otherwise
                Dig north
    otherwise
        If it's OK to go east
            If it's OK to go south
                Dig in a random direction chosen between (East, South)
            otherwise
                Dig east
        otherwise
            If it's OK to go south
                Dig south
            otherwise
                We're stuck; scan for the next occupied space, proceed to next iteration
Well. We can do so with one terrifying exception. Line 695 appears to read 695 Q=1:GOTO 830 and there is no possible way that is correct (830 is the middle of a dig operation). By analogy with the other five places in the code that Q is set, the target should clearly be line 710, not 830.

Now that we've laid out the entire set of logic, we can also wee what it does: it is selecting a random valid direction to dig, or, if it is completely boxed in, moving the cursor to the next cell that's still part of the maze. It's just that it's spending nearly seventy lines of code to do so, by exhaustively enumerating every possible case. (The case where all directions are valid is missing, but that is an impossible case; you start at an edge, and in every other case you had to arrive at the current cell from some direction that is now illegal.)

This is... not the best way to perform this operation. Even in BASIC. Even without GOSUB. The alternative I devised was eight lines long, none of which were over 80 characters in length. Nevertheless, now we know what it's doing and how it does it.

On the bright side, we can also now see that there's no problem with that GOTO 530 in the loopback code; that branch was only taken when digging east, and essentially just skipped the entire "if it is OK to go west" block. It had come from the west, so it was skipping a test that it knew was guaranteed to fail. We don't really need a DoFirstBit flag after all. "Needs a scan" also may be more sensibly rephrased as "we are stuck".

The Big Picture

We've been talking about moving cursors around and setting and checking various cells in a grid. That's not what we're here for. We're here to generate a maze. What's the top-level view of the program?

pre:
Get valid maze dimensions from user
Select a random column (X) to start the maze
Print northern edge of maze with entrance marked
Mark (X,1) as visited
Mark (X,1) as current location.
Repeat until the maze and its exit have been fully dug out:
    Select a random valid direction if possible
    If it was not possible:
        Mark ourselves as stuck
    If it was possible, and we're not digging the exit:
        Dig in that direction, and set the current location to that new value
    If it was possible, and we *are* digging the exit:
        Open the exit
        If this cell has an exit to the east, mark ourselves as stuck.
        If this cell has no exit to the east:
            Change current location to (1,1)
            Mark ourselves as stuck if the (1,1) is still not dug out
    If we have marked ourselves as stuck:
        Scan through the grid left to right and top to bottom for the next visited cell
        Mark ourselves as not stuck
Print resulting maze
At this level, the program isn't bad. This is a reasonably efficient implementation of computing a spanning tree across a rectangular graph, with some tweaks that lend themselves to ensuring there are a small number of long twisting corridors instead of a vast number of bushy, instantaneous dead-ends. Those same tweaks also mean that it never needs GOSUB, so it's not 100% clear to me that they did that on purpose.

After sorting out how it worked, I created an equivalent but vastly shorter version of it, and then made some additional modest improvements. This post has gone on far too long as it is, though, so I will save that for another post.

ManxomeBromide fucked around with this message at 01:36 on Jun 24, 2020

ManxomeBromide
Jan 29, 2009

old school
EVEN MORE AMAZING

Here is my improved version of Amazing, rewritten with sensible data structures and algorithms and noticeably improved graphics. It is compatible with QBASIC and GW-BASIC, and it is less than fifty lines of code, none of which are more than eighty characters long:
code:
10 PRINT TAB(28);"AMAZING PROGRAM"
20 PRINT TAB(15);"CREATIVE COMPUTING  MORRISTOWN, NEW JERSEY"
30 PRINT:PRINT:PRINT:PRINT
40 INPUT "WHAT ARE YOUR WIDTH AND LENGTH";XM,YM
50 IF XM<2 OR YM<2 THEN PRINT "MEANINGLESS DIMENSIONS. TRY AGAIN.":GOTO 40
60 DIM W(XM+1,YM+1),V(XM,YM),VD(4)
70 RANDOMIZE TIMER
80 R=INT(RND(1)*XM+1):S=1:W(R,S)=1:C=2:Z=0
90 N=0:X=0
100 IF R<>1 AND W(R-1,S)=0 THEN N=N+1:VD(N)=1
110 IF S<>1 AND W(R,S-1)=0 THEN N=N+1:VD(N)=2
120 IF R<>XM AND W(R+1,S)=0 THEN N=N+1:VD(N)=3
130 IF (S=YM AND Z=0) OR (S<>YM AND W(R,S+1)=0) THEN N=N+1:VD(N)=4
140 IF N>0 THEN N=VD(INT(RND(1)*N)+1)
150 ON N+1 GOSUB 390, 400, 410, 420, 430
160 IF X=0 THEN C=C+1:GOTO 210
170 IF R<>XM THEN R=R+1:GOTO 200
180 IF S<>YM THEN R=1:S=S+1:GOTO 200
190 R=1:S=1
200 IF W(R,S)=0 THEN 170
210 IF Z=0 OR C<XM*YM+1 THEN 90
220 PRINT CHR$(218);:IF W(1,1)=1 THEN PRINT "  ";:GOTO 240
230 PRINT CHR$(196);CHR$(196);
240 FOR X=2 TO XM:PRINT CHR$(194);:IF W(X,1)=1 THEN PRINT "  ";:GOTO 260
250 PRINT CHR$(196);CHR$(196);
260 NEXT X:PRINT CHR$(191)
270 FOR Y=1 TO YM:PRINT CHR$(179);:FOR X=1 TO XM
280 IF V(X,Y)<2 THEN PRINT "  ";CHR$(179);:GOTO 300
290 PRINT "   ";
300 NEXT X:PRINT:L$=CHR$(195):M$=CHR$(197):R$=CHR$(180)
310 IF Y=YM THEN L$=CHR$(192):M$=CHR$(193):R$=CHR$(217)
320 PRINT L$;:IF V(1,Y)=0 OR V(1,Y)=2 THEN PRINT CHR$(196);CHR$(196);:GOTO 340
330 PRINT "  ";
340 FOR X=2 TO XM:PRINT M$;
350 IF V(X,Y)=0 OR V(X,Y)=2 THEN PRINT CHR$(196);CHR$(196);:GOTO 370
360 PRINT "  ";
370 NEXT X:PRINT R$:NEXT Y
380 END
390 X=1:RETURN
400 W(R-1,S)=C:V(R-1,S)=2:R=R-1:RETURN
410 W(R,S-1)=C:V(R,S-1)=1:S=S-1:RETURN
420 W(R+1,S)=C:V(R,S)=V(R,S)+2:R=R+1:RETURN
430 IF S=YM THEN 450
440 W(R,S+1)=C:V(R,S)=V(R,S)+1:S=S+1:RETURN
450 Z=1:IF V(R,S)<>0 THEN V(R,S)=3:RETURN
460 V(R,S)=1:R=1:S=1:C=C-1
470 IF W(R,S)=0 THEN X=1:C=C+1
480 RETURN
The improvements and modifications are sixfold:
  • It now correctly rejects input that specifies zero or negative dimensions.
  • The 70-ish lines of code for choosing a direction (lines 260-780) are now more like seven. I create an expandable list, fill it with between 0 and 4 elements, and choose a result from the valid elements. BASIC doesn't have expandable lists as a data type, of course, but I'm not about to let that stop me.
  • The digging logic has also been simplified, merging "getting stuck" into the actual logic of executing a chosen direction.
  • The random number generator is re-seeded on each run to make sure it won't keep producing the same maze each time.
  • I removed the blank lines from after the part where you choose the dimensions, mainly so that a 10-row maze can keep the dimensions visible on a 25-line screen.
  • The printing of the maze is slightly more sophisticated, making use of the box-drawing characters in the IBM BIOS codepage that we now know as Code Page 437. The box-drawing characters aren't quite up to the level I'd want them to be at to get a properly proportioned set of maze passages, and even if they were the drawing code would be quite a bit harder to manage, but even if I keep the logic mostly intact I can still get a perfectly acceptable, if spiky, outcome.


A Challenge

There's still a bug lurking in this code somewhere, and it's visible in the screenshot. Can you find it?

Commodore Conversion

The Commodore 8-bits do not have the same graphics characters as the PC, but they do share some of them even when they are in different places. This program is pretty trivially portable to the 40-column Commodore computers (PET 4016, C64, C128, C16, C/+4) with only a few changes:

code:
10 PRINT TAB(12);"AMAZING PROGRAM"
20 PRINT TAB(11);"CREATIVE COMPUTING":PRINT TAB(9);"MORRISTOWN, NEW JERSEY"
70 R=RND(-TI)
220 PRINT CHR$(176);:IF W(1,1)=1 THEN PRINT "  ";:GOTO 240
230 PRINT CHR$(195);CHR$(195);
240 FOR X=2 TO XM:PRINT CHR$(178);:IF W(X,1)=1 THEN PRINT "  ";:GOTO 260
250 PRINT CHR$(195);CHR$(195);
260 NEXT X:PRINT CHR$(174)
270 FOR Y=1 TO YM:PRINT CHR$(221);:FOR X=1 TO XM
280 IF V(X,Y)<2 THEN PRINT "  ";CHR$(221);:GOTO 300
290 PRINT "   ";
300 NEXT X:PRINT:L$=CHR$(171):M$=CHR$(219):R$=CHR$(179)
310 IF Y=YM THEN L$=CHR$(173):M$=CHR$(177):R$=CHR$(189)
320 PRINT L$;:IF V(1,Y)=0 OR V(1,Y)=2 THEN PRINT CHR$(195);CHR$(195);:GOTO 340
330 PRINT "  ";
340 FOR X=2 TO XM:PRINT M$;
350 IF V(X,Y)=0 OR V(X,Y)=2 THEN PRINT CHR$(195);CHR$(195);:GOTO 370
Lines 10 and 20 change to match 40-column formatting, and line 70 changes because Commodore BASICs manage reseeding the random number generator differently from PC BASICs. Lines 220-270 don't all change, but it's easier to just reproduce the entire print routine than it is to pick out which lines happened to not change.

The VIC-20 has only 20-odd columns so you won't get a maze wider than 6 cells, but this code will work even on an unexpanded VIC-20 (3.5KB of RAM!)... as long as you shorten the prompt in line 40:

code:
40 INPUT "MAZE W/H",XM,YM
Long INPUT prompts seem to confuse the VIC-20 BASIC for some reason.

A Final Thought

You know how people often talk about the Grand Old Days when people had to actually be good at programming to get decent results out of their computers?

Don't. Believe. The. Hype.

Check out how much slack there was to squeeze out of this, and it worked fine on 1970s personal hardware.

ManxomeBromide
Jan 29, 2009

old school
There's a strong connection between these old dialects of BASIC and 80s-era assembly language, to be honest. All variables are global because your chip isn't very good at stack variables, with GOTO and GOSUB serving almost exactly the same purpose as JUMP and CALL statements, and with every branch being some kind of IF ____ THEN GOTO statement.

Even if it were designed for assembly language, generalizing "pick a random valid direction" would have been a vast improvement even there.

ManxomeBromide
Jan 29, 2009

old school

loaf posted:

My first real programming job involved a lot of Hewlett-Packard BASIC, maybe Dijkstra was right.

He wasn't.

(The quote in question, for those unfamiliar with it, is "It is practically impossible to teach good programming to students that have had a prior exposure to BASIC: as potential programmers they are mentally mutilated beyond hope of regeneration.")
  • The battles over how programs should be structured were fought throughout the 1970s and early 1980s, with the ultimate conclusion being that everyone had the wrong idea. (The most famous example of this would be that the same faction that thought that variables ought to be local to the function that declared them was the same faction that thought that "break" and "continue" should not exist as control structures, and that functions should never have a "return this value immediately" operation. They didn't think this for no reason, but the real solution to the problem they were trying to address is basically "destructors"... and while that term wasn't really common back then, equivalents existed in the form of constructs like "unwind-protect".)
  • It's even less true now because by the early 1990s BASIC had become a language that mostly fit the rules for code structure that Dijkstra's faction argued for. I'm tempted to convert the Awari program to a fully modern dialect as an example; it looks very different. But let's ignore that for now...
  • Vast numbers of people who went into tech and who was born in the late 1970s or early 1980s was exposed to BASIC at a very young age via the Apple II or the BBC Micro or their competitors, and this exposure certainly did not slow them down.
  • Lessons from the future can be brought back to the past. I wasn't thinking of Lua when I wrote it, but it's not a coincidence that FredMSloniker saw my modifications to Amazing and concluded that I had replicated the logic he'd use if he were to take on the task in Lua. My own thought process was closer to "I need a dynamic list. I've got arrays. I can make an equivalent to java.util.ArrayList."
  • Pretty much all the modern programming disciplines existed in the 1970s. The kind of control we see in BASIC existed at the asm level for basically the entire history of the electronic computer. "Structured" programming's basis were the algorithmic specification languages of the 1960s. So was object-orientation, really; C++ is based on Simula 67 in part, and that "67" is for 1967. Smalltalk, which inspired all the other major approaches to OOP, was from 1972. So as Prolog, the archetypical declarative programming language. Functional programming is older than digital computers if we treat the lambda calculus as a form of it, and it reached a recognizable modern form with Scheme, released in 1975, the same year as Dijkstra's quote. Nowadays a skilled programmer is expected to be able to attack problems using several of these approaches, and the mark of an experienced one is to know which paradigm will be most effective given other aspects of the problem.

1980s BASIC is never the right approach, these days. I'll give Dijkstra that. But if you can't write something, even in 1980s BASIC, that is recognizably a map-reduce operation, you don't really fully grasp map-reduce. At best you can command it.

ManxomeBromide
Jan 29, 2009

old school

FredMSloniker posted:

Okay. Remember how I said BASIC has subroutines? I was a filthy liar. The only difference between GOTO and GOSUB is that GOSUB leaves a note on the top of what's called a stack saying where to go the next time the program RETURNs.

This is very fun, and the optimization that happens here can be both (1) made even more awful and (2) done mechanically.

The more awful bit: if that GOSUB were to a routine that started right after this one, you would not even need the GOTO. You could simply let control fall into the second function directly.

As for doing it mechanically... there are two transformations you can do to make this happen "automatically:"
  • First: it's always possible as noted to transform GOSUB nnn:RETURN into GOTO nnn for any N. It's also more efficient. If you want to make JavaScript devs squirm, point out to them that this is 1980s BASIC demonstrating generalized tail-call optimization. You won't even be completely lying, but don't try this on a Scheme or Haskell dev.
  • If there's a GOTO nnn line and nnn is the next line, you can just drop the statement entirely.
So, if the last thing you do in a function is call the next one, you can just drop the call and the return! :science:

FredMSloniker posted:

The other thing you need to be aware of is that integers have limits, just like floating point numbers do. With floating point numbers, it's generally significant digits (i.e, you might have trouble storing exactly 1.795668586868469) but with integers, it's a hard limit on the minimum and maximum values - and in, say, QuickBASIC, that maximum limit is 32767! Fortunately QuickBASIC also provides the long integer type (the & suffix), which goes up to around two trillion (precisely, 2147483647), but, say, Commodore BASIC V2 does not.

Nitpick: you're off by a factor of a thousand. That is two billion. That is also the range of a 32-bit signed integer, still a very common data range in modern programs!

ManxomeBromide
Jan 29, 2009

old school

Carbon dioxide posted:

Is it legal to do a complete for loop in a single line? so FOR blabla : some operation : NEXT

Yes. Just, as noted in the "Animals" commentary, make sure that the NEXT always executes, and you really should make sure that there is exactly one NEXT for every for. It is theoretically possible to write code that handles this correctly, but seriously, just don't. Put the NEXT at the very end and conditionally jump to it as needed. NEXT is not C's "continue".

Some operations (IF, for instance) track by line number, where failing a test skips the whole rest of the line. They're the unusual cases. Normally, things like GOSUB or NEXT or FOR or whatever track at the individual statement level, not the line level.

ManxomeBromide
Jan 29, 2009

old school

Carbon dioxide posted:

You know, I was thinking and the best BASIC games I've ever seen are probably GORILLA.BAS and NIBBLES.BAS that came with MS-DOS QBasic.

Those are way more complex than anything in this book and you'd probably wouldn't want to copy them from paper.

Anyway, I guess the stuff they do where they have a sort of full screen graphical game where the whole screen updates instead of the terminal just scrolling as it prints things is something for which you need a much more modern computer than anything this book was aiming for.

This motivated me to dig through my old QBasic stuff to see how big they really were. It turns out they're a lot larger than I thought they'd be: NIBBLES.BAS is 720 lines of code, and GORILLAS is around 1150. Most of the lines are blank or comments, though, or very short; unlike the programs we've looked at here, there is little or no combination of multiple statements onto a single line.

Still, it's pretty verbose. Back in the Compute's Gazette LP one of the games studied was "The Viper", which is a Snake clone like NIBBLES.BAS, but which clocks in at 140 lines and was written to run on a much less powerful machine.

You're right that GORILLAS.BAS and NIBBLES.BAS are basically unportable off the PC. GORILLAS used full-screen bitmap graphics and NIBBLES relied heavily on PC-specific character graphics. That said, despite the fact that these two games were written in 1990, a 1981 IBM PC with 512KB of RAM and CGA would play both games. NIBBLES would run flawlessly in that environment (It's built out of character graphics in the 16-color 80x25 text mode that's been there for the entire lifespan of the PC), while GORILLAS really wanted EGA so it could run in a higher resolution with more colors... but it could drop back to a 320x200 CGA mode if necessary. At least some of GORILLAS.BAS is spent on error handling when it discovers that it's running on hardware that can't manage its preferred performance.

So, let's see here:
Program written: 1990
GORILLAS.BAS: runs on a top-of-the-line but not maxed-out PC from 1984
NIBBLES.BAS: runs on a top-of-the-line but not maxed-out PC from 1981

I was going to make a comment about how times sure have changed, but it's 2020, and if you were to target a top-of-the-line game console from 2014, you'd still be targeting the PS4.

So I guess, at least here, times haven't changed in the intervening 30 years.

Adbot
ADBOT LOVES YOU

ManxomeBromide
Jan 29, 2009

old school

FredMSloniker posted:

See how much simpler the code is when we take advantage of the natural flow? It's not a huge savings in code length -

code:
445 IF Z=INT(Z) AND Z>=0 AND Z<=4 THEN 460
455 PRINT "INCORRECT ANSWER.  RETYPE IT. ";:GOTO 430
460 IF RND(1)<.5 THEN 1000
- but it's easier to understand, and it's a savings that can be applied throughout the program. Whenever possible, work with the flow in BASIC!

It's easier to understand, but it's treacherous. This burned me in AMAZING a bit. :)

In the modern world we generally think of AND and OR and like it terms that ultimately come from LISP: there's a collection of values, and we want to know of all (AND) or any (OR) of them are true. We are used to a world where the computer goes through the alternatives in order and sees if they are true (or "truthy" - many languages don't explicitly have 'true/false' as separate values and instead have some way of assigning truth or falsity to arbitrary values.) Once it finds something that lets it know the answer right away, we expect the calculation to stop.

BASIC doesn't do this, and that means that if you need to need to check something like
IF (it is safe to even check the value) AND (the value checked has some property) THEN (do the thing)
BASIC will go right ahead and do the dangerous check, and very likely end up crashing at some point with the BASIC equivalents of a NullPointerException or an IndexOutOfBoundsException. In these cases you have to do the chained IF statements.

The formal term for this is that in BASIC, the AND/OR/NOT operators do not short-circuit. The reason they don't, in many but not all BASICs, is kind of interesting: it's because they aren't actually logical operators.

Instead, these are operations that combine the bits in their arguments; AND returns the number that has a 1 bit in every slot that both operands had a 1; OR returns a number with a 1 bit in every slot that either operand had a 1 in; NOT flips all the bits. In a Microsoft BASIC, 10 AND 12 is 8; 10 OR 12 is 14; NOT 10 is - due to the way integer math works on the computers Microsoft BASIC uses - negative 11.

So why does this work at all? Well, it's not really consistent what you'll get out if you print out a statement like PRINT 3>2. But on MS BASICs, it will print out -1. This is the integer with all 1 bits set.

And that is what makes this work. Generally speaking, 0 is false, and all non-zero values are truthy. The problem with this from a logical standpoint is that if you'd like NOT to be negation, now all numbers save -1 are falsy, which means that this program prints out both lines on a C64:

code:
10 IF 3 THEN PRINT "YAY!"
20 IF NOT 3 THEN PRINT "WAIT, WHAT?"
But as long as the only values you are working with are 0 and -1, then there is no difference between bit-level and logic-level operations for determining truth and falsehood.

(This is also why languages in the C tradition use && and || in their logical tests; & and | are the bitwise checks and would suffer from the problems we see here.)

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply