“[A game in alpha-testing has] on the order of 4,000 bugs. Maybe fifty percent are spelling and punctuation errors, extra spaces, missing blank lines, and so on. Maybe one percent are crashes.”
“I try to make sure that the sun is in the right position, or if you're in outer space the sun and moon are where they're supposed to be – stuff like that… The wooden beam [in ‘Infidel’] is described as being a certain length and width, and I calculated that it would have to weigh 500 pounds.”
— Max Buxton and Gary Brennan (Infocom play-testers), 1987
So the game is built: the wood is rough and splintered, but it's recognisably a game. There is still a good month's work to do, easier work if less creative, and beyond that a good deal of drudgery to fix bug after bug after bug.1 The first post-design task is to sort out the scoring system, usually awarding points out of some pleasingly round number and dividing them into rankings. Here is ‘Zork II’:
Beginner (0), Amateur Adventurer (40), Novice Adventurer (80), Junior Adventurer (160), Adventurer (240), Master (320), Wizard (360), Master Adventurer (400)
This is disappointingly bland, and a more pleasing tradition is to name ranks for the player's profession in the game – so that an orchestral musician might begin as Triangle and rise through Second Violinist to Conductor. (In ‘Sherlock’, the lowest rank – corresponding to zero achievement – is Chief Superintendent of Scotland Yard.) Among the questions to ask are: will every winner of the game necessarily score exactly 400 out of 400? (This can be difficult to arrange if even small acts are scored.) Will everyone entering the end game already have a score of 360, and so have earned the title “Wizard”? Will the rank “Amateur” correspond exactly to having got out of the prologue and into the middle game?
Unless the scoring system is worked out and the game can pass its entire transcript of the “winning” solution without crashing or giving absurd replies, it is too soon to go into play-testing.
▲ Scoring systems vary greatly. ‘Adventure Quest’ is scored “out of about 6,000”, and exemplifies the pinball-machine-like tendency to offer points by 5s, 10s or 100s. Other games feel that one puzzle is one point, or award percentages, and still others frown on score altogether because “that's not how life works”. In ‘Moonmist’, scores are described thus: “[Well, so far you've met Lord Jack and all of the guests, washed up from your trip... but you haven't found the hidden treasure nor enough evidence nor identified the ghost!]”. In ‘Zork III’, the player's “potential” is given out of 7, corresponding to which of seven challenges have been encountered (so that a score of 7 does not mean the game is over). In ‘The Lurking Horror’, 20 major puzzles are awarded 5 points apiece for a maximum of 100: the 20th puzzle is to win the game. In some ports of ‘Advent’ 1 point is awarded for each room visited for the first time, and 1 for never having saved the game, a mean trick, plus the infamous “Last Lousy Point”, awarded without any clue for dropping a particular object in a particular place, an irrelevant act achieving nothing. (People used to have to disassemble the mainframe game to discover this.)
· · · · ·
During the writing and maintenance of ‘Christminster’, Gareth Rees kept a log of all 475 modifications prompted by play-testers and players. This log is archived with the game's source code at ftp.gmd.de and makes an interesting case study. 224 reports requested additional interactivity and responses, often to reasonable but wrong guesses made by the player. A further 86 arose from incorrect responses or inconsistencies, 32 from typographical errors and 79 from mistakes in computer programming, for instance in the game's complicated algorithms to handle telephony and the mixing of liquids.
At every stage in writing an interactive fiction it is easy to lapse into the habit of writing an uninteractive one. A designer who has written a linear story and then introduced some puzzles may imagine that the literary style and effect of the game comes from the text originally written, but that isn't altogether true: most of the player's time at the keyboard is spent trying the wrong thing, so most of the player's experience of the game lies in how it deals with wrong guesses. This means that it's essential to respond to as many of attempts as possible, acknowledging that the player has made honest attempts, and so helping to form a sort of relationship:
In the aquarium is a baby sea-serpent who eyes
you suspiciously. His scaly body writhes about in the huge tank.
>take serpent
He takes you instead. *Uurrp!*
This is from ‘Zork II’, a program which is at least twice the size of ‘Advent’ in spite of implementing a much smaller design. Almost all of that disparity is due to its generous stock of responses. Similarly, ‘Zork I’ contains possibly the first examples of alternative solutions to puzzles (the cyclops can be defeated in two different ways, as can the Loud Room). If a play-tester can think of a reasonable solution to which the game does not respond, it is worth considering a redesign of the puzzle to allow both solutions. Even if not, a response should be made which acknowledges that the player has made a good guess.
· · · · ·
Bugs in interactive fiction are individually puny, yet daunting by their number, like a column of army ants. Just as the ‘Christminster’ log (see above) gives an idea of routine testing, so Graeme Cree's catalogue of bugs in released Infocom games (at www.xyzzynews.com) shows what can slip through the most rigorous testing regime. Here are some common types of bug:
· · · · ·
The days of play-testing are harrowing. Dave Lebling again, on ‘Suspect’:
>bartender, give me a drink
“Sorry, I've been hired to mix drinks and that's all.”
>dance with alicia
Which Alicia do you mean, Alicia or the overcoat?
Veronica's body is slumped behind the desk, strangled with a lariat.
>talk to veronica
Veronica's body is listening.
(“Little bugs, you know? Things no one would notice. At this point the tester's job is fairly easy. The story is like a house of cards – it looks pretty solid but the slightest touch collapses it.”)
Good play-testers are worth their weight in gold. Their first contribution is to try things in a systematically perverse way. To quote Michael Kinyon, whose effect may be felt almost everywhere in the present author's games,
A tester with a new verb is like a kid with a hammer; every problem seems like a nail.
And here is Neil deMause, on one of his play-testers:
He has an odd compulsion, when he plays IF games, to close doors behind him. It's a bizarre fastidiousness, not even remotely useful for an IF player, but I love him for it, because he has uncovered bugs in this way that I never would have found.
Games substantially grow in play-testing, and come alive. Irene Callaci's acknowledgements could speak for all designers:
I thought perhaps beta testing might reveal a couple of odd, off-the-wall commands that weren't implemented, or maybe a typo here and there, or possibly an adjective or two I had forgotten. Not! I wasn't even close to being finished, and I didn't even know it. ‘Mother Loose’ grew from 151K to 199K during the beta testing period alone. Looking back now, if I had released ‘Mother Loose’ when I thought it was ready, I would have crawled under a rock from embarrassment. Thank you, thank you, thank you to all my beta testers.
More is true even than this: the play-tester is to interactive fiction as the editor is to the novel, and should be credited and acknowledged as such. Major regions of ‘Curses’ and ‘Jigsaw’ were thrown out politely but firmly by my own play-testers as being substandard or unsuitable.3 A radical response to the play-tester's doubts is almost always better than papering over cracks.
After a first pass by one or two play-testers, and a consequent redrafting exercise, the game can go to beta testing at the hands of perhaps six or seven volunteers, who come to it fresh and treat it more as an entertainment and less as an unexploded bomb. (At one time Infocom used two phases of beta-testing, sometimes involving as many as 200 volunteers, even after pre-alpha and alpha-testing in house.) It is wise to insist on reports in writing or email, or some concrete form, and to ask for a series of reports, one at a time, rather than waiting a month for an epic list of bugs. It can be useful for play-testers to keep transcripts of their sessions with the game, and send them verbatim, because these transcripts are eloquent of how difficult or easy the puzzles are and which wrong guesses are tried. In its debugging version, ‘Jigsaw’ provided a verb called “bug” purely to help players type comments into such a transcript:
>bug Miss Shutes is known as he
Oh dear.
>bug The corn bread isn't edible
Is that so?
It is worth keeping in touch with play-testers to ensure that they are not utterly stuck because of a bug or an unreasonable puzzle, but it's important to give no hints unless they are asked for.
· · · · ·
A game is never finished, only abandoned. There is always one more bug, or one more message which could be improved, or one more wry response to drop in. Debugging is a creative process, even beyond the initial release, and games commonly have four to ten revisions in their first couple of years in play. In the end, of course, the designer walks away. Almost all the pre-1990 designers cited in the bibliography are still alive, but few are still designing, and they often speak of their games as something fun but belonging to another time in their lives: something they feel faintly self-conscious about, perhaps, something that they did years ago: when they were in high school, when their kids were young, when they did a little testing for Infocom, when computers were less visual, when it was the state of the art, when the Ph.D. was at an impasse. But if twenty-five years is an epoch in computing, it is not a long time in the history of art, and the early designers remain a presence in a genre which is younger and less settled than sometimes appears. Once in a while Scott Adams causes a frisson by throwing a remark into a newsgroup mostly read by designers to whom he seems a historical figure coeval with Scott of the Antarctic. Can such a man really have an email address?
In a recent radio broadcast (1999), Douglas Adams said that the great enjoyability of working on his games with Infocom was having fun with a new medium before it became an art form and had serious articles written about it. (Which pretty much puts this chapter in its place.) But that original fascination dies hard, and the first and happiest discovery of anyone researching into interactive fiction is that designers past are only too pleased to be rediscovered, and willing to go to great trouble – hunting through archives, attics and obsolete equipment – to see that their games can be trodden again.
An adventure game can be one of the most satisfying of works to have written: perhaps because one can always polish it a little further, perhaps because it has hidden and secret possibilities. But perhaps too because something is made as well as written: and once made can never be unmade, so that there will always be a small brick building by the end of the road, and in it there will always be keys, food, a bottle and the lamp.
1 Dave Lebling starts work in the morning: “Even a cup of yummy coffee won't improve things when you see ‘page 1 of 12’ on the first bug report form.” Many good-natured pieces about testing appear in Infocom's publicity newspapers. At best, it was enormous fun, and Liz Cyr-Jones's testing department made a lively and exhilarating summer job. But there were also tensions. The tester who wrote that 12-page form saw it not as dismaying but as the proud result of a job well done. It could be frustrating that the bugs would be fixed at random intervals, or not at all, while revised versions of the game would sometimes arrive without any clear indication of what had altered. Brian Moriarty redesigned great swathes of game at the last minute, and the fact that he had always said he would did not make this any less maddening.
2
It wasn't until the fifth proofs of §12
of this manual that any of us noticed that the “great stone slab”
altar of ‘Ruins’ had never been made static
.
3 Similarly, the published source code to ‘Christminster’ contains “offcuts” such as a pulley-and-rope puzzle in the clock tower.