LORD DIMWIT FLATHEAD: "It must have two hundred thousand rooms, four million takeable objects, and understand a vocabulary of every single word ever spoken in every language ever invented."
The New Zork Times (Winter 1984)
[Note: the information below has not been updated since the 20th June 1997 revision of this document.]
To give some idea of the sizes found in typical story files, here are a few statistics, mostly gathered by Paul David Doherty, whose "Infocom fact sheet" file is the definitive reference.
(i) Length
The shortest files are those dating from the time of the 'Zork' trilogy, at about 85K; middle-period Version 3 games are typically 105K, and only the latest use the full memory map. In Versions 4 and 5, only 'Trinity', 'A Mind Forever Voyaging' and 'Beyond Zork' use the full 256K. 'Border Zone' and 'Sherlock', for instance, are about 180K. (The author's short story 'Balances' is about 50K, an edition of 'Adventure' takes 80K, and 'Curses' takes 256K (it's padded out to the maximum size with background information; the actual game comprises only about 245K). Under Inform, the library occupies about 35K regardless of the size of game.)
(ii) Code size
'Zork I' uses only about 5500 opcodes, but the number rises steeply with later games; 'Hollywood Hijinx' has 10355 and, e.g. 'Moonmist' has 15900 (both these being Version 3). Against this, 'A Mind Forever Voyaging' has only 18700, and only 'Trinity' and 'Beyond Zork' reach 32000 or so. (Inform games are more efficiently compiled and make better use of common code -- the library -- so perform much better here: the old Version 3, release 10 of 'Curses' (128K long, and a larger game than any Infocom Version 3 game) has only 6720 opcodes.)
(iii) Objects and rooms
This varies greatly with the style of game. 'Zork I' has 110 rooms and 60 takeable objects, but several quite complex games have as few as 30 rooms (the mysteries, or 'Hitch-hikers'). The average for Version 3 games is 69 rooms, 39 takeable objects.
'A Mind Forever Voyaging' contains many rooms (178) but few objects (30). 'Trinity', a more typical style of game, contains 134 rooms and 49 objects: the Version 5 'Curses' has a few more of each. Of the Version 6 games, only 'Zork Zero' scores highly here, with 215 rooms and 106 objects. The average for Version 4/5 games is 105 rooms and 54 objects.
The total number of objects tends to be close to the limit of 255 in Version 3 games. 'Curses' contains 508.
(iv) Dictionary
Early games such as 'Zork I' know about 600 words, but again this rises steeply to about 1000 even in Version 3. Later games know 1569 ('Beyond Zork') to the record, 2120 ('Trinity'). (This is achieved by heroic inclusion of unlikely synonyms: e.g. the Japanese lady with the umbrella can be called WOMAN, LADY, CRONE, MADAM, MADAME, MATRON, DAME or FACE with any of the adjectives OLD, AGED, ANCIENT, JAP, JAPANESE, ORIENTAL or YELLOW.) Version 6 games have smaller dictionaries. So has 'Curses', at 1364.
(v) Opcodes
(a) Of the 1426854 opcodes in the shipped Infocom story files in Paul David Doherty's collection, here are the top and bottom ten most popular. (Leaving out those which never occur and so score 0: nop, art_shift, piracy and the two post-Infocom opcodes, print_unicode and check_unicode.)
Top Ten Opcodes Chart Bottom Ten Opcodes Chart 1. je 195959 1. print_form 2 2. print 142755 2. erase_picture 3 3. jz 112016 3. read_mouse 3 4. call_vs 104075 4. encode_text 7 5. print_ret 80870 5. make_menu 9 6. store 71128 6. not 14 7. rtrue 66125 7. scroll_window 16 8. jump 56534 8. pop_stack 17 9. new_line 52553 9. restore_undo 18 10. test_attr 46627 10. mouse_window 22
So about 2/3rds of all opcodes are those in the top ten; 1 in 8 opcodes is a je, and only 1 in 710000 is a print_form.
(b) An experiment (conducted with the help of Kevin Bracey) sheds some light on the opcodes most frequently interpreted in typical play. Two very different games ('Zork I', Version 5 "solid gold" edition; 'Museum of Inform', a complex Inform example game) were played for about 50000 cycles of the Z-machine (about 20 moves in 'Zork I', rather less in the 'Museum'). The following table records all opcodes with a frequency of at least 1% (i.e., 0.01):
Zork I Solid Gold (Infocom) Museum of Inform (Inform) 0.116110 loadb 0.104952 je 0.103990 storeb 0.101151 jz 0.101616 jz 0.092727 jump 0.074979 dec_chk 0.080985 jg 0.066375 add 0.079039 jl 0.066283 je 0.070550 inc 0.060760 store 0.070139 store 0.053867 loadw 0.047058 loadw 0.038095 storew 0.034137 get_prop_addr 0.036428 mul 0.024105 jin 0.032069 inc_chk 0.022734 rtrue 0.030243 jump 0.021583 storew 0.029170 test_attr 0.020075 add 0.020634 call_vs 0.018485 call_vs 0.011184 get_sibling 0.016731 and 0.016082 loadb 0.012061 call_vn 0.011879 test_attr 0.011824 dec 0.011687 ret
Adventure games spend most of the time parsing, and the differences between these tables reflect different parser designs (byte arrays versus word arrays and arrays stored in properties) as well as different compiler code generators (Inform does not use inc_chk or dec_chk, so it uses inc, dec, jl and jg correspondingly more). In the case of 'Zork I', about a third of all opcodes are branches: in the case of 'Museum', almost half.
Section 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 / 11 / 12 / 13 / 14 / 15 / 16
Appendix A / B / C / D / E / F