10 Compiling and linking modules ----------------------------- 10.1 Model ----- The model for linking is that a game being compiled (called the "external" program) may Link one or more pre-compiled sections of code called "modules". Suppose the game Jekyll has a subsection called Hyde. Then these two methods of making Jekyll are (almost) equivalent: (i) Putting Include "Hyde"; in the source code for "Jekyll", and compiling "Jekyll". (ii) Compiling "Hyde" with the -M ("module") switch set, putting Link "Hyde"; into the same point in the source code for "Jekyll", and compiling "Jekyll". Option (ii) is of course much faster if "Hyde" does not change very often, since its ready-compiled module can be left lying around while "Jekyll" is being developed. Option (ii) also slightly increases story file size, as some code optimisations are impossible this way: e.g., linking rather than including the library costs about 160 bytes on final story file size. In the linking process, two incarnations of Inform need to communicate with each other: A. one in the middle of compiling a story file and wanting to link a module into it; B. one compiling a module for future linking into some unknowable story file. To be more precise, B needs to send information to A. It does so by writing a modified form of story file with additional tables attached, and this is what a module is. We discuss how Inform copes with situation A first, as this indicates the information which B needs to send it; we then discuss how B gathers and sends this information. Note that A and B can never apply in the same run-through of Inform, since a module cannot link another module into it. Finally, note that A and B may be different ports of Inform running on different machines at different times, since the module format is machine-independent. 10.2 Linking a module ---------------- There are two tasks: (a) merging the module's data structures into those of the external story file, and (b) mending all the constants used in these data structures which couldn't be known at module compilation time. (a) Merging data structures This is formidably hard, since it means extracting tables in Z-machine format and putting them back into Inform's own (usually higher-level) data structures: in effect, an inverse is needed for each translation operation. The following table briefly lists the data structures which are merged, and how this is done: Structure Method Initial global variable values Directly copy in Initial array entry states Append Code area Append. Z-code is relocatable except that function calls are to absolute addresses: these are handled as though they were references to constants unknown at module compilation time Static strings area Append Dictionary Extract words from the module, one at a time, and insert into the story file's dictionary. (The dictionary is just too complex to merge the internal structures.) Object tree Append, and also fix all the object numbers used in parent, sibling and child fields: a particular concern is with classes defined in the module, which have to be added as children of the main program's Class object Property values area Append Individual property values Append "Flags 2" header bits Bitwise OR Class to object numbers table Effectively append, moving the object numbers up appropriately. The module's class numbers table contains offsets of the class inheritance property blocks as well, and these are appended too (b) Mending references The problem here is to absorb numerous references within the module into the backpatch table for the story file. It is not possible just to append the module's backpatch tables onto the story file's backpatch tables: all the offsets need adjusting, and certain kinds of backpatching need to take place immediately (to fix the four marker values not allowable in story file backpatch tables, ACTION_MV, IDENT_MV, VARIABLE_MV and OBJECT_MV). 10.3 Imports and exports ------------------- The main "extra" ingredients in a module (compared with an ordinary story file) are: two backpatching tables (see the next section), and an import/export table giving the names of symbols to be shared with the main story file which will link in with it. The language of "imports and exports" considers the module to be the home country, and the external program to be foreign. (An exported symbol's value is assigned in the module and used in the external program; an imported symbol's value is assigned in the external program and used in the module.) An export is a symbol defined in the module which may be referred to in the external program. All variables, routines and general constants defined in the module except for (a) system-defined symbols always present (such as "Class", "true", etc.) and (b) attributes, common properties and labels are exported. An import is a symbol name used in compiling the module which isn't defined within it. During module compilation, just as in story file compilation, if an unknown name is hit then an operand is constructed which is marked SYMBOL_MV and has the symbol index as its value. In the case of a story file, at value backpatching time the correct value is found out and written in. In the case of a module, there is no value backpatching phase. Instead, all symbols are flagged with IMPORT_SFLAG if ever used when undefined (i.e., if any SYMBOL_MV marker is ever issued). Any which remain undefined at the end of module compilation (in fact, at the same point where exports are made) are "imported": that is, their names are written down so that their existence can be checked on at Link time. It is an error for any imported symbol not to exist in the external program performing the Link. Note, therefore, that backpatch markers in the module with marker value SYMBOL_MV may refer either to symbols which were assigned values later on during module compilation (and are thus exported) or to symbols which were never assigned values (and are thus imported). 10.4 Backpatch backpatching ---------------------- Although modules undergo branch backpatching and optimisation just as story files do, they do not undergo value backpatching. Instead, the two value-backpatching tables (zcode_backpatch_table and zmachine_backpatch_table) are appended to the module file. The basic idea is that when the external file merges the module's data structures into its own, it also adds the module's backpatch markers into its own collection. Thus, the module's code and data are finally value-backpatched when the external program is value backpatched. Unfortunately, these backpatch markers have addresses which are correct for the module, but wrong for the external program: they require correction before they can be transferred, and this is called "backpatch backpatching". In addition to this, a few special types of backpatch markers (only ever generated inside modules, never in story files) are dealt with immediately. A further complication is that the module may have exported a symbol which itself has a value needing backpatching. For example, if the module contains Constant Swine 'hog'; then, in its symbol table, the symbol Swine will have the marker DWORD_MV attached to its value. When this is exported to the external program, the symbol value will need backpatch-backpatching. The sequence of events is roughly: 1. Load the module into an allocated block of memory. 2. Merge the dictionary, finding out which dictionary accession numbers in the external program correspond to which in the module. 3. Go through the import/export table, until we know which symbol numbers in the module correspond to which symbol numbers in the external program; and which variable numbers to which; and which action numbers to which; and which property identifiers to which. (Arrays called "maps" are created to encode all these.) 4. Backpatch, or deal with the backpatch markers attached to, any exported symbols from the module. 5. Go through the Z-code backpatch table: deal with IDENT_MV, ACTION_MV, VARIABLE_MV and OBJECT_MV markers immediately, by backpatching the copy of the module held in memory; backpatch the other markers and then transfer them into the external program's Z-code backpatch table. 6. Do the same for the Z-machine backpatch table. 7. Now merge the memory copy of the module into the external program. (There are actually 17 stages, but most of the latter ones are mergers of different tables which can happen in any order.) As an example of "backpatch backpatching", the backpatch marker for the quantity 'hog' will be DWORD_MV, with value the accession number of the word "hog" in the module's dictionary. This accession number needs to be changed to the accession number of "hog" in the story file's dictionary. The four "module only" backpatch marker values are: VARIABLE_MV global variable number in module's set OBJECT_MV object number in the module's tree IDENT_MV identifier number in the module's set ACTION_MV action number (always between 0 and 255, though always in a "long" constant so that a fake action number can be backpatched over it) Note that within the module, there is no way to tell an externally defined action from a fake action. That is, the reference ##Duckling might be to an external action or fake action called "Duckling". Within the module, it is created as an action name in the usual way; at Link time, the action map corresponds this module action number to a fake action number or a real one as required. Variable numbers in the module are not marked if they are local (or the stack pointer), or if they are from the system set ("self", "sender" and so on), whose numbers are guaranteed to be correct already. In a module, the available variable numbers are 16 to N-1, where N is the number of the lowest system-used variable (at present, 249). Imported variables are numbered from 16 upwards, and variables defined newly in the module from N-1 downwards. If these allocations collide in the middle, then the Z-machine has run out of variables. 10.5 How modules differ from story files ----------------------------------- Module files are identical to story files, except that they have not undergone value backpatching, and do not contain any part of the veneer; and except as detailed below. (i) Size --------- A module is at most 64K in size. (ii) Header entries (where different from story file header entries) -------------------------------------------------------------------- Byte Contents 0 64 + V (where V is the version number) 1 module version number 6,7 byte address of the module map table (note that this is the "initial PC value" slot in the story file header format, but no such value is meaningful for modules) V is used to make sure that we aren't trying to link, e.g., a Version 5 module into a Version 8 story file: this would cause heaps of backpatch errors, as the packed addresses would all be wrong. Although we could in principle automatically fix them all up, it isn't worth the trouble: the user only needs to compile off suitable versions of the modules. The module version number is a version number for the module format being used: this document describes module version 1. (iii) Class numbers table ------------------------- In a module, the class numbers table contains additional information, and has the form: ... 00 00 (In a story file, the inheritance-properties block offsets are absent.) (iv) Identifier names table --------------------------- This is missing altogether in a module. (v) Dictionary -------------- A module dictionary is in accession order, not alphabetical order. (This is necessary so that backpatch markers in the module, which refer to dictionary words by accession number, can be connected with their original words at link time.) (vi) Module map --------------- The module map is (currently) 15 words long and contains the following: Word 0 byte address of object tree 1 byte address of common property values table 2 scaled address of static strings table 3 byte address of class numbers table 4 byte address of individual property values table 5 number of bytes in indiv prop values table 6 number of symbols in module's symbols table 7 number of property identifier numbers used in module 8 number of objects present in module 9 byte address of import/export table 10 its size in bytes 11 byte address of Z-code backpatch table 12 its size in bytes 13 byte address of Z-machine image backpatch table 14 its size in bytes The map serves as an extension to the header, which has more or less run out of available slots for new pieces of information. (vii) Parents of class objects ------------------------------ Objects representing classes are given in the object tree as having parent $7fff (and having next-object 0, as if each were an only child). These are to be understood as being children of the Class object (which is not present in a module). (viii) Import/export table -------------------------- This is a sequence of symbols (mixing up imports and exports in no particular order), occupying records as follows: Record type (1 byte) Symbol number in module symbols table (2 bytes) Symbol type (1 byte) Symbol backpatch marker (1 byte) Symbol value (2 bytes) Symbol name (null terminated) where the possible record types are: IMPORT_MV import a symbol (in which case the symbol backpatch marker is omitted) EXPORT_MV export a symbol defined in a non-system file EXPORTSF_MV export a symbol defined in a system file EXPORTAC_MV export an action name (in the form "Name__A") Note: we need to distinguish between EXPORT_MV and EXPORTSF_MV in order to make Replacement of routines work properly. Suppose the user asks to Replace the routine DrawStatusLine, normally found in the parser, and then links the parser module and also a module of his own containing his new definition of DrawStatusLine. The linker knows which of these to accept because the first was compiled from a system module, and thus exported with EXPORTSF_MV, while the second was not, and was exported with EXPORT_MV. Other than for this purpose, the two values are treated equivalently. (ix) Z-code backpatch table --------------------------- This is exactly the module's Z-code backpatch table, no different in format from that used for backpatching story files (except that four additional marker values are permitted). (x) Z-machine image backpatch table ----------------------------------- This is exactly the module's Z-machine backpatch table, no different in format from that used for backpatching story files (except that four additional marker values are permitted). 10.6 Restrictions on what modules may contain ---------------------------------------- [1] It is impractical to allow the module and the external program use of each others' attribute and property names, because there is no rapid way of repairing the module's object tables. Instead, both program and module must use the same Include file to make the same attribute and property definitions as each other. [2] An object defined in a module cannot: (a) have an externally-defined parent object, or (b) inherit from an externally-defined class. [3] A module can only use externally-defined global variables if they have been explicitly "imported" using the Import directive. [4] A module cannot use Verb or Extend: that is, it cannot define grammar. [5] A module cannot use Stub or Default (for obvious reasons). [6] A module cannot use "unknown at compile time" constants in every context. (Unknown in that, e.g., 'frog' and MAX_FROGS might be unknown: the former because the dictionary address of 'frog' in the final program can't be known yet, the latter (say) because MAX_FROGS is a constant defined only in the external program.) In the following list of contexts in which constants occur in Inform source code, those marked with a (*) are not permitted to be "unknown at compile time": 1. In high-level source code (including as a switch value and a static string in a "box" or "string" statement). 2. In assembly language source code. 3. As the initial entries in an array. 4. As the initial value of a variable. 5. In a CONSTANT definition. 6. As an object's (common or individual) property value. * 7. As a release number. (Defining the module release number only.) * 8. As a version number. (Modules are always version 5.) * 9. In grammar. (You can't define grammar in a module.) * 10. As the size of an array. * 11. In a DEFAULT constant definition. (You can't use Default.) * 12. In an IFTRUE or IFFALSE condition. * 13. As a property default value (though this is unnecessary since the definitions in the outside program take precedence). * 14. As the number of duplicate objects provided for a class. Only (10) and (14) are restrictive. Combining a module with a short Include file to make such definitions will get around them.