11 Service levels -------------- 11.1 Variables and arrays -------------------- Each of the sections is responsible for the initialisation of its own variables, and for the allocation and deallocation of its own arrays. To this end, every section except "inform.c" (which is considered to be in charge of the others) provides a set of four routines to manage its data structures: init_*_vars() called when the compiler is beginning a run *_begin_pass() called at the beginning of the compilation pass through the source code: this usually sets some of the variables *_allocate_arrays() called when the compiler is beginning a run *_free_arrays() called when the compiler is ending its run Note that *_free_arrays() must exactly undo what *_allocate_arrays() has done. The 19 routines of each kind are polled in no particular order (actually, in alphabetical order). Note that there are also lexer_begin_prepass() files_begin_prepass() which need to happen even earlier than the general *_begin_pass() poll, and also lexer_endpass() linker_endpass() but since the other sections generally don't need to do anything at end of pass, it wasn't worth making a formal poll of it. Static initialisation of variables is only allowed when they are certain never to change (for example, the string containing the accent escape characters never changes). 11.2 Memory allocation and deallocation ---------------------------------- All allocation and deallocation is routed through routines given in "memory.c": specifically, my_malloc() and my_free() Note that these are called with a string as an extra parameter (as compared with malloc() and free()), which is used to print helpful error messages in case of collapse. This is where the -m ("print the memory allocation") switch is implemented, and it's also a good place to add any OS-specific code which may be needed. This may be needed because, under ANSI rules, malloc() is entitled to refuse to allocate blocks of memory any larger than 32K minus a few bytes. On a few implementations malloc() actually does this, even on very large machines, because it's convenient to avoid memory segmentation problems. (For instance, on a few versions of C for the Macintosh, it's possible to have 16M free and still be refused a request for 33K of it.) Consequently it may be necessary to use some compiler-specific routine instead. Inform allocates many arrays (50 or so) and in a normal compilation run, the largest of these occupies about 26K. However, the implementation of Link requires the whole of a module file to be held in one contiguous block of memory, potentially 64K long; and the same applies to the array holding the unpaged memory area of the Z-machine during construct_storyfile(). Failure of my_malloc() causes a fatal error. Inform also provides a structure (memory_block) for allocating extensible regions of memory (at a slight access-speed penalty): these allocate 8K chunks as needed. Three of these are used as alternatives to the temporary files when -F0 is operating, and two more are used to hold backpatch markers. 11.3 Error messages -------------- All diagnostic and error messages are routed through the "errors.c" section (except for errors in parsing ICL, as this takes place before the compiler itself is active). There are three kinds of diagnostic: warning, error and fatal error. Fatal errors cause Inform to halt with an exit(1), which is about as fatal as anything can be to a C program. (Fatal errors are mainly issued when the memory or file-handling environment seems to have gone wrong.) Errors allow compilation to carry on, but suppress the production of any output at the end: in some ways this is a pity, but there are not many errors which one can be confident of safely recovering from. Note that a sequence of MAX_ERRORS (normally 100) errors causes a fatal error. Error messages which are generated during the compilation pass try to quote the source code line apparently responsible (unless the "concise" switch is set): but Inform may be unable to do this if the text was read and lost too long ago, for example in the case of a "declared but not used" warning. Note that informational messages (such as the banner line, the statistics printout, etc.) are not routed through "errors.c", or indeed anywhere else. Error recovery in Inform is mainly a matter of going into "panic mode" (amusingly, this is a term of art in compiler theory): tearing through the tokens until a semicolon is reached, then starting afresh with a new statement or directive. This pretty often goes wrong (especially if a routine start/end is not noticed), and more work might be needed here. 11.4 File input/output ----------------- The "files.c" section handles almost all of the file input/output in Inform: the two exceptions are in cli_interpret() in "inform.c", which loads in ICL files; and in link_module() in "linker.c", which loads in modules (a simple job which there was no point abstracting). The biggest shake-up to this section since Inform 5 was the realisation that, far from being a grubby low-level activity, working out good filenames was in fact a task which had to be done much higher up in Inform: so all of the code which used to do this is now located in the ICL interpreter in "inform.c". What remains in "files.c" is: opening source code and reading it into buffers; opening and closing the temporary files; calculating the checksum and length of, and then outputting, the story file or module being compiled; opening, writing to and closing the text transcript file; opening, writing to and closing the debugging information file. For each different source file that has ever been open, "files.c" maintains a structure called FileId, containing the full filename and a file handle. The files are never read from except via the routine file_load_chars(), which fills up the lexical analyser's buffers. The routine output_file() writes the story/module file: see its comments for details. It takes up where construct_storyfile() leaves off, and contributes the last two pieces of header data to be worked out, the checksum and the length fields. Note that the checksum calculation is only known after the whole file has been written (mainly because Z-code backpatching alters the checksum, but can't be done until output time without consuming a good deal of extra memory); so that an "fseek" is made to skip the write position back into the header and overwrite the correct checksum just before closing the file. This call is made in a strictly ANSI way but, like everything on the fringes of the C library-to-operating-system interface, may cause trouble on some compilers. The other routines are largely self-explanatory. There are three temporary files (although they won't be used if -F0 applies), as follows: Temporary file Contents 1 The static strings area 2 The Z-code area 3 The link data area (only opened and used in -M, module mode) For the format of the debugging information file, see the utility "infact.c", which prints from it. This format is likely to change in the near future in any case, as it is already inadequate for some of the new features in Inform 6.