The Assembler

SICP > Computing with Register Machines > A Register-Machine Simulator > The Assembler
Previous: The Machine Model Next: Generating Execution Procedures for Instructions

    The assembler transforms the sequence of controller expressions for a machine into a corresponding list of machine instructions, each with its execution procedure. Overall, the assembler is much like the evaluators we studied in chapter 4--there is an input language (in this case, the register-machine language) and we must perform an appropriate action for each type of expression in the language.

    The technique of producing an execution procedure for each instruction is just what we used in section [*] to speed up the evaluator by separating analysis from runtime execution. As we saw in chapter 4, much useful analysis of Scheme expressions could be performed without knowing the actual values of variables. Here, analogously, much useful analysis of register-machine-language expressions can be performed without knowing the actual contents of machine registers. For example, we can replace references to registers by pointers to the register objects, and we can replace references to labels by pointers to the place in the instruction sequence that the label designates.

    Before it can generate the instruction execution procedures, the assembler must know what all the labels refer to, so it begins by scanning the controller text to separate the labels from the instructions. As it scans the text, it constructs both a list of instructions and a table that associates each label with a pointer into that list. Then the assembler augments the instruction list by inserting the execution procedure for each instruction.

    The assemble procedure is the main entry to the assembler. It takes the controller text and the machine model as arguments and returns the instruction sequence to be stored in the model. Assemble calls extract-labels to build the initial instruction list and label table from the supplied controller text. The second argument to extract-labels is a procedure to be called to process these results: This procedure uses update-insts! to generate the instruction execution procedures and insert them into the instruction list, and returns the modified list.

    (define (assemble controller-text machine)
      (extract-labels controller-text
        (lambda (insts labels)
          (update-insts! insts labels machine)
          insts)))
    

    Extract-labels takes as arguments a list text (the sequence of controller instruction expressions) and a receive procedure. Receive will be called with two values: (1) a list insts of instruction data structures, each containing an instruction from text; and (2) a table called labels, which associates each label from text with the position in the list insts that the label designates.

    (define (extract-labels text receive)
      (if (null? text)
          (receive '() '())
          (extract-labels (cdr text)
           (lambda (insts labels)
             (let ((next-inst (car text)))
               (if (symbol? next-inst)
                   (receive insts
                            (cons (make-label-entry next-inst
                                                    insts)
                                  labels))
                   (receive (cons (make-instruction next-inst)
                                  insts)
                            labels)))))))
    
    Extract-labels works by sequentially scanning the elements of the text and accumulating the insts and the labels. If an element is a symbol (and thus a label) an appropriate entry is added to the labels table. Otherwise the element is accumulated onto the insts list.[*]

    Update-insts! modifies the instruction list, which initially contains only the text of the instructions, to include the corresponding execution procedures:

    (define (update-insts! insts labels machine)
      (let ((pc (get-register machine 'pc))
            (flag (get-register machine 'flag))
            (stack (machine 'stack))
            (ops (machine 'operations)))
        (for-each
         (lambda (inst)
           (set-instruction-execution-proc! 
            inst
            (make-execution-procedure
             (instruction-text inst) labels machine
             pc flag stack ops)))
         insts)))
    

    The machine instruction data structure simply pairs the instruction text with the corresponding execution procedure. The execution procedure is not yet available when extract-labels constructs the instruction, and is inserted later by update-insts!.

    (define (make-instruction text)
      (cons text '()))
    
    (define (instruction-text inst)
      (car inst))
    
    (define (instruction-execution-proc inst)
      (cdr inst))
    
    (define (set-instruction-execution-proc! inst proc)
      (set-cdr! inst proc))
    
    The instruction text is not used by our simulator, but it is handy to keep around for debugging (see exercise [*]).

    Elements of the label table are pairs:

    (define (make-label-entry label-name insts)
      (cons label-name insts))
    
    Entries will be looked up in the table with

    (define (lookup-label labels label-name)
      (let ((val (assoc label-name labels)))
        (if val
            (cdr val)
            (error "Undefined label - ASSEMBLE" label-name))))
    

    Exercise. The following register-machine code is ambiguous, because the label here is defined more than once:

    start
      (goto (label here))
    here
      (assign a (const 3))
      (goto (label there))
    here
      (assign a (const 4))
      (goto (label there))
    there
    
    With the simulator as written, what will the contents of register a be when control reaches there? Modify the extract-labels procedure so that the assembler will signal an error if the same label name is used to indicate two different locations.

    Previous: The Machine Model Next: Generating Execution Procedures for Instructions

      webmaster@arsdigita.org