Compiling Combinations

SICP > Computing with Register Machines > Compilation > Compiling Combinations
Previous: Compiling Expressions Next: Combining Instruction Sequences

    The essence of the compilation process is the compilation of procedure applications. The code for a combination compiled with a given target and linkage has the form

    compilation of operator, target proc, linkage next
    evaluate operands and construct argument list in argl
    compilation of procedure call with given target and linkage
    
    The registers env, proc, and argl may have to be saved and restored during evaluation of the operator and operands. Note that this is the only place in the compiler where a target other than val is specified.

    The required code is generated by compile-application. This recursively compiles the operator, to produce code that puts the procedure to be applied into proc, and compiles the operands, to produce code that evaluates the individual operands of the application. The instruction sequences for the operands are combined (by construct-arglist) with code that constructs the list of arguments in argl, and the resulting argument-list code is combined with the procedure code and the code that performs the procedure call (produced by compile-procedure-call). In appending the code sequences, the env register must be preserved around the evaluation of the operator (since evaluating the operator might modify env, which will be needed to evaluate the operands), and the proc register must be preserved around the construction of the argument list (since evaluating the operands might modify proc, which will be needed for the actual procedure application). Continue must also be preserved throughout, since it is needed for the linkage in the procedure call.

    (define (compile-application exp target linkage)
      (let ((proc-code (compile (operator exp) 'proc 'next))
            (operand-codes
             (map (lambda (operand) (compile operand 'val 'next))
                  (operands exp))))
        (preserving '(env continue)
         proc-code
         (preserving '(proc continue)
          (construct-arglist operand-codes)
          (compile-procedure-call target linkage)))))
    

    The code to construct the argument list will evaluate each operand into val and then cons that value onto the argument list being accumulated in argl. Since we cons the arguments onto argl in sequence, we must start with the last argument and end with the first, so that the arguments will appear in order from first to last in the resulting list. Rather than waste an instruction by initializing argl to the empty list to set up for this sequence of evaluations, we make the first code sequence construct the initial argl. The general form of the argument-list construction is thus as follows:

    compilation of last operand, targeted to val
    (assign argl (op list) (reg val))
    compilation of next operand, targeted to val
    (assign argl (op cons) (reg val) (reg argl))
    ...
    compilation of first operand, targeted to val
    (assign argl (op cons) (reg val) (reg argl))
    
    Argl must be preserved around each operand evaluation except the first (so that arguments accumulated so far won't be lost), and env must be preserved around each operand evaluation except the last (for use by subsequent operand evaluations).

    Compiling this argument code is a bit tricky, because of the special treatment of the first operand to be evaluated and the need to preserve argl and env in different places. The construct-arglist procedure takes as arguments the code that evaluates the individual operands. If there are no operands at all, it simply emits the instruction

    (assign argl (const ()))
    
    Otherwise, construct-arglist creates code that initializes argl with the last argument, and appends code that evaluates the rest of the arguments and adjoins them to argl in succession. In order to process the arguments from last to first, we must reverse the list of operand code sequences from the order supplied by compile-application.

    (define (construct-arglist operand-codes)
      (let ((operand-codes (reverse operand-codes)))
        (if (null? operand-codes)
            (make-instruction-sequence '() '(argl)
             '((assign argl (const ()))))
            (let ((code-to-get-last-arg
                   (append-instruction-sequences
                    (car operand-codes)
                    (make-instruction-sequence '(val) '(argl)
                     '((assign argl (op list) (reg val)))))))
              (if (null? (cdr operand-codes))
                  code-to-get-last-arg
                  (preserving '(env)
                   code-to-get-last-arg
                   (code-to-get-rest-args
                    (cdr operand-codes))))))))
    
    (define (code-to-get-rest-args operand-codes)
      (let ((code-for-next-arg
             (preserving '(argl)
              (car operand-codes)
              (make-instruction-sequence '(val argl) '(argl)
               '((assign argl
                  (op cons) (reg val) (reg argl)))))))
        (if (null? (cdr operand-codes))
            code-for-next-arg
            (preserving '(env)
             code-for-next-arg
             (code-to-get-rest-args (cdr operand-codes))))))
    

    Applying procedures

    After evaluating the elements of a combination, the compiled code must apply the procedure in proc to the arguments in argl. The code performs essentially the same dispatch as the apply procedure in the metacircular evaluator of section [*] or the apply-dispatch entry point in the explicit-control evaluator of section [*]. It checks whether the procedure to be applied is a primitive procedure or a compiled procedure. For a primitive procedure, it uses apply-primitive-procedure; we will see shortly how it handles compiled procedures. The procedure-application code has the following form:

     (test (op primitive-procedure?) (reg proc))
     (branch (label primitive-branch))
    compiled-branch
     code to apply compiled procedure with given target and appropriate linkage
    primitive-branch
     (assign target
             (op apply-primitive-procedure)
             (reg proc)
             (reg argl))
     linkage
    after-call
    
    Observe that the compiled branch must skip around the primitive branch. Therefore, if the linkage for the original procedure call was next, the compound branch must use a linkage that jumps to a label that is inserted after the primitive branch. (This is similar to the linkage used for the true branch in compile-if.)

    (define (compile-procedure-call target linkage)
      (let ((primitive-branch (make-label 'primitive-branch))
            (compiled-branch (make-label 'compiled-branch))
            (after-call (make-label 'after-call)))
        (let ((compiled-linkage
               (if (eq? linkage 'next) after-call linkage)))
          (append-instruction-sequences
           (make-instruction-sequence '(proc) '()
            `((test (op primitive-procedure?) (reg proc))
              (branch (label ,primitive-branch))))
           (parallel-instruction-sequences
            (append-instruction-sequences
             compiled-branch
             (compile-proc-appl target compiled-linkage))
            (append-instruction-sequences
             primitive-branch
             (end-with-linkage linkage
              (make-instruction-sequence '(proc argl)
                                         (list target)
               `((assign ,target
                         (op apply-primitive-procedure)
                         (reg proc)
                         (reg argl)))))))
           after-call))))
    
    The primitive and compound branches, like the true and false branches in compile-if, are appended using parallel-instruction-sequences rather than the ordinary append-instruction-sequences, because they will not be executed sequentially.

    Applying compiled procedures

    The code that handles procedure application is the most subtle part of the compiler, even though the instruction sequences it generates are very short. A compiled procedure (as constructed by compile-lambda) has an entry point, which is a label that designates where the code for the procedure starts. The code at this entry point computes a result in val and returns by executing the instruction (goto (reg continue)). Thus, we might expect the code for a compiled-procedure application (to be generated by compile-proc-appl) with a given target and linkage to look like this if the linkage is a label

     (assign continue (label proc-return))
     (assign val (op compiled-procedure-entry) (reg proc))
     (goto (reg val))
    proc-return
     (assign target (reg val)) ; included if target is not val
     (goto (label linkage)) ; linkage code
    
    or like this if the linkage is return.

     (save continue)
     (assign continue (label proc-return))
     (assign val (op compiled-procedure-entry) (reg proc))
     (goto (reg val))
    proc-return
     (assign target (reg val)) ; included if target is not val
     (restore continue)
     (goto (reg continue)) ; linkage code
    
    This code sets up continue so that the procedure will return to a label proc-return and jumps to the procedure's entry point. The code at procreturn transfers the procedure's result from val to the target register (if necessary) and then jumps to the location specified by the linkage. (The linkage is always return or a label, because compileprocedurecall replaces a next linkage for the compound-procedure branch by an after-call label.)

    In fact, if the target is not val, that is exactly the code our compiler will generate. [*] Usually, however, the target is val (the only time the compiler specifies a different register is when targeting the evaluation of an operator to proc), so the procedure result is put directly into the target register and there is no need to return to a special location that copies it. Instead, we simplify the code by setting up continue so that the procedure will ``return'' directly to the place specified by the caller's linkage:

    set up continue for linkage
    (assign val (op compiled-procedure-entry) (reg proc))
    (goto (reg val))
    
    If the linkage is a label, we set up continue so that the procedure will return to that label. (That is, the (goto (reg continue)) the procedure ends with becomes equivalent to the (goto (label linkage)) at proc-return above.)

    (assign continue (label linkage))
    (assign val (op compiled-procedure-entry) (reg proc))
    (goto (reg val))
    
    If the linkage is return, we don't need to set up continue at all: It already holds the desired location. (That is, the (goto (reg continue)) the procedure ends with goes directly to the place where the (goto (reg continue)) at proc-return would have gone.)

    (assign val (op compiled-procedure-entry) (reg proc))
    (goto (reg val))
    
    With this implementation of the return linkage, the compiler generates tail-recursive code. Calling a procedure as the final step in a procedure body does a direct transfer, without saving any information on the stack.

    Suppose instead that we had handled the case of a procedure call with a linkage of return and a target of val as shown above for a non-val target. This would destroy tail recursion. Our system would still give the same value for any expression. But each time we called a procedure, we would save continue and return after the call to undo the (useless) save. These extra saves would accumulate during a nest of procedure calls.[*]

    Compile-proc-appl generates the above procedure-application code by considering four cases, depending on whether the target for the call is val and whether the linkage is return. Observe that the instruction sequences are declared to modify all the registers, since executing the procedure body can change the registers in arbitrary ways.[*] Also note that the code sequence for the case with target val and linkage return is declared to need continue: Even though continue is not explicitly used in the two-instruction sequence, we must be sure that continue will have the correct value when we enter the compiled procedure.

    (define (compile-proc-appl target linkage)
      (cond ((and (eq? target 'val) (not (eq? linkage 'return)))
             (make-instruction-sequence '(proc) all-regs
               `((assign continue (label ,linkage))
                 (assign val (op compiled-procedure-entry)
                             (reg proc))
                 (goto (reg val)))))
            ((and (not (eq? target 'val))
                  (not (eq? linkage 'return)))
             (let ((proc-return (make-label 'proc-return)))
               (make-instruction-sequence '(proc) all-regs
                `((assign continue (label ,proc-return))
                  (assign val (op compiled-procedure-entry)
                              (reg proc))
                  (goto (reg val))
                  ,proc-return
                  (assign ,target (reg val))
                  (goto (label ,linkage))))))
            ((and (eq? target 'val) (eq? linkage 'return))
             (make-instruction-sequence '(proc continue) all-regs
              '((assign val (op compiled-procedure-entry)
                            (reg proc))
                (goto (reg val)))))
            ((and (not (eq? target 'val)) (eq? linkage 'return))
             (error "return linkage, target not val - COMPILE"
                    target))))
    

    Previous: Compiling Expressions Next: Combining Instruction Sequences

      webmaster@arsdigita.org