An Example of Compiled Code

SICP > Computing with Register Machines > Compilation > An Example of Compiled Code
Previous: Combining Instruction Sequences Next: Lexical Addressing

    Now that we have seen all the elements of the compiler, let us examine an example of compiled code to see how things fit together. We will compile the definition of a recursive factorial procedure by calling compile:

    (compile
     '(define (factorial n)
        (if (= n 1)
            1
            (* (factorial (- n 1)) n)))
     'val
     'next)
    
    We have specified that the value of the define expression should be placed in the val register. We don't care what the compiled code does after executing the define, so our choice of next as the linkage descriptor is arbitrary.

    Compile determines that the expression is a definition, so it calls compile-definition to compile code to compute the value to be assigned (targeted to val), followed by code to install the definition, followed by code to put the value of the define (which is the symbol ok) into the target register, followed finally by the linkage code. Env is preserved around the computation of the value, because it is needed in order to install the definition. Because the linkage is next, there is no linkage code in this case. The skeleton of the compiled code is thus

      save env if modified by code to compute value
      compilation of definition value, target val, linkage next
      restore env if saved above
      (perform (op define-variable!)
               (const factorial)
               (reg val)
               (reg env))
      (assign val (const ok))
    

    The expression that is to be compiled to produce the value for the variable factorial is a lambda expression whose value is the procedure that computes factorials. Compile handles this by calling compile-lambda, which compiles the procedure body, labels it as a new entry point, and generates the instruction that will combine the procedure body at the new entry point with the run-time environment and assign the result to val. The sequence then skips around the compiled procedure code, which is inserted at this point. The procedure code itself begins by extending the procedure's definition environment by a frame that binds the formal parameter n to the procedure argument. Then comes the actual procedure body. Since this code for the value of the variable doesn't modify the env register, the optional save and restore shown above aren't generated. (The procedure code at entry2 isn't executed at this point, so its use of env is irrelevant.) Therefore, the skeleton for the compiled code becomes

      (assign val (op make-compiled-procedure)
                  (label entry2)
                  (reg env))
      (goto (label after-lambda1))
    entry2
      (assign env (op compiled-procedure-env) (reg proc))
      (assign env (op extend-environment)
                  (const (n))
                  (reg argl)
                  (reg env))
      compilation of procedure body
    after-lambda1
      (perform (op define-variable!)
               (const factorial)
               (reg val)
               (reg env))
      (assign val (const ok))
    

    A procedure body is always compiled (by compile-lambda-body) as a sequence with target val and linkage return. The sequence in this case consists of a single if expression:

    (if (= n 1)
        1
        (* (factorial (- n 1)) n))
    
    Compile-if generates code that first computes the predicate (targeted to val), then checks the result and branches around the true branch if the predicate is false. Env and continue are preserved around the predicate code, since they may be needed for the rest of the if expression. Since the if expression is the final expression (and only expression) in the sequence making up the procedure body, its target is val and its linkage is return, so the true and false branches are both compiled with target val and linkage return. (That is, the value of the conditional, which is the value computed by either of its branches, is the value of the procedure.)

      save continue, env if modified by predicate and needed by branches
      compilation of predicate, target val, linkage next
      restore continue, env if saved above
      (test (op false?) (reg val))
      (branch (label false-branch4))
    true-branch5
      compilation of true branch, target val, linkage return
    false-branch4
      compilation of false branch, target val, linkage return
    after-if3
    

    The predicate (= n 1) is a procedure call. This looks up the operator (the symbol =) and places this value in proc. It then assembles the arguments 1 and the value of n into argl. Then it tests whether proc contains a primitive or a compound procedure, and dispatches to a primitive branch or a compound branch accordingly. Both branches resume at the after-call label. The requirements to preserve registers around the evaluation of the operator and operands don't result in any saving of registers, because in this case those evaluations don't modify the registers in question.

      (assign proc
              (op lookup-variable-value) (const =) (reg env))
      (assign val (const 1))
      (assign argl (op list) (reg val))
      (assign val (op lookup-variable-value) (const n) (reg env))
      (assign argl (op cons) (reg val) (reg argl))
      (test (op primitive-procedure?) (reg proc))
      (branch (label primitive-branch17))
    compiled-branch16
      (assign continue (label after-call15))
      (assign val (op compiled-procedure-entry) (reg proc))
      (goto (reg val))
    primitive-branch17
      (assign val (op apply-primitive-procedure)
                  (reg proc)
                  (reg argl))
    after-call15
    

    The true branch, which is the constant 1, compiles (with target val and linkage return) to

      (assign val (const 1))
      (goto (reg continue))
    
    The code for the false branch is another a procedure call, where the procedure is the value of the symbol *, and the arguments are n and the result of another procedure call (a call to factorial). Each of these calls sets up proc and argl and its own primitive and compound branches. Figure [*] shows the complete compilation of the definition of the factorial procedure. Notice that the possible save and restore of continue and env around the predicate, shown above, are in fact generated, because these registers are modified by the procedure call in the predicate and needed for the procedure call and the return linkage in the branches.

    Exercise. Consider the following definition of a factorial procedure, which is slightly different from the one given above:

    (define (factorial-alt n)
      (if (= n 1)
          1
          (* n (factorial-alt (- n 1)))))
    
    Compile this procedure and compare the resulting code with that produced for factorial. Explain any differences you find. Does either program execute more efficiently than the other?

    Exercise. Compile the iterative factorial procedure

    (define (factorial n)
      (define (iter product counter)
        (if (> counter n)
            product
            (iter (* counter product)
                  (+ counter 1))))
      (iter 1 1))
    
    Annotate the resulting code, showing the essential difference between the code for iterative and recursive versions of factorial that makes one process build up stack space and the other run in constant stack space.


      \begin{figure}<pre>
{\em ;; construct the procedure and skip over code for the p...
...rial}
procedure (continued on next page).}
\addtocounter{figure}{-1}\end{figure}


    \begin{figure}<pre>
{\em ;; compute {\tt (- n 1)}, which is the argument for {\t...
...assign val (const ok))
</pre>
\vskip -10pt
\figcaption{(continued)}
\end{figure}

    Exercise. What expression was compiled to produce the code shown in figure [*]?

      \begin{figure}% latex2html id marker 27585
<pre>
(assign val (op make-compiled-...
...ge).
See exercise~\ref{ex:compiled-code}.}
\addtocounter{figure}{-1}\end{figure}


    \begin{figure}<pre>
after-call20
(assign argl (op list) (reg val))
(restore en...
...assign val (const ok))
</pre>
\vskip -10pt
\figcaption{(continued)}
\end{figure}

    Exercise. What order of evaluation does our compiler produce for operands of a combination? Is it left-to-right, right-to-left, or some other order? Where in the compiler is this order determined? Modify the compiler so that it produces some other order of evaluation. (See the discussion of order of evaluation for the explicit-control evaluator in section [*].) How does changing the order of operand evaluation affect the efficiency of the code that constructs the argument list?

    Exercise. One way to understand the compiler's preserving mechanism for optimizing stack usage is to see what extra operations would be generated if we did not use this idea. Modify preserving so that it always generates the save and restore operations. Compile some simple expressions and identify the unnecessary stack operations that are generated. Compare the code to that generated with the preserving mechanism intact.

    Exercise. Our compiler is clever about avoiding unnecessary stack operations, but it is not clever at all when it comes to compiling calls to the primitive procedures of the language in terms of the primitive operations supplied by the machine. For example, consider how much code is compiled to compute (+ a 1): The code sets up an argument list in argl, puts the primitive addition procedure (which it finds by looking up the symbol + in the environment) into proc, and tests whether the procedure is primitive or compound. The compiler always generates code to perform the test, as well as code for primitive and compound branches (only one of which will be executed). We have not shown the part of the controller that implements primitives, but we presume that these instructions make use of primitive arithmetic operations in the machine's data paths. Consider how much less code would be generated if the compiler could open-code primitives--that is, if it could generate code to directly use these primitive machine operations. The expression (+ a 1) might be compiled into something as simple as [*]

    (assign val (op lookup-variable-value) (const a) (reg env))
    (assign val (op +) (reg val) (const 1))
    
    In this exercise we will extend our compiler to support open coding of selected primitives. Special-purpose code will be generated for calls to these primitive procedures instead of the general procedure-application code. In order to support this, we will augment our machine with special argument registers arg1 and arg2. The primitive arithmetic operations of the machine will take their inputs from arg1 and arg2. The results may be put into val, arg1, or arg2.

    The compiler must be able to recognize the application of an open-coded primitive in the source program. We will augment the dispatch in the compile procedure to recognize the names of these primitives in addition to the reserved words (the special forms) it currently recognizes.[*] For each special form our compiler has a code generator. In this exercise we will construct a family of code generators for the open-coded primitives.

    aThe open-coded primitives, unlike the special forms, all need their operands evaluated. Write a code generator spread-arguments for use by all the open-coding code generators. Spread-arguments should take an operand list and compile the given operands targeted to successive argument registers. Note that an operand may contain a call to an open-coded primitive, so argument registers will have to be preserved during operand evaluation.

    bFor each of the primitive procedures =, *, -, and +, write a code generator that takes a combination with that operator, together with a target and a linkage descriptor, and produces code to spread the arguments into the registers and then perform the operation targeted to the given target with the given linkage. You need only handle expressions with two operands. Make compile dispatch to these code generators.

    cTry your new compiler on the factorial example. Compare the resulting code with the result produced without open coding.

    dExtend your code generators for + and * so that they can handle expressions with arbitrary numbers of operands. An expression with more than two operands will have to be compiled into a sequence of operations, each with only two inputs.  

    Previous: Combining Instruction Sequences Next: Lexical Addressing

      webmaster@arsdigita.org