The Costs of Introducing Assignment

SICP > Modularity, Objects, and State > Assignment and Local State > The Costs of Introducing Assignment
Previous: The Benefits of Introducing Assignment Next: The Environment Model of Evaluation

    As we have seen, the set! operation enables us to model objects that have local state. However, this advantage comes at a price. Our programming language can no longer be interpreted in terms of the substitution model of procedure application that we introduced in section [*]. Moreover, no simple model with ``nice'' mathematical properties can be an adequate framework for dealing with objects and assignment in programming languages.

    So long as we do not use assignments, two evaluations of the same procedure with the same arguments will produce the same result, so that procedures can be viewed as computing mathematical functions. Programming without any use of assignments, as we did throughout the first two chapters of this book, is accordingly known as functional programming.

    To understand how assignment complicates matters, consider a simplified version of the make-withdraw procedure of section [*] that does not bother to check for an insufficient amount:

    (define (make-simplified-withdraw balance)
      (lambda (amount)
        (set! balance (- balance amount))
        balance))
    
    (define W (make-simplified-withdraw 25))
    
    (W 20)
    5
    
    (W 10)
    -5
    
    Compare this procedure with the following make-decrementer procedure, which does not use set!:

    (define (make-decrementer balance)
      (lambda (amount)
        (- balance amount)))
    
    Make-decrementer returns a procedure that subtracts its input from a designated amount balance, but there is no accumulated effect over successive calls, as with make-simplified-withdraw:

    (define D (make-decrementer 25))
    
    (D 20)
    5
    
    (D 10)
    15
    
    We can use the substitution model to explain how make-decrementer works. For instance, let us analyze the evaluation of the expression

    ((make-decrementer 25) 20)
    
    We first simplify the operator of the combination by substituting 25 for balance in the body of make-decrementer. This reduces the expression to

    ((lambda (amount) (- 25 amount)) 20)
    
    Now we apply the operator by substituting 20 for amount in the body of the lambda expression:

    (- 25 20)
    
    The final answer is 5.

    Observe, however, what happens if we attempt a similar substitution analysis with make-simplified-withdraw:

    ((make-simplified-withdraw 25) 20)
    
    We first simplify the operator by substituting 25 for balance in the body of make-simplified-withdraw. This reduces the expression to[*]

    ((lambda (amount) (set! balance (- 25 amount)) 25) 20)
    
    Now we apply the operator by substituting 20 for amount in the body of the lambda expression:

    (set! balance (- 25 20)) 25
    
    If we adhered to the substitution model, we would have to say that the meaning of the procedure application is to first set balance to 5 and then return 25 as the value of the expression. This gets the wrong answer. In order to get the correct answer, we would have to somehow distinguish the first occurrence of balance (before the effect of the set!) from the second occurrence of balance (after the effect of the set!), and the substitution model cannot do this.

    The trouble here is that substitution is based ultimately on the notion that the symbols in our language are essentially names for values. But as soon as we introduce set! and the idea that the value of a variable can change, a variable can no longer be simply a name. Now a variable somehow refers to a place where a value can be stored, and the value stored at this place can change. In section [*] we will see how environments play this role of ``place'' in our computational model.

    Sameness and change

    The issue surfacing here is more profound than the mere breakdown of a particular model of computation. As soon as we introduce change into our computational models, many notions that were previously straightforward become problematical. Consider the concept of two things being ``the same.''

    Suppose we call make-decrementer twice with the same argument to create two procedures:

    (define D1 (make-decrementer 25))
    
    (define D2 (make-decrementer 25))
    
    Are D1 and D2 the same? An acceptable answer is yes, because D1 and D2 have the same computational behavior--each is a procedure that subtracts its input from 25. In fact, D1 could be substituted for D2 in any computation without changing the result.

    Contrast this with making two calls to make-simplified-withdraw:

    (define W1 (make-simplified-withdraw 25))
    
    (define W2 (make-simplified-withdraw 25))
    
    Are W1 and W2 the same? Surely not, because calls to W1 and W2 have distinct effects, as shown by the following sequence of interactions:

    (W1 20)
    5
    
    (W1 20)
    -15
    
    (W2 20)
    5
    
    Even though W1 and W2 are ``equal'' in the sense that they are both created by evaluating the same expression, (make-simplified-withdraw 25), it is not true that W1 could be substituted for W2 in any expression without changing the result of evaluating the expression.

    A language that supports the concept that ``equals can be substituted for equals'' in an expresssion without changing the value of the expression is said to be referentially transparent. Referential transparency is violated when we include set! in our computer language. This makes it tricky to determine when we can simplify expressions by substituting equivalent expressions. Consequently, reasoning about programs that use assignment becomes drastically more difficult.

    Once we forgo referential transparency, the notion of what it means for computational objects to be ``the same'' becomes difficult to capture in a formal way. Indeed, the meaning of ``same'' in the real world that our programs model is hardly clear in itself. In general, we can determine that two apparently identical objects are indeed ``the same one'' only by modifying one object and then observing whether the other object has changed in the same way. But how can we tell if an object has ``changed'' other than by observing the ``same'' object twice and seeing whether some property of the object differs from one observation to the next? Thus, we cannot determine ``change'' without some a priori notion of ``sameness,'' and we cannot determine sameness without observing the effects of change.

    As an example of how this issue arises in programming, consider the situation where Peter and Paul have a bank account with $100 in it. There is a substantial difference between modeling this as

    (define peter-acc (make-account 100))
    (define paul-acc (make-account 100))
    
    and modeling it as

    (define peter-acc (make-account 100))
    (define paul-acc peter-acc)
    
    In the first situation, the two bank accounts are distinct. Transactions made by Peter will not affect Paul's account, and vice versa. In the second situation, however, we have defined paul-acc to be the same thing as peter-acc. In effect, Peter and Paul now have a joint bank account, and if Peter makes a withdrawal from peter-acc Paul will observe less money in paul-acc. These two similar but distinct situations can cause confusion in building computational models. With the shared account, in particular, it can be especially confusing that there is one object (the bank account) that has two different names (peter-acc and paul-acc); if we are searching for all the places in our program where paul-acc can be changed, we must remember to look also at things that change peter-acc.[*]

    With reference to the above remarks on ``sameness'' and ``change,'' observe that if Peter and Paul could only examine their bank balances, and could not perform operations that changed the balance, then the issue of whether the two accounts are distinct would be moot. In general, so long as we never modify data objects, we can regard a compound data object to be precisely the totality of its pieces. For example, a rational number is determined by giving its numerator and its denominator. But this view is no longer valid in the presence of change, where a compound data object has an ``identity'' that is something different from the pieces of which it is composed. A bank account is still ``the same'' bank account even if we change the balance by making a withdrawal; conversely, we could have two different bank accounts with the same state information. This complication is a consequence, not of our programming language, but of our perception of a bank account as an object. We do not, for example, ordinarily regard a rational number as a changeable object with identity, such that we could change the numerator and still have ``the same'' rational number.

    Pitfalls of imperative programming

    In contrast to functional programming, programming that makes extensive use of assignment is known as imperative programming. In addition to raising complications about computational models, programs written in imperative style are susceptible to bugs that cannot occur in functional programs. For example, recall the iterative factorial program from section [*]:

    (define (factorial n)
      (define (iter product counter)
        (if (> counter n)
            product
            (iter (* counter product)
                  (+ counter 1))))
      (iter 1 1))
    
    Instead of passing arguments in the internal iterative loop, we could adopt a more imperative style by using explicit assignment to update the values of the variables product and counter:
    (define (factorial n)
      (let ((product 1)
            (counter 1))
        (define (iter)
          (if (> counter n)
              product
              (begin (set! product (* counter product))
                     (set! counter (+ counter 1))
                     (iter))))
        (iter)))
    
    This does not change the results produced by the program, but it does introduce a subtle trap. How do we decide the order of the assignments? As it happens, the program is correct as written. But writing the assignments in the opposite order
    (set! counter (+ counter 1))
    (set! product (* counter product))
    
    would have produced a different, incorrect result. In general, programming with assignment forces us to carefully consider the relative orders of the assignments to make sure that each statement is using the correct version of the variables that have been changed. This issue simply does not arise in functional programs. [*]

    The complexity of imperative programs becomes even worse if we consider applications in which several processes execute concurrently. We will return to this in section [*]. First, however, we will address the issue of providing a computational model for expressions that involve assignment, and explore the uses of objects with local state in designing simulations.

    Exercise. Consider the bank account objects created by make-account, with the password modification described in exercise [*]. Suppose that our banking system requires the ability to make joint accounts. Define a procedure make-joint that accomplishes this. Make-joint should take three arguments. The first is a password-protected account. The second argument must match the password with which the account was defined in order for the make-joint operation to proceed. The third argument is a new password. Make-joint is to create an additional access to the original account using the new password. For example, if peter-acc is a bank account with password open-sesame, then

    (define paul-acc
      (make-joint peter-acc 'open-sesame 'rosebud))
    
    will allow one to make transactions on peter-acc using the name paul-acc and the password rosebud. You may wish to modify your solution to exercise [*] to accommodate this new feature.

    Exercise. When we defined the evaluation model in section [*], we said that the first step in evaluating an expression is to evaluate its subexpressions. But we never specified the order in which the subexpressions should be evaluated (e.g., left to right or right to left). When we introduce assignment, the order in which the arguments to a procedure are evaluated can make a difference to the result. Define a simple procedure f such that evaluating (+ (f 0) (f 1)) will return 0 if the arguments to + are evaluated from left to right but will return 1 if the arguments are evaluated from right to left.  

    Previous: The Benefits of Introducing Assignment Next: The Environment Model of Evaluation

      webmaster@arsdigita.org