Mutable List Structure

SICP > Modularity, Objects, and State > Modeling with Mutable Data > Mutable List Structure
Previous: Modeling with Mutable Data Next: Representing Queues

    The basic operations on pairs--cons, car, and cdr--can be used to construct list structure and to select parts from list structure, but they are incapable of modifying list structure. The same is true of the list operations we have used so far, such as append and list, since these can be defined in terms of cons, car, and cdr. To modify list structures we need new operations.


      \begin{figure}\par\figcaption{Lists {\tt x}: {\tt ((a b) c d)} and {\tt y}: {\tt (e
f)}.}\end{figure}


      \begin{figure}% latex2html id marker 12542
\par\figcaption{Effect of {\tt (set-car! x y)} on the lists in
figure~\ref{fig:two-lists}.}\end{figure}


      \begin{figure}% latex2html id marker 12547
\par\figcaption{Effect of {\tt (define z (cons y (cdr x)))} on
the lists in figure~\ref{fig:two-lists}.}\end{figure}


      \begin{figure}% latex2html id marker 12552
\par\figcaption {Effect of {\tt (set-cdr! x y)} on the lists in
figure~\ref{fig:two-lists}.}\end{figure}

    The primitive mutators for pairs are set-car! and set-cdr!. Setcar! takes two arguments, the first of which must be a pair. It modifies this pair, replacing the car pointer by a pointer to the second argument of set-car!. [*]

    As an example, suppose that x is bound to the list ((a b) c d) and y to the list (e f) as illustrated in figure [*]. Evaluating the expression (set-car! x y) modifies the pair to which x is bound, replacing its car by the value of y. The result of the operation is shown in figure [*]. The structure x has been modified and would now be printed as ((e f) c d). The pairs representing the list (a b), identified by the pointer that was replaced, are now detached from the original structure.[*]

    Compare figure [*] with figure [*], which illustrates the result of executing (define z (cons y (cdr x))) with x and y bound to the original lists of figure [*]. The variable z is now bound to a new pair created by the cons operation; the list to which x is bound is unchanged.

    The set-cdr! operation is similar to set-car!. The only difference is that the cdr pointer of the pair, rather than the car pointer, is replaced. The effect of executing (set-cdr! x y) on the lists of figure [*] is shown in figure [*]. Here the cdr pointer of x has been replaced by the pointer to (e f). Also, the list (c d), which used to be the cdr of x, is now detached from the structure.

    Cons builds new list structure by creating new pairs, while set-car! and set-cdr! modify existing pairs. Indeed, we could implement cons in terms of the two mutators, together with a procedure get-new-pair, which returns a new pair that is not part of any existing list structure. We obtain the new pair, set its car and cdr pointers to the designated objects, and return the new pair as the result of the cons.[*]

    (define (cons x y)
      (let ((new (get-new-pair)))
        (set-car! new x)
        (set-cdr! new y)
        new))
    

    Exercise. The following procedure for appending lists was introduced in section [*]:

    (define (append x y)
      (if (null? x)
          y
          (cons (car x) (append (cdr x) y))))
    
    Append forms a new list by successively consing the elements of x onto y. The procedure append! is similar to append, but it is a mutator rather than a constructor. It appends the lists by splicing them together, modifying the final pair of x so that its cdr is now y. (It is an error to call append! with an empty x.)

    (define (append! x y)
      (set-cdr! (last-pair x) y)
      x)
    
    Here last-pair is a procedure that returns the last pair in its argument:

    (define (last-pair x)
      (if (null? (cdr x))
          x
          (last-pair (cdr x))))
    
    Consider the interaction

    (define x (list 'a 'b))
    
    (define y (list 'c 'd))
    
    (define z (append x y))
    
    z
    (a b c d)
    
    (cdr x)
    response
    
    (define w (append! x y))
    
    w
    (a b c d)
    
    (cdr x)
    response
    
    What are the missing responses? Draw box-and-pointer diagrams to explain your answer.  

    Exercise. Consider the following make-cycle procedure, which uses the last-pair procedure defined in exercise [*]:

    (define (make-cycle x)
      (set-cdr! (last-pair x) x)
      x)
    
    Draw a box-and-pointer diagram that shows the structure z created by

    (define z (make-cycle (list 'a 'b 'c)))
    
    What happens if we try to compute (last-pair z)?  

    Exercise. The following procedure is quite useful, although obscure:

    (define (mystery x)
      (define (loop x y)
        (if (null? x)
            y
            (let ((temp (cdr x)))
              (set-cdr! x y)
              (loop temp x))))
      (loop x '()))
    
    Loop uses the ``temporary'' variable temp to hold the old value of the cdr of x, since the set-cdr! on the next line destroys the cdr. Explain what mystery does in general. Suppose v is defined by (define v (list 'a 'b 'c 'd)). Draw the box-and-pointer diagram that represents the list to which v is bound. Suppose that we now evaluate (define w (mystery v)). Draw box-and-pointer diagrams that show the structures v and w after evaluating this expression. What would be printed as the values of v and w$\,$?

    Sharing and identity

    We mentioned in section [*] the theoretical issues of ``sameness'' and ``change'' raised by the introduction of assignment. These issues arise in practice when individual pairs are shared among different data objects. For example, consider the structure formed by

    (define x (list 'a 'b))
    (define z1 (cons x x))
    
    As shown in figure [*], z1 is a pair whose car and cdr both point to the same pair x. This sharing of x by the car and cdr of z1 is a consequence of the straightforward way in which cons is implemented. In general, using cons to construct lists will result in an interlinked structure of pairs in which many individual pairs are shared by many different structures.


      \begin{figure}\par\figcaption {The list {\tt z1} formed by {\tt (cons x x)}.}\end{figure}


      \begin{figure}\par\figcaption {The list {\tt z2} formed by
{\tt (cons (list 'a 'b) (list 'a 'b))}.}\end{figure}

    In contrast to figure [*], figure [*] shows the structure created by

    (define z2 (cons (list 'a 'b) (list 'a 'b)))
    
    In this structure, the pairs in the two (a b) lists are distinct, although the actual symbols are shared.[*]

    When thought of as a list, z1 and z2 both represent ``the same'' list, ((a b) a b). In general, sharing is completely undetectable if we operate on lists using only cons, car, and cdr. However, if we allow mutators on list structure, sharing becomes significant. As an example of the difference that sharing can make, consider the following procedure, which modifies the car of the structure to which it is applied:

    (define (set-to-wow! x)
      (set-car! (car x) 'wow)
      x)
    
    Even though z1 and z2 are ``the same'' structure, applying set-to-wow! to them yields different results. With z1, altering the car also changes the cdr, because in z1 the car and the cdr are the same pair. With z2, the car and cdr are distinct, so set-to-wow! modifies only the car:

    z1
    ((a b) a b)
    
    

    (set-to-wow! z1) ((wow b) wow b)

    z2 ((a b) a b)

    (set-to-wow! z2) ((wow b) a b)

    One way to detect sharing in list structures is to use the predicate eq?, which we introduced in section [*] as a way to test whether two symbols are equal. More generally, (eq? x y) tests whether x and y are the same object (that is, whether x and y are equal as pointers). Thus, with z1 and z2 as defined in figures [*] and [*], (eq? (car z1) (cdr z1)) is true and (eq? (car z2) (cdr z2)) is false.

    As will be seen in the following sections, we can exploit sharing to greatly extend the repertoire of data structures that can be represented by pairs. On the other hand, sharing can also be dangerous, since modifications made to structures will also affect other structures that happen to share the modified parts. The mutation operations set-car! and set-cdr! should be used with care; unless we have a good understanding of how our data objects are shared, mutation can have unanticipated results.[*]

    Exercise. Draw box-and-pointer diagrams to explain the effect of set-to-wow! on the structures z1 and z2 above.

    Exercise. Ben Bitdiddle decides to write a procedure to count the number of pairs in any list structure. ``It's easy,'' he reasons. ``The number of pairs in any structure is the number in the car plus the number in the cdr plus one more to count the current pair.'' So Ben writes the following procedure:

    (define (count-pairs x)
      (if (not (pair? x))
          0
          (+ (count-pairs (car x))
             (count-pairs (cdr x))
             1)))
    
    Show that this procedure is not correct. In particular, draw box-and-pointer diagrams representing list structures made up of exactly three pairs for which Ben's procedure would return 3; return 4; return 7; never return at all.  

    Exercise. Devise a correct version of the count-pairs procedure of exercise [*] that returns the number of distinct pairs in any structure. (Hint: Traverse the structure, maintaining an auxiliary data structure that is used to keep track of which pairs have already been counted.)

    Exercise. Write a procedure that examines a list and determines whether it contains a cycle, that is, whether a program that tried to find the end of the list by taking successive cdrs would go into an infinite loop. Exercise [*] constructed such lists.  

    Exercise. Redo exercise [*] using an algorithm that takes only a constant amount of space. (This requires a very clever idea.)

    Mutation is just assignment

    When we introduced compound data, we observed in section [*] that pairs can be represented purely in terms of procedures:

    (define (cons x y)
      (define (dispatch m)
        (cond ((eq? m 'car) x)
              ((eq? m 'cdr) y)
              (else (error "Undefined operation - CONS" m))))
      dispatch)
    
    (define (car z) (z 'car))
    
    (define (cdr z) (z 'cdr))
    
    The same observation is true for mutable data. We can implement mutable data objects as procedures using assignment and local state. For instance, we can extend the above pair implementation to handle set-car! and set-cdr! in a manner analogous to the way we implemented bank accounts using make-account in section [*]:

    (define (cons x y)
      (define (set-x! v) (set! x v))
      (define (set-y! v) (set! y v))
      (define (dispatch m)
        (cond ((eq? m 'car) x)
              ((eq? m 'cdr) y)
              ((eq? m 'set-car!) set-x!)
              ((eq? m 'set-cdr!) set-y!)
              (else (error "Undefined operation - CONS" m))))
      dispatch)
    
    (define (car z) (z 'car))
    
    (define (cdr z) (z 'cdr))
    
    (define (set-car! z new-value)
      ((z 'set-car!) new-value)
      z)
    
    (define (set-cdr! z new-value)
      ((z 'set-cdr!) new-value)
      z)
    

    Assignment is all that is needed, theoretically, to account for the behavior of mutable data. As soon as we admit set! to our language, we raise all the issues, not only of assignment, but of mutable data in general.[*]

    Exercise. Draw environment diagrams to illustrate the evaluation of the sequence of expressions

    (define x (cons 1 2))
    (define z (cons x x))
    (set-car! (cdr z) 17)
    
    (car x)
    17
    
    using the procedural implementation of pairs given above. (Compare exercise [*].)

    Previous: Modeling with Mutable Data Next: Representing Queues

      webmaster@arsdigita.org