previous  next  up  top

Generic Zipper: the context of a traversal

 


 

Zipper: a derivative of a data structure or of its mapping

Zipper is a functional cursor into a data structure. It lets us navigate to and change, without mutation, an item deeply buried in a tree or a nested record. The result is a new data structure, sharing much of its components with the old one. The latter remains available permitting the instant rollback of the changes. Zippers thus implement copy-on-write updates to data structures.

Zipper, the functional cursor into a data structure, is itself a data structure, derived from the original one. The derived data type D' for the recursive data type D is D with exactly one hole. Filling-in the hole -- integrating over all positions of the hole -- gives the data type D back. Zipper is the derivative, pretty much in the calculus sense, of a data type. The zipper-as-datatype view was the original presentation, described by Huet (JFP, 1997) and Hinze and Jeuring (JFP 2001); the data structure derivative was expounded by McBride.

We advocate a different view, emphasizing not the result of navigating through a data structure to a desired item and extracting it, but the process of navigation. Each data structure comes with a method to enumerate its components, representing the data structure as a stream of the nodes visited during the enumeration. To let the user focus on a item and submit an update, the enumeration process should yield control to the user once it reached an item. Co-routines provide exactly the right yielding mechanism. As Chung-chieh Shan aptly put it, ``zipper is a suspended walk.''

Whereas datatype-derivative zipper necessarily depends on the data type (different data types have different derivatives), the suspended-walk zipper is datatype-invariant. We only need a traversal procedure that applies a given function to each item of the data structure. The zipper is obtained by supplying the mapping function that yields. Since the traversal procedure can generically handle many data types (think of map), so can the zipper. Our zipper type depends only on the interface of the traversal function.

The two views of zipper are dual: they both regard zipper as the context of a data structure traversal. That context can be rendered in the functional form (the suspension, the delimited continuation of a traversal) or in the defunctionalized form, as a data structure.

References
zipper data structure < http://www.nist.gov/dads/HTML/zipper.html >

Ralf Hinze and Johan Jeuring: Weaving a Web
Journal of Functional Programming 11(6), pages 681 - 689, 2001. Technical report ICS Utrecht University, UU-CS-2001-33.

 

Zipper from any Traversable

The generic zipper can be derived for an abstract data type, whose implementation is hidden. So long as the data type implements the Data.Traversable interface -- at the very least, provides something like mapM -- it can be turned into zipper. Generically. We do not need to know any details of the data type, or if it is really a data structure, rather than an ephemeral collection whose elements are computed on the fly.

We demonstrate that zipper is the suspended traversal by deriving zipper for Data.Map. The latter is the Haskell standard, abstract data type for finite maps. The example is a bit contrived since Data.Map has a very rich interface, with many ways to access and update map's elements. On the other hand, Data.Map is the only non-trivial instance of Traversable defined in the standard library. We refer to Zipper1.lhs for a more interesting zipper, walking up and down a lambda-calculus term. Zipping into nested maps (which is a model of a file system) is described in ZipperFS below.

     data Zipper t a = ZDone (t a) 
                     | Z a (Maybe a -> Zipper t a)
     
     make_zipper :: T.Traversable t => t a -> Zipper t a
     make_zipper t = reset $ T.mapM f t >>= return . ZDone
      where
      f a = shift (\k -> return $ Z a (k . maybe a id))
The few lines above is all it takes to implement the zipper. The active `pointer' Z a k contains the current, focused on, element of the map and the function k to move forward. Evaluating k Nothing returns the cursor to the next element; evaluating k v replaces the current element with v and moves the cursor forward. The complete code below defines the function tmod to interactively traverse the map, displaying the elements one by one and offering to modify the current element, or quit the traversal. Here is a sample interaction:
     Current element: 1
     Enter Return, q or the replacement value: 
     
     Current element: 2
     Enter Return, q or the replacement value: 
     
     Current element: 6
     Enter Return, q or the replacement value: 
     42
     Current element: 24
     Enter Return, q or the replacement value: 
     q
     Done: 
     fromList [(1,1),(2,2),(3,42),(4,24),(5,120),(6,720),(7,5040),(8,40320),(9,362880),(10,3628800)]
Indeed, the third element of the map has been replaced with 42.
Version
The current version is February 2011.
References
ZipperTraversable.hs [3K]
Complete Haskell 2010 code
The code was originally posted as Zippers from any traversable on the Haskell-Cafe mailing list Tue, 31 Mar 2009 23:59:46 -0700 (PDT)

Zipper1.lhs [12K]
Introduction to generic zippers as delimited continuations
The code was originally posted as Zipper as a delimited continuation on the Haskell mailing list on Wed, 27 Apr 2005 16:17:04 -0700 (PDT)
The new version of the code demonstrates deriving zipper from a different traversal function, with a more intuitive interface, suggested by Ron de Bruijn. The zipper declaration and the navigation functions remain the same: the delimited-control--based zipper is generic indeed. The new version of the code relies on the delimcc library of delimited control in Haskell.

Chung-chieh Shan: From walking to zipping, Part 3: Caught in a zipper
< http://conway.rutgers.edu/~ccshan/wiki/blog/posts/WalkZip3/ >
Our Data.Map zipper moves only forward. Chung-chieh Shan has explained in detail how to turn any one-directional zipper into bi-directional. Again generically.

Generator from any Foldable

Zipper in Scheme
An old derivation, using call/cc to implement delimited control to model suspension.

 

Zipper-based file server/OS

We present a file server/OS with threading and exceptions realized via delimited continuations. A file system is modeled as a nested finite map, relying on zipper for navigation and updates. The outcome looks like the Unix file system, with the improved semantics: transactions; undo of any file and directory operation; snapshots; statically guaranteed the strongest, repeatable read, isolation mode for clients; pervasive copy-on-write for files and directories; built-in traversal facility; the expected behavior for cyclic directory references.

We can easily change our file server to support NFS or 9P or other distributed file system protocol. We can traverse richer terms than mere finite maps with string keys. In particular, we can use lambda-terms as our file system: one could cd into a lambda-term in bash.

The implementation of ZipperFS has no unsafe operations, no GHC let alone Unix threads, and no concurrency problems. Our threads cannot even do IO and cannot mutate any global state --- and the type system sees to it. We thus demonstrate how delimited continuations let us statically isolate effects even if the whole program eventually runs in the IO monad.

Version
The current version is February 2011.
References
zfs-talk.pdf [98K]
Expanded talk with the demo. It was originally presented as an extra demo at the Haskell Workshop 2005

ZFS.hs [18K]
The commented source code of the interrupt handler, scheduling loop, and the shell.
The code has tested with GHC 6.4-6.10 on FreeBSD and Linux.

ZipperM.hs [8K]
Zipper over the nested Map with path accumulation and sharing preservation

LowLevelIO.hs [4K]
Low-level IO operations, specifically, select
(also used by the Iteratee library)

 

Delimited continuations in operating systems

[The Abstract of the paper]
Delimited continuations are the meanings of delimited evaluation contexts in programming languages. We show they offer a uniform view of many scenarios that arise in systems programming, such as a request for a system service, an event handler for input/output, a snapshot of a process, a file system being read and updated, and a Web page. Explicitly recognizing these uses of delimited continuations helps us design a system of concurrent, isolated transactions where desirable features such as snapshots, undo, copy-on-write, reconciliation, and interposition fall out by default. It also lets us take advantage of efficient implementation techniques from programming-language research. The Zipper File System prototypes these ideas.

Joint work with Chung-chieh Shan.

Olivier Danvy has very kindly pointed out that continuations in operating systems delimited at process, job, etc. boundaries were first mentioned back in 1974 in the seminal work by Christopher Strachey and Christopher P. Wadsworth `Continuations: A mathematical semantics for handling full jumps.' Technical Monograph PRG-11, Programming Research Group, Oxford University Computing Laboratory, 1974. Reprinted in Higher-Order and Symbolic Computation, 2000, 13(1-2):135-152. Here is footnote 1 from their paper:

The reader may ask if it is any more justifiable to take a single program in isolation than it is to take a single command. The answer, of course, is that it is not, and that in the same way as command continuations are needed to explain jumps inside programs, further hierarchical levels of continuations, such as process continuations, job continuations and operating system continuations, will be needed to give the semantics of the operating system. The outer-most level (or possibly levels) are not inside the machine at all and are implemented by operator intervention. We do not discuss the use of continuations in the semantics of operating systems any further in this paper as to do so would require a fuller understanding of the differences between operating systems and programs that is yet at our disposal. It would also make the paper much too long...
Version
The current version is 1.2, June 1, 2007.
References
context-OS.pdf [134K]
Proc. CONTEXT2007: 6th International and Interdisciplinary Conference on Modeling and Using Context. Roskilde, Denmark, August 20-24, 2007. Lecture Notes in Artificial Intelligence 4635, pp. 291-302.

Poster presented at CONTEXT 2007, Roskilde U., Denmark, August 22, 2007.
< http://www.cs.rutgers.edu/~ccshan/zipper/poster.ps >

 

Zipping through several data structures to compare and annotate them

Zipping through two data structures to compare them side-by-side and annotate the differing components is a more interesting, and practical application of zippers, highlighting the incremental traversal and the functional update. The example has been motivated by a practical application to compare two XML documents disregarding the inline markup. To process large XML documents the processing has to be done inplace, minimizing copying and rebuilding of the DOM tree (SXML document). If two documents are the same, modulo the inline markup, the comparison procedure returns the original documents as they are, with no copying of the DOM tree or its branches. The example is a non-trivial generalization of the ``same fringe problem'' -- not only do we detect a single difference, we return the updated documents where each paragraph with the difference in character content is marked.
Version
The current version is 1.1, August 23, 2005.
References
Joint processing of two immutable SXML documents with update cursors

 

Zippers with several holes

Suspending the traversal of a term gives us zipper for the term. Suspending several concurrent traversals of the term gives us zipper with several holes. Concurrent traversals are based on delimited continuations, which let us view a single-threaded program as cooperatively multi-threading. We support various isolation modes of concurrent traversals: from ``uncommitted read'' to ``repeatable read'' to ``serializable'' -- and even sub-transactions. We can use either the push mode -- one cursor broadcasting its updates to the others -- or the pull mode. A cursor may decide to broadcast the accumulated updates at arbitrary checkpoints of its own choosing. A cursor may examine updates made by the other cursor and perhaps disregard some of them -- or apply in a different order, after its own updates. We are using terms zipper, cursor and transaction somewhat interchangeably: a zipper is a cursor which is by default isolated from the others and whose updates are instantly reversible.
Version
The current version is February 2011.
References
Zipper2.lhs [22K]
The detailed, commented code, originally posted as Two-hole zippers and transactions of various isolation modes on the Haskell mailing list on Tue, 10 May 2005 23:11:06 -0700 (PDT). The new version of the code uses the delimcc library.


Last updated May 4, 2011

This site's top page is http://okmij.org/ftp/

oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML