previous   next   up   top

Staging, Program Generation, and Meta-Programming

 


 

A monadic approach for avoiding code duplication when staging memoized functions

[The Abstract of the paper]
Building program generators that do not duplicate generated code can be challenging. At the same time, code duplication can easily increase both generation time and runtime of generated programs by an exponential factor. We identify an instance of this problem that can arise when memoized functions are staged. Without addressing this problem, it would be impossible to effectively stage dynamic programming algorithms. Intuitively, direct staging undoes the effect of memoization. To solve this problem once and for all, and for any function that uses memoization, we propose a staged monadic combinator library. Experimental results confirm that the library works as expected. Preliminary results also indicate that the library is useful even when memoization is not used.

The paper introduces the continuation+state monad for code generation with let-insertion, and briefly describes abstract interpretation for generating efficient code without its intensional analysis. These are the foundational techniques for typeful multi-staged (MetaOCaml) programming.

Joint work with Walid Taha, Kedar N. Swadi, and Emir Pasalic.

Version
The current version is January 2006.
References
Proceedings of the 2006 ACM SIGPLAN Workshop on Partial Evaluation and Semantics-based Program Manipulation (PEPM), 2006, Charleston, South Carolina, USA, January 9-10, 2006, pp. 160-169.
< http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf >

Dynamic Programming Benchmark: The corresponding MetaOCaml source code
< http://www.metaocaml.org/examples/dp/ >

 

Generating optimal code with confidence

Can we generate optimal code in one pass, without further transformations such as common-subexpression-elimination, without looking into the code at all after it has been generated? The papers below answer affirmatively. The papers discuss in detail the generation of a straight-line C, Fortran, etc. code for the power-of-two FFT. The papers show that with the help of Abstract Interpretation we can exactly match the quality of code generated by the well-known system FFTW. The second paper demonstrates that our generated power-of-two FFT code has exactly the same number of floating-point operations as that in codelets of FFTW. The significant difference from FFTW is that we do not do any intensional code analysis -- the generated code is a black box and can not be processed nor manipulated any further. Moreover, the generated code cannot even be compared in equality. The reason for these restrictions is to maintain strong invariants about the generated code, e.g., that the generated code is automatically strongly typed and does not need to be typechecked. These invariants and the absence of ad hoc manipulation on the generated low-level code significantly increase our confidence in the result.

Unlike FFTW, we know precisely where savings are coming from, and which particular equivalences contribute to which improvements in the code.

The paper makes the case that ``there are benefits to focusing on writing better generators rather than on fixing the results of simple generators.''

References
A Methodology for Generating Verified Combinatorial Circuits
Joint work with Kedar N. Swadi and Walid Taha.
Proc. of EMSOFT'04, the Fourth ACM International Conference on Embedded Software, September 27-29, 2004, Pisa, Italy. ACM Press, pp. 249 - 258.
< http://www.cs.rice.edu/~taha/publications/conference/emsoft04.pdf >

The corresponding MetaOCaml source code
< http://www.metaocaml.org/examples/fft.ml >

Generating optimal FFT code and relating FFTW and Split-Radix
Matching FFTW in operation counts

 

Multi-stage programming with functors and monads: eliminating abstraction overhead from generic code

[The Abstract of the paper]
We use multi-stage programming, monads and OCaml's advanced module system to demonstrate how to eliminate all abstraction overhead from generic programs while avoiding any inspection of the resulting code. We demonstrate this clearly with Gaussian Elimination as a representative family of symbolic and numeric algorithms. We parameterize our code to a great extent -- over domain, input and permutation matrix representations, determinant and rank tracking, pivoting policies, result types, etc. -- at no run-time cost. Because the resulting code is generated just right and not changed afterward, MetaOCaml guarantees that the generated code is well-typed. We further demonstrate that various abstraction parameters (aspects) can be made orthogonal and compositional, even in the presence of name-generation for temporaries, and ``interleaving'' of aspects. We also show how to encode some domain-specific knowledge so that ``clearly wrong'' compositions can be rejected at or before generation time, rather than during the compilation or running of the generated code.

Joint work with Jacques Carette.

Version
The current version is October 2008.
References
SCP-metamonads.pdf [311K]
Science of Computer Programming, 2008, in press and online
< http://dx.doi.org/10.1016/j.scico.2008.09.008 >

< http://www.cas.mcmaster.ca/~carette/metamonads/index.html >
Besides the paper, the page points to the generator code, and the generated Gaussian Elimination codes to handle real and integer matrices using various pivoting strategies.

 

In search of a program generator to implement generic transformations for high-performance computing

[The Abstract of the paper]
The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the performance gap has been widening over time, due to the tremendous complexity increase in microprocessor and memory architectures, and to the rising level of abstraction of popular programming languages and styles. This paper explores in-between solutions, neither fully automatic nor fully manual ways to adapt a computationally intensive application to the target architecture. By mimicking complex sequences of transformations useful to optimize real codes, we show that generative programming is a practical means to implement architecture-aware optimizations for high-performance applications.

This work explores the promises of generative programming languages and techniques for the high-performance computing expert. We show that complex, architecture-specific optimizations can be implemented in a type-safe, purely generative framework. Peak performance is achievable through the careful combination of a high-level, multi-stage evaluation language -- MetaOCaml -- with low-level code generation techniques. Nevertheless, our results also show that generative approaches for high-performance computing do not come without technical caveats and implementation barriers concerning productivity and reuse. We describe these difficulties and identify ways to hide or overcome them, from abstract syntaxes to heterogeneous generators of code generators, combining high-level and type-safe multi-stage programming with a back-end generator of imperative code.

Joint work with Albert Cohen, Se'bastien Donadio, Mari'a Jesu's Garzara'n, Christoph Armin Herrmann, and David A. Padua.

Version
The current version is September, 2006.
References
< http://dx.doi.org/10.1016/j.scico.2005.10.013 >
Science of Computer Programming, Volume 62, Issue 1, September 2006, pp. 25-46.

< http://www-rocq.inria.fr/~acohen/publications/CDGHKP06.ps.gz >
From `Albert Cohen - Publications'

 

Staged let-generalization may be unsound

We demonstrate that the value restriction in a staged calculus with cross-stage persistence and reference cells is insufficient to prevent unsound generalization.

The distinguished feature of Hindley-Milner type system is generalization of let-bindings. The type inferred for a binding introduced by the let-form is generalized by quantifying generalizable free type variables. For example, the expression

     let f () = [] in (1::f(), "123"::f())

is well-typed: since the type inferred for f, unit -> 'a list, contains the generalizable type variable 'a the type is generalized to the polymorphic forall 'a. unit -> 'a list (in Hindley-Milner systems, quantifiers are often omitted). The (implicitly) quantified type variable 'a can then be instantiated to int or string, permitting different instances of f to be used in differently-typed contexts.

It has long been known that let-generalization is unsound in the presence of reference cells. For example, if the type inferred for r in the expression

     let r = ref [] in
     r := [1]; "123"::!r
       This expression !r has type int list but is here used with type string list
were generalized from 'a list ref to forall 'a. 'a list ref, the expression would have returned a list that contains a string and an int, letting us apply string operations to an int, or vice versa. A well-type program would have ``gone wrong.''

To ensure the type system soundness, there have been proposed many various restrictions on let-generalization. The most widely implemented is value restriction: the type inferred for a let-binding is generalized only if the right-hand-side of the binding syntactically has the form of a value. In other words, only values may have polymorphic types. In our examples, the let-binding for f (which de-sugars to let f = fun () -> [] in ...) binds f to what syntactically is a function, that is, a value. In contrast, the right-hand-side of the binding to r, ref [], is syntactically a non-value expression. Therefore, the type of r is not generalized, prohibiting the use of r in differently-typed contexts, int list ref vs string list ref. The type checker rejects our second example.

The value restriction has a clear intuition. We can type check our first example without polymorphism, if we first inline the definition of f into its two use places. A polymorphic binding then may be regarded as an `optimization', letting us type check f once, where it is defined, rather than at every place where f is used. A polymorphic type is an indication that the expression is inlineable; some compilers, e.g., MLton, indeed inline all polymorphic expressions so that they can be compiled without resorting to boxing. Effectful expressions (such as ref []) cannot be inlined while preserving dynamic behavior as copying them replicates effects and hence is observable. Values are inert and hence can be copied or shared at compiler's discretion, without affecting dynamic behavior.

Alas, in a staged calculus with cross-stage persistence, value restriction or its variants (such as a relaxed value restriction) turn out insufficient to ensure type soundness. We demonstrate the unsoundness on a series of examples culminating in a segmentation fault. We develop our examples interactively, by submitting expressions to the top-level of a MetaOCaml interpreter of any recent version (309 or BER 002) and observing its responses. In the transcript below, the responses are indented.

The first example uses no staging and causes no controversy:

     let f () = ref [] 
     in f() := [1]; "123"::!(f())
          - : string list = ["123"]
The let-binding de-sugars into let f = fun () -> ref [] in ... whose right-hand side is syntactically a function. Value restriction should allow generalization; both OCaml and MetaOCaml agree and accept the expression. The indented line shows the result.

What if we enclose the whole expression into MetaOCaml brackets? If an expression is well-typed, its code should be well-typed too. If we could manually enter an expression without type errors, we should be able to automatically generate that expression without type errors.

     let c =
     .<let f () = ref [] 
       in f() := [1]; "123"::!(f())>.
          val c : ('a, string list) code =
            .<let f_2 = fun () -> (ref ([])) in
              ((f_2 ()) := [1]); ("123" :: (! (f_2 ())))>.
     .! c;;
          - : string list = ["123"]
MetaOCaml accepts the quoted expression from the first example. Running the code produces the result we have already seen.

Let us add an escape, or splice: < let f () = .~(.<ref []>.) in ... >., which reduces in one step to the second example.

     let c =
     .<let f () = .~(.<ref []>.) 
       in f() := [1]; "123"::!(f())>.
     
          val c : ('a, string list) code =
            .<let f_2 = fun () -> (ref ([])) in
              ((f_2 ()) := [1]); ("123" :: (! (f_2 ())))>.
     .! c;;
          - : string list = ["123"]
Although the right-hand side of the c-binding is no longer a (present-stage) value because of the splice, evaluating the right-hand-side produces the code value of the second example, which is well-typed. More formally, despite the splice, the second-stage binding to f still looks like a binding for a function. The (second) stage value restriction should allow second-stage generalization. And it does, in MetaOCaml. The result of running the code value c is identical to that of running the previous examples.

We come to the final example, which looks as if it reduces to the earlier ones.

     let c =
     .<let f () = .~(let x = ref [] in .<x>.) 
       in f() := [1]; "123"::!(f())>.
     
          val c : ('a, string list) code =
            .<let f_2 = fun () -> (* cross-stage persistent value (as id: x) *) in
              ((f_2 ()) := [1]); ("123" :: (! (f_2 ())))>.
The example is accepted by MetaOCaml; the code value c clearly contains a cross-stage persistent value. We have created a reference cell at the present stage and ``lifted'' the cell to the future-stage, letting the generated code use the value as it is. We stress that the cross-stage persistence is vital in practice: without it, we have to write a staged version of all standard library functions. The let-binding for f is still syntactically a binding for a function, so generalization seems justified. If we try to run the generated code, we trip and fall:
     .! c;;
          segmentation fault
Although the right-hand side of the future-stage f binding was syntactically a functional value, fun () -> csprefval, all applications of that function return one and the same csprefval. Therefore, sharing or deep-copying of the function become observable -- the behavior that is inconsistent with the inferred polymorphic type for the function. The value restriction has a rarely mentioned premise: the only way to produce reference cells is to evaluate an expression like ref something. There are no syntactic values of the type of reference cells. Staging with cross-stage persistence can violate the premise: reference cells are still created as the result of evaluating an expression; that result, if lifted to the future stage, looks syntactically like a value.

The problem of restricting let-generalization to ensure soundness, considered closed long time ago, is thrown open in staged calculi.

Joint work with Chung-chieh Shan.

Version
The current version is June, 2010.
References
The MetaOCaml files: Status report and research proposal

Jacques Garrigue: Relaxing the Value Restriction.
Proc. Int. Symposium on Functional and Logic Programming, Nara, April 2004. Springer-Verlag LNCS 2998, pp. 196--213. (extended version: RIMS Preprint 1444)
< http://www.math.nagoya-u.ac.jp/~garrigue/papers/morepoly-long.pdf >
The paper gives a good survey of approaches to restrict let-generalization to ensure its soundness.

 

Computational Effects across Generated Binders

Code generation is the leading approach to making high-performance software reusable. Using a set of realistic benchmarks -- faulty power, guard insertion, loop tiling -- we demonstrate that side effects are indispensable in composable code generators, especially side effects that move open code past generated binders. We challenge the audience to implement these examples in their favorite code-generation framework.

Not just any solution is acceptable: we wish to avoid tree hacking, modification or even examination of the generated code, and its post-validation. See the slide notes for the detailed explanation of these requirements. Our goal is to generate code with compositional combinators that statically assure the results (even intermediate, open results) are well-formed and well-typed.

Template Haskell is far away from the goal. MetaOCaml is closer: generators without side-effects satisfy our requirements. Alas, such generators cannot implement our benchmarks. Without effects, let-insertion or other movement of open code past generated binders is not possible.

Granted, let-insertion without crossing binders can be done effectlessly, as well-known from partial evaluation. The cost is writing generators in the continuation-passing, or monadic style (which obscures the algorithm and makes the generators harder to use by domain experts). However, even the repeated continuation-passing transformation cannot help us insert let beyond the closest binder. We need a new CPS hierarchy.

Joint work with Chung-chieh Shan and Yukiyoshi Kameyama.

Version
The current version is September 2011.
References
talk-problems.pdf [172K]
Computational Effects across Generated Binders. Part 1: Problems and solutions
Extensively annotated slides of the talk presented at the IFIP WG2.11 meeting (Bordeaux, France, September 5, 2011) and at INRIA Paris (September 9, 2011).

talk-problems.ml [11K]
MetaOCaml code for the code generation benchmarks that emphasize effects crossing future-stage binders. Although the code implements all benchmarks, the absence of scope extrusion cannot be assured. Small mistakes can indeed result in unbound variables in the generated code.

 

Tagless-Staged: a step toward MetaHaskell

The goal of MetaHaskell is convenient and expressive code generation in Haskell that maintains lexical scope and statically ensures the results (even intermediate, open results) are well-formed and well-typed. Template Haskell falls short, assuring -- dynamically rather than statically -- the well-typedness of only the final code. Furthermore, although the generated code, after a type-check, is well-typed, it may still be ill-scoped. Template Haskell however is a good back-end on which to build libraries with the desired static guarantees.

Tagless-staged is such a front-end -- a (tagless-final) library of code generation combinators built on top of Template Haskell. The library statically assures not only that all generated code is well-formed and well-typed but also that all generated variables are bound lexically as expected. Such assurances are crucial for code generators to be written by domain experts rather than compiler writers, because the most profitable optimizations are domain-specific ones. Unlike MetaOCaml, the static guarantees are maintained in the presence of arbitrary effects, including IO. Since Tagless-staged gets by without the Q monad of Template Haskell, it questions that monad's existence.

Joint work with Chung-chieh Shan and Yukiyoshi Kameyama.

Version
The current version is September 2011.
References
TSCore.hs [19K]
The core of the code generation library

TSCPST.hs [5K]
Code generation with control effects reaching beyond the closest binder. Far-reaching let-insertion. The implementation of the CPSA applicative hierarchy.

CS-TR-11-17.pdf [193K]
< http://www.cs.tsukuba.ac.jp/techreport/data/CS-TR-11-17.pdf >
Computational Effects across Generated Binders: Maintaining future-stage lexical scope.
Technical Report CS-TR-11-17, Department of Computer Science, Graduate School of Systems and Information Engineering, University of Tsukuba.
A difficult-to-understand explanation of the library

Examples and Sample code

 

What is lexical scope and how to enforce it

If left unchecked, side effects in code generators often interact with generated binders badly to produce unexpectedly unbound variables, or worse, unexpectedly bound ones. The literature and our experience is rife with examples of these surprises, where variables with different scopes are mixed up. To prevent such surprises while still allowing arbitrary side effects to move open code past generated binders, we first define a notion of lexical scope for generated code with explicit contexts. To each generated binder, we attach a unique label that is checked against each use occurrence of the bound variable. We then introduce a static type system to assure that these checks will succeed.

We embedded this type system in a Haskell library. We used this tagless-staged library to implement statically safe let-insertion across an arbitrary number of binders for the first time.

Joint work with Chung-chieh Shan and Yukiyoshi Kameyama.

Version
The current version is September 2011.
References
Chung-chieh Shan: Computational effects across generated binders. Part 2: enforcing lexical scope
Talk presented at the IFIP Working Group 2.11 (program generation), INRIA Paris, and Cornell University, 2011.
< http://www.cs.rutgers.edu/~ccshan/metafx/talk-scope.pdf >

Tagless-Staged: a step toward MetaHaskell
Haskell code generation library that statically enforces future-stage lexical scope

Unsafe.hs [17K]
Demonstrating run-time errors that our type system statically prevents. The code shows that well-scoped de Bruijn indices do not statically determine lexical scope.

 

Closing the Stage: From Staged Code to Typed Closures

[The Abstract of the paper]
Code generation lets us write well-abstracted programs without performance penalty. Writing a correct code generator is easier than building a full-scale compiler but still hard. Typed multistage languages such as MetaOCaml help in two ways: they provide simple annotations to express code generation, and they assure that the generated code is well-typed and well-scoped. Unfortunately, the assurance only holds without side effects such as state and control. Without effects, generators often have to be written in a continuation-passing or monadic style that has proved inconvenient. It is thus a pressing open problem to combine effects with staging in a sound type system.

This paper takes a first step towards solving the problem, by translating the staging away. Our source language models MetaOCaml restricted to one future stage. It is a call-by-value language, with a sound type system and a small-step operational semantics, that supports building open code, running closed code, cross-stage persistence, and non-termination effects. We translate each typing derivation from this source language to the unstaged System F with constants. Our translation represents future-stage code using closures, yet preserves the typing, alpha-equivalence (hygiene), and (we conjecture) termination and evaluation order of the staged program.

To decouple evaluation from scope (a defining characteristic of staging), our translation weakens the typing environment of open code using a term coercion reminiscent of Goedel's translation from intuitionistic to modal logic. By converting open code to closures with typed environments, our translation establishes a framework in which to study staging with effects and to prototype staged languages. It already makes scope extrusion a type error.

Joint work with Yukiyoshi Kameyama and Chung-chieh Shan.

Version
The current version is December 2007.
References
metafx.pdf [211K]
The paper published in the Proceedings of the 2008 ACM SIGPLAN Workshop on Partial Evaluation and Semantics-based Program Manipulation (PEPM), 2008, San Francisco, USA, January 7-8; pp. 147-157, 2008.

Chung-chieh Shan. Slides of the talk at PEPM, January 7, 2008.
< http://www.cs.rutgers.edu/~ccshan/metafx/talk.pdf >

paper-examples.ml [11K]
Examples of the staged code and its translation. This file contains the complete code for all the examples in the paper plus extra examples.

power-count.ml [4K]
Computing a staged power function while tracking the number of multiplications. The example in Sec 6 of the paper.

 

Shifting the Stage: Staging with Delimited Control

[The Abstract of the paper]
It is often hard to write programs that are efficient yet reusable. For example, an efficient implementation of Gaussian elimination should be specialized to the structure and known static properties of the input matrix. The most profitable optimizations, such as choosing the best pivoting or memoization, cannot be expected of even an advanced compiler because they are specific to the domain, but expressing these optimizations directly makes for ungainly source code. Instead, a promising and popular way to reconcile efficiency with reusability is for a domain expert to write code generators.

Two pillars of this approach are types and effects. Typed multilevel languages such as MetaOCaml ensure safety: a well-typed code generator neither goes wrong nor generates code that goes wrong. Side effects such as state and control ease correctness: an effectful generator can resemble the textbook presentation of an algorithm, as is familiar to domain experts, yet insert let for memoization and if for bounds-checking, as is necessary for efficiency. However, adding effects blindly renders multilevel types unsound.

We introduce the first multilevel calculus with control effects and a sound type system. We give small-step operational semantics as well as a one-pass continuation-passing-style translation. For soundness, our calculus restricts the code generator's effects to the scope of generated binders. Even with this restriction, we can finally write efficient code generators for dynamic programming and numerical methods in direct style, like in algorithm textbooks, rather than in continuation-passing or monadic style.

Joint work with Yukiyoshi Kameyama and Chung-chieh Shan.

Version
The current version is November 2011.
References
circle-shift.pdf [1035K]
Shifting the stage: Staging with delimited control
Journal of Functional Programming 21(6):617-662, 2011.
Copyright Cambridge University Press 2011 doi:10.1017/S0956796811000256
This is the greatly extended, journal version of the paper first published in the Proc. of PEPM 2009, pp. 111-120.

Chung-chieh Shan. Slides of the talk at PEPM, January 20, 2009.
< http://www.cs.rutgers.edu/~ccshan/metafx/meta-shift-talk.pdf >

circle-shift.elf [38K]
lambda-circle calculus with shift/reset: two-stage calculus with delimited control effects
This Twelf code defines the calculus and its static (type checking) and dynamic semantics. The code contains many examples, including the staged Gibonacci example with memoization and let-insertion.

Mechanizing multilevel metatheory with control effects
Detailed description of the formalization of the extended calculus, with arbitrarily many levels

fib.ml [16K]
fib1.ml [11K]
The Gibonacci example -- generating efficient specialized versions of the generalized Fibonacci function in direct style
The two MetaOCaml files describe a progression of attempts, from an inefficient unstaged Gibonacci, memoized Gibonacci, naively staged and inefficient function and finally to the efficient memoization with let-insertion. The dangers of scope extrusion are well illustrated. The specialized Gibonacci generator is written in direct, rather than monadic or CPS style. The file fib.ml uses the explicit fix-point combinator; The other file relies on recursive definitions instead.

lcs.ml [6K]
The complete MetaOCaml code for generating optimal specialized code for the longest common subsequence: another example of dynamic meta-programming in direct style

ge_unstaged.ml [8K]
ge_gen.ml [15K]
Generating a family of Gaussian Elimination codes in direct style, without either functors or monads
The file ge_unstaged.ml is the unstaged, textbook Gaussian elimination code, in plain OCaml. The other file is the corresponding staged code, in MetaOCaml. The staged code generates a family of GE codes (with or without determinant computation, etc). We pay no performance penalty for the added flexibility.

 

The MetaOCaml files: Status report and research proposal

[The Abstract]
Custom code generation is the leading approach to reconciling generality with performance. MetaOCaml, a dialect of OCaml, is the best-developed way today to write custom code generators and assure them type-safe across multiple stages of computation. Nevertheless, the continuing interest from the community has yet to result in a mature implementation of MetaOCaml that integrates cleanly with OCaml's type checker and run-time system. Even in theory, it is unclear how staging interacts with effects, polymorphism, and user-defined data types.

We report on the status of the ongoing MetaOCaml project, focusing on the gap between theory and practice and the difficulties that arise in a full-featured staged language rather than an idealized calculus. We highlight foundational problems in type soundness and cross-stage persistence that demand investigation. We also suggest a lightweight implementation of a two-stage dialect of OCaml, as syntactic sugar.

Joint work with Chung-chieh Shan.

Version
The current version is September, 2010.
References
metaocaml-files.pdf [95K]
The extended abstract published in the informal proceedings of the 2010 ACM SIGPLAN Workshop on ML.

Chung-chieh Shan. Slides of the talk at the ML workshop. Baltimore, MD, September 26, 2010.
< http://www.cs.rutgers.edu/~ccshan/metafx/metaocaml-slides.pdf >

BER MetaOCaml: a streamlined version of MetaOCaml

 

MetaScheme, or untyped MetaOCaml

We implement the four MetaOCaml special forms -- bracket, escape, cross-stage persistence (aka `lift'), and run -- in R5RS Scheme. A Scheme system thus becomes untyped MetaOCaml, in which we can write and run code with an arbitrary number of stages. We point out how distinct the staging annotations are from Scheme's quasiquotation, despite superficial similarity. Our implementation of cross-stage persistent (CSP) values closely corresponds to the one in MetaOCaml and so helps explain the latter.

Here is a sample MetaOCaml code and its Scheme translation

     # let eta = fun f -> .<fun x -> .~(f .<x>.)>.
       in .<fun x -> .~(eta (fun y -> .<x + .~y>.))>.
     - : ('a, int -> int -> int) code = .<fun x_1 -> fun x_2 -> (x_1 + x_2)>.
     
     (let ((eta (lambda (f) (bracket (lambda (x) (escape (f (bracket x))))))))
       (bracket (lambda (x) (escape (eta (lambda (y) (bracket (+ x (escape y)))))))))
Translating MetaOCaml code into Scheme seems trivial: code values are like S-expressions, MetaOCaml's bracket is like quasiquote, escape is like unquote, and `run' is eval. If we indeed replace bracket with the quasiquote and escape with the unquote in the above Scheme code and then evaluate it, we get the S-expression (lambda (x) (lambda (x) (+ x x))) , which is a wrong result, quite different from the code expression .<fun x_1 -> fun x_2 -> (x_1 + x_2)>. produced by MetaOCaml. The latter is the code of a curried function that adds two integers. The S-expression produced by the naive Scheme translation represents a curried function that disregards the first argument and doubles the second. This demonstrates that the often mentioned `similarity' between the bracket and the quasiquote is flawed. MetaOCaml's bracket respects alpha-equivalence; in contrast, Scheme's quasiquotation, being a general form for constructing arbitrary S-expressions (not necessarily representing any code), is oblivious to the binding structure.

Our implementation still uses S-expressions for code values (so we can print them and eval-uate them) and treats escape as unquote. To maintain the hygiene, we need to make sure that every run-time evaluation of a bracket form such as (bracket (lambda (x) x)) gives '(lambda (x_new) x_new) with the unique bound variable x_new . Two examples in the source code demonstrate why static renaming of manifestly bound identifiers is not sufficient. We implement the very clever suggestion by Chung-chieh Shan and represent a staged expression such as .<(fun x -> x + 1) 3>. by the sexp-generating expression `(,(let ((x (gensym))) `(lambda (,x) (+ ,x 1))) 3) which evaluates to the S-expression ((lambda (x_1) (+ x_1 1)) 3) . Thus bracket is a complex macro that transforms its body to the sexp-generating expression, keeping track of the levels of brackets and escapes. The macro bracket is implemented as a CEK machine with the defunctionalized continuation. In our implementation, the Scheme translation of the eta-example yields the S-expression (lambda (g6) (lambda (g8) (+ g6 g8))) . Just as in MetaOCaml, the result represents the curried addition.

CSP poses another implementation problem. In MetaScheme we can write the MetaOCaml expression .<fun x -> x + 1>. as (bracket (lambda (x) (+ x 1))) , which yields an S-expression (lambda (g1) (+ g1 1)) . When we pass this code to eval, the identifier + will be looked up in the environment of eval, which is generally different from the environment that was in effect when the original bracket form was evaluated. That might not look like much of a difference since + is probably bound to the same procedure for adding numbers in either environment. This is no longer the case if we take the following MetaOCaml code and its putative MetaScheme translation:

     let test = let id u = u in
                   .<fun x -> id x>.
     
     (define test (let ((id (lambda (u) u)))
                    (bracket (lambda (x) (id x)))))
The latter definition binds test to the S-expression (lambda (x) (id x)) that contains a free identifier id , unlikely to be bound in the environment of eval. Our code values therefore should be `closures', being able to include values that are (possibly locally) bound in the environment when the code value was created. Incidentally, the problem of code closures closed over the environment of the generator also appears in syntax-case macros. R6RS editors made a choice to prohibit the occurrence of locally-bound identifiers in the output of a syntax-case transformer.

MetaOCaml among other staged systems does permit the inclusion of values from the generator stage in the generated code. Such values are called CSP; they are evident in the output of the MetaOCaml interpreter for the above test:

     val test : ('a, 'b -> 'b) code =
     .<fun x_1 -> (((* cross-stage persistent value (as id: id) *)) x_1)>.
In MetaScheme, we have to write CSP explicitly: (% e) , which is often called `lift'.
     (define test
       (let ((id (lambda (u) u)))
         (bracket (lambda (x) ((% id) x)))))
One may think that such a lifting is already available in Scheme: (define (lift x) (list 'quote x)) . Although this works in the simple cases of x being a number, a string, etc., it is neither universal nor portable: attempting this way to lift a closure or an output-port could be an error. According to R5RS, the argument of a quotation is an external representation of a Scheme datum. Closures, for example, are not guaranteed to have an external representation. For portability, we implement CSP via a reference in a global array, taking advantage of the fact that both the index (a number) and the name of the array (an identifier) have external representations and hence are trivially liftable by quotation. This is precisely the mechanism used by the current version of MetaOCaml.

The source code contains many more examples of the translation of MetaOCaml code into MetaScheme -- including the examples with many stages and CSP.

Joint work with Chung-chieh Shan.

Version
The current version is 1.4, Aug 20, 2008.
References
meta-scheme.scm [15K]
The complete implementation with many comments. It should work with any R5RS Scheme system. The code has been tested with Scheme 48 1.3, SCM 5e1, and Petite Chez Scheme 6.0a.

 

Type(class)-directed symbolic differentiation

We show how to take an ordinary numeric Haskell function, e.g., a term of the type D a => a -> a where D is a Floating-like class, and produce a function of the same type that is the symbolic derivative of the former. The original function can be given to us in a separately compiled module with no available source code. The derived function is an ordinary numeric Haskell function and can be applied to numeric arguments -- or differentiated further. The Floating-like class D currently supports arithmetic and a bit of trigonometry. We also support partial derivatives.
     test1f x = x * x + fromInteger 1
     test1 = test1f (2.0::Float)      -- 5.0
     
     test1f' x = diff_fn test1f x
     test1' = test1f' (3.0::Float)    -- 6.0
     
     test1f'' x = diff_fn test1f' x
     test1'' = test1f'' (10.0::Float) -- 2.0
The original function test1f can be evaluated numerically, test1 , or differentiated symbolically, test1f' . The result is again an ordinary numeric function (i.e., 2x), which can be applied to a numeric argument, see test1' . Alternatively, test1f' can be differentiated further.

The original function is emphatically not represented as an algebraic data type -- it is a numeric function like sin or tan . Still, we are able to differentiate it symbolically (rather than numerically or automatically). The key insight is that Haskell98 supports a sort of reflection -- or, to be precise, type-directed partial evaluation and hence term reconstruction.

Our approach also shows off the specification of the differentiation rules via type classes (which makes the rules extensible) and the emulation of GADT via type classes. In 2006, we improved the approach by developing an algebraic simplifier and by avoiding any interpretative overhead.

Version
The current version is 1.1, Nov 24, 2004.
References
differentiation.lhs [11K]
The complete literate Haskell code with more examples, including those of partial differentiation. It was originally posted as Typeful symbolic differentiation of compiled functions on the Haskell mailing list on Wed Nov 24 04:05:12 EST 2004.

Most optimal symbolic differentiation of compiled numeric functions

 

Most optimal symbolic differentiation of compiled numeric functions

We show symbolic differentiation of a wide class of numeric functions without imposing any interpretative overhead. The functions to differentiate can be given to us in a compiled form (in .hi files); their source code is not needed. We produce a (compiled, if needed) function that is an exact, algebraically simplified analytic derivative of the given function.

Our approach is reifying code into its `dictionary view', intensional analysis of typed code expressions, and staging so to evaluate under lambda. We improve the earlier, 2004 approach in algebraically simplifying the result of symbolic differentiation, and in removing interpretative overhead with the help of Template Haskell (TH). The computed derivative can be compiled down to the machine code and so it runs at full speed, as if it were written by hand to start with.

In the process, we develop a simple type system for a subset of TH code expressions (TH is, sadly, completely untyped) -- so that accidental errors can be detected early. We introduce a few combinators for the intensional analysis of such typed code expressions. We also show how to reify an identifier like (+) to a TH.Name -- by applying TH to itself. Effectively we obtain more than one stage of computation.

Our technique can be considered the inverse of the TH splicing operation: given a (compiled) numeric expression of a host program, we obtain its source view as a TH representation. The latter can be spliced back into the host program and compiled -- after, perhaps, simplification, partial evaluation, or symbolic differentiation. As an example, given the definition of the ordinary numeric, Floating a => a->a function test1f x = let y = x * x in y + 1 (which can be located in a separately compiled file), we reify it into a TH code expression, print it, differentiate it symbolically, and algebraically simplify the result:

     *Diff> test1cp
     \dx_0 -> GHC.Num.+ (GHC.Num.* dx_0 dx_0) 1
      *Diff> test1dp
     \dx_0 -> GHC.Num.+ (GHC.Num.+ (GHC.Num.* 1 dx_0) (GHC.Num.* dx_0 1)) 0
     *Diff> test1dsp
     \dx_0 -> GHC.Num.+ dx_0 dx_0
The output is produced by TH's pretty-printer. We can splice the result in a Haskell program as $(reflectQC test1ds) 2.0 and use it as ordinary, hand-written function \x -> x+x .
Version
The current version is 1.1, Dec 3, 2006.
References
Diff.txt [8K]
The message with the explanation of optimality and more examples. It was originally posted as Jones-optimal, typed, symbolic differentiation of (compiled) functions on the Haskell mailing list on Sun, 3 Dec 2006 17:56:27 -0800 (PST).

Diff.hs [9K]
The main implementation file, which defines the reification of code into TH representation, differentiation rules and algebraic simplification rules, all via the intensional analysis of the typed code. The file also includes many examples, including those of partial differentiation.

DiffTest.hs [<1K]
Running the splicing tests from Diff.hs. Due to the TH requirement, this code must be in a separate module.

TypedCode.hs [5K]
This file introduces the type Code a of typed TH code expressions. The (phantom) type parameter is the expression's type. The file defines combinators for building and analyzing these typed expressions.

TypedCodeAux.hs [2K]
Obtain the Name that corresponds to a top-level (Prelude-level) Haskell identifier by applying TH to itself.

 

In quest for staged calculus

Our main motivation comes from program generation, which is regarded as the most promising approach in high-performance computing (cf. `SPIRAL') and high-assurance embedded programming (cf. `Hume'). Staged languages such as MetaOCaml are an attractive way of such program generation. Generating code with imperative or conditional statements, or with import and other declarations, or using let insertion requires programming either in continuation-passing style (CPS) or with effects. CPS is cumbersome and particularly unattractive to domain experts -- even with the syntactic sugar such as monadic do-notation. Effects such as state or delimited control make the generator modular and notation natural to domain experts, but come with the risk of scope extrusion.

Here is the simplest example of scope extrusion in MetaOCaml caused by the effect of mutating a state:

     # let code = let x = ref .<1>. in
                  let _ = .<fun v -> .~(x := .<v>.; .<()>.)>. in
                  !x;;
     val code: ('a, int) code = .<v_1>.
     # .!code;;
     Unbound value v_1
     Exception: Trx.TypeCheckingError.
We have managed to build a piece of code with literally a free variable. Evaluating this code (by MetaOCaml's `run' operation .!) causes a paradoxical type error at run-time. The type-checker has accepted the code that it should not have. We have seen such scope extrusion arising from honest mistakes of let-insertion in real program generation with effects. No staged language today can statically prevent such mistakes.

Our goal is to make program generation convenient and safe, in particular, to statically prevent errors like scope extrusion. Developing a sound type system of staged code with effects requires a suitable calculus. To model real staged programming languages like MetaOCaml, we need a call by value calculus that supports splicing of open code, running the closed code, and cross-stage persistence. Modeling effects, especially control effects such as delimited continuations, is better with small-step operational semantics. Many formal calculi for staged programming have evolved over the last couple of decades. Here is a sample:

lambda U
untyped, supports splicing and running of code, cross-stage persistence (CSP) of values only, call by name, big-step semantics.
lambda box, lambda S4
typed, running but no splicing of code, no CSP.
lambda circle
typed, splicing but no running of code, no CSP.
lambda circle box
typed, supports splicing and limited running of code, CSP of variables, small-step call-by-value semantics.
lambda a, lambda i let
typed, supports splicing and running of code, CSP of expressions, big-step call-by-name semantics.
Alas, none of these calculi satisfy our need.

Our goal for this project is to make, with the benefit of hindsight, the existing staged calculi more uniform and their features more orthogonal to each other, so that they are easier to study, mechanize, and extend (for example, to add side effects). To be more precise, we aim to:

Joint work with Chung-chieh Shan and Yukiyoshi Kameyama.

Version
The current version is August 21, 2007.
References
Discussion at the Fifth Meeting of the IFIP Working Group 2.11 `Program Generation'.
August 21, 2007, Dragoer, Denmark.

 

Dependent open terms and the evaluation contexts that bind them

How should we represent open terms and their binding contexts, especially in Logical Frameworks (LF) with higher-order abstract syntax, and especially when one binding may depend on another? We came to this problem when formalizing the small-step semantics of staging so as to combine staging with effects in a sound type system. With staging, our redexes may be open and our evaluation contexts may be binding.

Our solution represents contexts outside-in and uses dependent types to describe the binding structure of contexts and the corresponding structure of open terms. We convinced Twelf that our our functions to decompose a term into an (open) redex and its context, and to plug an open term into its closing context are total. This totality suggests that our types adequately represent ordered dependent sequences of bindings, be they needed by an expression or provided by a context. These focusing and zipping functions let us specify the first small-step semantics for staging.

The challenge remains to represent contexts inside-out while expressing its binding structure, in particular how the continuation of a staged evaluator may ``bind off'' a later-stage variable.

Joint work with Chung-chieh Shan.

Version
The current version is June, 2007.
References
open-term-binding-ctx.pdf [76K]
The explanation of the problem, using a simple example in Scheme, MetaOCaml, and the staged calculus lambda a by Taha and Nielsen.


Last updated January 1, 2012

This site's top page is http://okmij.org/ftp/

oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML