Sound code generation: definitions, models, challenges



Generating Code with Polymorphic let: A Ballad of Value Restriction, Copying and Sharing

Getting polymorphism and effects such as mutation to live together in the same language is a tale worth telling, under the recurring refrain of copying vs. sharing. We add new stanzas to the tale, about the ordeal to generate code with polymorphism and effects, and be sure it type-checks. Generating well-typed--by--construction polymorphic let-expressions is impossible in the Hindley-Milner type system: even the author believed that.

The polymorphic-let generator turns out to exist. We present its derivation and the application for the lightweight implementation of quotation via a novel and unexpectedly simple source-to-source transformation to code-generating combinators.

However, generating let-expressions with polymorphic functions demands more than even the relaxed value restriction can deliver. We need a new deal for let-polymorphism in ML. We conjecture the weaker restriction and implement it in a practically-useful code-generation library. Its formal justification is formulated as the research program.

The current version is February 2017
let-insertion.pdf [225K]
Electronic Proceedings in Theoretical Computer Science (EPTCS) 241, 2017, pp. 1-22 (Post-Proceedings ML/OCaml 2015) doi:10.4204/EPTCS.241.1 [17K]
OCaml code accompanying the paper
Some part of the code relies on the delimcc library of delimited control.

Staged let-generalization may be unsound
The exception to the generalization rule proposed in the paper. We conjecture this is the only exception.


Refined Environment Classifiers: Type- and Scope-Safe Code Generation with Mutable Cells

Alternative, more insightful title: Region Memory Management for Free Variables

Generating high-performance code and applying typical optimizations within the bodies of loops and functions involves moving or storing open code for later use, often in a different binding environment. There are ample opportunities for variables being left unbound or accidentally captured. It has been a tough challenge to statically ensure that by construction the generated code is nevertheless well-typed and well-scoped: all free variables in manipulated and stored code fragments shall eventually be bound, by their intended binders.

We present the calculus for code generation with mutable state that for the first time achieves type-safety and hygiene without ad hoc restrictions. The calculus strongly resembles region-based memory management, but with the orders of magnitude simpler proofs. It employs the rightly abstract representation for free variables, which, like hypothesis in natural deduction, are free from the bureaucracy of syntax imposed by the type environment or numbering conventions.

Although the calculus was designed for the sake of formalization and is deliberately bare-bone, it turns out easily implementable and not too bothersome for writing realistic program.

Joint work with Yukiyoshi Kameyama and Yuto Sudo.

master.pdf [139K]
The paper published in Springer's LNCS 10017: the Proceedings of the 14th Asian Symposium on Programming Languages and Systems (APLAS 2016). November 21-23, 2016. Hanoi, Vietnam.

APLAS-talk.pdf [86K]
Annotated slides of the talk presented at APLAS 2016 on November 22, 2016. Hanoi, Vietnam. [14K]
Embedding of the <NJ> calculus in OCaml, with many examples


New translation from nested quasiquotes to code-generation combinators: efficient, typed, easier to retrofit

Quasi-quotation (or, code templates) has long been used as a convenient tool for code generation, commonly implemented as a pre-processing/translation into code-generation combinators. The original MetaOCaml was also based on such translation, done post type checking. BER MetaOCaml employs a significantly different, efficient (especially in version N114) translation integrated with type-checking, in the least intrusive way. This paper presents the integrated efficient translation for the first time.
translation.pdf [523K]
MetaOCaml  Theory and Implementation. OCAML 2023 workshop. doi:10.48550/arXiv.2309.08207


Shifting the Stage: Staging with Delimited Control

It is often hard to write programs that are efficient yet reusable. For example, an efficient implementation of Gaussian elimination should be specialized to the structure and known static properties of the input matrix. The most profitable optimizations, such as choosing the best pivoting or memoization, cannot be expected of even an advanced compiler because they are specific to the domain, but expressing these optimizations directly makes for ungainly source code. Instead, a promising and popular way to reconcile efficiency with reusability is for a domain expert to write code generators.

Two pillars of this approach are types and effects. Typed multilevel languages such as MetaOCaml ensure safety: a well-typed code generator neither goes wrong nor generates code that goes wrong. Side effects such as state and control ease correctness: an effectful generator can resemble the textbook presentation of an algorithm, as is familiar to domain experts, yet insert let for memoization and if for bounds-checking, as is necessary for efficiency. However, adding effects blindly renders multilevel types unsound.

We introduce the first multilevel calculus with control effects and a sound type system. We give small-step operational semantics as well as a one-pass continuation-passing-style translation. For soundness, our calculus restricts the code generator's effects to the scope of generated binders. Even with this restriction, we can finally write efficient code generators for dynamic programming and numerical methods in direct style, like in algorithm textbooks, rather than in continuation-passing or monadic style.

Joint work with Yukiyoshi Kameyama and Chung-chieh Shan.

The current version is November 2011
circle-shift.pdf [1035K]
Shifting the stage: Staging with delimited control
Journal of Functional Programming 21(6):617-662, 2011. Copyright Cambridge University Press 2011 doi:10.1017/S0956796811000256
This is the greatly extended, journal version of the paper first published in the Proc. of PEPM 2009, pp. 111-120.

circle-shift.elf [38K]
lambda-circle calculus with shift/reset: two-stage calculus with delimited control effects
This Twelf code defines the calculus and its static (type checking) and dynamic semantics. The code contains many examples, including the staged Gibonacci example with memoization and let-insertion.

Mechanizing multilevel metatheory with control effects
Detailed description of the formalization of the extended calculus, with arbitrarily many levels [16K] [11K]
The Gibonacci example -- generating efficient specialized versions of the generalized Fibonacci function in direct style
The two MetaOCaml files describe a progression of attempts, from an inefficient unstaged Gibonacci, memoized Gibonacci, naively staged and inefficient function and finally to the efficient memoization with let-insertion. The dangers of scope extrusion are well illustrated. The specialized Gibonacci generator is written in direct, rather than monadic or CPS style. The file uses the explicit fix-point combinator; The other file relies on recursive definitions instead. [6K]
The complete MetaOCaml code for generating optimal specialized code for the longest common subsequence: another example of dynamic meta-programming in direct style [8K] [15K]
Generating a family of Gaussian Elimination codes in direct style, without either functors or monads
The file is the unstaged, textbook Gaussian elimination code, in plain OCaml. The other file is the corresponding staged code, in MetaOCaml. The staged code generates a family of GE codes (with or without determinant computation, etc). We pay no performance penalty for the added flexibility.


Closing the Stage: From Staged Code to Typed Closures

Code generation lets us write well-abstracted programs without performance penalty. Writing a correct code generator is easier than building a full-scale compiler but still hard. Typed multistage languages such as MetaOCaml help in two ways: they provide simple annotations to express code generation, and they assure that the generated code is well-typed and well-scoped. Unfortunately, the assurance only holds without side effects such as state and control. Without effects, generators often have to be written in a continuation-passing or monadic style that has proved inconvenient. It is thus a pressing open problem to combine effects with staging in a sound type system.

This paper takes a first step towards solving the problem, by translating the staging away. Our source language models MetaOCaml restricted to one future stage. It is a call-by-value language, with a sound type system and a small-step operational semantics, that supports building open code, running closed code, cross-stage persistence, and non-termination effects. We translate each typing derivation from this source language to the unstaged System F with constants. Our translation represents future-stage code using closures, yet preserves the typing, alpha-equivalence (hygiene), and (we conjecture) termination and evaluation order of the staged program.

To decouple evaluation from scope (a defining characteristic of staging), our translation weakens the typing environment of open code using a term coercion reminiscent of Goedel's translation from intuitionistic to modal logic. By converting open code to closures with typed environments, our translation establishes a framework in which to study staging with effects and to prototype staged languages. It already makes scope extrusion a type error.

Joint work with Yukiyoshi Kameyama and Chung-chieh Shan.

The current version is December 2007
A surprisingly simple implementation of staging
The approach in the PEPM 2008 paper below is extraordinarily complicated. It has been realized recently that the unstaging translation can be done much simply.

metafx.pdf [211K]
The paper published in the Proceedings of the 2008 ACM SIGPLAN Workshop on Partial Evaluation and Semantics-based Program Manipulation (PEPM), 2008, San Francisco, USA, January 7-8; pp. 147-157, 2008. [11K]
Examples of the staged code and its translation. The file contains the complete code for all the examples in the paper plus a few extra. The source language is MetaOCaml. The target language is supposed to be System F. We try to emulate it in plain OCaml, using first-class (record) polymorphism where needed. [4K]
Computing a staged power function while tracking the number of multiplications: The example in Sec 6 of the paper. It is easiest to write this example with a side effect such as mutable state in MetaOCaml, but such an extension (a piece of state of type int) has not been shown sound except through our translation. Second, we can write this example in pure MetaOCaml (more awkwardly) using our environment-passing translation.

Two-level staged calculus with environment classifiers, run and cross-stage persistence
The source language of the translation


Staged let-generalization may be unsound

We demonstrate that the value restriction in a staged calculus with cross-stage persistence and reference cells is insufficient to prevent unsound generalization.

The distinguished feature of Hindley-Milner type system is generalization of let-bindings. The type inferred for a binding introduced by the let-form is generalized by quantifying generalizable free type variables. For example, the expression

    let f () = [] in (1::f(), "123"::f())

is well-typed: since the type inferred for f, unit -> 'a list, contains the generalizable type variable 'a the type is generalized to the polymorphic forall 'a. unit -> 'a list (in Hindley-Milner systems, quantifiers are often omitted). The (implicitly) quantified type variable 'a can then be instantiated to int or string, permitting different instances of f to be used in differently-typed contexts.

It has long been known that let-generalization is unsound in the presence of reference cells. For example, if the type inferred for r in the expression

    let r = ref [] in
    r := [1]; "123"::!r
      This expression !r has type int list but is here used with type string list
were generalized from 'a list ref to forall 'a. 'a list ref, the expression would have returned a list that contains a string and an int, letting us apply string operations to an int, or vice versa. A well-type program would have ``gone wrong.''

To ensure the type system soundness, there have been proposed many various restrictions on let-generalization. The most widely implemented is value restriction: the type inferred for a let-binding is generalized only if the right-hand-side of the binding syntactically has the form of a value. In other words, only values may have polymorphic types. In our examples, the let-binding for f (which de-sugars to let f = fun () -> [] in ...) binds f to what syntactically is a function, that is, a value. In contrast, the right-hand-side of the binding to r, ref [], is syntactically a non-value expression. Therefore, the type of r is not generalized, prohibiting the use of r in differently-typed contexts, int list ref vs string list ref. The type checker rejects our second example.

The value restriction has a clear intuition. We can type check our first example without polymorphism, if we first inline the definition of f into its two use places. A polymorphic binding then may be regarded as an `optimization', letting us type check f once, where it is defined, rather than at every place where f is used. A polymorphic type is an indication that the expression is inlineable; some compilers, e.g., MLton, indeed inline all polymorphic expressions so that they can be compiled without resorting to boxing. Effectful expressions (such as ref []) cannot be inlined while preserving dynamic behavior as copying them replicates effects and hence is observable. Values are inert and hence can be copied or shared at compiler's discretion, without affecting dynamic behavior.

Alas, in a staged calculus with cross-stage persistence, value restriction or its variants (such as a relaxed value restriction) turn out insufficient to ensure type soundness. We demonstrate the unsoundness on a series of examples culminating in a segmentation fault. We develop our examples interactively, by submitting expressions to the top-level of a MetaOCaml interpreter of any recent version (309 or BER 002) and observing its responses. In the transcript below, the responses are indented.

The first example uses no staging and causes no controversy:

    let f () = ref []
    in f() := [1]; "123"::!(f())
         - : string list = ["123"]
The let-binding de-sugars into let f = fun () -> ref [] in ... whose right-hand side is syntactically a function. Value restriction should allow generalization; both OCaml and MetaOCaml agree and accept the expression. The indented line shows the result.

What if we enclose the whole expression into MetaOCaml brackets? If an expression is well-typed, its code should be well-typed too. If we could manually enter an expression without type errors, we should be able to automatically generate that expression without type errors.

    let c =
    .<let f () = ref []
      in f() := [1]; "123"::!(f())>.
         val c : ('a, string list) code =
           .<let f_2 = fun () -> (ref ([])) in
             ((f_2 ()) := [1]); ("123" :: (! (f_2 ())))>.
    .! c;;
         - : string list = ["123"]
MetaOCaml accepts the quoted expression from the first example. Running the code produces the result we have already seen.

Let us add an escape, or splice: .< let f () = .~(.<ref []>.) in ... >., which reduces in one step to the second example.

    let c =
    .<let f () = .~(.<ref []>.)
      in f() := [1]; "123"::!(f())>.
         val c : ('a, string list) code =
           .<let f_2 = fun () -> (ref ([])) in
             ((f_2 ()) := [1]); ("123" :: (! (f_2 ())))>.
    .! c;;
         - : string list = ["123"]
Although the right-hand side of the c-binding is no longer a (present-stage) value because of the splice, evaluating the right-hand-side produces the code value of the second example, which is well-typed. More formally, despite the splice, the second-stage binding to f still looks like a binding for a function. The (second) stage value restriction should allow second-stage generalization. And it does, in MetaOCaml. The result of running the code value c is identical to that of running the previous examples.

We come to the final example, which looks as if it reduces to the earlier ones.

    let c =
     .<let f () = .~(let x = ref [] in .<x>.)
       in f() := [1]; "123"::!(f())>.
          val c : ('a, string list) code =
            .<let f_2 = fun () -> (* cross-stage persistent value (as id: x) *) in
              ((f_2 ()) := [1]); ("123" :: (! (f_2 ())))>.
The example is accepted by MetaOCaml; the code value c clearly contains a cross-stage persistent value. We have created a reference cell at the present stage and ``lifted'' the cell to the future-stage, letting the generated code use the value as it is. We stress that the cross-stage persistence is vital in practice: without it, we have to write a staged version of all standard library functions. The let-binding for f is still syntactically a binding for a function, so generalization seems justified. If we try to run the generated code, we trip and fall:
    .! c;;
         segmentation fault
Although the right-hand side of the future-stage f binding was syntactically a functional value, fun () -> csprefval, all applications of that function return one and the same csprefval. Therefore, sharing or deep-copying of the function become observable -- the behavior that is inconsistent with the inferred polymorphic type for the function. The value restriction has a rarely mentioned premise: the only way to produce reference cells is to evaluate an expression like ref something. There are no syntactic values of the type of reference cells. Staging with cross-stage persistence can violate the premise: reference cells are still created as the result of evaluating an expression; that result, if lifted to the future stage, looks syntactically like a value.

The problem of restricting let-generalization to ensure soundness, considered closed long time ago, is thrown open in staged calculi.

Joint work with Chung-chieh Shan.

The current version is June, 2010
Generating Code with Polymorphic let: A Ballad of Value Restriction, Copying and Sharing
The problem in a broader context

Jacques Garrigue: Relaxing the Value Restriction.
Proc. Int. Symposium on Functional and Logic Programming, Nara, April 2004. Springer-Verlag LNCS 2998, pp. 196--213. (extended version: RIMS Preprint 1444)
The paper gives a good survey of approaches to restrict let-generalization to ensure its soundness.


Computational Effects across Generated Binders

Code generation is the leading approach to making high-performance software reusable. Using a set of realistic benchmarks -- faulty power, guard insertion, loop tiling -- we demonstrate that side effects are indispensable in composable code generators, especially side effects that move open code past generated binders. We challenge the audience to implement these examples in their favorite code-generation framework.

Not just any solution is acceptable: we wish to avoid tree hacking, modification or even examination of the generated code, and its post-validation. See the slide notes for the detailed explanation of these requirements. Our goal is to generate code with compositional combinators that statically assure the results (even intermediate, open results) are well-formed and well-typed.

Template Haskell is far away from the goal. MetaOCaml is closer: generators without side-effects satisfy our requirements. Alas, such generators cannot implement our benchmarks. Without effects, let-insertion or other movement of open code past generated binders is not possible.

Granted, let-insertion without crossing binders can be done effectlessly, as well-known from partial evaluation. The cost is writing generators in the continuation-passing, or monadic style (which obscures the algorithm and makes the generators harder to use by domain experts). However, even the repeated continuation-passing transformation cannot help us insert let beyond the closest binder. We need a new CPS hierarchy.

Joint work with Chung-chieh Shan and Yukiyoshi Kameyama.

The current version is September 2011
Refined Environment Classifiers: Type- and Scope-Safe Code Generation with Mutable Cells
The first solution to the problem

talk-problems.pdf [172K]
Computational Effects across Generated Binders. Part 1: Problems and solutions
Extensively annotated slides of the talk presented at the IFIP WG2.11 meeting (Bordeaux, France, September 5, 2011) and at INRIA Paris (September 9, 2011) [11K]
MetaOCaml code for the code generation benchmarks that emphasize effects crossing future-stage binders. Although the code implements all benchmarks, the absence of scope extrusion cannot be assured. Small mistakes can indeed result in unbound variables in the generated code.

Generating optimal stencil code
A real-life example of code generation with effects crossing many binders


What is lexical scope and how to enforce it

If left unchecked, side effects in code generators often interact with generated binders badly to produce unexpectedly unbound variables, or worse, unexpectedly bound ones. The literature and our experience is rife with examples of these surprises, where variables with different scopes are mixed up. To prevent such surprises while still allowing arbitrary side effects to move open code past generated binders, we first define a notion of lexical scope for generated code with explicit contexts. To each generated binder, we attach a unique label that is checked against each use occurrence of the bound variable. We then introduce a static type system to assure that these checks will succeed.

We embedded this type system in a Haskell library. We used this tagless-staged library to implement statically safe let-insertion across an arbitrary number of binders for the first time.

Joint work with Chung-chieh Shan and Yukiyoshi Kameyama.

The current version is September 2011
Chung-chieh Shan: Computational effects across generated binders. Part 2: enforcing lexical scope
Talk presented at the IFIP Working Group 2.11 (program generation), INRIA Paris, and Cornell University, 2011

Staged Haskell
Haskell code generation library that statically enforces future-stage lexical scope

Unsafe.hs [19K]
Demonstrating the run-time errors that our type system statically prevents. The code shows that well-scoped De Bruijn indices do not statically determine lexical scope.


Two-level staged calculus with environment classifiers, run and cross-stage persistence

The calculus lambda^a_v1 is the model of MetaOCaml restricted to two levels. This is a call-by-value two-level lambda-calculus that supports the manipulation and the splicing-in of the open code, the running of the closed code, and cross-staged persistence. The calculus implements Taha and Nielsen's `environment classifiers' to prevent attempts to run open code. The calculus is a small variation of the one presented in Taha and Nielsen's POPL2003 paper.

The calculus -- its syntax, dynamic and static semantics -- is implemented in Twelf. We can enter terms, infer their types and see their values. The Twelf code includes many sample terms and the examples of type inference and evaluation. The implementation was used to write all examples in the ``Closing the Stage'' (PEPM 2008) paper: The calculus was the source language of the translation described in the paper.

In addition to implementing the calculus, the Twelf code proves several meta-theoretical properties: any non-value term can be decomposed into a possibly open pre-value and the possibly binding context; primitive reductions preserve types. The proofs are very challenging: splices let evaluation happen under a future-stage lambda. The evaluation context therefore can cross the arbitrary number of dependent binders: variable binders include classifiers that must be bound, too, at that point.

The current version is 1.1, April 2008
Walid Taha and Michael Florentin Nielsen: Environment Classifiers. POPL2003

lambda-am1.elf [34K]
Twelf code
The code includes many examples of type-checking and evaluating staged terms

util.elf [<1K]
Common utilities: Natural numbers and their addition and equality


In quest for staged calculus

Our main motivation comes from program generation, which is regarded as the most promising approach in high-performance computing (cf. `SPIRAL') and high-assurance embedded programming (cf. `Hume'). Staged languages such as MetaOCaml are an attractive way of such program generation. Generating code with imperative or conditional statements, or with import and other declarations, or using let insertion requires programming either in continuation-passing style (CPS) or with effects. CPS is cumbersome and particularly unattractive to domain experts -- even with the syntactic sugar such as monadic do-notation. Effects such as state or delimited control make the generator modular and notation natural to domain experts, but come with the risk of scope extrusion.

Here is the simplest example of scope extrusion in the old MetaOCaml caused by the effect of mutating a state:

    # let code = let x = ref .<1>. in
                 let _ = .<fun v -> .~(x := .<v>.; .<()>.)>. in
    val code: ('a, int) code = .<v_1>.
    # .!code;;
    Unbound value v_1
    Exception: Trx.TypeCheckingError.
We have managed to build a piece of code with literally a free variable. Evaluating this code (by MetaOCaml's `run' operation .!) causes a paradoxical type error at run-time. The type-checker has accepted the code that it should not have. We have seen such scope extrusion arising from honest mistakes of let-insertion in real program generation with effects. No staged language today can statically prevent such mistakes.

Our goal is to make program generation convenient and safe, in particular, to statically prevent errors like scope extrusion. Developing a sound type system of staged code with effects requires a suitable calculus. To model real staged programming languages like MetaOCaml, we need a call by value calculus that supports splicing of open code, running the closed code, and cross-stage persistence. Modeling effects, especially control effects such as delimited continuations, is better with small-step operational semantics. Many formal calculi for staged programming have evolved over the last couple of decades. Here is a sample:

untyped, supports splicing and running of code, cross-stage persistence (CSP) of values only, call by name, big-step semantics.
lambda^box, lambda^S4
typed, running but no splicing of code, no CSP.
typed, splicing but no running of code, no CSP.
lambda^{circle box}
typed, supports splicing and the limited running of code, CSP of variables, small-step call-by-value semantics.
lambda^a, lambda^i_let
typed, supports splicing and running of code, CSP of expressions, big-step call-by-name semantics.
Alas, none of these calculi satisfy our need.

Our goal for this project is to make, with the benefit of hindsight, the existing staged calculi more uniform and their features more orthogonal to each other, so that they are easier to study, mechanize, and extend (for example, to add side effects). To be more precise, we aim to:

Joint work with Chung-chieh Shan and Yukiyoshi Kameyama.

The current version is August 21, 2007
Discussion at the Fifth Meeting of the IFIP Working Group 2.11 `Program Generation'. August 21, 2007, Dragoer, Denmark.

Two-level staged calculus with environment classifiers, run and cross-stage persistence
This calculus is the result of our quest. The calculus is a close model of MetaOCaml.

Shifting the Stage: Staging with Delimited Control
A simpler calculus, without cross-stage persistence and the first-class run. The calculus is significantly easier to work with. It let us add delimited control to staging. The meta-theory of the resulting calculus has been fully mechanized, with mechanically checked proofs of progress and preservation.


Dependent open terms and the evaluation contexts that bind them

How should we represent open terms and their binding contexts, especially in Logical Frameworks (LF) with higher-order abstract syntax, and especially when one binding may depend on another? We came to this problem when formalizing the small-step semantics of staging so as to combine staging with effects in a sound type system. With staging, our redexes may be open and our evaluation contexts may be binding.

Our solution represents contexts outside-in and uses dependent types to describe the binding structure of contexts and the corresponding structure of open terms. We convinced Twelf that our our functions to decompose a term into an (open) redex and its context, and to plug an open term into its closing context are total. This totality suggests that our types adequately represent ordered dependent sequences of bindings, be they needed by an expression or provided by a context. These focusing and zipping functions let us specify the first small-step semantics for staging.

The challenge remains to represent contexts inside-out while expressing its binding structure, in particular how the continuation of a staged evaluator may ``bind off'' a later-stage variable.

Joint work with Chung-chieh Shan.

The current version is June, 2007
open-term-binding-ctx.pdf [76K]
The explanation of the problem, using a simple example in Scheme, MetaOCaml, and the staged calculus lambda^a by Taha and Nielsen.