This page collects examples and sample DSLs implemented
in the typed final (``tagless-final'') style.

- A surprisingly simple implementation of staging
- lambda to SKI: a different view on the old translation
- Expressing sharing
- Effects Without Monads: Non-determinism -- embedding effectful DSLs
- Reconciling Abstraction with High Performance: A MetaOCaml approach
- The index of DSLs embedded in OCaml, in tagless-final style
- C-like DSL with arrays, loops and mutable variables and its compilation to C
- Embedded two-staged DSL with let-, letrec- and mutual-letrec--insertion in pure, plain OCaml
- Algebraic data types and measuring strictness
- Math layout
- Type-safe Formatted IO (printf and scanf) as an embedded DSL: the final view
- DSL of algebraic effects and handlers
- Tagless-Staged: a step toward MetaHaskell
- Type- and Scope-Safe Code Generation with Mutable Cells: A tagless-final OCaml embedding of a staged calculus with a rather complicated, MLF-like type system
- DSLs with reference types, pointers, mutable variables: translating OCaml-like reference-type variables to C-like mutable variables
- Evaluators, Normalizers, Reducers: from denotational to operational semantics
- Ordinary and one-pass CPS transformation
- Type-directed partial evaluation
- Parameterizing expressions by the evaluation order: call-by-name, call-by-value, call-by-need
- Linear and affine lambda-calculi
- Embedding of lambda calculus with De Bruijn Levels
- Lambek Grammars as an embedded DSL: a tool for semanticists
- Lambda: the ultimate syntax-semantics interface
- Lambek calculi
- Transformational Semantics (TS): Gradually Transforming Syntax to Semantics
- Mobile Code in C# via Finally Tagless Interpreters

- We demonstrate the ordinary and the administrative-redex--less call-by-value
Continuation Passing Style (CPS) transformations that assuredly produce
well-typed terms and are
*patently*total.Our goal here is not to evaluate, view or serialize a tagless-final term, but to transform it to another one. The result is a tagless-final term, which we can interpret in multiple ways: evaluate, view, or transform again. It is natural to require the result of transforming a well-typed term be well-typed. In the tagless-final approach that requirement is satisfied automatically: after all, only well-typed terms are expressible. We impose a more stringent requirement that a transformation be total. In particular, the fact that the transformation handles all possible cases of the source terms must be patently, syntactically clear. The complete coverage must be so clear that the metalanguage compiler should be able to see that, without the aid of extra tools.

Since the only thing we can do with tagless-final terms is to interpret them, the CPS transformer is written in the form of an interpreter. It interprets source terms yielding transformed terms, which can be interpreted in many ways. In particular, the terms can be interpreted by the CPS transformer again, yielding 2-CPS terms. CPS transformers are composable, as expected.

A particular complication of the CPS transform is that the type of the result is different from the type of the source term: the CPS transform translates not only terms but also types. Moreover, base types and the arrow type are translated in different ways. To express CPS, we need an interpreter that gives the meaning not only to terms but also to types. In particular, what the function types denote should be up to a particular interpreter. It turns out the existing tagless-final framework is up to that task: with the help of type families, we can after all define an instance of Symantics that interprets source types as CPS-transformed types.

The ordinary (Fischer or Plotkin) CPS transform introduces many administrative redices, which make the result too hard to read. Danvy and Filinski proposed a one-pass CPS transform, which relies on the metalanguage to get rid of the administrative redices. The one-pass CPS transform can be regarded as an example of the normalization-by-evaluation.

CPS.hs [10K]

Ordinary and one-pass CPS transforms and their compositionsTTFdBHO.hs [2K]

The simplest tagless-final transformer, from the De-Bruijn--based Symantics to the one based on the higher-order abstract syntaxOlivier Danvy and Andrzej Filinski. Representing Control: A Study of the CPS Transformation.

Mathematical Structures in Computer Science, 1992.

- We present the unexpectedly straightforward embedding of the staged
language that may be regarded as a simple version of MetaOCaml. In
fact, MetaOCaml quotations and splices can be translated to our
language in the most direct way. The language supports open
quotation (allowing free variables in fragments of the generated code)
and cross-stage persistence.
The staged language is embedded into OCaml in the tagless-final style, as the signature

`Code`

and its three implementations:`CodeReal`

(in terms of the ordinary OCaml),`CodeSting`

(performing code generation via strings), and`CodeCode`

(in terms of MetaOCaml). The`CodeReal`

implementation -- of the staged calculus is terms of the ordinary, unstaged OCaml -- embodies an*unstaging translation*. It is orders of magnitude simpler compared to the other such translations described in literature, requiring no fancy types or type-level operations. The staged language can be extended to support the polymorphic`let`

. **Version**- The current version is February 2017
**References**- let-insertion.pdf [225K]

Generating Code with Polymorphic let: A Ballad of Value Restriction, Copying and Sharing

Electronic Proceedings in Theoretical Computer Science (EPTCS) 241, 2017, pp. 1-22 (Post-Proceedings ML/OCaml 2015) DOI: 10.4204/EPTCS.241.1

Sec 3.1 of the paper describes a staged calculus in the form of code-generating combinators, and the translation of MetaOCaml quotations and splices into it. The section formulates the correctness properties of the translation.polylet.ml [17K]

The OCaml code for the embedding of the staged language into OCaml, and a few examplesClosing the Stage: From Staged Code to Typed Closures

The previous, immensely more complex unstaging translation{Tagless-Staged: a step toward MetaHaskell

Type- and Scope-Safe Code Generation with Mutable Cells

The tagless-final embedding, in Haskell and OCaml, of a staged calculus with a rather complicated, MLF-like type system. The type system ensures scope safety of the generated code even in the presence of effects.

- Jacques Carette has posed a problem of designing an embedded DSL for
Math layout, with subscripts, superscripts,
tables, fractions, and other standard features.
The language has to support an
*extensible*set of glyphs, such as Greek letters and special characters. It should be possible to add more glyphs later, similar to `using'`stmaryrd`

, etc. in LaTeX. Adding a new glyph should not affect the existing documents in any way. Another requirement is an extensible set of renderers (formatters), such as plain text, LaTeX, HTML+MathML, etc. One should be able to add a new formatter later, without having to re-write or even recompile the existing documents. A formatter does not have to support all layout features or glyphs. However, an attempt to render a document with an unsuitable formatter should be flagged as a type error.The tagless-final approach seems to easily fulfill all the requirements, as the enclosed code demonstrates.

**Version**- The current version is May 2016
**References**- MathLayout.hs [5K]

Haskell code outlining the embedding of the layout language, with LaTeX and HTML backends

- The paper `Not by equations alone' introduced a realistic calculus of
effects, its denotational semantics, and used it to reason and optimize
interesting, non-trivial programs with non-determinism and state.
The calculus is implemented by embedding it into pure OCaml in the tagless-final style. The implementation was used to run all the examples in the paper and test the derived equational laws. One may use it for further experimenting and reasoning with (multiple) algebraic effects.

**References**- eff-calculus.ml [27K]

The implementation with many examples

- We present a technique for compiling lambda-calculus expressions into
SKI combinators. Unlike the well-known bracket abstraction based on
(syntactic) term re-writing, our algorithm relies on a specially
chosen,
*compositional*semantic model of generally open lambda terms. The meaning of a closed lambda term is the corresponding SKI combination. For simply-typed as well as unityped terms, the meaning derivation mirrors the typing derivation. One may also view the algorithm as an algebra, or a non-standard evaluator for lambda-terms (i.e., denotational semantics).The algorithm is implemented as a tagless-final compiler for (uni)typed lambda-calculus embedded as a DSL into OCaml. Its type preservation is clear even to OCaml. The correctness of both the algorithm and of its implementation becomes clear.

Our algorithm is easily amenable to optimizations. In particular, its output and the running time can both be made linear in the size (i.e., the number of all constructors) of the input De Bruijn-indexed term.

**Version**- The current version is May 2018
**References**- ski.pdf [354K]

The paper presented on May 9, 2018 at FLOPS 2018 (Nagoya, Japan) and published in Springer's LNCS 10818, pp. 33-50, 2018

doi:10.1007/978-3-319-90686-7_3skconv.ml [34K]

The complete code accompanying the paper

The code presents the basic SKI translation and all optimizations described in the paper, along with several interpreters.

- Many DSLs deal not only with integers, floats or strings but also with
composite data types like tuples, options, records, arrays,
lists. They too can be represented in tagless-final style. Arrays are
straightforward: we only need to introduce, similarly to OCaml's Array
module, the operations
`make`

,`length`

,`get`

and`set`

. Algebraic data types are more interesting: we want to be able to not only construct them but also deconstruct, or patter-match. Programming languages with algebraic data types typically have a dedicated syntax for pattern-matching. Reproducing such syntax sugar in an embedded DSL is difficult -- but not impossible if the host language is expressive enough. This article shows the example, introducing an eDSL with lists and using it for standard pattern-match--based list processing. Lists are a prototypical example of algebraic data types, encompassing tuples (or, products), variants (or, sums), and recursion.As an illustration of multiple interpretations, afforded by the tagless-final style, we provide several interpretations of lists: strict, non-strict and lazy (memoized). We also define `operation counting' interpreters, which count cons/closure construction and deconstruction. We then investigate how efficiently left-fold can be emulated via right-fold, answering the question if left-fold is superfluous. Does the answer depend on the strictness of the list and the evaluation strategy -- call-by-name, call-by-value or call-by-need? Analysis of call-by-need is particularly tricky, so any firm counts are welcome. In the next article, we extend the counting to report not only the total amount of cons/closure allocations but the peak/average allocation rate: i.e., the space complexity of the algorithm. We use the extended counting to verify fusion in different strategies, and check if zip can be time- and space-efficiently expressed via a fold.

As the basis we take the simplest EDSL with functions and integer arithmetic, described a decade earlier. It was embedded in OCaml and defined by the signature below. The EDSL was meant to illustrate various evaluation strategies: call-by-value, call-by-name and call-by-need. For the sake of such generality, we introduced the abstract

`('a,'b) arrow`

type of DSL functions (normally, the arrow type constructor of the host language suffices).module type EDSL = sig type 'a exp (* representation of terms *) type ('a,'b) arrow (* The arrow type *) val int : int -> int exp val (+) : int exp -> int exp -> int exp val (-) : int exp -> int exp -> int exp val lam : ('a exp -> 'b exp) -> ('a,'b) arrow exp val (/) : ('a,'b) arrow exp -> ('a exp -> 'b exp) (* application *) end

As syntax sugar, we introduce let-expressions for the embedded language (not to be confused with OCaml`let`

s) and the often occurring two-argument eDSL functions (which should remind one of`zip_with`

):let (let-) : 'a exp -> ('a exp -> 'b exp) -> 'b exp = fun e body -> lam body / e let lam2 : ('a exp -> 'b exp -> 'c exp) -> ('a,('b,'c)arrow)arrow exp = fun f -> lam @@ fun x -> lam @@ fun y -> f x y

They are defined generically for any implementation of`EDSL`

. One may wonder if`let-`

is the mere reverse application:`fun e body -> body e`

. Not quite: it depends on the interpreter/evaluation strategy, as we shall see later.We now extend the EDSL, adding the type of embedded language lists

`'a list`

and the familiar list constructors`nil`

and`cons`

:type +'a list (* List type of the EDLS *) val nil : 'a list exp val cons : 'a exp -> 'a list exp -> 'a list exp

That is, eDSL lists are constructed by either`nil`

or`cons`

; the latter requires, as the second argument, an already constructed list. This is only one part of the standard inductive definition of lists, however. The second part says that lists are constructed*only*with`nil`

and`cons`

. The total (as we aim and assume) operation`decon`

expresses that fact constructively: for any list`l`

it tells exactly which constructor,`nil`

or`cons`

, was used to build it; in case of`cons`

, we also obtain the arguments used with that constructor.val decon : 'a list exp -> 'w exp -> ('a exp -> 'a list exp -> 'w exp) -> 'w exp

It is supposed to satisfy the equationsdecon onnil oncons nil === onnil decon onnil oncons (cons x l) === oncons x l

We call this deconstructor `lazy'. We may even emphasize laziness by defining the deconstructor asval decon : 'a list exp -> (unit -> 'w exp) -> ('a exp -> 'a list exp -> 'w exp) -> 'w exp

but this is not needed. Another way to write the deconstructor is to rely on option and tuples of the host language, if any:val decon : 'a list exp -> ('a exp * 'a list exp) option

We call it strict, for the reasons that become clear later, when we come to the implementation of the signature. Its defining equations aredecon nil === None decon (cons x l) === Some (x,l)

Finally, we need the induction principle: if we can handle

`nil`

, and can handle`cons x l`

assuming the handling of`l`

, we can process any lists. In a word, we need`fold`

. Rather than introducing it as an eDSL primitive, we provide tools to write it, as we will be investigating different folds below. Thus, we introduce recursion as primitive -- which, together with`decon`

, lets us writes folds, and all other list processing operations.val fix : (('a,'b)arrow exp -> ('a,'b)arrow exp) -> ('a,'b)arrow exp

The EDSL is thus defined, and we may try writing list-processing functions. The first is

`map`

; to remind, it can be defined in OCaml aslet map : ('a -> 'b) -> 'a list -> 'b list = fun f lst -> let rec loop l = match l with | [] -> [] | h::t -> f h :: loop t in loop lst

In our EDSL, it is written as follows (using the lazy`decon`

):let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst -> let- f = fexp in let loop = fix @@ fun loop -> lam @@ fun l -> decon l nil (fun h t -> cons (f / h) (loop / t)) in loop / lst

With the strict`decon`

, the similarity with the OCaml`map`

code is closer:let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst -> let- f = fexp in let loop = fix @@ fun loop -> lam @@ fun l -> match decon l with | None -> nil | Some (h,t) -> cons (f / h) (loop / t) in loop / lst

We can even use pattern-matching. One may notice that our EDSL`map`

is actually an OCaml function with two arguments: that is, to the EDSL it is a macro. This macro-form of`map`

is more convenient to use (and, if needed, can be converted to the EDSL function by`lam2`

). Still, we have to be aware that its first argument`fexp`

is an EDSL expression, which gets used many times in the loop. Therefore, it better be shared, using`let-`

(if an interpreter of EDSL supports sharing: some don't, as we shall see.)A few other examples:

`append`

, left and right fold, and`zip_with`

(called`map2`

in OCaml):let append : 'a list exp -> 'a list exp -> 'a list exp = fun l1 l2 -> let loop = fix @@ fun loop -> lam @@ fun l1 -> match decon l1 with | None -> l2 | Some (h,t) -> cons h (loop / t) in loop / l1 let fold_right : ('a,('z,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp = fun fexp z lst -> let- f = fexp in let loop = fix @@ fun self -> lam @@ fun lst -> match decon lst with | None -> z | Some (h,t) -> f / h / (self / t) in loop / lst let fold_left : ('z,('a,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp = fun fexp z lst -> let- f = fexp in let loop = fix @@ fun self -> lam2 @@ fun z lst -> match decon lst with | None -> z | Some (h,t) -> self / (f / z / h) / t in loop / z / lst let zip_with : ('a,('b,'c)arrow)arrow exp -> 'a list exp -> 'b list exp -> 'c list exp = fun fexp l1 l2 -> let- f = fexp in let loop = fix @@ fun loop -> lam2 @@ fun l1 l2 -> match decon l1 with | None -> nil | Some (h1,t1) -> match decon l2 with | None -> nil | Some (h2,t2) -> cons (f / h1 / h2) (loop / t1 / t2) in loop / l1 / l2

All in all, the EDSL code is quite close to how would we write it in OCaml. The accompanying code shows more examples, of defining and using list-processing functions.To run the examples, we need an implementation of the EDSL. The first that comes to mind is meta-circular one: EDSL functions are OCaml functions and EDSL lists are normal OCaml lists. It is indeed straightforward and left as an exercise to the reader.

We will be mostly interesting with the call-by-name, call-by-value and call-by-need family of interpreters, called

`CBName`

,`CBValue`

and`CBLazy`

(see the earlier article: Parameterizing expressions by the evaluation order: call-by-name, call-by-value, call-by-need). They all share the same representation of EDSL expressions and functions, as:type 'a exp = unit -> 'a type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b)

In fact, the only difference between the different strategies is the implementation of`lam`

. The extension to lists is likewise almost uniform. Here are non-strict lists:module ListName(S:EDSL with type 'a exp = unit -> 'a and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b)) = struct include S type 'a list = Nil | Cons of 'a exp * 'a list exp let nil : 'a list exp = fun () -> Nil let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l -> fun () -> Cons (x,l) let decon : 'a list exp -> ('a exp * 'a list exp) option = fun lst -> match lst () with | Nil -> None | Cons (h,t) -> Some (h,t) let fix : (('a,'b)arrow exp -> ('a,'b)arrow exp) -> ('a,'b)arrow exp = fun f -> fun () -> let rec fp () = f fp () in fp () end

Indeed, here`cons`

does not evaluate its arguments. It should also become clear why`decon`

was called `strict'.Strict lists differ only in the interpretation of

`cons`

: now it does evaluate its arguments.module ListValue(S:EDSL with type 'a exp = unit -> 'a and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b)) = struct include ListName(S) let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l -> fun () -> let xv = x () and lv = l () in Cons ((fun () -> xv),(fun () -> lv)) end

In lazy lists,`cons`

arguments are evaluated on demand and shared:module ListLazy(S:EDSL with type 'a exp = unit -> 'a and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b)) = struct include ListName(S) let share : 'a exp -> 'a exp = fun e -> let r = ref e in fun () -> let v = !r () in r := (fun () -> v); v let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l -> fun () -> Cons (share x, share l) end

We may run the list examples with`ListName(CBName)`

,`ListValue(CBValue)`

and`ListLazy(CBLazy)`

(or even`ListLazy(CBValue)`

) interpreters and confirm they all produce the same results.But efficiency probably differs. How can we check? By defining a counting interpreter (actually, a transformer) that modifies the definitions of

`cons`

,`decon`

and`lam`

to count cons-tructions, decon-structions and building and applying of (selected) closures. See the accompanying source code for detail (the functor`ListCnt`

).To get the feel for how the counters work, consider the following program.

let- dummy = int 0 in let- l = cons (int 0) (cons (int 1) nil) in let- unused = append l l in let- l2 = append l l in fold_right (lam2 (+)) (int 0) l2

The call-by-value interpreter with strict lists reports: 6 cons constructed, 8 deconstructed, 9 closures created and 12 applied. Indeed,`l`

contributes 2 cons constructions, the`append`

in`unused`

contributes 2 constructions and 2 deconstructions, the append in`l2`

contributes the same. Finally,`fold_right`

walks the whole list and hence does 4 deconstructions. The total matches the report by the counting interpreter. As to closures, each`let-`

contributes one closure construction and application.`lam2`

contributes one (outer) closure construction, whose application to two arguments contributes 1 closure construction and 2 applications. Again, the totals agree with the measurements.For lazy lists with the lazy interpreter, we obtain for the same program 4 cons constructed and 6 deconstructed; the closure statistics is the same. The lazy interpreter does not evaluate

`unused`

, and so the`append l l`

contribution (2 cons and 2 decons) is not happening. For unstrict lists with the call-by-name interpreter, we obtain 6 cons constructed and deconstructed and 12 closures created and applied. The`unused`

is again not evaluated; however, due to the lack of sharing, the construction of`l`

and`lam2 (+)`

happens repeatedly.We now apply the counting framework to the question considered a few years ago: if fold-left is worth it, given that it is easily expressible via right-fold:

let fold_left_via_right : ('z,('a,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp = fun fexp z lst -> let- f = fexp in fold_right (lam2 @@ fun e a -> lam @@ fun z -> a / (f / z / e)) (lam Fun.id) lst / z

All other list operations are expressible via right fold. For example,`map`

:let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst -> let- f = fexp in fold_right (lam2 @@ fun e l -> cons (f / e) l) nil lst

We conduct an experiment: define the following three benchmarks and apply to two lists, of sizes 4 and 16, resp.

let tmr l = fold_right (lam2 (Fun.flip (-))) (int 0) l let tml l = fold_left (lam2 (-)) (int 0) l let tmlr l = fold_left_via_right (lam2 (-)) (int 0) l

The count of constructed and deconstructed cons-cell is predictable: exactly the size of the input list -- regardless of strictness of lists or the interpreter. Left- and right- fold processing (benchmarks`tml`

and`tmr`

) happen to have the same complexity: the same cons and closure use, again for all interpreters. The following table reports (closure creation, closure application) counts, for different interpreters:~ left- or right- left-via-right Value, size 4 ( 5, 8) ( 22, 28) Value, size 16 (17, 32) ( 70, 100) Lazy, size 4 ( 5, 8) ( 22, 28) Lazy, size 16 (17, 32) ( 70, 100) Name, size 4 ( 8, 8) ( 28, 28) Name, size 16 (32, 32) (100, 100)

Since there is no `dead' code, call-by-value with strict lists and call-by-need with lazy lists have the same complexity. Call-by-name is predictably worse: more closure construction due to lack of sharing. Expressing left-fold via right-fold is about 3 times worse, regardless of the evaluation strategy. We'd better keep left-fold as a primitive. **Version**- The current version is August 2022
**References**- call_by_any.ml [7K]

The basic EDLS, which appeared in Parameterizing expressions by the evaluation order: call-by-name, call-by-value, call-by-needlists.ml [19K]

The complete code accompanying the article, with more examplesLeft fold vs. right fold: theory vs. practice

An old article with the analysis of left- and right- folds

- One may think that only those DSL can be embedded in Haskell whose
type system is a subset of that of Haskell. To counter that impression
we show how to faithfully embed typed
*linear*lambda calculus. Any bound variable must be referenced exactly once in abstraction's body. As before, only well-typed and well-formed terms are representable. Haskell as the metalanguage will statically reject as ill-typed the attempts to represent terms with a bound variable referenced several times -- or, as in the K combinator, never.We build on the embedding of the ordinary simply-typed lambda calculus with De Bruijn indices described earlier. An object term of the type

`a`

was represented as a value of the type`repr h a`

, where the binary type constructor`repr`

is a member of the class`Symantics`

. Here`h`

stands for the type environment, assigning types to free variables (`hypotheses') of a term. Linear lambda calculus regards bound variables as representing resources; referencing a variable consumes the resource. We use the type environment for tracking the state of resources: available or consumed. The type environment becomes the*type state*. We follow the approach described in Variable (type)state `monad'.We represent linear lambda terms by Haskell values of the type

`repr hi ho a`

, where`hi`

stands for the variable state before evaluating the term and`ho`

represents the state after evaluating the term. To be more precise,`hi`

and`ho`

are the types of the variable states. We can determine the types and hence the state of the variables statically: As usual, the type checker does abstract interpretation. In our tagless-final encoding,`lam`

has the following typelam :: repr (F a,hi) (U,ho) b -> repr hi ho (a->b)

The expression`(lam e)`

has the type`repr hi ho (a->b)`

provided the body of abstraction,`e`

, has the type`repr (F a,hi) (U,ho) b`

. That is, in the environment extended with a term of the type`a`

, the body must produce the value of type`b`

. The body must consume the term at the top of the environment, changing the type of the first environment cell from`F a`

to`U`

(the type of the used variable).A trivial modification turns the embedding of the linear lambda-calculus into that of the affine lambda-calculus, which allows to ignore bound variables. K combinator becomes expressible.

LinearLC.hs [11K]

Commented code defining the typed linear lambda calculus and its two interpreters, to evaluate and to show linear lambda terms. Later we add general abstractions imposing no constraints on the use of bound variables.Jeff Polakow: Embedding a full linear Lambda calculus in Haskell

Proceedings of the ACM SIGPLAN Haskell Symposium, 2015, pp. 177-188.

Polakow's tagless-final linear lambda-calculus interpreter relies on higher-order abstract syntax, rather than De Bruijn indices of LinearLC.hs. He implements the full linear lambda calculus with additives and units.

- We present a tagless-final embedding of simply-typed pure lambda
calculus with De Bruijn
*levels*. Unlike De Bruijn indices, the embedding of the De Bruijn-levels calculus was thought to be impossible without type-level functions, which excludes OCaml and many other languages.In the standard (original) presentation of lambda calculus, variables are denoted by symbolic names (strings); a binding location, `lambda', is labeled by the name of the variable it binds. De Bruijn's insight was to replace names with integers -- `nameless dummies' -- which identify the corresponding binding occurrence by the position in the sequence of nested lambdas. De Bruijn indices count the position `inside-out' (so that index 0 points to the innermost lambda), whereas De Bruijn levels count in the opposite direction (hence the outermost lambda is at level 0). For example, the S combinator, which is

`\f\g\x. (f x) g x`

in the namefull notation, is written as`lam lam lam 2 0 (1 0)`

in the De Bruijn indices notation, and`lam lam lam 0 2 (1 2)`

in the De Bruijn levels notation. With indices or levels, we no longer have to label binding locations, and, mainly, no longer need alpha-conversion. We can compare terms without needing to rename anything.In the indices notation,

`lam 0`

denotes the identity function -- whether this term appears by itself or as part of a larger term, for example,`lam lam ((lam 0) (1 0))`

(which is`\f\x. (\z.z) (f x)`

). In contrast, De Bruijn-levels terms are inherently unmodular: the identity function is represented by`lam 0`

if it appears standalone. In`\f\x. (\z.z) (f x)`

, the very same identity subterm is denoted as`lam 2`

. On the other hand, the term`\x\f. (\y\g. g y z) (f x)`

in the indices notation becomes`lam lam (lam lam (0 1 3)) (0 1)`

. The variable`x`

, bound by the outermost lambda, is denoted either by`1`

or by`3`

, depending on its occurrence within the term. The same term in the levels notation,`lam lam (lam lam (3 2 0)) (1 0)`

, shows that all occurrences of the same variable are denoted by the same integer. De Bruijn-levels terms are therefore much easier to read.Terms in the indices notation are, however, much easier to embed. The typing rules of the calculus (for

`z`

and`lam`

) give a hint:t, G |- z : t t, G |- e : t' ------------------ G |- lam e : t->t'

The type environment`G`

is an ordered sequence (a list);`lam`

pushes the argument type to the top of the list, and`z`

(the variable of the index 0) retrieves it from there. This symmetry leads to a simple implementation, with`G`

represented by a nested tuple. Here is the tagless-final embedding, as an OCaml module signaturemodule type LamIx = sig type ('g,'a) repr val z: ('a*'g,'a) repr val s: ('g,'a) repr -> (_*'g,'a) repr val lam: ('a*'g,'b) repr -> ('g,'a->'b) repr val (/): ('g,'a->'b) repr -> ('g,'a) repr -> ('g,'b) repr (* application *) end

The inferred type of the sample term`lam (lam (s z / s (s z) / z))`

, namely,`('a * 'b, ('a -> 'c -> 'd) -> 'c -> 'd) L.repr`

shows it to be an open term of the type`('a -> 'c -> 'd) -> 'c -> 'd`

with one free variable of the type`'a`

. We aim at the similar typeful embedding of the levels-based calculus: the inferred type of the OCaml value should tell not just the type of the term but also its free variables and their types.However, the embedding of the levels-based calculus seems next to impossible. Again, consider the representative typing rules:

t, G |- z : t G, t |- e : t' ------------------ G |- lam e : t->t'

Whereas`z`

retrieves the type from one end of the typing environment`G`

,`lam`

pushes its binding variable type to the other end. Our embedding should be able to instruct the type-checker to push and pop items to/from a*queue*of an arbitrary length. Type-level functions seem inevitable.Let us think more of the inherent unmodularity of De Bruijn levels. Let

`v0`

denote a variable with the level`0`

and`v1`

denote the variable with the level`1`

. The open term`v0 v1`

has two free variables -- which should be apparent in the type of its embedding. If we use this term to build a bigger term`lam (v0 v1)`

, which of the two variables the`lam`

binds? We cannot actually tell. If`lam (v0 v1)`

is standalone,`lam`

binds`v0`

and`v1`

is left free. If the term is a part of`lam (lam (v0 v1))`

, the inner`lam`

binds`v1`

. As a part of`lam (lam (lam (v0 v1)))`

, the innermost`lam`

binds neither`v0`

nor`v1`

. When building terms bottom-up, from fragments, we may hence need to somehow tell the level of a`lam`

in the complete term -- or assign the level to a`lam`

. This is the solution. It relies on the library of heterogeneous lists`Hlist_simple`

, which provides just what we need: a lens, to retrieve an element from a heterogeneous list based on its `index', and to replace the element (with a value of a different type).Here is the embedding of the De Bruijn-levels calculus, again as an OCaml signature:

module type LamLev = sig type ('g,'a) repr val v: ('g,'a,_,_) lens -> ('g,'a) repr val lam: ('g1,_,'a,'g) lens -> ('g,'b) repr -> ('g1,'a->'b) repr val (/): ('g,'a->'b) repr -> ('g,'a) repr -> ('g,'b) repr (* application *) end

(We could have used`z`

and`s`

from the indices embedding; we went for the uniformity of reference and binding occurrences.) The type`('g,'a,'b,'g1) lens`

of the`Hlist_simple`

library is the type of the lens into the heterogeneous sequence`'g`

to retrieve an element of the type`'a`

. If that element is later replaced by a value of the type`'b`

, the entire sequence type changes to`'g1`

. To construct values of the`lens`

type, the library provides the constructors`LHere`

and`LNext`

.The running example is now written as

`lam n1 (lam n2 (v n1 / v n0 / v n2))`

, where we have defined`n0`

to be`LHere`

,`n1`

to be`LNext n0`

and`n2`

to be`LNext n1`

. The inferred type is`('a * ('b * ('c * 'd)), ('a -> 'e -> 'f) -> 'e -> 'f) L.repr`

. It tells that the term has the type`('a -> 'e -> 'f) -> 'e -> 'f`

and has one free variable at the level 0 of the type`'a`

. The type also tells that the term uses two other slots in the environment (for bound variables). Leaking the information about bound variables betrays from the unmodularity of De Bruijn levels. One may also view this information as a hint as to how much space to reserve in the environment at run-time.In conclusion, by bringing the labels to binding occurrences, we made the construction of terms a bit more predictable. Mainly, we were able to embed the De Bruijn-levels calculus in a host language with GADTs, without resorting to full-blown type-level programming or dependent types.

I thank Simon Charlow for the insightful question.

**Version**- The current version is May 2020
**References**- db_levels.ml [7K]

Complete (sans`Hlist_simple`

) OCaml code and examples, including two interpreters: the evaluator and the pretty printer. For the sake of comparison, the code also contains the tagless-final embedding of the lambda calculus with De Bruijn indices.Typed heterogeneous collections: Lightweight HList

The dependency: the`Hlist_simple`

library

- Olivier Danvy's original POPL96 paper on type-directed partial
evaluation used an untyped target language, represented as an
algebraic data type. Type preservation was not apparent and had
to be proved. In our presentation, the result of reification is a
*typed*expression, in the tagless-final form. Type preservation of reification is now syntactically apparent and is verified by the Haskell type-checker. In the tagless-final presentation, reification and reflection seem particularly symmetric, elegant and insightful.TDPE.hs [6K]

Tagless-final presentation of type-directed partial evaluationToTDPE.hs [<1K]

The imported module with sample functions to reify. Compiling this module makes for a nicer example.<http://www.brics.dk/~danvy/tdpe-ln.pdf>

Olivier Danvy: Lecture notes on type-directed partial evaluation. The lecture notes are based on his POPL96 paper.

- We demonstrate a tagless-final embedding of the simply-typed
lambda-calculus with integers and constants, and its uniform
interpretations with three different evaluation strategies:
call-by-name, call-by-value and call-by-need. The three interpreters
share almost all the code, differing only in the interpretation of
object-language abstractions (
`lam`

-forms). The semantics of abstraction is indeed what sets the three evaluation orders apart. The examples also implement tracing of the interpreter actions, to make clear which expressions are evaluated, in which sequence and how many times. The interpreters are assuredly and patently type-preserving: the host language -- OCaml and Haskell -- guarantee that. We thus obtain a higher-order embedded domain-specific language with the selectable evaluation order.In Haskell, we define the DSL as a type class

`Symantics`

. The type class declaration specifies the syntax and its instances define the semantics of the language. Our DSL is typed. We use the Haskell type constant`Int`

to also denote the DSL type of integers (although we could have used a different name). DSL function types are represented in Haskell by the type synonym`Arr repr a b`

:class Symantics repr where int :: Int -> repr Int -- integer literals add :: repr Int -> repr Int -> repr Int -- addition sub :: repr Int -> repr Int -> repr Int -- and subtraction app :: repr (Arr repr a b) -> repr a -> repr b -- application type Arr repr a b = repr a -> repr b

This somewhat unusual representation of DSL functions permits the uniform implementation of evaluation strategies. The class`Semantics`

specifies the part of our calculus that is interpreted the same way in all evaluation strategies. Only DSL abstractions are interpreted differently; therefore, we make them a distinct DSL `feature', defined by its own type class:class SymLam repr where lam :: (repr a -> repr b) -> repr (Arr repr a b)

After introducing the convenient `macro'`let_`

(which could have been called `bind') we write a sample object language term as follows:let_ :: (Symantics repr, SymLam repr) => repr a -> (repr a -> repr b) -> repr b let_ x y = (lam y) `app` x -- The representation of the lambda-calculus term -- (\z x -> let y = x + x in y + y) (100 - 10) (5 + 5) t2 :: (Symantics repr, SymLam repr) => repr Int t2 = (lam $ \z -> lam $ \x -> let_ (x `add` x) $ \y -> y `add` y) `app` (int 100 `sub` int 10) `app` (int 5 `add` int 5)

The Haskell expression`t2`

represents the lambda-calculus term`(\z x -> let y = x + x in y + y) (100 - 10) (5 + 5)`

; the variable`z`

does not occur in the body of the abstraction whereas`x`

occurs twice.We are ready to write interpreters for our DSLs. We will interpret DSL expressions of the type

`a`

as Haskell values of the type`m a`

where`m`

is some`MonadIO`

.`IO`

is needed to print the trace of interpreter actions. The representation is common to the three evaluation strategies. To distinguish them (when interpreting`lam`

later on) we add a phantom type parameter, the label`l`

. It will take the values`Name`

,`Value`

and`Lazy`

; more strategies may be added later on.newtype S l m a = S { unS :: m a } deriving (Monad, Applicative, Functor, MonadIO)

Here is the code common to all interpreters regardless of the evaluation strategy. The reasons to parameterize the interpreter over

`MonadIO`

is to print out the evaluation trace, so that we can see the difference among the evaluation strategies in the number of performed additions and subtractions.instance MonadIO m => Symantics (S l m) where int = return add x y = do a <- x b <- y liftIO $ putStrLn "Adding" return (a + b) sub x y = do a <- x b <- y liftIO $ putStrLn "Subtracting" return (a - b) app x y = x >>= ($ y)

Call-by-name is defined by the following interpretation of DSL abstractions:data Name -- The label instance Monad m => SymLam (S Name m) where lam f = return f

The abstraction, when applied, will receive an unevaluated argument expression -- of the type`repr a`

, or`S Name m a`

-- using (substituting) it as it is.We evaluate the sample term under call-by-name

runName :: S Name m a -> m a runName x = unS x t2SN = runName t2 >>= print

obtaining the result 40 and observing from the trace that subtraction was not performed: the value of`int 100 `sub` int 10`

was not needed to compute the result of`t2`

. On the other hand, the sub-expression`int 5 `add` int 5`

was evaluated four times.Call-by-value differs from call-by-name only in the interpretation of abstractions:

data Value instance Monad m => SymLam (S Value m) where -- lam f = return (\x -> (f . return) =<< x) -- or, in the pointless notation, as below: lam f = return (f . return =<<)

As in the call-by-name evaluator, the lambda-abstraction receives an unevaluated argument expression -- evaluating it before passing its result to the abstraction body`f`

. This is literally the definition of call-by-value. The very same sample term`t2`

can be interpreted differently:runValue :: S Value m a -> m a runValue x = unS x t2SV = runValue t2 >>= print

giving in the end the same result 40. Although the result of the subtraction`int 100 `sub` int 10`

in`t2`

was not needed, the trace shows it performed. On the other hand, the argument sub-expression`int 5 `add` int 5`

was evaluated only once. In call-by-value, arguments are evaluated exactly once.The call-by-need evaluator differs from the others again in one line, the interpretation of abstractions:

data Lazy instance MonadIO m => SymLam (S Lazy m) where -- lam f = return (\x -> f =<< share x) -- Or, in the pointless notation lam f = return ((f =<<) . share)

The evaluation of the body of the abstraction always starts by lazy sharing the argument expression. (The implementation of`share`

is straightforward; see the code for details.) Again, this is the definition of call-by-need. We run the very same term`t2`

with the new evaluator, obtaining the same result 40 and observing from the execution trace that subtraction was not evaluated (because it was not needed) but the needed argument expression`int 5 `add` int 5`

was evaluated once. In call by need, application arguments are evaluated*at most*once. **Version**- The current version is 1.2, March 17, 2010
**References**- CBAny.hs [7K]

The complete code with examples and testscall_by_any.ml [7K]

The OCaml version of the code, using modules and functorsCB.hs [7K]

The older version of the code

It was originally posted as ``CBN, CBV, Lazy in the same final tagless framework'' on the Haskell-Cafe mailing list on Thu, 8 Oct 2009 00:54:14 -0700 (PDT).CB98.hs [7K]

The same code in Haskell98, proving that the tagless final approach indeed requires fewer fancy type system features. The fancy features like the type families, used in CB.hs, add convenience and gloss. One can do without them however. The code can be (and has been) easily re-written in OCaml, which has no typeclasses, type families and other bleeding-edge features.Parameterized syntax: interpreters without run-time interpretive overhead

Evaluating the same Scheme source language expression using call-by-value, call-by-reference, and call-by-name evaluation strategies.

- Lambek calculus is a resource-sensitive calculus introduced by Lambek
in 1958, almost three decades before linear logic. Like linear logic,
Lambek calculus does not have the weakening rule. In fact, in the
non-associative Lambek calculus NL, the antecedent of a sequent is a
tree and there are
*no*structural rules at all. Lambek calculus is hence the purest, simplest, and the earliest substructural logic. Adding the associativity and commutativity rules (that is, treating the antecedent as a sequence rather than a tree, and allowing exchange) turns Lambek calculus into a fragment of the Multiplicative Linear Logic (MILL).The most noticeable difference of Lambek calculus from the conventional or linear lambda-calculus is its

*directional*implications and abstractions. There are two function types typically written as`B / A`

and`A \ B`

, called left/right slash-types rather than arrow types. The function of the type`B / A`

accepts the argument of the type`A`

on the right;`A \ B`

accepts the`A`

argument on the left. There are hence two rules for eliminating implications and, correspondingly, two rules for introducing them, which bring in the power of hypothetical reasoning. Although the NL calculus per se has no structural rules, various NL theories add so-called structural postulates: the ways to rearrange the antecedent structure in particular limited ways. All these features set Lambek calculi even farther apart from the Haskell type system. And yet it can be embedded in Haskell, in the tagless-final approach. All and only valid derivations are represented by Haskell values of a particular`repr`

type. One tagless-final interpreter prints the ``yield'', the constants used in the derivation. In linguistic applications the yield spells out the sentence whose parse is represented by the derivation. Other interpreters transform the derivation to a logical formula that stands for the meaning of the sentence.A so-called symmetric, Lambek-Grishin calculus has, in addition to directional implications, directional co-implications and interesting symmetric structural rules for moving between the antecedent and the consequent structures. It too can be represented in the tagless-final style. The semantic interpretation builds the meaning formula in the continuation-passing style.

**References**- Michael Moortgat: Typelogical Grammar

The Stanford Encyclopedia of Philosophy (Spring 2014 Edition), Edward N. Zalta (ed.)

<http://plato.stanford.edu/entries/typelogical-grammar/>Lambek Grammars as an embedded DSL: a tool for semanticists

Embedding of Lambek Grammars based on the associative Lambek calculusHOCCG.hs [28K]

Explanation of the code

Non-associative Lambek calculus NL with the non-standard semantic interpretation. Applications to quantification, non-canonical coordination and gapping.LG.hs [10K]

Tagless-final embedding of the Lambek-Grishin symmetric calculus and its 1-CPS translation

Our starting point is the regular CBV CPS translation for lambda-LG, described on p. 697 of the paper by Michael Moortgat ``Symmetric Categorial Grammar''. J. Philos. Logic, 2009. The original translation (eq (20) of the paper) yields many administrative beta redices. The present translation uses lightweight staging to remove such redices in the process.

- Sandro Magi shows that different interpretations of the same DSL term may not only involve different run-time systems but also occur on different hosts.
**Version**- The current version is June 23, 2009
**References**- <http://higherlogics.blogspot.com/2009/06/mobile-code-in-c-via-finally-tagless.html>

The article by Sandro Magi