Higher-kinded polymorphism -- the abstraction over a type
*constructor* to be later supplied with arguments -- is often needed,
for expressing generic operations over collections or embedding typed
DSLs, particularly in tagless-final style. Typically, the abstracted type
constructor is not arbitrary, but must implement a particular
interface (e.g., an abstract sequence) -- so-called bounded
polymorphism.

OCaml does not support higher-kinded polymorphism directly: OCaml type variables range over types rather than type constructors, and type constructors may not appear in type expressions without being applied to the right number of arguments. Nevertheless, higher-kinded polymorphism is expressible in OCaml -- in fact, in several, more or less cumbersome ways. The less cumbersome ways are particularly less known, and kept being rediscovered. This page summarizes the different ways of expressing, and occasionally avoiding, higher-kinded polymorphism. They are collected from academic papers and messages on the caml-list spread over the years -- and adjusted to fit the story and differently explained.

- Introduction
- Why higher-kinded polymorphism is not supported directly in OCaml
- Higher-kinded functions as Functors
- Yallop and White's Lightweight higher-kinded polymorphism
- Sidestepping higher-kinded polymorphism
- Algebras
- What's in a higher-kinded type name
- Conclusions

- ``Polymorphism abstracts types, just as functions abstract
values. Higher-kinded polymorphism takes things a step further,
abstracting both types and types constructors, just as higher-order
functions abstract both first-order values and functions.'' -- write
Yallop and White (FLOPS 2014).
This remarkably concise summary is worth expounding upon, to demonstrate how (bounded) higher-kinded polymorphism tends to arise. The example introduced here is used all throughout the page.

Summing up numbers frequently occurs in practice; abstracting from concrete numbers leads to a function -- an operation that can be uniformly performed on any collection (list) of numbers:

let rec sumi : int list -> int = function [] -> 0 | h::t -> h + sumi t

We may further abstract over`0`

and the operation`+`

, which itself is a function (a parameterized value, so to speak). The result is a higher-order function:let rec foldi (f: int->int->int) (z: int) : int list -> int = function [] -> z | h::t -> f h (foldi f z t)

Folding over a list, say, of floating-point numbers proceeds similarly, so we may abstract yet again -- this time not over values but over the type`int`

, replacing it with a type variable:let rec fold (f: 'a->'a->'a) (z: 'a) : 'a list -> 'a = function [] -> z | h::t -> f h (fold f z t)

thus giving us the polymorphic function: the function that describes an operation performed over lists of various types, uniformly. The operation`f`

and the value`z`

can be collected into a parameterized recordtype 'a monoid = {op: 'a->'a->'a; unit: 'a}

The earlier`fold`

then takes the formlet rec foldm (m: 'a monoid) : 'a list -> 'a = function [] -> m.unit | h::t -> m.op h (foldm m t)

When using`foldm`

on a concrete list of the type`t list`

, the type variable`'a`

gets instantiated to the type`t`

of the elements of this list. The type is not completely arbitrary, however: there must exist the value`t monoid`

, to be passed to`foldm`

as the argument. We say the type`t`

must (at least) implement/support the`'a monoid`

interface; the`t monoid`

value is then the witness that`t`

indeed does so. Hence the polymorphism in`foldm`

is*bounded*.*Exercise:*if`'a monoid`

really describes a monoid,`op x unit = x`

holds. Write a more optimal version of`foldm`

(and its subsequent variants) taking advantage of this identity.A file, a string, an array, a sequence -- all can be folded over in the same way. Any collection is foldable so long as it supports the deconstruction operation, which tells if the collection is empty, or gives its element and the rest of the sequence. One is tempted to abstract again -- this time not over a mere type like

`int`

or`int list`

, but over a type constructor such as`list`

, and introducetype ('a,'F) seq = {decon: 'a 'F -> ('a * 'a 'F) option}

This is a hypothetical OCaml: the type variable 'F (with the upper-case name) is to be instantiated not with types but one-argument type constructors: technically, one says it has the*higher-kind*`* -> *`

rather than the ordinary kind`*`

of types and ordinary type variables such as`'a`

. The record`seq`

is, hence, higher-kind polymorphic. The function`foldm`

then generalizes tolet rec folds (m: 'a monoid) (s: ('a,'F) seq) : 'a 'F -> 'a = fun c -> match s.decon c with None -> m.unit | Some (h,t) -> m.op h (folds m s t)

Again,`'F`

is instantiated not with just any type constructor, but only that for which we can find the value`('a,'F) seq`

; thus`folds`

exhibits*bounded higher-kinded*polymorphism.Alas, higher-kind type variables are not possible in OCaml. The next section explains why. The following sections tell what we can do in OCaml instead. There are several alternatives. In some, the end result ends up looking almost exactly as the above imagined higher-kind--polymorphic code.

- Higher-kind type variables are not supported in OCaml. Yallop and
White's FLOPS 2014 paper (Sec. 1.1) explains why: in a word, type
aliasing. For completeness, we recount their explanation here, with
modifications.
Consider the following two modules:

module Tree = struct type 'a t = Leaf | Branch of 'a t * 'a * 'a t end module TreeA : dcont = struct type 'a t = ('a * 'a) Tree.t end

Here,`'a Tree.t`

is a data type: a fresh type, distinct from all other existing types. On the other hand,`'a TreeA.t`

is an alias: as its declaration says, it is equal to an existing type, viz.`('a * 'a) Tree.t`

.Suppose OCaml had higher-kind

`* -> *`

type variables, such as`'F`

hypothesized in the previous section. Type checking is, in the end, solving/checking type equalities, such as`'a 'F = 'b 'G`

. If higher-kind type variables ranged only over*data type*constructors, the solution is easy:`'a = 'b`

and`'F = 'G`

: a data type is fresh, hence equal only to itself. This is the situation in Haskell. To ensure that only data type constructors can be substituted for higher-kind type variables, a Haskell compiler keeps track of type aliases, even across module boundaries. Module system in Haskell is rather simple, so such tracking is unproblematic.Module system of ML is, in contrast, sophisticated. It has functors, signatures, etc., and extensively relies on type aliases, for example:

module F(T: sig type 'a t val empty: 'a t end) = struct type 'a ft = 'a T.t end

If we preclude substitution of type aliases for higher-kind type variables, we severely restrict expressiveness. For example,`'a ft`

above is a type alias; hence`F(TRee).ft`

cannot be substituted for a higher-kind type variable, even though one may feel`F(TRee).ft`

is the same as`Tree.t`

, which is substitutable.On the other hand, if we allow type aliases to be substituted for higher-kind type variables, the equivalence of

`'a 'F = 'b 'G`

and`'a = 'b, 'F = 'G`

breaks down. Indeed, consider`(int*int) 'F = int 'G`

. This equation now has the solution:`'F = Tree.t`

and`'G = TreeA.t`

. Parameterized type aliases like`'a TreeA.t`

are type functions, and type expressions like`int TreeA.t`

are applications of those functions, expanding to the right-hand-side of the alias declaration with`'a`

substituted for`int`

. Thus, with type aliases, the type equality problem becomes the higher-order unification problem, which is not decidable. **References**- Jeremy Yallop and Leo White: Lightweight higher-kinded polymorphism. FLOPS 2014.

- Although OCaml does not support higher-kind type variables, higher-kinded
polymorphism is not out of the question. There are other ways of
parameterizing by a type constructor: the module system (functor)
abstraction is the first to come to mind.
It is however rather verbose and cumbersome. Let us see.
We now re-write the hypothetical higher-kind--polymorphic OCaml code at the end of [Introduction] in the real OCaml -- by raising the level, so to speak, from term-level to module-level. The hypothetical record

type ('a,'F) seq = {decon: 'a 'F -> ('a * 'a 'F) option}

becomes the module signaturemodule type seq_i = sig type 'a t (* sequence type *) val decon : 'a t -> ('a * 'a t) option end

which represents the higher-kind type variable`'F`

, not supported in OCaml, with an ordinary type constructor`t`

(type constant). Different implementations of`seq_i`

(see, e.g.,`ListS`

below) instantiate`'a t`

in their own ways; hence`t`

does in effect act like a variable. The hypothetical higher-kind--polymorphic functionlet rec folds (m: 'a monoid) (s: ('a,'F) seq) : 'a 'F -> 'a = fun c -> match s.decon c with None -> m.unit | Some (h,t) -> m.op h (folds m s t)

becomes the functor, parameterized by the`seq_i`

signature:module FoldS(S:seq_i) = struct let rec fold (m: 'a monoid) : 'a S.t -> 'a = fun c -> match S.decon c with None -> m.unit | Some (h,t) -> m.op h (fold m t) end

We got what we wanted: abstraction over a sequence. To use it to define other higher-kinded polymorphic functions, such as`sums`

to sum up a sequence, we also need functors. Functors are infectious, one may say.module SumS(S:seq_i) = struct open S open FoldS(S) let sum : int t -> int = fold monoid_plus end

Finally, an example of instantiating and using higher-kind--polymorphic functions: summing a list. First we need an instance of

`seq_i`

for a list: the witness that a list is a sequence.module ListS = struct type 'a t = 'a list let decon = function [] -> None | h::t -> Some (h,t) end

which we pass to the`SumS`

functor:let 6 = let module M = SumS(ListS) in M.sum [1;2;3]

The accompanying code shows another example: using the same`SumS`

to sum up an array, which also can be made a sequence.Thus in this approach, all higher-kind--polymorphic functions are functors, which leads to verbosity, awkwardness and boilerplate. For example, we cannot even write a

`SumS`

application as`SumS(ListS).sum [1;2;3]`

; we have to use the verbose expression above. **References**- HKPoly_seq.ml [11K]

The complete code with tests and other examples

- Perhaps surprisingly, higher-kinded polymorphism can always be reduced
to the ordinary polymorphism, as Yallop and White's FLOPS 2014 paper
cleverly demonstrated. They explained their approach as
defunctionalization. Here we recap it and explain in a different way.
Consider the type

`'a list`

again. It is a parameterized type:`'a`

is the type of elements, and`list`

is the name of the collection: `the base name', so to speak. The combination of the element type and the base name can be expressed differently, for example, as`('a,list_name) app`

, where`('a,'b) app`

is some fixed type, and`list_name`

is the ordinary type that tells the base name. The fact that the two representations are equivalent is witnessed by the bijection:inj: 'a list -> ('a,list_name) app prj: ('a,list_name) app -> 'a list

Here is a way to implement it. First, we introduce the dedicated `pairing' data type. It is extensible, to let us define as many pairings as needed.

type ('a,'b) app = ..

For`'a list`

, we have:type list_name type ('a,'b) app += List_name : 'a list -> ('a,list_name) app

In this case the bijection`'a list <-> ('a,list_name) app`

is:let inj x = List_name x and let prj (List_name x) = x

and the two functions are indeed inverses of each other.*Exercise:*Actually, that the above`inj`

and`prj`

are the inverses of each other is not as straightforward. It requires a side-condition, which is satisfied in our case. State it.In this new representation of the polymorphic list as

`('a,list_name) app`

, the base name`list_name`

is the ordinary (kind`*`

) type. Abstraction over it is straightforward: replacing with a type variable. The base-name-polymorphism is, hence, the ordinary polymorphism. We can then write the desired sequence-polymorphic`folds`

almost literally as the hypothetical code at the end of [Introduction]:type ('a,'n) seq = {decon: ('a,'n) app -> ('a * ('a,'n) app) option} let rec folds (m: 'a monoid) (s: ('a,'n) seq) : ('a,'n) app -> 'a = fun c -> match s.decon c with None -> m.unit | Some (h,t) -> m.op h (folds m s t)

Instead of`'a 'F`

we write`('a,'n) app`

. That's it. Using`folds`

in other higher-kinded functions is straightforward, as if it were a regular polymorphic function (which it actually is):let sums s c = folds monoid_plus s c (* val sums : (int, 'a) seq -> (int, 'a) app -> int = <fun> *)

Type annotations are not necessary: the type inference works. Here is a usage example, summing a list:let list_seq : ('a,list_name) seq = {decon = fun (List_name l) -> match l with [] -> None | h::t -> Some (h,List_name t)} let 6 = sums list_seq (List_name [1;2;3])

There is still a bit of awkwardness remains: the user have to think up the base name like

`list_name`

and the tag like`List_name`

, and ensure uniqueness. Yallop and White automate using the module system, see the code accompanying this page, or Yallop and White's paper (and the Opam package `higher').We shall return to Yallop and White's approach later on this page, with another perspective and implementation.

**References**- Jeremy Yallop and Leo White: Lightweight higher-kinded
polymorphism. FLOPS 2014.
HKPoly_seq.ml [11K]

The complete code with tests and other examples

- At times, higher-kinded polymorphism can be avoided altogether: upon
close inspection it may turn out that the problem at hand does not
actually require higher-kinded polymorphism. In fact, our running
example is such a problem.
Let us examine the sequence interface, parameterized both by the type of the sequence elements and the sequence itself. The definition that first comes to mind, which cannot be written as such in OCaml, is (from Introduction):

type ('a,'F) seq = {decon: 'a 'F -> ('a * 'a 'F) option}

It has a peculiarity: the sole operation`decon`

consumes and produces sequences of the same type`'a 'F`

(i.e., the same sort of sequence with the elements of the same type). That is,`'F`

always occurs as the type`'a 'F`

, where`'a`

is`seq`

's parameter:`'a`

and`'F`

do not vary independently. Therefore, there is actually no higher-kinded polymorphism here. The sequence interface can be written simply astype ('a,'t) seq = {decon: 't -> ('a * 't) option}

with`folds`

taking*exactly*the desired form:let rec folds (m: 'a monoid) (s: ('a,'t) seq) : 't -> 'a = fun c -> match s.decon c with None -> m.unit | Some (h,t) -> m.op h (folds m s t)

It is the ordinary polymorphic function. There is no problem in using it to define other such sequence-polymorphic functions, e.g.:let sums s c = folds monoid_plus s c (* val folds : 'a monoid -> ('a, 't) seq -> 't -> 'a = <fun> *)

and applying it, say, to a list:let list_seq : ('a,'a list) seq = {decon = function [] -> None | h::t -> Some (h,t)} let 6 = sums list_seq [1;2;3]

*Exercise:*Consider the interface of collections that may be `mapped', in the hypothetical OCaml with higher-kind type variables:type ('a,'b,'F) ftor = {map: ('a->'b) -> ('a 'F -> 'b 'F)}

Now`'F`

is applied to different types. Can this interface be still expressed using the ordinary polymorphism, or higher-kinded polymorphism is really needed here?Looking very closely at the higher-kinded polymorphic interface

`('a,'F) seq`

and the ordinary polymorphic`('a,'t) seq`

, one may notice that the latter is larger. The higher-kinded interface describes only polymorphic sequences such as`'a list`

, whereas`('a,'t) seq`

applies also to files, strings, buffers, etc. Such an enlargement is welcome here: we can apply the same`folds`

to sequences whose structure is optimized for the type of their elements. In Haskell terms,`('a,'t) seq`

corresponds to `data families', a later Haskell extension. Here is an example, of applying`folds`

to a string, which is not a polymorphic sequence:let string_seq : (char,int*string) seq = {decon = fun (i,s) -> if i >= String.length s || i < 0 then None else Some (s.[i], (i+1, s))} let 'c' = folds monoid_maxchar string_seq (0,"bca")

We can hence use`folds`

with any collection, polymorphic or not, for which there is an implementation of the`('a,'t) seq`

interface. We have encountered the old, and very useful trick: enlarging the type but restricting the set of its values by having to be able to define `witnesses' such as`('a,'t) seq`

.*Exercise:*Yallop and White's approach can also deal with non-polymorphic collections. Use it to implement`string_seq`

. **References**- HKPoly_seq.ml [11K]

The complete code with tests and other examples

- The running example, the
`seq`

interface, was about deconstruction of sequences: technically, about co-algebras. Let us now turn to construction: building of values using a fixed set of operations, which can be considered an embedded DSL. The abstraction over a DSL implementation gives rise to polymorphism. If the embedded DSL is typed, the polymorphism becomes higher-kinded -- as commonly seen in DSL embeddings in tagless-final style.Here we briefly recount how the higher-kinded polymorphism arises in DSL embeddings, and how it can be hidden away. The key idea is initial algebra, which is, by definition, the abstraction over any concrete algebra of the same signature, i.e., the abstraction over DSL implementations.

Our running example in this section is a simple programming language with integers and booleans: a dialect of the language used in Chap. 3 of Pierce's `Types and Programming Languages' (TAPL). Here is the familiar tagless-final embedding in OCaml. The grammar of the language is represented as an OCaml signature:

module type sym = sig type 'a repr val int : int -> int repr val add : int repr -> int repr -> int repr val iszero : int repr -> bool repr val if_ : bool repr -> 'a repr -> 'a repr -> 'a repr end

The language is typed; therefore, the type`'a repr`

, which represents DSL terms, is indexed by the term's type: an`int`

or a`bool`

. The signature`sym`

also defines the type system of the DSL: almost like in TAPL, but with the typing rules written in a vertical-space--economic way.Here is a sample term of the DSL:

module SymEx1(I:sym) = struct open I let t1 = add (add (int 1) (int 2)) (int 3) (* intermediate binding *) let res = if_ (iszero t1) (int 0) (add t1 (int 1)) end

It is written as a functor parameterized by`sym`

: a DSL implementation is abstracted out. The term is polymorphic over`sym`

and, hence, may be evaluated in any implementation of the DSL. Since`sym`

contains a higher-kinded type`repr`

, the polymorphism is higher-kinded.The just presented (tagless-final) DSL embedding followed the approach described in [Higher-kinded functions as Functors]. Let us move away from functors to ordinary terms. Actually, we never quite escape functors, but we hide them in terms, relying on first-class modules. As we have seen, a DSL term of the type

`int`

such as`SymEx1`

is the functorfunctor (I:sym) -> sig val res : int I.repr end

To abstract over`int`

, we wrap it into a modulemodule type symF = sig type a module Term(I:sym) : sig val res : a I.repr end end

which can then be turned into ordinary polymorphic type:type 'a sym_term = (module (symF with type a = 'a))

which lets us represent the functor`SymEx1`

as an ordinary OCaml value:let sym_ex1 : _ sym_term = (module struct type a = int module Term = SymEx1 end)

Here, the type annotation is needed. However, we let the type of the term to be`_`

, as a schematic variable. OCaml infers it as`int`

. If we have an implementation of`sym`

, say, module`R`

, we can use it to run the example (and obtain the`sym_ex1`

's value in`R`

's interpretation):let _ = let module N = (val sym_ex1) in let module M = N.Term(R) in M.res

The type

`'a sym_term`

can itself implement the`sym`

signature, in a `tautological' sort of way:module SymSelf : (sym with type 'a repr = 'a sym_term) = struct type 'a repr = 'a sym_term let int : int -> int repr = fun n -> let module M(I:sym) = struct let res = I.int n end in (module struct type a = int module Term = M end) let add : int repr -> int repr -> int repr = fun (module E1) (module E2) -> let module M(I:sym) = struct module E1T = E1.Term(I) module E2T = E2.Term(I) let res = I.add (E1T.res) (E2T.res) end in (module struct type a = int module Term = M end) ... end

That was a mouthful. But writing`sym`

DSL terms becomes much easier, with no functors and no type annotations. The earlier`sym_ex1`

can now be written aslet sym_ex1 = let open SymSelf in let t1 = add (add (int 1) (int 2)) (int 3) in (* intermediate binding *) if_ (iszero t1) (int 0) (add t1 (int 1))

It can be evaluated in`R`

or other implementation as shown before.Technically,

`SymSelf`

is the initial algebra: an implementation of the DSL that can be mapped to any other implementation, and in a unique way. That means its terms like`sym_ex1`

can be evaluated in any`sym`

DSL implementation: they are polymorphic over DSL implementation.On the down-side, we have

`SymSelf`

, which is the epitome of boilerplate: utterly trivial and voluminous code that has to be written. On the up side, writing DSL terms cannot be easier: no type annotations, no functors, no implementation passing -- and no overt polymorphism, higher-kind or even the ordinary kind. Still, the terms can be evaluated in any implementation of the DSL.*Exercise:*Apply Yallop and White's method to this DSL example. Hint: the first example in Yallop and White's paper, monad representations, is an example of a DSL embedding in tagless-final style. **References**- HKPoly_tf.ml [13K]

The complete code with tests and detailed developmentInitial Algebra

The initial algebra construction using first-class functors, in the case of one-sorted algebras (corresponding to untyped DSLs)Stephen Dolan: phantom type. Message on the caml-list posted on Mon, 27 Apr 2015 12:51:11 +0100

- We now look back at Yallop and White's approach of reducing
higher-kinded polymorphism to the ordinary polymorphism, from a
different perspective. It gives if not a new insight, at least new
implementations.
A polymorphic type like

`'a list`

represents a family of types, indexed by a type (of list elements, in this example). A higher-kinded type abstraction such as`'a 'F`

with the hypothetical (in OCaml) higher-kind type variable`'F`

is the abstraction over a family name, so to speak, while still keeping track of the index. Here is another way of accomplishing such an abstraction.Consider the existential type

`exists a. a list`

(realizable in OCaml in several ways, although not in the shown notation. We will keep the notation for clarity). The existential is now the ordinary, rank`*`

type and can be abstracted in a type variable, e.g.,`'d`

. The `family name' is, hence, the family type with the hidden index. We have lost track of the index, however. Therefore, we tack it back, ending up with the type`('a,'d) hk`

. Thus`(t,exists a. a list) hk`

is meant to be the same as`t list`

(for any type`t`

).There is a problem however:

`('a,'d) hk`

is a much bigger type. We need the condition that in`(t,exists a. a list) hk`

, the index`t`

is exactly the one that we hid in the existential quantification -- we need dependent pairs, not supported in OCaml. Remember the old trick, however: we may have a bigger type so long as we control the producers of its values and ensure only the values satisfying the condition are built. To be concrete, we must make certain that the only way to produce`('a,'d) hk`

values is by using functions like`inj: 'a list -> ('a, exists a. a list) hk`

that expose the same index they hide. At some point the type checker will demand a proof: when implementing the inverse mapping`('a, exists a. a list) hk -> 'a list`

and extracting the list out of the existential. There are several ways of going about the proof.The simplest is to give our word -- that the condition always holds for all

`('a,'d) hk`

values actually produced, and we have a proof of that on some piece of paper or in a`.v`

file. This leads to the exceptionally simple implementation, which does nothing at all (all of its operations are the identity).module HK : sig type ('a,'d) hk (* abstract *) module MakeHK : functor (S: sig type 'a t end) -> sig type anyt (* also abstract *) val inj : 'a S.t -> ('a,anyt) hk val prj : ('a,anyt) hk -> 'a S.t end end = struct type ('a,'d) hk = 'd module MakeHK(S:sig type 'a t end) = struct type anyt = Obj.t let inj : 'a S.t -> ('a,anyt) hk = Obj.repr let prj : ('a,anyt) hk -> 'a S.t = Obj.obj end end

The accompanying code shows a different, also quite simple implementation without any`Obj`

magic.After enriching the

`sym`

signature of the DSL from the previous section with fake higher-kinded types:module type sym_hk = sig include sym include module type of HK.MakeHK(struct type 'a t = 'a repr end) end

we can write the earlier`SymEx1`

example as a function (a term) rather than a functor:let sym_ex1 (type d) (module I:(sym_hk with type anyt=d)) : (_,d) HK.hk = let open I in let t1 = add (add (int 1) (int 2)) (int 3) |> inj in (* intermediate term *) let res = if_ (iszero t1) (int 0) (add t1 (int 1)) in inj res

It can be evaluated simply assym_ex1 (module RHK) |> RHK.prj

where`RHK`

is a module implementing`sym_hk`

. Incidentally, if`SHK`

is another module implementing`sym_hk`

and we attempt`sym_ex1 (module RHK) |> SHK.prj`

, we discover that`(int,RHK.anyt) bk`

and`(int,SHK.anyt) bk`

are actually different types. Although`HK`

does not do anything (at runtime), it does maintain safety and soundness. **References**- HKPoly_tf.ml [13K]

The complete code with tests and detailed development

- We have surveyed various ways of abstracting over a type
constructor -- or, writing interface-parameterized terms
when the interface involves a polymorphic type. Even if the
language does not support type-constructor--polymorphism directly,
such interface parameterization can still be realized, as:
- interface abstraction as a functor abstraction
- reducing higher-kinded polymorphism to ordinary polymorphism, by establishing a bijection between type constructors and ordinary types
- hiding the polymorphism over DSL implementations behind initial algebra (if the interface is algebraic, as often happens in tagless-final DSL embeddings)
- and in some problems, higher-kinded polymorphism is not actually needed, on close inspection