This is a joint work with Chung-chieh Shan. We describe
several approaches to *lightweight* dependent-type
programming, letting us gain experience with dependent types on
existing programming language systems, such as the available Haskell
or ML compilers.

All these lightweight approaches rely on type-level proxies for values, so we can statically express properties (e.g., equality, inequality) of the values that are not generally known until the run time.

``This much is clear: many programmers are already finding practical uses for the approximants to dependent types which mainstream functional languages (especially Haskell) admit, by hook or by crook.''[WDTM].

In general, dependent type is a type that is predicated upon
a (dynamic) value [WDTM], [LtU-DT]. Most commonly,
this dependency of a type takes the form of a term indexing
within a family of types. As Neel Krishnaswami well put it,
``dependent types are the Curry-Howard interpretation of first-order
logic'' [LtU-DT]. That is, propositions correspond to
types, and so a formula `P(n)`

expressing an assertion
about a particular element `n`

in the domain of discourse
corresponds to a type that depends on the particular (dynamic) value --
or on an expression, term, that will compute that dynamic value.
Therefore, any program specification that can be written in
first-order logic can be expressed as a typechecking
constraint. ``Dependently typed programs are, by their nature, proof
carrying code'' [WDTM].

The obvious main issue with dependent typechecking is deciding type equality. In the type indexing approach, types may include terms, and so deciding if two types are equal requires determination if two terms compute (or denote) the same value. The extensional equality of terms is, in presence of general recursion, generally undecidable. The issue of dependent types and general recursion, and various solutions, are elaborated in [WDTM].

Because most (statically typed) programming
languages do not permit value expressions appear in
type expressions, one may think that these languages cannot encode in
their types the fact, e.g., that some proposition `P(n)`

holds over all or some integers `n`

. The type system of
Haskell and ML permits types (predicates) that are parameterized over
type variables. However, the type variables range over types -- sets of
values -- rather than over individual values in the domain of
discourse.

Abstract data types, however, already offer the simplest way to encode
*some* propositions about values. For example, in OCaml,
evaluating

module S1 = Set.Make(struct type t = int let compare x y = if x = y then 0 else if x > y then -1 else 1 end);;yields

module S1 : sig type elt = int type t val empty : t val is_empty : t -> bool val mem : elt -> t -> bool val add : elt -> t -> t val singleton : elt -> t ... endwhere the type of the set,

`S1.t`

, is abstract. The only
way to obtain values of that type is to use the members of the module
`S1`

. That is, whatever set-related invariant is enforced
by the code of module `S1`

, remains in force throughout the
program. In particular, if `S1`

enforces the invariants P(n) === S1.mem n (S1.singleton n) = true Q(n) === forall m \in int. m /= n -> S1.mem m (S1.singleton n) = falsethese invariants hold all throughout, for

`S1.t`

represents, at the level
of types, the above properties of integers. The `Q(n)`

is actually quite strong and cannot be
efficiently represented by a run-time contract, due to the `forall m \in int`

part.
Of course the invariant actually enforced by an implementation of the data type may differ from the invariant we think it enforces. Thus an implementation becomes a ``trusted kernel'', which has to be verified, using formal tools if necessary. It goes without saying that verifying an implementation of the Set interface is far easier than verifying the whole program that uses Set.

It should be pointed out that guarantees of abstract data types come from the lack of reflection, extension, and introspection of abstract data types. OCaml, incidentally, is weaker in this respect because polymorphic equality can break some of the invariants. So, reflection, extension and introspection facilities of a language can be viewed as vices rather than virtues.

The paper and the talk on lightweight static capabilities (see below) discuss this topic in more (formal) detail. The idea of abstract data type's statically representing and enforcing sophisticated properties of run-time values was Robin Milner's principal idea in the design of Edinburgh LCF back in the early 1970s.

The type system of Haskell or ML lets us express propositions
of the form `P(n)`

where `n`

is either a type or
a type variable that ranges over types. Seemingly, we cannot express a
proposition `P(n)`

where `n`

ranges over
individual members of a set (e.g., individual integers) rather than
over sets (i.e., types of integers). This inexpressibility is superficial
however: if we define particular types such that the corresponding
sets of values consist of only one value, the conceptual barrier
between elements and sets of elements disappears. For example:

data Zero = Zero data Succ a = Succ a

Here, there is only one (non-bottom) value of the type `Zero`

, there is only one (non-bottom) value of the type `Succ (Succ Zero)`

, etc. There is the obvious correspondence between
the family of types induced by the two declarations and
non-negative integers. With more sophistication (well-formedness
constraints expressed in type classes) we can easily represent, in
types, propositions about ``integers''. Quite sophisticated
propositions can indeed be expressed this way: [Blume01],
[number-parameterized-types]. Singleton types have been
introduced by Hayashi (1991) and first used for dependent type
programming by Xi.

A language with higher-rank types or existential type
quantification lets us introduce type proxies for the values. We can
associate a value with `some`

type in such a way that type
equality entails value equality, even if the values in question are
not known statically (e.g., read from a file).

The paper [Implicit-Config] used singleton-type families to generate such proxies. The paper demonstrated, for example, how to write arithmetic expressions over implicit modulus and statically guarantee that within a (potentially, quite complex and impure expression) the modulus is the same -- even if the modulus is not known until the run time.

One can build type proxies for values without resorting to singleton types. Existential quantification suffices [branding]. This approach shares with abstract data types the insight that opaque, unforgeable types can represent propositions on values. This idea is extended and formalized in the next section.

This paper [lightweight-static-capabilities] is
written with Chung-chieh Shan, who presented it at the PLPV 2006
workshop. We describe a modular programming style that harnesses
modern type systems to verify safety conditions in practical systems.
This style has three ingredients: (i) A compact kernel of trust that
is specific to the problem domain; (ii) Unique names (*capabilities*) that confer rights and certify properties, so as to
extend the trust from the kernel to the rest of the application; (iii)
Static (type) proxies for dynamic values. We illustrate our approach
using examples from the dependent-type literature, but our
programs are written in Haskell and OCaml today, so our techniques are
compatible with imperative code, native mutable arrays, and general
recursion. The three ingredients of this programming style call
for (1) an expressive core language, (2) higher-rank polymorphism, and (3)
phantom types.

This paper demonstrates a lightweight notion of *static
capabilities* that brings together increasingly expressive type
systems and increasingly accessible program verification. Like many
programmers, we want to assure safety conditions: array indices remain
within bounds; modular arithmetic operates on numbers with the same
modulus; a file or database handle is used only while open; and so on.
The safety conditions protect objects such as arrays, modular numbers,
and files. Our overarching view is that a static capability
authorizes access to a protected object and simultaneously certifies
that a safety condition holds. Rather than proposing a new language
or system, our contribution is to substantiate the slogan that types
are capabilities, today: we use concrete and straightforward code in
Haskell and OCaml to illustrate that a programming language with an
appropriately expressive type system is a static capability
language.

The follow-up paper [lightweight-static-resources] describes further applications, for safe embedded and systems programming, ensuring, among other properties, proper alignment when accessing raw memory areas. The paper also introduces kind-level static capabilities to enforce invariants of type-level programming.

- The so-called `strict' type system that assures safety of list or array accesses is presented in [lsc-talk]. The system has a dependent-type flavor. The talk then shows how a relaxation of the system to the Hindley-Milner system with higher-ranked types can preserve the safety properties, provided branding operations are isolated in a separate kernel, and the kernel correctly implements the formally stated semantics of these operations.
- safety.elf [17K]

Twelf code with proofs of soundness of the `strict' type system for the language made of System F plus lists and non-empty lists. The progress theorem assures that a well-typed program shall not attempt to take the head or tail of an empty list. - safety-array.elf [32K]

Twelf code verifying manual proofs of soundness of the `strict' type system for the language made of System F plus `arrays' whose type reflects their size. The progress theorem states the safety property: a well-typed program shall not attempt to access an out-of-bounds element of an array.

- Binary search with safe array access and no array bound
checks: the example from Xi and Pfenning's PLDI 1998 paper.

eliminating-array-bound-check-literally.ml [12K]

The OCaml code patterned after Xi's Dependent ML code from the PLDI 1998 paper.

eliminating-array-bound-check-literally.hs [8K]

The corresponding Haskell code. The code is written in Haskell98 with a sole extension for higher-ranked types.

eliminating-array-bound-check.lhs [9K]

A slightly more general literate Haskell version that accounts for Haskell arrays with arbitrary lower and upper bounds. - Eliminating array bound checks in
*multiple*arrays

eliminating-mult-array-bound-check.lhs [13K]

folding over multiple arrays of various sizes. - KMP-deptype.hs [14K]

The Haskell code for Knuth-Morris-Pratt (packed) string matching. The safety of all string/array access operations is guaranteed statically. - reverse.ml [2K]

The OCaml code for the introductory example of the paper: list reversal with safe list access. - reverse.ml [2K]

The OCaml code for the introductory example of the paper: list reversal with safe list access.

The lightweight approaches depend on a trusted kernel. How to
make sure the trusted library deserves our trust? The same question
exists for any other dependent-type system: how we make sure that our
Oracle is correct, the typing rules are correct, and, mainly, that the
implementation of those rules in a *compiler* is correct?

I have heard a similar question asked of J. Strother Moore and J. Harrison. J. Strother Moore said that most of ACL2 is built by bootstrapping, from lemmas and strategies that ACL2 itself has proven. However, the core of ACL2 just has to be trusted. ACL2 has been used for quite a while and so there is a confidence in its soundness. Incidentally, both NSA and NIST found this argument persuasive, when they accepted proofs by ACL2 as evidence of high assurance, in awarding Orange book A1 and IFIPS 140-1 ratings -- the highest security ratings -- to some products.

In a great more detail, this issue is explored in Randy Pollack's ``How to believe a machine-checked proof''[Believing].

Incidentally, even in Mathematics, it is not at all resolved what exactly constitutes a mathematical proof and how much trust, including personal trust, is involved: [Theoretical-Math].

Section 2, ``Related work'' of [WDTM] gives a comprehensive list of genuine Dependent-Type systems, such as Cayenne, ATS, Epigram, DML, Omega.

[WDTM] Thorsten Altenkirch, Conor McBride, and James McKinna: Why Dependent Types Matter. April 2005.

<http://www.cs.nott.ac.uk/~txa/publ/ydtm.pdf>

[LtU-DT] Dependent types: literature, implementations and limitations ?

A thread on *Lambda the Ultimate* started on Nov 21, 2005

<http://lambda-the-ultimate.org/node/view/1129>

[Implicit-Config] Implicit configurations -- or, type classes reflect the values
of types

Joint work with Chung-chieh Shan.

[Blume01] Matthias Blume: No-Longer-Foreign: Teaching an ML compiler to speak C ``natively''

In BABEL'01: First workshop on multi-language infrastructure and
interoperability, September 2001, Firenze, Italy.

<http://people.cs.uchicago.edu/~blume/pub.html>

[lightweight-static-capabilities] lightweight-static-capabilities.pdf [334K]

Lightweight static capabilities. Joint work with Chung-chieh Shan.

Electr. Notes Theor. Comput. Sci, 174(7), pp. 79-104, 2007.

[lsc-talk] Lightweight static capabilities. Slides of the talk at the IJCAR'06 workshop ``Programming Languages meet Program Verification'' (PLPV'06). Seattle, WA, August 21, 2006.

<http://www.cs.rutgers.edu/~ccshan/capability/talk.pdf>

[number-parameterized-types] Number-parameterized types

[branding] Eliminating Array Bound Checking through Non-dependent types

[tagless-typed] Tagless typed staged interpreters with no dependent types

[tagless-typed-tc] Metatypechecking: Staged Typed Compilation into GADT via typeclasses. Computing the type from a value

[XiThesis] Hongwei Xi: Dependent Types in Practical Programming

Ph.D thesis, Carnegie Mellon University, September 1998.

<http://www.cs.bu.edu/~hwxi/>

[Believing] Randy Pollack: How to believe a machine-checked proof

<http://www.dcs.ed.ac.uk/home/rap/export/believing.ps.gz>

[lightweight-static-resources] Lightweight static resources: Typed memory areas and time-parameterized monads for safe embedded and systems programming

[Theoretical-Math] Arthur Jaffe, Frank Quinn: ``Theoretical mathematics'': Toward a
cultural synthesis of mathematics and theoretical physics

Bull.Am.Math.Soc. 29 (1993) 1-13.

<http://arxiv.org/abs/math.HO/9307227>

William P. Thurston: On Proof And Progress In Mathematics

Bull.Am.Math.Soc. 30 (1994) 167-177.

<http://arxiv.org/abs/math.HO/9404236>

Michael Atiyah, Armand Borel, G. J. Chaitin, Daniel Friedan,
James Glimm, Jeremy J. Gray, Morris W. Hirsch, Saunder MacLane, Benoit
B. Mandelbrot, David Ruelle, Albert Schwarz, Karen Uhlenbeck, Rene'
Thom, Edward Witten, Christopher Zeeman:
Responses to ``Theoretical Mathematics: Toward a cultural synthesis of
mathematics and theoretical physics'', by A. Jaffe and F. Quinn

Bull.Am.Math.Soc. 30 (1994) 178-207.

<http://arxiv.org/abs/math.HO/9404229>

Arthur Jaffe, Frank Quinn: Response To Comments On ``Theoretical Mathematics''

Bull.Am.Math.Soc. 30 (1994) 208-211.

<http://arxiv.org/abs/math/9404231>

This site's top page is **http://okmij.org/ftp/**

oleg-at-pobox.com or oleg-at-okmij.org

Your comments, problem reports, questions are very welcome!

Your comments, problem reports, questions are very welcome!

Converted from SXML by SXML->HTML