This is a joint work with Chung-chieh Shan. We describe several approaches to lightweight dependent-type programming, letting us gain experience with dependent types on existing programming language systems, such as the available Haskell or ML compilers.
All these lightweight approaches rely on type-level proxies for values, so we can statically express properties (e.g., equality, inequality) of the values that are not generally known until the run time.
``This much is clear: many programmers are already finding practical uses for the approximants to dependent types which mainstream functional languages (especially Haskell) admit, by hook or by crook.''[WDTM].
In general, dependent type is a type that is predicated upon
a (dynamic) value [WDTM], [LtU-DT]. Most commonly,
this dependency of a type takes the form of a term indexing
within a family of types. As Neel Krishnaswami well put it,
``dependent types are the Curry-Howard interpretation of first-order
logic'' [LtU-DT]. That is, propositions correspond to
types, and so a formula
P(n) expressing an assertion
about a particular element
n in the domain of discourse
corresponds to a type that depends on the particular (dynamic) value --
or on an expression, term, that will compute that dynamic value.
Therefore, any program specification that can be written in
first-order logic can be expressed as a typechecking
constraint. ``Dependently typed programs are, by their nature, proof
carrying code'' [WDTM].
The obvious main issue with dependent typechecking is deciding type equality. In the type indexing approach, types may include terms, and so deciding if two types are equal requires determination if two terms compute (or denote) the same value. The extensional equality of terms is, in presence of general recursion, generally undecidable. The issue of dependent types and general recursion, and various solutions, are elaborated in [WDTM].
Because most (statically typed) programming
languages do not permit value expressions appear in
type expressions, one may think that these languages cannot encode in
their types the fact, e.g., that some proposition
holds over all or some integers
n. The type system of
Haskell and ML permits types (predicates) that are parameterized over
type variables. However, the type variables range over types -- sets of
values -- rather than over individual values in the domain of
Abstract data types, however, already offer the simplest way to encode some propositions about values. For example, in OCaml, evaluating
module S1 = Set.Make(struct type t = int let compare x y = if x = y then 0 else if x > y then -1 else 1 end);;yields
module S1 : sig type elt = int type t val empty : t val is_empty : t -> bool val mem : elt -> t -> bool val add : elt -> t -> t val singleton : elt -> t ... endwhere the type of the set,
S1.t, is abstract. The only way to obtain values of that type is to use the members of the module
S1. That is, whatever set-related invariant is enforced by the code of module
S1, remains in force throughout the program. In particular, if
S1enforces the invariants
P(n) === S1.mem n (S1.singleton n) = true Q(n) === forall m \in int. m /= n -> S1.mem m (S1.singleton n) = falsethese invariants hold all throughout, for all integers. The abstract type
S1.trepresents, at the level of types, the above properties of integers. The static property
Q(n)is actually quite strong and cannot be efficiently represented by a run-time contract, due to the
forall m \in intpart.
Of course the invariant actually enforced by an implementation of the data type may differ from the invariant we think it enforces. Thus an implementation becomes a ``trusted kernel'', which has to be verified, using formal tools if necessary. It goes without saying that verifying an implementation of the Set interface is far easier than verifying the whole program that uses Set.
It should be pointed out that guarantees of abstract data types come from the lack of reflection, extension, and introspection of abstract data types. OCaml, incidentally, is weaker in this respect because polymorphic equality can break some of the invariants. So, reflection, extension and introspection facilities of a language can be viewed as vices rather than virtues.
The paper and the talk on lightweight static capabilities (see below) discuss this topic in more (formal) detail. The idea of abstract data type's statically representing and enforcing sophisticated properties of run-time values was Robin Milner's principal idea in the design of Edinburgh LCF back in the early 1970s.
The type system of Haskell or ML lets us express propositions
of the form
n is either a type or
a type variable that ranges over types. Seemingly, we cannot express a
n ranges over
individual members of a set (e.g., individual integers) rather than
over sets (i.e., types of integers). This inexpressibility is superficial
however: if we define particular types such that the corresponding
sets of values consist of only one value, the conceptual barrier
between elements and sets of elements disappears. For example:
data Zero = Zero data Succ a = Succ a
Here, there is only one (non-bottom) value of the type
Zero, there is only one (non-bottom) value of the type
Succ (Succ Zero), etc. There is the obvious correspondence between
the family of types induced by the two declarations and
non-negative integers. With more sophistication (well-formedness
constraints expressed in type classes) we can easily represent, in
types, propositions about ``integers''. Quite sophisticated
propositions can indeed be expressed this way: [Blume01],
[number-parameterized-types]. Singleton types have been
introduced by Hayashi (1991) and first used for dependent type
programming by Xi.
A language with higher-rank types or existential type
quantification lets us introduce type proxies for the values. We can
associate a value with
some type in such a way that type
equality entails value equality, even if the values in question are
not known statically (e.g., read from a file).
The paper [Implicit-Config] used singleton-type families to generate such proxies. The paper demonstrated, for example, how to write arithmetic expressions over implicit modulus and statically guarantee that within a (potentially, quite complex and impure expression) the modulus is the same -- even if the modulus is not known until the run time.
One can build type proxies for values without resorting to singleton types. Existential quantification suffices [branding]. This approach shares with abstract data types the insight that opaque, unforgeable types can represent propositions on values. This idea is extended and formalized in the next section.
This paper [lightweight-static-capabilities] is written with Chung-chieh Shan, who presented it at the PLPV 2006 workshop. We describe a modular programming style that harnesses modern type systems to verify safety conditions in practical systems. This style has three ingredients: (i) A compact kernel of trust that is specific to the problem domain; (ii) Unique names (capabilities) that confer rights and certify properties, so as to extend the trust from the kernel to the rest of the application; (iii) Static (type) proxies for dynamic values. We illustrate our approach using examples from the dependent-type literature, but our programs are written in Haskell and OCaml today, so our techniques are compatible with imperative code, native mutable arrays, and general recursion. The three ingredients of this programming style call for (1) an expressive core language, (2) higher-rank polymorphism, and (3) phantom types.
This paper demonstrates a lightweight notion of static capabilities that brings together increasingly expressive type systems and increasingly accessible program verification. Like many programmers, we want to assure safety conditions: array indices remain within bounds; modular arithmetic operates on numbers with the same modulus; a file or database handle is used only while open; and so on. The safety conditions protect objects such as arrays, modular numbers, and files. Our overarching view is that a static capability authorizes access to a protected object and simultaneously certifies that a safety condition holds. Rather than proposing a new language or system, our contribution is to substantiate the slogan that types are capabilities, today: we use concrete and straightforward code in Haskell and OCaml to illustrate that a programming language with an appropriately expressive type system is a static capability language.
The follow-up paper [lightweight-static-resources] describes further applications, for safe embedded and systems programming, ensuring, among other properties, proper alignment when accessing raw memory areas. The paper also introduces kind-level static capabilities to enforce invariants of type-level programming.
The lightweight approaches depend on a trusted kernel. How to make sure the trusted library deserves our trust? The same question exists for any other dependent-type system: how we make sure that our Oracle is correct, the typing rules are correct, and, mainly, that the implementation of those rules in a compiler is correct?
I have heard a similar question asked of J. Strother Moore and J. Harrison. J. Strother Moore said that most of ACL2 is built by bootstrapping, from lemmas and strategies that ACL2 itself has proven. However, the core of ACL2 just has to be trusted. ACL2 has been used for quite a while and so there is a confidence in its soundness. Incidentally, both NSA and NIST found this argument persuasive, when they accepted proofs by ACL2 as evidence of high assurance, in awarding Orange book A1 and IFIPS 140-1 ratings -- the highest security ratings -- to some products.
In a great more detail, this issue is explored in Randy Pollack's ``How to believe a machine-checked proof''[Believing].
Incidentally, even in Mathematics, it is not at all resolved what exactly constitutes a mathematical proof and how much trust, including personal trust, is involved: [Theoretical-Math].
Section 2, ``Related work'' of [WDTM] gives a comprehensive list of genuine Dependent-Type systems, such as Cayenne, ATS, Epigram, DML, Omega.
[WDTM] Thorsten Altenkirch, Conor McBride, and James McKinna: Why Dependent Types Matter. April 2005.
[LtU-DT] Dependent types: literature, implementations and limitations ?
A thread on Lambda the Ultimate started on Nov 21, 2005
[Implicit-Config] Implicit configurations -- or, type classes reflect the values
Joint work with Chung-chieh Shan.
[Blume01] Matthias Blume: No-Longer-Foreign: Teaching an ML compiler to speak C ``natively''
In BABEL'01: First workshop on multi-language infrastructure and interoperability, September 2001, Firenze, Italy.
[lightweight-static-capabilities] lightweight-static-capabilities.pdf [334K]
Lightweight static capabilities. Joint work with Chung-chieh Shan.
Electr. Notes Theor. Comput. Sci, 174(7), pp. 79-104, 2007.
[lsc-talk] Lightweight static capabilities. Slides of the talk at the IJCAR'06 workshop ``Programming Languages meet Program Verification'' (PLPV'06). Seattle, WA, August 21, 2006.
[number-parameterized-types] Number-parameterized types
[branding] Eliminating Array Bound Checking through Non-dependent types
[tagless-typed] Tagless typed staged interpreters with no dependent types
[tagless-typed-tc] Metatypechecking: Staged Typed Compilation into GADT via typeclasses. Computing the type from a value
[XiThesis] Hongwei Xi: Dependent Types in Practical Programming
Ph.D thesis, Carnegie Mellon University, September 1998.
[Believing] Randy Pollack: How to believe a machine-checked proof
[lightweight-static-resources] Lightweight static resources: Typed memory areas and time-parameterized monads for safe embedded and systems programming
[Theoretical-Math] Arthur Jaffe, Frank Quinn: ``Theoretical mathematics'': Toward a
cultural synthesis of mathematics and theoretical physics
Bull.Am.Math.Soc. 29 (1993) 1-13.
William P. Thurston: On Proof And Progress In Mathematics
Bull.Am.Math.Soc. 30 (1994) 167-177.
Michael Atiyah, Armand Borel, G. J. Chaitin, Daniel Friedan, James Glimm, Jeremy J. Gray, Morris W. Hirsch, Saunder MacLane, Benoit B. Mandelbrot, David Ruelle, Albert Schwarz, Karen Uhlenbeck, Rene' Thom, Edward Witten, Christopher Zeeman: Responses to ``Theoretical Mathematics: Toward a cultural synthesis of mathematics and theoretical physics'', by A. Jaffe and F. Quinn
Bull.Am.Math.Soc. 30 (1994) 178-207.
Arthur Jaffe, Frank Quinn: Response To Comments On ``Theoretical Mathematics''
Bull.Am.Math.Soc. 30 (1994) 208-211.
This site's top page is http://okmij.org/ftp/
Converted from SXML by SXML->HTML