Stretching Type Classes

Haskell with only one typeclass
Class-parameterized classes, and the type-level logarithm
Type-class overloaded functions: second-order typeclass programming with backtracking
Overcoming Overlapping
Implementing, and understanding type classes
Choosing a type-class instance based on the context
Local instances: frequently discussed and still not implemented
Dynamic dispatch on a class of a type
Model checking of Functional Dependencies
Type improvement constraint, local functional dependencies, and a type-level typecase
In defense of UndecidableInstances

Class-parameterized classes, and the type-level logarithm

We show invertible, terminating, 3-place addition, multiplication, and exponentiation relations on type-level unary, Peano numerals, where any two operands determine the third. We also show the invertible factorial relation. This gives us all common arithmetic operations on Peano numerals, including n-base discrete logarithm, n-th root, and the inverse of factorial. The latter operations and division are defined generically, as inverses of exponentiation, factorial and multiplication, resp. It takes only a couple of lines to define each. The inverting method can work with any representation of (type-level) numerals, binary or decimal.

The inverter itself is generic: it is a type-class function, that is, a type-class parameterized by the type-class to `invert'. The inverter is a simple higher-order for-loop on types. There has been a proposal on Haskell' to give equal rights to types and classes. In Haskell98+multi-parameter type classes, classes are already first-class, for all practical purposes. We can easily define (potentially, higher-order) type functions on type classes.

It becomes relatively straightforward then to implement RSA in types. That prompted Graham Klyne to remark on Haskell Cafe ``Methinks this gives a whole new meaning to "type security"''.

Version

The current version is 1.1, Feb 2, 2006.

References

PeanoArithm.lhs [8K]
The literate Haskell source code with tests
The code was originally posted as Class-parameterized classes, and the type-level logarithm on the Haskell mailing list on Thu, 2 Feb 2006 22:42:08 -0800 (PST)

Haskell with only one typeclass

Type classes are so ingrained in Haskell that one can hardly think about the language without them. And yet, if we remove the type class declaration and the standard type classes, leaving the language with a single, fixed, pre-defined type class with a single method, no expressivity is lost. We can still write all Haskell98 type-class programming idioms including constructor classes, as well as multi-parameter type classes and even some functional dependencies. Adding the equality constraint gives us all functional dependencies, bounded existentials, and even associated data types. Besides clarifying the role of type classes as method bundles, we propose a simpler model of overloading resolution than that of Hall et al.

The article below introduces a subset of Haskell (called Haskell1) with only one, pre-defined typeclass C with only one method ac. One can add more instances to C but cannot define any more typeclasses or any more methods. Despite the restriction, we implement overloaded numerical functions, show and, minBound. We define monads and their extension, restricted monads, as well as monad transformers, e.g., MonadError, demonstrating that Haskell1 supports some functional dependencies.

To represent the rest of functional dependencies, we introduce Haskell1' as an extension of Haskell1 with the pre-defined constraint TypeCast. That constraint is not user-extensible and can be regarded built-in. In fact, it is already the built-in constraint t1 ~ t2 since GHC 6.12. We implement in Haskell1' the motivating example from the paper `Associated Types with Class' by Chakravarty, Keller, Peyton Jones and Marlow (POPL2005). Finally we introduce the analogue of Haskell98 classes -- method bundles -- and use them for defining bounded existentials.

Haskell1 is not a new language and requires no new compilers. It is merely a subset of Haskell; the `removal' of typeclass declarations is a matter of discipline rather than that of syntax.

Gerrit van den Geest has kindly pointed out another re-design of Haskell type classes: System O of Odersky, Wadler, and Wehr. System O however imposes the restriction that each function overloaded over the type a must have the type of the form a -> t, with a not free in t. Therefore, System O cannot express, for example, fromInteger and minBound. Haskell1 has no such restrictions and can express any overloaded function of Haskell. Restricting Haskell to only one particularly chosen typeclass is no restriction at all.

Version

The current version is 1.1, February 2007.

References

Haskell1.txt [10K]
The complete article
It was originally posted on the Haskell mailing list on Wed, 28 Feb 2007 23:56:47 -0800 (PST)

Class1.hs [7K]
Expressing Haskell98 typeclass idioms in Haskell1

Class2.hs [10K]
Haskell1' and its expressivity
In addition to the single, pre-defined class C, we assume TypeCast. The latter is not user-extensible and may be regarded as a pre-defined constraint. To define TypeCast however, we need the undecidable instances extension. Nothing else requires this extension; furthermore, TypeCast is closed and non-recursive and so is certainly decidable.

Restricted Monads

Local instances: frequently discussed and still not implemented

Local type-class instances regularly come up in various discussions and forums. This is not the recent trend: they have been the topic of several academic papers written around 2000. In particular, local instances appeared in the context of a configuration problem: setting and propagating various preference parameters such as page width for the pretty-printer, or tolerances and rounding modes for numerical code. They go back even farther: the very first papers about type classes and parametric overloading, by Kaes in 1988 and Wadler and Blott in 1989, discussed local instances. Kaes' prototype language even implemented them. Why they are still not available in Haskell?

No doubt local instances are sorely missed. However, there are several different ways of adding them to Haskell -- each with is its own semantics, each with its own problems, which may seem to outweigh the benefits. This article illustrates the problems on simple examples.

Let us assume Haskell with local instances, defined by the let instance form. Consider the following function:

     f1 :: Int -> String
     f1 x = 
       let instance Show Int where show _ = "one"
       in show x

whose type has no Show constraint: show is being applied to x::Int and there is the matching Show instance in scope. Thus f1 1 should evaluate to "one". In the next example, however,

     f2 :: Show a => a -> String
     f2 x = 
       let instance Show Int where show _ = "one"
       in show x

the variable x is polymorphic and the overloading of show cannot be resolved, hence the Show constraint in the function's type. What should f2 (1::Int) return?

One may argue that a function should close over, or capture, local instances just as it closes over local variables. If captured instances are not reflected in the type of the function, however, there is an unpleasant consequence: parametricity breaks and free theorems are invalidated. To illustrate, consider the following variation of f2:

     f3 :: Show a => a -> String
     f3 x = 
       let instance Show Int
         where show x = if x == 0 then "zero" else "nonzero"
       in show x

Its type says that f3 handles its argument solely through the caller-provided show implementations, that is, Show instances that are in scope of its invocation. Therefore, in the following program with the single Show instance,

     instance Show Int where show _ = "const"
     f3_Int :: Int -> String
     f3_Int x = f3 x

f3_Int should be a constant function -- as the consequence of a free theorem. With our implementation of f3, f3_Int 0 and f3_Int 1 give, however, different results. To preserve parametricity, the closed instances must somehow appear in the function's signature -- preferably in a way that preserves principle types and hence the type inference. This complicates the type system.

In the opposite approach, local instances are not closed over -- rather, they behave like dynamically-bound variables. We would then expect f2 (1::Int) to return "1". This approach is also reasonable, and also problematic. GHC, seeing our call in which the polymorphic f2 is applied to an Int argument, may decide to specialize f2 to that type. Such specializations, which eliminate indirect type-class method calls, are crucial for good performance, and GHC indeed does them. In our case, the compiler will generate a specialized version of f2:

     f2_Int :: Int -> String
     f2_Int x = 
       let instance Show Int where show _ = "one"
       in show x

It looks just like the f1 we started with, which, when applied to 1 returned "one". The specialization thus changed the function's behavior. Without local instances, specialization was easy to understand: GHC may, at will, create a copy of a polymorphic function, instantiate it at a specific type, and replace calls to the polymorphic function at that type with the calls to the specialized version. Users, or tools, can perform such a procedure ourselves when optimizing or refactoring code. With unclosed local instances, specialization changes the program behavior.

To complicate the matters further, what should f4 (1::Int) return for the following f4?

     f4 :: Show a => a -> String
     f4 x = 
       let instance Show Int where show _ = "two"
       in f2 x

Stefan Kaes, the person who introduced ``parametric overloading'' (type classes, in modern term) chose the ``closure semantics''. In his system all instances are local. Duggan and J. Ophel (2002) have perhaps the most comprehensive treatment of local instances. Their proposal supports both closure and non-closure semantics, distinguishing them by so-called open and closed kinds. They do complicate the type system.

Yet another approach, advocated in ``Implicit Configurations'' (HW 2004), restricts local instances so that the problems we have seen do not arise. The restriction outlaws all the examples shown so far. Users are only allowed to define something like the following

     data Any = forall s. Any s  -- existential
     any = Any ()
     
     fq :: (forall a. Show a => a -> String) -> String
     fq k = case any of
             Any (some::s) ->
               let instance Show (s,Int) where ...
               in k (some,10)

That is, a local instance must contain an ``opaque'' type like the s above. Informally, only those local instances are permitted that cannot possibly overlap with any other instance in scope. Is this restriction too severe? Only experience can tell.

In conclusion, the problem of local instances at present is the engineering problem, to be resolved by the empirical evaluation of the trade-offs of various proposals. Incidentally, the ``Implicit Configurations'' paper pointed out a way to emulate the proposed local instances within Haskell of 2004 -- so that one can experiment and accumulate the experience.

I thank Gesh @gesh.uni.cx and Jun Inoue for insightful discussions.

Version

The current version is October 2014.

References

D. Duggan and J. Ophel: Open and closed scopes for constrained genericity.
Theoretical Computer Science, 275, 215-258, 2002.
< http://dx.doi.org/10.1016/S0304-3975(01)00129-3 >

Implicit configurations -- or, type classes reflect the values of types
This paper described all proposed approaches to the configuration problem, including implicit parameters, global variables, the Reader monad, and local instances.

Philip L. Wadler and Stephen Blott: How to Make Ad-Hoc Polymorphism Less Ad Hoc
Proc. POPL 1989, pp. 60-76.

The pioneering work of Stefan Kaes
The parametric overloading system introduced in Kaes' paper was built around local instances, and even local type classes.

Atze Dijkstra, Gerrit van den Geest, Bastiaan Heeren, S. Doaitse Swierstra: Modelling Scoped Instances with Constraint Handling Rules
Technical Report, Utrecht University, 2007
The paper discusses in great detail the problems posed by local instances. It proposes an interesting solution, which lets the programmer specify the instance resolution heuristics.

Type-class overloaded functions: second-order typeclass programming with backtracking

We demonstrate functions polymorphic over classes of types. Each instance of such (2-polymorphic) function uses ordinary 1-polymorphic methods, to generically process values of many types, members of that 2-instance type class. The typeclass constraints are thus manipulated as first-class entities. We also show how to write typeclass instances with back-tracking: if one instance does not apply, the typechecker will chose the `next' instance -- in the precise meaning of `next'.

We show a method to describe classes of types in a concise way, using unions, class differences and unrestricted comprehension. These classes of types may be either closed or open (extensible). After that set up, we can write arbitrarily many functions overloaded over these type classes. An instance of our function for a specific type class may use polymorphic functions to generically process all members of that type class. Our functions are hence second-order polymorphic.

Version

The current version is 1.1, November 2006.

References

poly2.txt [14K]
The complete article with explanations
It was originally posted on the Haskell mailing list on Sun, 19 Nov 2006 16:54:34 -0800 (PST)

poly2.hs [9K]
The complete Haskell source code

is-of-class.hs [4K]
Small self-containing example of classifying data based on their types
This self-contained code defines the function is_of_class to statically check if an object of the type x is a member of the class c. A class of types is defined by enumeration, set-union, set-difference, and unrestricted comprehension.
The code was originally posted as Re: Trying to avoid duplicate instances on the Haskell-Cafe mailing list Tue, 13 May 2008 23:04:29 -0700 (PDT)

Choosing a type-class instance based on the context

Type classes let us overload operations based on the type of an expression. A frequently expressed wish is to overload based on the class to which expression's type belongs. For example, we want to define an overloaded operation print to be equivalent to putStrLn . show when applied to showable expressions, whose types are the members of the class Show. For other types, the operation print should do something different (e.g., print that no show function is available, or, for Typeable expressions, write their type instead).

The naive approach clearly does not work

     instance Show a => Print a where
       print x = putStrLn (show x)
     instance        Print a where
       print x = putStrLn "No show method"

because the two instances have the identical heads Print a and so considered duplicates by the type-checker. The WikiPage below describes the working solution, relying on multi-parameter type classes with functional dependencies and the type level predicate describing the membership in the class Show. The trick is to re-write a constraint (C a) which succeeds of fails, into a predicate constraint (C' a flag), which always succeeds, but once discharged, unifies flag with a type-level Boolean HTrue or HFalse.

Joint work with Simon Peyton-Jones.

Version

The current version is 1.1, April 2008.

References

The WikiPage with the explanation of the technique and of its several variations
< http://haskell.org/haskellwiki/GHC/AdvancedOverlap >

Type improvement constraint, local functional dependencies, and a type-level typecase
The method for defining type-level predicates

Dynamic dispatch on a class of a type
A more `dynamic' solution

Type-class overloaded functions: second-order typeclass programming with backtracking
An example of building classes of types and defining their membership predicate

Dynamic dispatch on a class of a type

This message gives an example of a dynamic type class cast in Haskell. We want to dispatch on a class of a type rather on a type itself. In other words, we would like to simulate IsInstanceOf. The problem was originally posed by Hal Daume III, who wrote on the Haskell mailing list:

i'm hoping to be able to simulate a sort of dynamic dispatch based on class instances. basically a function which takes a value and depending on what classes it is an instance of, does something. I've been trying for a while to simulate something along the lines of:
     	class Foo a where { foo :: a -> Bool }
     	class Bar a where { bar :: a -> Bool }
     	foo x
     	  | typeOf x `instanceOf` Foo = Just (foo x)
     	  | typeOf x `instanceOf` Bar = Just (bar x)
     	  | otherwise                 = Nothing

The following code shows how to implement a dynamic dispatch on a type class context. No unsafe operations are used. The test at the end demonstrates that we can indeed simulate Hal's desired example.

Version

The current version is 1.1, Mar 23, 2003.

References

class-based-dispatch.lhs [4K]
The literate Haskell source code and a test
The code was originally posted as simulating dynamic dispatch on the Haskell mailing list on Sun, 23 Mar 2003 13:41:48

Hal Daume III: simulating dynamic dispatch. The message with the statement of the problem, posted on the Haskell mailing list on Mar 20, 2003.
< http://www.haskell.org/pipermail/haskell/2003-March/011518.html >

Choosing a type-class instance based on the context
An alternative: a static dispatch on a class of a type

Model checking of Functional Dependencies

Given multi-parameter type-class declarations with functional dependencies and a set of their instances, we explain how to check if the instances conform to a functional dependency. If the check fails we give a counter-example, which is more helpful than compiler error messages. Our checker, which is a simple Prolog code, fills the real need nowadays: regrettably, GHC no longer does the functional dependency conformance check when the UndecidableInstances extension is on. The unconformant instances are admitted and cause problems, but at a later time and place and accompanied with even harder to understand error messages.

Functional dependencies are quite well understood, see the works by M. P. Jones and M. Sulzmann. According to Mark P. Jones, they are inspired and quite akin to functional dependencies in relational databases. In a class declaration

     class C a b | a -> b where cm :: a -> b

the functional dependency a->b asserts the following implication

     	(*)  C a1 b1, C a2 b2, a1 ~ a1 ===> b1 ~ b2

where ~ denotes type equality. We view the type-class as defining a predicate on types: C t1 t2 holds for types t1 and t2 if there is an instance of C that matches C t1 t2, that is, will be selected for t1 and t2. On one hand, the implication (*) lets us derive the proof of type equality, to be used for improving other types and resolving further constraints. For example, if we enter the following instance and a definition

     instance C Bool Int where ...
     
     f x = cm (not x)

GHC infers for f the type Bool -> Int. Without the functional dependency, the inferred type is polymorphic:

C Bool b => Bool ->
b

. If the context in which f is used does not constrain f's result type, the user will have to write a type annotation to help the compiler select the instance for C. Since our instance tells the compiler that C Bool Int holds, the implication (*) says C Bool b ===> b ~ Int. The constraint C Bool b can hence be discharged from the type of f and b improved to Int.

A functional dependency is also a restriction on the set of instances. If we enter instances of C such that the implication (*) is violated, the compiler should reject the program. Sometimes, the error message makes it hard to understand what exactly is wrong with the instances. The simple Prolog code below is intended to help, by printing a counter-example of the violation of the functional dependency. The code works with and without overlapping instances. The program is a simple model checker for implications like (*).

Our first example has one type class and two instances

     class C a b | a -> b
     instance C Bool Int
     instance C [a] [b]

Each instance is encoded as a Prolog clause:

     c(i1,bool,int).
     c(i2,[_A],[_B]).

In Haskell, lower-case type identifiers are variables and upper-case ones are type constants. In Prolog, it's the other way around. The first argument of c identifies the instance. Underscored variables are singleton variables (if we omit the underscores, we get a warning from the interpreter).

Here is the model-checking code itself

     ?- G = (c(_,X,Y), c(_,X,Y1)), 
        G, Y \== Y1,
        print('counterexample: '), instantiate(G), print(G), nl.

we search for types X, Y and Y1 such that c(_,X,Y) and c(_,X,Y1) both hold and Y is different from Y1. If the search succeeds, we print out the found types. (Actually, we print out a grounding of the found types, which makes the print-out nicer.) The above program does find a counterexample: c(i2, [t2], [t3]), c(i2, [t2], [t4]). That is, one can select an instance for C [t2] [t3] and for C [t2] [t4], where t2, t3, t4 are some distinct types; in particular t3 is different from t4. Since the implication (*) is violated, the program must be rejected.

The Prolog code used syntactic disequality \==, which is defined as the negation of ==. The ordinary Prolog equality Term1 = Term2 holds if there exists a substitution for free variables in Term1 and Term2 that makes the terms identical. This equality is decided by unification. The syntactic equality Term1 == Term2 holds if the terms are identical for any substitution. If X and Y are two free variables, then X = Y holds (making these variables shared) but X == Y fails. Correspondingly, X \== Y succeeds (and X \== X fails).

Our second example has overlapping instances:

     class C a b | a -> b
     instance C Bool Int
     instance C a b

encoded as

     c(i1,bool,int).
     c(i2,_A,_B).

The same Prolog query produces the counter-example: c(i1, bool, int), c(i2, bool, t1). That is, C Bool Int can be resolved (to the first instance) and C Bool t1 with t1 different from Int can also be resolved, to the second instance. The program should be rejected.

A violation of a functional dependency is usually harder to see. Here is a real example, posted on the Haskell mailing list by Wolfgang Jeltsch in Aug 2003.

     class C a b c | a b -> c where ...
     instance C a b c => C a (x,y,b) c where ...
     instance C a (a,c,b) c where ...

Its encoding in Prolog, the model checking query and the found counter-example are as follows

     c(c2,   A,tup(A,C,_B), C).
     c(c1(I),A,tup(_X,_Y,B),C) :- c(I,A,B,C).
     
     ?- G = (c(_,X,Y,Z), c(_,X,Y,Z1)), 
        G, Z \== Z1,
        print('counterexample: '), instantiate(G), print(G), nl.
     
     %% counterexample: c(c2,     t11, tup(t11, t12, tup(t11, t13, t14)), t12), 
     %%                 c(c1(c2), t11, tup(t11, t12, tup(t11, t13, t14)), t13)

We have described how to check if the instances of a type class with a functional dependency conform to the dependency. The method is model checking of the implication represented by the functional dependency. If the check fails, it produces a counter-example of a concrete set of instances that violate the dependency.

Version

The current version is 2003; minor changes in 2011.

References

fd-check.prl [5K]
Complete Prolog code for all the examples
The code has more examples, with and without overlapping, of dependencies among several type class parameters or several type classes.

Peter J. Stuckey and Martin Sulzmann: The theory of overloading.
TOPLAS, 2005, v27, N6, 1216-1269

Mark P. Jones, Iavor S. Diatchki: Language and Program Design for Functional Dependencies.
Haskell Symposium 2008
< http://web.cecs.pdx.edu/~mpj/pubs/fundeps-design.html >

Iavor S. Diatchki: TypeFamilies vs. FunctionalDependencies & type-level recursion
Message posted on the Haskell' list on Wed, 15 Jun 2011 10:10:14 -0700.
The message explains the GHC bug of failing to validate instances with respect to functional dependency, occurring when UndecidableInstances is on. Iavor suggests a fix.

In defense of UndecidableInstances

Class instance and type family instance declarations are subject to a set of strict conditions: Paterson and Coverage Conditions for class instances (Section 7.6.3. ``Instance declarations'' of GHC User's Guide, version 6.10.4) and similar conditions for type families (Section 7.7.2.2.3. ``Decidability of type synonym instances''). The conditions ensure that the process of resolving class constraints or normalizing type function applications always terminates.

The conditions are quite constrictive, and GHC offers a way to lift them, with the LANGUAGE pragma UndecidableInstances. The pragma is shunned however. There are good reasons for that attitude: if we lift the conditions, we can write instances that send the type checker into infinite loop (albeit only potentially: the termination is insured by the recursion depth limit, set by the flag -fcontext-stack). Section 7.6.3.2. ``Undecidable instances'' describes two such (quite subtle) cases.

It is also true that Paterson and Coverage Conditions are sufficient but by no means necessary to ensure the decidability of type normalization. There are patently total type functions rejected by the Conditions. This article shows a simple example, and argues for more acceptance towards UndecidableInstances. The extension is not always bad, and should not be automatically frowned upon.

Our example is simple indeed:

     {-# LANGUAGE TypeFamilies #-}
     
     type family I x :: *
     type instance I x = x
     
     type family B x :: *
     type instance B x = I x

The code is not accepted by GHC 6.8.3 and GHC 6.10.4. The compiler complains:

     Application is no smaller than the instance head
       in the type family application: I x
     (Use -fallow-undecidable-instances to permit this)
     In the type synonym instance declaration for `B'

Yet there cannot possibly be any diverging reduction sequence here! The type family I is the identity, and the type family B is its alias. There is no recursion. The fact that type families are open is not relevant: type families in our code are effectively closed, because one cannot define any more instance for I and B (or risk overlap, which is rightfully not supported for type families).

GHC complains because it checks termination instance-by-instance. To see the termination in the above type program, we should consider the instances I and B together. Then we will see that I does not refer to B, so there are no loops. This global termination check -- for a set of instances -- is beyond the abilities of GHC. One may argue that this is the right decision: after all, GHCi is not a full-blown theorem prover.

Thus there are patently decidable type programs that require UndecidableInstances. That extension should not be categorically stigmatized.

In conclusion, UndecidableInstances is not a dangerous flag. It will never cause the type-checker to accept a program that `goes wrong.' The only bad consequence of using the flag is type checker's might be telling us that it cannot decide if our program is well-typed, given the context-stack--depth limit. We may ask the type-checker to try a bit harder (with a larger depth limit), or look through our program and find the problem.

UndecidableInstances are quite like the primitive recursion criterion: all primitive recursive functions surely terminate; non-primitive recursive functions generally don't. Still there are many classes of non-primitive recursive functions that are total. To see their totality, one has to use more complex criteria.

Last updated February 4, 2015

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML