Type equality predicates decide if two types are equal and return the result of the comparison as a type-level boolean. A type equality predicate ought to be total for ground types, that is, types with no free type variables. A more general type matching predicate compares the shape of two types, their top-level type constructors, etc.
Type equality predicates are deeply related to overlapping type-class
instances. OverlappingInstances
is one of the most controversial
Haskell extensions, whose semantics is the least clear and whose
interactions with existentials, separate compilation and other
Haskell features leave lingering doubts of soundness. Yet
OverlappingInstances
are too useful to remove -- in part because
they decide type equalities. Several examples below tell
how useful OverlappingInstances
really are.
Yet there is a well-behaving, semantically clear way to implement
type equalities and general type pattern-matching that not only
avoids OverlappingInstances
but subsumes them. The code using
OverlappingInstances
can be simply and mechanically modified to
avoid that extension. We thus propose to eliminate
OverlappingInstances as redundant.
OverlappingInstances
The most common, and historically the first, type equality predicate
TypeEq
was introduced in HList. The type checker simplifies the
type expression (originally, constraint) TypeEq t1 t2
to the dedicated
type HTrue
if the type checker considers the types t1
and t2
the
same. If the type checker regards t1
and t2
as different, TypeEq t1
t2
is simplified to the dedicated type HFalse
. For ground types --
types without free type variables -- TypeEq t1 t2
always simplifies, to
either HFalse
or HTrue
. If the type under comparison contains a
free type variable, the type checker may be still be able to decide
disequality: for example, when comparing the unground types
Maybe a
and [b]
. Otherwise, the type checker defers
the simplification of TypeEq t1 t2
, waiting for t1
and t2
become sufficiently instantiated by the context.
The TypeEq
predicate is used all throughout the HList and OOHaskell
libraries. In fact, TypeEq
is as ubiquitous in HList as Eq
is ubiquitous
in the standard Data.List
library. Most operations on
type-level collections -- such as look-ups, deletions,
replacements, checking for duplicates -- require equality comparisons.
The predicate TypeEqTotal t1 t2
is like TypeEq t1 t2
but always
simplifies to either HTrue
or HFalse
, regardless of how well the
types t1
and t2
are instantiated. A type variable is considered
equal only to itself. Therefore, TypeEqTotal
decides the identity
of two (unground) types: TypeEqTotal t1 t2
simplifies to HTrue
if
and only if t1
and t2
are equal in every possible grounding.
A type equality assertion is not a predicate, but is closely
related. Whereas an equality predicate applied to two types
simplifies to HTrue
or HFalse
depending on the success of the comparison,
the assertion simplifies to nothing
(or, gets discharged) when the types are unifiable, that is,
equal or could be made equal by instantiating type variables.
If the types are not unifiable the assertion -- and
the overall type checking -- fails.
If two types can be made equal by instantiating type variables,
the assertion does such an instantiation, as a side effect.
The type equality assertion was also introduced in HList, under the
name TypeCast t1 t2
, which nowadays can be written
as t1 ~ t2
. For ground types t1
and t2
,
TypeEq t1 t2 ~ HTrue
is equivalent to t1 ~ t2
and
TypeEq t1 t2 ~ HFalse
asserts the disequality of the two types.
One might wish for a predicate version of TypeCast t1 t2
, which
we call TypeUni t1 t2
.
If the types t1
and t2
are equal or could be made equal by
instantiating type variables, TypeUni t1 t2
simplifies to
HTrue
and, like TypeCast
, applies the unifying substitution
to t1
and t2
. Otherwise, TypeUni t1 t2
simplifies to
HFalse
and the type-checking continues. Alas, TypeUni
is
incoherent. For example, the type expression
(TypeUni x y, TypeUni x (), TypeUni y Bool)
(assuming previously uninstantiated x
and y
) simplifies to
(HTrue,HTrue,HFalse)
if the simplifications are done
left-to-right. If the type checker chooses the opposite order,
the result will be (HFalse,HTrue,HTrue)
.
TypeEq
is described in Sec 9.TypeCast: Type improvement constraint, local functional dependencies, and a type-level typecase
Type-level type introspection, equality and matching
The overlapping-free implementation of TypeEQ
that also supports wildcards
Reversing Haskell typechecker: converting from undefined
to defined
Implementation of TypeEqTotal, which was called TypeEq, as a functional relation. See the section `Equality predicate of type schemas' in the accompanying literal Haskell code.
TypeEq
, described its three
implementations. We summarize the description for ease of reference.
The naive implementation of TypeEq
specifies the predicate
by enumerating all the cases, as shown below. It does not scale.
data HTrue; data HFalse type family TypeEq x y :: * type instance TypeEq Bool Bool = HTrue type instance TypeEq Bool Char = HFalse type instance TypeEq Char Bool = HFalse type instance TypeEq Char Char = HTrue
A better approach is to relate each type to a unique type-level representation
that is easy to analyze and compare. For example, we map
each type to a type-level natural number, using an auxiliary injective
type function TypeCode
:
data Z; data S n -- Peano numerals type family TypeCode x :: * -- Type -to- Nat type instance TypeCode () = Z type instance TypeCode Bool = S Z type instance TypeCode Char = S (S Z)Composite types like
Either t1 t2
can be encoded similarly (e.g., with
Goedel numbers). The TypeEq
predicate then is defined generically,
in a single type instance:type family TypeEq x y :: * type instance TypeEq x y = TypeEqNat (TypeCode x) (TypeCode y)We assume a type-level function
TypeEqNat
to compare Peano
numerals. To extend the universe of comparable types with a new
type, we merely need to add one instance of TypeCode
; the
definition of TypeEq
is unaffected. This approach does scale. We
still need to manually coordinate the assignment of TypeCode
s
across modules and libraries to ensure uniqueness. Goedel numbers are
big, opaque, and cumbersome to analyze. Later we introduce a better
representation that overcomes these drawbacks and also supports
matching.
The `best' implementation of TypeEq
, in three lines of code,
works for all types, present or declared in the future.
In the HList paper, TypeEq
was introduced
as a functional relation (a multi-parameter type class with a functional
dependency). TypeEq
could also be defined as a type function
(in GHC parlance, `type family'), as we have just shown. The most generic
implementation below, however, can be defined only as a functional
relation, since it relies on overlapping instances.
Type families do not permit overlapping, for a good reason.
class TypeEq x y b | x y -> b instance TypeEq x x HTrue instance b ~ HFalse => TypeEq x y bIf the type checker can see that two types
t1
and t2
are the same
so that the first TypeEq
instance can be chosen, the constraint
TypeEq t1 t2 b
simplifies to b ~ HTrue
. Otherwise the third
argument of TypeEq
is unified with HFalse
. The word `otherwise'
betrays the complementation, `the default case', which is at the core
of overlapping instances. We see the first intimations of the
usefulness of overlapping instances and of their relation with type
equalities.Int->Bool
with ()->Char
because both types have the same top constructor
(->)
. That `distinguishing the types of one shape from the
others' is again about the default case and a specific
exception. OverlappingInstances
are seemingly required; on the
other hand, that extension could be impossible to use. In the
vector-space example, the signature of the scalar-vector
multiplication includes the type of the scalar field, which is
uniquely determined by, or associated with, the vector space. Alas,
associated data types cannot be used with OverlappingInstances
;
even functional dependencies are problematic.
Here we resolve the problem by demonstrating how to use
OverlappingInstances
-based type-constructor equality with associated
types. Later on we re-implement type-shape comparison without
resorting to OverlappingInstances
.
Our implementation here reduces the type-shape comparison to the
previously defined TypeEq
, applied to `normalized' types. The
normalization replaces `insignificant' parts of the composite type
with ()
, as follows. (Alas, the procedure cannot be defined
generically for all arities since type classes have fixed kinds.)
class Normalize t r | t -> r instance (c ()) ~ r => Normalize (c x) r instance (c () ()) ~ r => Normalize (c x y) r instance (c () () ()) ~ r => Normalize (c x y z) r -- add instances for type constructors of higher arities instance y ~ x => Normalize x y -- The default instanceThe last instance is the catch-all for all the types not handled by more specific transformations. Again we resort to overlapping instances.
NotAFunction.hs [4K]
An application of the type shape comparison: distinguishing function
types from any other types when implementing vector spaces as function
spaces. This code demonstrates the mixing of functional relations,
TypeEq
and Normalize
, with associated type families. The mixing
is possible, but difficult and verbose. The difficulty
prompts us to seek the implementation of TypeEq
without
overlapping instances. TypeEq
could be defined as a type function
then.
Type-level type introspection, equality and matching
Re-implementation of type matching, including the functional vector spaces example, without overlapping instances.
Nicolas Frisby: topEq for types
A message posted on the Haskell-Cafe mailing list on Thu Nov 16 18:59:05 EST 2006. The message motivated comparing types by shape, and described one solution.
How to write an instance for not-a-function: an earlier, ad hoc solution.
In June 2010, Ryan Newton has posed a problem of optimizing a GMap
package of finite maps with the common interface and
keys of various types. The module GMap defines the type class
GMapKey
and optimized instances for Int
, Int32
, pairs, sums,
lists, etc. When the key is a pair (Int16,Int16)
, it is worthwhile to
pack the key into a single Int32
value and use the GMap Int32
instance,
rather than using the general GMap (a,b)
instance for pairs.
Overlapping instances are ideal for such specific overrides to
general behavior. Alas, the class GMapKey
has associated data types
and so does not permit overlapping instances.
We demonstrate an ungainly work-around, which moves overlapping into
an auxiliary type class with a functional dependency. The type class
`classifies' keys into disjoint subsets, with overlapping to
distinguish special cases (of keys that could be packed into a single word)
from the default case. The result of the classification is handled by the
original GMapKey
class.
Later we re-implement the example without OverlappingInstances
; the
work-around will not be needed then.
GMapSpecT.hs [6K]
Overlapping-free implementation of the example, in the approach described below.
OverlappingInstances
, GHC lets us add a special case to an
existing set of overloaded functions. Giving the general case and a
set of exceptions is truly useful. The examples on this page -- and
many more on Hackage -- speak for such mode of programming and hence
for overlapping instances. Why are they so controversial?
Overlapping instances are controversial because they straddle a
contradiction. They embody ``negation as failure'': the general
type class instance is chosen for the given type only when all more
specific instances failed to match the type. Negation-as-failure
presupposes closed world, or a fixed set of instances. However, type
classes are open: the user may add more instances at any time, in
the same or different modules. The unlimited specialization
is indeed possible, for example, in the following
infinite chain of polymorphic types a > [a] > [[a]] > [[[a]]] > ....
.
When compiling a module, GHC cannot know if a more specific instance
exists in another, perhaps not yet written, module. Still, GHC may have to
choose an instance when compiling (monomorphic) terms in the module.
Since abandoning separate compilation is not an option, GHC employs various
heuristics, which could occasionally lead to surprising results. GHC
User's Guide, in the section about overlapping instances, describes an
example of incoherence -- of different instance choices made in
different parts of the program for the same type. The interaction of
overlapping instances with existential quantification is another gray
area. If a type class includes associated data or types, overlapping
is unsound -- as in segmentation-fault unsound. This bodes ill
for OverlappingInstances
.
Since implementations of overlapping instances are based on heuristics rather than a sound theory, different implementations have different behavior. For example, Hugs treats overlapping in subtly different ways from GHC; therefore, the type equality predicate of HList, based on overlapping, sometimes works in Hugs and sometimes doesn't. HList is hardly useful in Hugs.
The Glorious Glasgow Haskell Compilation System User's Guide, Version 7.0.4. Section 7.6.3.4. Overlapping instances
OverlappingInstances
. These type
equality predicates later let us get rid of overlapping in the
general case.The scalable, overlapping-free type-level equality predicate has been described already in the HList paper, recalled earlier on this page. The key was relating each type to a unique type-level representation that is easy to analyze and compare. Type-level numerals are one such representation, which suffers from two drawbacks: the difficulty of implementing matching, and the cumbersome manual process to ensure unique representation. We now describe the representation that overcomes the first drawback and alleviates the second. Template Haskell or a bit of GHC support can ensure unique representation automatically.
The new type representation TYPEREP
is much easier to work with than Goedel
numbers. It is a type-level equivalent of Data.Typeable.TypeRep
and can
be informally described by the following declarations:
data TyCon_name -- name associated with each (registered) type constructor data LIST a = NIL | a :/ LIST a data TYPEREP = TYPEREP (TyCon_name, LIST TYPEREP)
TYPEREP
is however a type-level data-structure. Strictly speaking, we
need data kinds to define it. For now, the above declarations
should serve as a guidance. Also as an informal guide, we write
kind signatures for our type-level functions. Like type signatures
of ordinary functions, kind signatures aid understanding. For now,
they are informal and have to be given in comments.
To compare type constructor names, we injectively map them to
something easily comparable, for example, natural numbers
(Data.Typeable
uses a similar trick):
-- TC_code :: TyCon_name -> NAT type family TC_code tycon :: *
The total type function TREPEQ
compares two TYPEREP
for equality.
The function is defined below in closed form, once and for all:
-- TYPEREP -> TYPEREP -> BOOL type family TREPEQ x y :: * type instance TREPEQ (tc1, targ1) (tc2, targ2) = AND (NatEq (TC_code tc1) (TC_code tc2)) (TREPEQL targ1 targ2) -- LIST TYPEREP -> LIST TYPEREP -> BOOL type family TREPEQL xs ys :: * type instance TREPEQL NIL NIL = HTrue type instance TREPEQL NIL (h :/ t) = HFalse type instance TREPEQL (h :/ t) NIL = HFalse type instance TREPEQL (h1 :/ t1) (h2 :/ t2) = AND (TREPEQ h1 h2) (TREPEQL t1 t2)where the total
NatEq :: NAT -> NAT -> BOOL
compares two type-level natural
numbers. TYPEREP
supports matching, or wildcards, represented
by a type ANY
(with a distinguished TC_code
, e.g., zero). The
type function TREPEQW :: TYPEREP -> TYPEREP -> BOOL
, the variant of
TREPEQ
, compares two TYPEREP
assuming that the ANY
type is equal to
any other type. A type-level list membership predicate-- ((*,*) -> BOOL) -> * -> LIST * -> BOOL type family Member f x l :: *checks for an occurrence of an element in a type-level list, using the given element equality predicate, specified as
AC_TREPEQ
(for equality) or AC_TREPEQW
(for matching).
The injective type function TYPEOF :: * -> TYPEREP
, the analogue of
Data.Typeable.typeOf
, returns the representation of a type. The
function is defined by instances. Here are a few:
data ANY -- The wildcard, for matching data TRN_any type instance TC_code TRN_any = Z -- distinguished for TRN_any type instance TYPEOF ANY = (TRN_any, NIL) data TRN_unit type instance TC_code TRN_unit = S Z type instance TYPEOF () = (TRN_unit, NIL) data TRN_maybe type instance TC_code TRN_maybe = S (S Z) type instance TYPEOF (Maybe a) = (TRN_maybe, (TYPEOF a) :/ NIL) data TRN_arrow type instance TC_code TRN_arrow = S (S (S Z)) type instance TYPEOF (a -> b) = (TRN_arrow, (TYPEOF a) :/ (TYPEOF b) :/ NIL)
The instances for TYPEOF
and TC_code
are regular and could be,
hopefully, derived by GHC or a similar tool automatically, similar to
the deriving of Data.Typeable
. For now, we can use Template Haskell
to derive the TYPEOF
instances for a new data type. For example,
for Foo :: *
, $(tderive ''Foo)
may expand to something like this:
data TRN_package_name_module_name_Foo type instance TC_code TRN_package_name_module_name_Foo = <very long type-level numeral that spells package_name_module_name_Foo> type instance TYPEOF Foo = (TRN_package_name_module_name_Foo, NIL)
Long unary numerals spelling full type constructor names
are clearly unappealing. Help is on the way: GHC will soon support
efficient numeric kinds and type-level strings. Ideally,
GHC would support TYPEREP
and TYPEOF
the way it currently
supports Data.Typeable
-- deriving TYPEOF
, assigning
unique TC_code
s and efficiently comparing them.
In principle, GHC could even support ground type equality as a
built-in type-level function. However, mere equality comparison of
types is not enough to eliminate overlapping instances. We need type
comparison at all kinds, and we need matching. The TYPEREP
approach
fulfills all the needs.
In conclusion, the injective type function TYPEOF
reifying a type to its
representation TYPEREP
gives rise to the type equality comparison and
matching. Template Haskell or other such automated deriving of
TYPEOF
instances can ensure the uniqueness of
the representation. Numeric and string kinds, coming to GHC,
will greatly improve the efficiency of type-level comparison.
TYPEREP
,
type-level TypeOf
, and type-level functions for comparison and
matching -- along with a small library for higher-order functional
programming at the type level. The library uses neither overlapping instances
nor functional dependencies. The code includes a few tests.NotAFunctionT.hs [3K]
An example of type matching: re-implementing NotAFunction.hs, this time without overlapping instances.
OverlappingInstances
altogether. We re-write the
earlier code that relied on overlapping into the overlapping-free
code, in a regular and almost mechanical way. We gain a sound
overlapping of type families and associated data types. The new
approach requires no changes to GHC and does not affect the FC
calculus or the instance resolution algorithm. It works right now.
We may still define overloaded functions by giving the default
general instance and a set of special cases. The contradiction of
OverlappingInstances
is avoided because the set of special cases is
closed.
Our first example is eliminating overlapping in the overloaded
function that converts lists to strings, with the special cases for
lists of booleans and characters: [True,False]
should be converted
to the string "10"
:
class ShowList a where showl :: [a] -> String -- General instance instance Show a => ShowList a where ... -- Special instances, for lists of specific types instance ShowList Bool where ... instance ShowList Char where showl = id
To eliminate overlapping, we re-write the example as follows.
We rely on the TTypeable library for type introspection, TYPEOF
,
and for the type-level list membership predicate Member
.
class ShowList a where showl :: [a] -> String -- Enumeration of special cases type Special = TYPEOF Bool :/ TYPEOF Char :/ NIL -- Discriminator instance (flag ~ (Member AC_TREPEQ (TYPEOF a) Special), ShowList' flag a) => ShowList a where showl = showl' (undefined::flag) -- A replica of the original class and its instances class ShowList' flag a where showl' :: flag -> [a] -> String -- General instance instance Show a => ShowList' HFalse a where ... -- Special instances, for lists of specific types instance ShowList' HTrue Bool where ... instance ShowList' HTrue Char where showl' _ = id
The single ShowList
instance checks to see if the type in question is
mentioned in the list of exceptions Special
, thus discriminating
between the general and special cases. The class ShowList'
and
its instances are the replica of the original ShowList
,
with the extra parameter flag
, which resolves overlapping.
Modulo the dummy flag
parameter, the code of ShowList
's members
is unchanged. Special
defines the list of specializations. The members
of the list may contain wildcards ANY
and so the set of types to
specialize among could be infinite -- but their lower specialization
bound is fixed. Adding new special instances does not require
modifications to the existing instances but does require changes
to the code to extend Special
, adding TYPEREP
s of those special
types. The type alias Special
may be in a module of its own,
along with other configuration parameters.
In general, suppose we have declared class C a
with
N special, mutually non-overlapping instances, and one
catch-all instance C a
. To eliminate overlapping, we define
an auxiliary class C' flag a
with N special and one catch-all
instance, all non-overlapping, and a type-level list Special
.
The original class C
will have one, dispatching instance. Our
approach thus calls for one extra class, one extra instance and
one type-level list. In a more complex case of full overlap
class C a instance C [Int] instance C [a] instance C athe members of
Special
include wildcards. We use matching rather than
the simple equality when searching through Special
. The rest
remains the same.The approach lets us implement hypothetical closed type instances. Such an extension is frequently mentioned -- but not implemented in GHC. The implementation will be non-trivial, requiring the change in the instance resolution algorithm. Our approach gives closed type families now.
-- hypothetical code type instance F where F Int = Bool F a = [a]Here is the complete code emulating the same closed type family with overlapping, in the existing GHC 7.0.4.
type family F a :: * type instance F x = F' (TREPEQ (TYPEOF x) (TYPEOF Int)) x type family F' flag x type instance F' HTrue x = Bool type instance F' HFalse x = [x] newtype W x = W (F x) w1 = W True :: W Int w2 = W "abc" :: W CharWe stress that
TREPEQ
is the total equality comparison predicate
defined in closed form, once and for all.There have been other proposals to eliminate overlapping instances: closed type families, instance chains, disequality constraints. Only instance chains have been implemented, in an experimental Haskell-like system. None of these proposal have been realized in GHC and none are easily implementable, requiring significant changes to the instance resolution algorithm or System FC, with unknown interactions with other features. All these proposals to sound overlapping also make the set of special cases closed, thus taking the closed-world assumption.
We have described the re-writing procedure that eliminates the use of
OverlappingInstances
. It is the only presently available way
to sound overlapping, requiring no changes to the type checker or
its theory. We advocate abolishing overlapping instances,
claiming it results in no loss of expressiveness.
ExampleD.hs [4K]
Implementing the challenging example posed by David Mazi`eres.
The example seemed to require overlapping type constructor instances,
and hence *->*
-type equality.
GMapSpecT.hs [6K]
Adding a special case to an existing library of overloaded functions: re-implementing the example GMapSpec.hs, this time without overlapping instances.
The described new approach to type equality and overlapping elimination was first discussed on the Haskell' mailing list in June-August 2011.
oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML