• × An introduction to Generic Programming and Generic H A SKELL

An introduction to
Generic Programming
and
Generic H SKELL
A
Johan Jeuring
March 18, 2002
1
JJ J I II J
•
×
Overview of the lecture
I Different kinds of polymorphism
I Data types in Haskell
I Equality on different data types
I Equality as a type-indexed value
I Binary encoding as a type-indexed value
1
JJ J I II J
•
×
Introduction
A generic program is a program that the programmer writes once, but which
works for many different data types.
This lecture introduces data types, generic programs, and shows how generic
programs can be implemented in Generic H SKELL.
A
2
JJ J I II J
•
×
Parametric polymorphism
Parametric polymorphic functions are ‘generic’ functions that work on any type.
For example, the function flatten flattens a list of lists, so for example,
flatten [[1,2],[3,4],[5]]
=
[1,2,3,4,5]
Function flatten is defined by:
flatten
flatten []
flatten (x:xs)
:: [[a]] -> [a]
= []
= x ++ flatten xs
This is an example of a parametric polymorphic function: the type of a is
arbitrary. Other examples are the identity function id :: a -> a, the map
function, the fold function and many other prelude functions.
3
JJ J I II J
•
×
Adhoc polymorphism
Adhoc polymorphic functions are functions that can be reused on a limited
number of types. For example:
2 + 4
3.14 + 2.7666
In Haskell, adhoc polymorphic functions are instances of a class.
class
(+)
...
(Eq a, Show a) => Num a
:: a -> a -> a
where
The instances for integers and floats should be given:
instance
instance
Num Int where...
Num Float where...
The operator (+) is an example of an adhoc polymorphic function.
4
JJ J I II J
•
×
‘Generic’ polymorphism or polytypism
Some functions do the same thing for different data types, but still you have to
write instances on the data types. For example, for the equality function we have
on lists of integers:
eqList
eqList
eqList
eqList
eqList
[] []
(m:ms) (n:ns)
[]
(n:ns)
(m:ms) []
and on trees
data Tree
=
|
::
=
=
=
=
[Int] -> [Int] -> Bool
True
m==n && eqList ms ns
False
False
Leaf
Node Tree Int Tree
eqTree
:: Tree -> Tree -> Bool
eqTree Leaf
Leaf
= True
eqTree (Node l n r) (Node v m w) =
eqTree l v && n==m && eqTree r w
eqTree _
_
= False
5
JJ J I II J
•
×
What is generic programming?
I Many programs work in essentially the same way, but on different data types.
I But still separate code has to be written for each data type.
I Examples:
Summing or collecting all values in a data structure.
Printing and parsing values.
Systematically changing all values in a data structure (mapping).
I The approach to generic programming we take works by induction on the
structure of types.
6
JJ J I II J
•
×
Haskell data types: simple examples
A Haskell data type consists of one or more constructors which each take zero or
more arguments. Here is a simple non-recursive data type:
data WeekDay
ex
ex
=
|
Monday | Tuesday | Wednesday | Thursday
Friday | Saturday | Sunday
:: WeekDay
= Monday
The built-in data type of polymorphic, recursive, lists can be seen as defined by
data [a]
ex
ex
=
[] | (:) a [a]
:: [Int]
= [3,-10]
7
JJ J I II J
•
×
Haskell data types: regular
A regular data type is a data type in which occurrences of the type on the
right-hand side of the = symbol are equal to the occurrence on the left-hand side.
Here are some (more) examples. A type of user-defined lists:
data List a
ex
ex
=
Nil | Cons a (List a)
:: List Int
= Cons 3 (Cons (-10) Nil)
The trees defined on a previous slide are also regular. And the following type of
infinite lists, which uses field labels, is also regular:
data InfList a
ex
ex
=
InfList{ ihead::a , itail::InfList a}
:: InfList Bool
= InfList {ihead=True, itail=ex}
8
JJ J I II J
•
×
Haskell data types: higher-order
The type of rose trees consists of values that may have lists of children.
data Rose a
ex
ex
=
Fork a [Rose a]
:: Rose String
= Fork "A" [Fork "rose" [],Fork "is" [],Fork "a" []]
If we abstract from the data type lists in this declaration, we obtain the data type
of generalised rose trees.
data GRose f a
ex
ex
=
GFork a (f (GRose f a))
:: GRose [] Int
= GFork 3 [GFork 10 [],GFork 2 [GFork (-3) []]]
This is a higher-order data type.
9
JJ J I II J
•
×
Haskell data types: fixed-points
We can define a data type as an initial fixed-point in Haskell, using the following
newtype (instead of data type) for fixed-points.
newtype Fix f
=
In (f (Fix f))
For example, we can define a type for lists using Fix
data Listbase a b = NilL | ConsL a b
type FixList a = Fix (ListBase a)
ex
ex
:: FixList Int
= In (ConsL 3 (In (ConsL 5 (In NilL))))
Fix is also a higher-order data type.
10
JJ J I II J
•
×
Haskell data types: non-regular or nested
Non-regular or nested types also appear in programming problems. For example,
to specify perfectly balanced trees, we can use the following data type:
data Perfect a = ZeroP a | SuccP (Perfect (Fork a))
data Fork a = Fork a a
ex
ex
:: Perfect Int
= SuccP (SuccP (SuccP (ZeroP (Fork (Fork (Fork
(Fork
(Fork (Fork
(Fork
)
)
)
)
)
11
2 3)
5 7))
11 13)
17 19)
JJ J I II J
•
×
Haskell data types: corresponding to XML
The following XML Document Type Definition:
<!DOCTYPE
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT
]>
BookDTD [
book (introduction,chapter*)>
introduction (prologue|preface)>
prologue (#PCDATA)>
preface (#PCDATA)>
chapter (#PCDATA)>
can be represented by the following Haskell data type:
data
data
data
data
data
Book
Introduction
Prologue
Preface
Chapter
=
=
=
=
=
Book Introduction [Chapter]
IPrologue Prologue | IPreface Preface
Prologue String
Preface String
Chapter String
12
JJ J I II J
•
×
Example: equality
Checking whether two values of the same data type are equal is easy: use
deriving Eq and ==.
data Tree
test
=
=
|
Leaf
Node Tree Int Tree deriving Eq
Node Leaf 3 Leaf == Leaf
That doesn’t always work (higher-order data types).
13
JJ J I II J
•
×
Example: equality
Checking whether two values of the same data type are equal is easy: use
deriving Eq and ==.
data Tree
test
=
=
|
Leaf
Node Tree Int Tree deriving Eq
Node Leaf 3 Leaf == Leaf
That doesn’t always work (higher-order data types).
We can follow the following recipe:
I Check whether the two values are in the same alternative:
I If not, they are not equal.
I Otherwise, they are only equal if all components are equal.
The last item needs to be checked by using the appropiate equality functions.
13
JJ J I II J
•
×
Equality on regular data types
We have seen equality functions on regular data types on the slide on generic
polymorphism.
eqList
eqList
eqList
eqList
eqList
[] []
(m:ms) (n:ns)
[]
(n:ns)
(m:ms) []
::
=
=
=
=
[Int] -> [Int] -> Bool
True
m==n && eqList ms ns
False
False
Equality on trees
eqTree
:: Tree -> Tree -> Bool
eqTree Leaf
Leaf
= True
eqTree (Node l n r) (Node v m w) =
eqTree l v && n==m && eqTree r w
eqTree _
_
= False
14
JJ J I II J
•
×
Equality and type abstraction
If we abstract from the integers in the tree:
data Tree a
=
|
Leaf
Node (Tree a) a (Tree a)
then we need to supply a function telling us how to compare values of type a.
The equality function now becomes:
eqTree :: (a -> a -> Bool) -> Tree a -> Tree a -> Bool
eqTree eqA Leaf
Leaf
= True
eqTree eqA (Node l a r) (Node v b w) =
eqTree eqA l v && eqA a b && eqTree eqA r w
eqTree eqA _
_
= False
15
JJ J I II J
•
×
Equality on higher-order data types
Since higher-order data types take type constructors as arguments, their equality
functions need equality function constructors as arguments.
eqGRose
::
(forall b.(b->b->Bool) -> f b -> f b -> Bool) ->
(a -> a -> Bool) ->
GRose f a -> GRose f a -> Bool
eqGRose eqF eqA (GFork a xs) (GFork b ys) =
eqA a b && eqF (eqGRose eqF eqA) xs ys
For example, here is a legal call of eqGRose:
eqGRose eqTree
(==)
(GFork 3 Leaf)
(GFork 5 (Node Leaf (GFork 5 Leaf) Leaf))
16
JJ J I II J
•
×
Equality on nested data types
In order to define equality on nested data types we use so-called polymorphic
recursion.
eqFork :: (a -> a -> Bool) -> Fork a -> Fork a -> Bool
eqFork eqA (Fork a b) (Fork c d) = eqA a c && eqA b d
eqPerfect
::
(a -> a -> Bool) ->
Perfect a -> Perfect a -> Bool
eqPerfect eqA (ZeroP a) (ZeroP b) = eqA a b
eqPerfect eqA (SuccP p) (SuccP q) =
eqPerfect (eqFork eqA) p q
eqPerfect eqA _
= False
17
JJ J I II J
•
×
Defining type-indexed values
I How can we define equality in general?
I We need to know how to handle
Different alternatives: disjoint sums.
Components in a constructor: tuples.
Constructors and field labels.
Primitive types.
Function space constructors.
I We must be able to rewrite every data type using the above constructs.
I The transformations are straightforward to do by hand. If you want to have a
compiler perform them in a structured way, on the other hand...
18
JJ J I II J
•
×
Data types as sums of products of ...
Haskell’s data construct combines several features in a single form: type
abstraction, type recursion, (labelled) sums, and (possibly labelled) products. For
example:
Tree
List a
=
=
Unit :+: (Tree :*: Int :*: Tree)
Unit :+: (a :*: List a)
So a list is a sum of two components, where the first component is ‘empty’, and
the second component is a pair consisting of an element and a list. So :+: or
Sum represents sums, and :*: or Prod represents products.
19
JJ J I II J
•
×
Encoding constructors
The above encoding of data types as sums of products does not mention
constructor names yet. Here is the correct encoding:
Tree
=
Con conLeaf Unit :+:
Con conNode (Tree :*: Int :*: Tree)
conLeaf
conLeaf
:: ConDescr
= ConDescr{ conName="Leaf", nrOfArgs="0",...}
conNode
=
List a
=
...
Con conNil Unit :+: Con conCons (a :*: List a)
conNil, conCons :: Condescr
20
JJ J I II J
•
×
Constructor descriptions
Constructors are encoded as follows:
data Con a = Con ConDescr a
data ConDescr = ConDescr
{ conName :: String
, conArity :: Int
, ...
}
ConDescr contains information about the constructor, such as:
I Its name,
I Its number of arguments,
I Whether it has field labels,
I ...
21
JJ J I II J
•
×
Defining type-indexed values
Type-indexed values are now defined with induction on the structure of types. So,
for example for equality on trees:
Tree
=
Con conLeaf Unit :+:
Con conNode (Tree :*: Int :*: Tree)
we have to specify how to define equality on sums, on products, on constructors,
on units, and on integers.
22
JJ J I II J
•
×
Defining type-indexed values: sums
Disjoint sums are defined as follows:
data Sum a b = Inl a | Inr b
Sum is isomorphic to the Either type. The code for equality on sums:
eq {| :+: |}
eq
eq
eq
eq
{|
{|
{|
{|
:+:
:+:
:+:
:+:
|}
|}
|}
|}
::
eqA
eqA
eqA
eqA
(a -> a -> Bool) ->
(Sum a b -> Sum a b
eqB (Inl a) (Inl b)
eqB (Inl a) (Inr b)
eqB (Inr b) (Inl a)
eqB (Inr a) (Inr b)
23
(b
->
=
=
=
=
-> b -> Bool) ->
Bool)
eqA a b
False
False
eqB a b
JJ J I II J
•
×
Defining type-indexed values: products
Products (tuples) are defined as follows:
data Prod a b = a :*: b
There is a special case for 0-tuples:
data Unit = Unit
The code for equality on products:
eq {| Unit |}
eq {| Unit |} Unit Unit
eq {| :*: |}
:: Unit -> Unit -> Unit
= True
::
(a -> a -> Bool) -> (b -> b -> Bool) ->
(Prod a b -> Prod a b -> Bool)
eq {| :*: |} eqA eqB (a :*: b) (c :*: d) =
eqA a c && eqB b d
24
JJ J I II J
•
×
Defining type-indexed values: constructors
Constructors:
data Con a = Con ConDescr a
Equality does not need the constructor description information:
eq {| Con c |} eqA (Con _ a) (Con _ b) = eqA a b
The same approach works for field labels.
25
JJ J I II J
•
×
Defining type-indexed values: primitive types
Equality on primitive types is defined as follows:
eq {| Int |}
eq {| Char |}
eq {| IO |}
=
=
=
(==)
(==)
error "equality not defined for IO types"
26
JJ J I II J
•
×
Defining type-indexed values: equality
And here is the complete code:
eq {| Int |}
eq {| Char |}
eq {| IO |}
error "equality not defined for IO types"
eq {| Unit |} Unit Unit
eq {| :*: |} eqA eqB (a :*: b) (c :*: d)
eqA a c && eqB b d
eq {| :+: |} eqA eqB (Inl a) (Inl b)
eq {| :+: |} eqA eqB (Inl a) (Inr b)
eq {| :+: |} eqA eqB (Inr b) (Inl a)
eq {| :+: |} eqA eqB (Inr a) (Inr b)
eq {| Con c |} eqA (Con _ a) (Con _ b)
eq {| Label l |} eqA (Label _ a) (Label _ b)
27
=
=
=
(==)
(==)
=
=
True
=
=
=
=
=
=
eqA a
False
False
eqB a
eqA a
eqA a
JJ J I II J
b
b
b
b
•
×
Binary encoding and decoding
The next problem we look at is to encode elements of a given data type as bit
streams.
type Bin
data Bit
=
=
[Bit]
O | I
We want to have two functions that encode and decode values to and from
bitstreams:
encode
decode
::
::
t -> Bin
Bin -> t
with the property that their composition decode . encode is the identity
function. These are idealised types, we will give the exact types later.
28
JJ J I II J
•
×
Binary encoding on integer lists
Here is an instance of the encoding function on lists of integers.
encodeIList
encodeIList []
encodeIList (x:xs)
:: [Int] -> Bin
= [O]
= I:encodeInt x ++ encodeIList xs
To decode such a stream, we use the function decodeIList
decodeIList
decodeIList []
decodeIList (O:xs)
decodeIList (I:xs)
::
=
=
=
Bin -> ([Int],Bin)
error "decodeIList"
([],xs)
let (i,xs’) = decodeInt xs
in i:decodeIList xs’
For these functions we have
fst (decodeIList (encodeIList xs)) = xs
provided encodeInt and decodeInt satisfy the same property.
29
JJ J I II J
•
×
Binary encoding on trees
For trees we have
encodeTree
:: Tree -> Bin
encodeTree Leaf
= [O]
encodeTree (Node l a r) =
I:encodeTree l ++ encodeInt a ++ encodeTree r
decodeTree
decodeTree []
decodeTree (O:xs)
decodeTree (I:xs)
::
=
=
=
Bin -> (Tree,Bin)
error "decodeTree"
(Leaf,xs)
let (l,xs’)
= decodeTree xs
(i,xs’’)
= decodeInt xs’
(r,xs’’’) = decodeTree xs’’
in (Node l i r,xs’’’)
These functions satisfy the same property again.
30
JJ J I II J
•
×
Type-indexed binary encoding
The type-indexed function encode is defined as follows.
encode
encode
encode
encode
encode
encode
{|
{|
{|
{|
{|
{|
Int |} i
Unit |} Unit
:*: |} encA encB (a :*: b)
:+: |} encA encB (Inl a)
:+: |} encA encB (Inr b)
Con c |} encA (Con _ a)
31
=
=
=
=
=
=
encodeInt i
[]
encA a ++ encB b
O:encA a
I:encB b
encA a
JJ J I II J
•
×
Type-indexed binary decoding
The type-indexed function decode is defined as follows.
decode {| Int |} bin
= decodeInt bin
decode {| Unit |} bin
= (Unit,bin)
decode {| :*: |} decA decB bin
=
let (a,bin’)
= decA bin
(b,bin’’) = decB bin’
in (a:*:b,bin’’)
decode {| :+: |} decA decB (O:bin) =
let (a,bin’) = decA bin in (Inl a, bin’)
decode {| :+: |} decA decB (I:bin) =
let (b,bin’) = decB bin in (Inr b, bin’)
decode {| Con c |} decA bin
= decA bin
Again, the pair of functions encode/decode satisfy the property. The proof of
this property is more complicated, however.
32
JJ J I II J
•
×
Conclusions and next lecture
This lecture has shown:
I several forms of polymorphism;
I different kinds of data types in Haskell;
I equality on several data types;
I type-indexed equality;
I type-indexed binary encoding and decoding.
Next lecture:
I what is the type of type-indexed values? Kind-indexed types!
I which functions can be defined as type-indexed functions?
A
I the library of Generic H SKELL.
33
JJ J I II J
•
×