Functional Programming in SML - Compound Data Types
So far this series concentrated on building functional abstractions. The functions themselves manipulated only simple data, like numbers. But simple data types are not sufficient for modelling real-world objects that happen to be combinations of many parts.
This post and the few that follows will deal with compound data. Let’s start with the most important compound data object in functional programming - the list.
A list is a finite sequence of elements. (A list may also be infinite, but let’s explore this bizzare idea in a later post). The elements of a list are enclosed in opening and closing square-brackets and separated by comma.
- [1, 2, 3]; (* val it = [1,2,3] : int list *) - [#"a", #"e", #"i", #"o", #"u"]; (* val it = [#"a",#"e",#"i",#"o",#"u"] : char list *)
An empty list is denoted by
. The constant
nil is a synonym for the empty list.
As SML is statically typed, all the elements of a list must be of the same type.
The type of a list is denoted as
t list, where
t is the type of the elements.
The preceding example shows an
int list and a
Behind the scenes, a list is constructed using two primitives - the constant
nil (empty list) and the infix operator
cons (for “construct”). The following example shows a verbose version of building the list
[1, 2, 3] using
- 1::2::3::nil; val it = [1,2,3] : int list
This means, a list conforms to either one of two “data patterns” -
- an empty list.
- h::t - where
his the head of the list and
tis its tail.
The head of a list points to the first element and tail points to the rest of the elements.
The tail of a properly constructed list will always be another list or
The elements can be other complex objects like tuples or lists.
The following program defines an employee database. The database is represented as a list, each employee being
a tuple of type
id:int * first_name:string * last_name:string * salary:real:
- val employee_db = [(1, "Mat", "J", 4500.0), (2, "Sue", "P", 5600.20), (3, "Joe", "C", 2300.0)]; (* val employee_db = [(1,"Mat","J",4500.0),(2,"Sue","P",5600.20),(3,"Joe","C",2300.0)] : (int * string * string * int) list *)
Once you have a list, you would want to access its elements.
hd function lets you fetch the first element in a list and the
returns its “tail” or the rest of the elements:
- hd [10, 20, 30]; (* val it = 10 : int *) - tl [10, 20, 30]; (* val it = [20,30] : int list *)
Assume that you want to access the
nth employee in the “database” that we defined earlier.
Using calls to
tl, you can define a function that achieves this:
fun nth n ls x = if ls =  then x else if n = 0 then (hd ls) else nth (n - 1) (tl ls) x (* val nth = fn : int -> ''a list -> ''a -> ''a *)
nth takes three arguments - a position of the item to find (indicated by the integer parameter
the list to search (
ls) and a value to return if
n is out-of-bounds (
- nth 3 ["a", "f", "k", "g"] "z"; (* val it = "g" : string *) - nth 10 ["a", "f", "k", "g"] "z"; (* val it = "z" : string *)
The inferred type of the function indicates that the list
ls is polymorphic, i.e it can be a list with
elements of any type, with one restriction as we shall see now.
You might be wondering why the type variable
a has two single-quotes as prefix, instead of just one.
This means that, though the list is polymorphic, its elements should belong to a class of types known as
equality types. The values of these types can be tested for equality with the
Most basic types - integer, boolean, character, and string - are equality types.
Tuples and lists whose elements are of equality types can also be compared using the
The first clause of the
if expression in the
nth function makes this assumption and applies the
to the list. Hence the type of
ls is inferred as
''a list instead of
'a list. So only lists with values belonging to
an equality type can be passed to
Earlier we saw that a list has the pattern
h is the first element (head) and
t are the rest of the
elements (tail). SML allows us to define functions on the basis of the “pattern” of its arguments. We can redefine
using this idea. Instead of the
if expression, we will write the function as a bunch of clauses that matches the patterns
of the arguments and takes appropriate action.
fun nth _  x = x | nth 0 (h::_) _ = h | nth n (_::t) x = nth (n-1) t x (* val nth = fn : int -> 'a list -> 'a -> 'a *)
The clauses are evaluated in the order they are defined. So we need to handle the base-case of the list being empty in the
first clause itself. In this case, we return the
x. We have no use for the
position parameter here, so we ignore that with the wildcard
The next clause checks the case where
n has become
0. This means we have reached the item we are looking for, which is at the
head of the list. The head and tail of the list is matched by the pattern (h::t). Here, we return
h and ignore other values.
The last clause recursively calls
nth with the decremented value of
n and the tail of the list. This recursion will eventually reach either
the first or second clauses, thus terminating the function.
With pattern matching, we no longer need the
= check on the list. Now the inferred type of the list is
'a list, indicating
that there are no longer any restrictions on the types of elements.
The number of elements in a list is reported by the built-in function
- length [10, 20, 30]; (* val it = 3 : int *)
As an exercise in pattern matching, let us define our own version of
length. To avoid
a name conflict, let us call this function
fun count xs = let fun count_loop n  = n | count_loop n (_::t) = count_loop (n+1) t in count_loop 0 xs end
count function makes use of an internal function called
The first argument of
count_loop (i.e the integer
n) is incremented until the list becomes empty.
The final value of
n will be the total number of elements in the list.
- count [10, 20, 30]; (* val it = 3 : int *)
Merging two lists
There are times we want to merge two lists into one. The following function does exactly that:
fun merge  ys = ys | merge xs  = xs | merge (x::xs) ys = x::(merge xs ys)
Let’s see if
merge works as expected:
- merge [1, 2, 3] [4, 5]; (* val it = [1,2,3,4,5] : int list *)
The recursive calls to
merge performed the following operations:
This clearly shows that
merge need not iterate over the second list. So it can run in time
proportional to the length of the first list.
Once we have a list, it should be possible to do something to each of its elements. For instance, you might want to give a salary increase to all employees in our employee database.
We can solve this problem by defining a higher-order function that applies
a function to each element in a list and returns the results in a new list. Traditionally this function is known as
fun map _  =  | map f (x::xs) = (f x)::(map f xs)
Let’s first try
map on a simple list:
- map (fn x => x * x) [1, 2, 3, 4, 5]; (* val it = [1,4,9,16,25] : int list *)
Now let’s attempt the salary increment task. The following function increments the salary of a single employee by a given percentage and returns a new employee record with the updated salary. Notice how it destructures an employee records by pattern matching a tuple:
fun salary_inc percentage (id, fname, lname, sal:real) = (id, fname, lname, sal + (sal * percentage))
By default, SML infers the arguments of arithmetic operators to be of type
But we want the salary to be a
real value, so we have type annotated
Now to provide a 10% salary raise to all employees, just map
salary_inc over the employee database:
- map (fn e => salary_inc 0.1 e) employee_db; (* val it = [(1,"Mat","J",4950.0),(2,"Sue","P",6160.22),(3,"Joe","C",2530.0)] : (int * string * string * real) list *)
What if we want to give a hike to only those employees whose salary is below
We will solve this problem in a number of short steps.
First let’s define a function to check if an employee’s salary is below a given threshold:
fun sal_below threshold (_, _, _, sal:real) = sal < threshold
Next we need a function that sifts a list through a predicate.
This function should return a list with all items in the original list
for which the predicate returned
true. Let’s call this function
fun filter _  =  | filter predic (x::xs) = if (predic x) then x::(filter predic xs) else filter predic xs
To find all employees with salaries less than
5000 we just need to pass a partial function
sal_below along with the
employee_db list to
- val employees_for_incr = filter (sal_below 5000.0) employee_db; (* val employees_for_incr = [(1,"Mat","J",4500.0),(3,"Joe","C",2300.0)] : (int * string * string * real) list *)
Now we can pass the filtered list to
map to increment the salaries:
- map (fn e => salary_inc 0.1 e) employees_for_incr; (* val it = [(1,"Mat","J",4950.0),(3,"Joe","C",2530.0)] : (int * string * string * real) list *)
This post showed you how to maintain a collection of objects using lists and how to write functions to manipulate these collections. It is interesting that to process items in a list, we never had to resort to imperative features like loops. We were able to do this using patterns, recursion and higher-order functions. This ability to express complex data processing declaratively is one of the strongest selling points of functional programming.
The SML basis library defines many useful functions for list processing, including efficient versions of
filter. Please get familiar with these functions by refering
to the online manual.
This post concludes our introduction to the basics of SML. The time has come to move on to some advanced topics in functional programming.
Let’s begin that journey by looking at the functional approach to algorithm and data structure implementation. The next couple of posts will cover this topic which is important to the whole enterprise of software engineering.
As you make progress through the posts on advanced functional programming, you will also learn more features of the SML language, like sum types, modules etc.
Note that name and e-mail are required for posting comments