So far this series concentrated on building functional abstractions. The functions themselves manipulated only simple data, like numbers. But simple data types are not sufficient for modelling real-world objects that happen to be combinations of many parts.

This post and the few that follows will deal with compound data. Let’s start with the most important compound data object in functional programming - the list.

## Making Lists

A list is a finite sequence of elements. (A list may also be infinite, but let’s explore this bizzare idea in a later post). The elements of a list are enclosed in opening and closing square-brackets and separated by comma.

- [1, 2, 3];
(* val it = [1,2,3] : int list *)

- [#"a", #"e", #"i", #"o", #"u"];
(* val it = [#"a",#"e",#"i",#"o",#"u"] : char list *)


An empty list is denoted by []. The constant nil is a synonym for the empty list.

As SML is statically typed, all the elements of a list must be of the same type. The type of a list is denoted as t list, where t is the type of the elements. The preceding example shows an int list and a char list.

Behind the scenes, a list is constructed using two primitives - the constant nil (empty list) and the infix operator :: called cons (for “construct”). The following example shows a verbose version of building the list [1, 2, 3] using :: and nil:

- 1::2::3::nil;
val it = [1,2,3] : int list


This means, a list conforms to either one of two “data patterns” -

1. [] - an empty list.
2. h::t - where h is the head of the list and t is its tail.

The head of a list points to the first element and tail points to the rest of the elements. The tail of a properly constructed list will always be another list or nil.

The elements can be other complex objects like tuples or lists. The following program defines an employee database. The database is represented as a list, each employee being a tuple of type id:int * first_name:string * last_name:string * salary:real:

- val employee_db = [(1, "Mat", "J", 4500.0), (2, "Sue", "P", 5600.20), (3, "Joe", "C", 2300.0)];
(* val employee_db = [(1,"Mat","J",4500.0),(2,"Sue","P",5600.20),(3,"Joe","C",2300.0)]
: (int * string * string * int) list *)


## Basic Operations

Once you have a list, you would want to access its elements. The hd function lets you fetch the first element in a list and the tl function returns its “tail” or the rest of the elements:

- hd [10, 20, 30];
(* val it = 10 : int *)

- tl [10, 20, 30];
(* val it = [20,30] : int list *)


Assume that you want to access the nth employee in the “database” that we defined earlier. Using calls to hd and tl, you can define a function that achieves this:

fun nth n ls x =
if ls = [] then x
else if n = 0 then (hd ls)
else nth (n - 1) (tl ls) x
(* val nth = fn : int -> ''a list -> ''a -> ''a *)


The function nth takes three arguments - a position of the item to find (indicated by the integer parameter n), the list to search (ls) and a value to return if n is out-of-bounds (x).

- nth 3 ["a", "f", "k", "g"] "z";
(* val it = "g" : string *)

- nth 10 ["a", "f", "k", "g"] "z";
(* val it = "z" : string *)


The inferred type of the function indicates that the list ls is polymorphic, i.e it can be a list with elements of any type, with one restriction as we shall see now.

You might be wondering why the type variable a has two single-quotes as prefix, instead of just one. This means that, though the list is polymorphic, its elements should belong to a class of types known as equality types. The values of these types can be tested for equality with the = and <> operators. Most basic types - integer, boolean, character, and string - are equality types.

Tuples and lists whose elements are of equality types can also be compared using the = and <> operators. The first clause of the if expression in the nth function makes this assumption and applies the = operator to the list. Hence the type of ls is inferred as ''a list instead of 'a list. So only lists with values belonging to an equality type can be passed to nth.

### Pattern Matching

Earlier we saw that a list has the pattern h::t, where h is the first element (head) and t are the rest of the elements (tail). SML allows us to define functions on the basis of the “pattern” of its arguments. We can redefine nth using this idea. Instead of the if expression, we will write the function as a bunch of clauses that matches the patterns of the arguments and takes appropriate action.

fun nth _ [] x = x
| nth 0 (h::_) _ = h
| nth n (_::t) x = nth (n-1) t x
(* val nth = fn : int -> 'a list -> 'a -> 'a *)


The clauses are evaluated in the order they are defined. So we need to handle the base-case of the list being empty in the first clause itself. In this case, we return the not-found value x. We have no use for the position parameter here, so we ignore that with the wildcard _ pattern.

The next clause checks the case where n has become 0. This means we have reached the item we are looking for, which is at the head of the list. The head and tail of the list is matched by the pattern (h::t). Here, we return h and ignore other values.

The last clause recursively calls nth with the decremented value of n and the tail of the list. This recursion will eventually reach either the first or second clauses, thus terminating the function.

With pattern matching, we no longer need the = check on the list. Now the inferred type of the list is 'a list, indicating that there are no longer any restrictions on the types of elements.

#### Counting elements

The number of elements in a list is reported by the built-in function length:

- length [10, 20, 30];
(* val it = 3 : int *)


As an exercise in pattern matching, let us define our own version of length. To avoid a name conflict, let us call this function count.

fun count xs =
let fun count_loop n [] = n
| count_loop n (_::t) = count_loop (n+1) t
in
count_loop 0 xs
end


The count function makes use of an internal function called count_loop. The first argument of count_loop (i.e the integer n) is incremented until the list becomes empty. The final value of n will be the total number of elements in the list.

- count [10, 20, 30];
(* val it = 3 : int *)


### Merging two lists

There are times we want to merge two lists into one. The following function does exactly that:

fun merge [] ys = ys
| merge xs [] = xs
| merge (x::xs) ys = x::(merge xs ys)


Let’s see if merge works as expected:

- merge [1, 2, 3] [4, 5];
(* val it = [1,2,3,4,5] : int list *)


The recursive calls to merge performed the following operations:

1::2::3::[4, 5]


This clearly shows that merge need not iterate over the second list. So it can run in time proportional to the length of the first list.

## Enumeration

Once we have a list, it should be possible to do something to each of its elements. For instance, you might want to give a salary increase to all employees in our employee database.

We can solve this problem by defining a higher-order function that applies a function to each element in a list and returns the results in a new list. Traditionally this function is known as map.

fun map _ []      = []
| map f (x::xs) = (f x)::(map f xs)


Let’s first try map on a simple list:

- map (fn x => x * x) [1, 2, 3, 4, 5];
(* val it = [1,4,9,16,25] : int list *)


Now let’s attempt the salary increment task. The following function increments the salary of a single employee by a given percentage and returns a new employee record with the updated salary. Notice how it destructures an employee records by pattern matching a tuple:

fun salary_inc percentage (id, fname, lname, sal:real) =
(id, fname, lname, sal + (sal * percentage))


By default, SML infers the arguments of arithmetic operators to be of type int. But we want the salary to be a real value, so we have type annotated sal with real.

Now to provide a 10% salary raise to all employees, just map salary_inc over the employee database:

- map (fn e => salary_inc 0.1 e) employee_db;
(* val it = [(1,"Mat","J",4950.0),(2,"Sue","P",6160.22),(3,"Joe","C",2530.0)]
: (int * string * string * real) list *)


### Filtering data

What if we want to give a hike to only those employees whose salary is below 5000? We will solve this problem in a number of short steps.

First let’s define a function to check if an employee’s salary is below a given threshold:

fun sal_below threshold (_, _, _, sal:real) =
sal < threshold


Next we need a function that sifts a list through a predicate. This function should return a list with all items in the original list for which the predicate returned true. Let’s call this function filter:

fun filter _ [] = []
| filter predic (x::xs) =
if (predic x) then
x::(filter predic xs)
else
filter predic xs


To find all employees with salaries less than 5000 we just need to pass a partial function initialized from sal_below along with the employee_db list to filter:

- val employees_for_incr = filter (sal_below 5000.0) employee_db;
(* val employees_for_incr = [(1,"Mat","J",4500.0),(3,"Joe","C",2300.0)]
: (int * string * string * real) list *)


Now we can pass the filtered list to map to increment the salaries:

- map (fn e => salary_inc 0.1 e) employees_for_incr;
(* val it = [(1,"Mat","J",4950.0),(3,"Joe","C",2530.0)]
: (int * string * string * real) list *)


## Conclusion

This post showed you how to maintain a collection of objects using lists and how to write functions to manipulate these collections. It is interesting that to process items in a list, we never had to resort to imperative features like loops. We were able to do this using patterns, recursion and higher-order functions. This ability to express complex data processing declaratively is one of the strongest selling points of functional programming.

The SML basis library defines many useful functions for list processing, including efficient versions of map and filter. Please get familiar with these functions by refering to the online manual.

This post concludes our introduction to the basics of SML. The time has come to move on to some advanced topics in functional programming.

Let’s begin that journey by looking at the functional approach to algorithm and data structure implementation. The next couple of posts will cover this topic which is important to the whole enterprise of software engineering.

As you make progress through the posts on advanced functional programming, you will also learn more features of the SML language, like sum types, modules etc.

Note that name and e-mail are required for posting comments