The first part of this series introduced CSP, a model for writing concurrent programs. In this second part, we will look at another model of concurrent computation - the actor model. We will also see how real world programs in the actor model can be developed using the Erlang programming language.

Both CSP and the actor model express computations as autonomous tasks interacting with each other. The tasks in CSP communicate through synchronous channels. This means, the task sending a message will block until another task receives that message at the other end of the channel. A channel waiting for a message will block until a message appears on the channel.

In contrast, messaging in the actor model is asynchronous. Each actor (or task) has a mailbox attached to it. The actor can poll the mailbox for new messages and perform appropriate actions. The asynchronous nature of messaging makes it possible to simulate activities as they take place in the real world. Objects can simultaneously receive multiple stimuli from their environment and choose which one to respond to and when.

Hello Actor World!

The Erlang programming language has built its concurrency primitives around the ideas from actor model. In this section we will write a very simple concurrent program in Erlang. This will help us get familiar with the syntax and basic constructs of the language.

An Erlang program is organized as modules. A module will mostly consist of functions. A module may choose to export some of its functions so that they can be called from the outside. In other words, the exported functions will become the public interface of the module.

The following program defines a simple module hello that exports a function named start. When invoked, this function will create two actors that will send a greeting to each other and exit. (Note that the function exports are specified as name/arity, where name is the name of the function and arity is its parameter count).

-module(hello).
-export([start/0, greeter/2]).

start() ->
    AlicePID = spawn(hello, greeter, ["Alice", recv]),
    spawn(hello, greeter, ["Bob", AlicePID]).

greeter(Name, recv) ->
    receive
    {FromPID, FromName, Message} ->
        io:format("~s says: ~s~n", [FromName, Message]),
        FromPID ! {self(), Name, "hi, there"}
    end;
greeter(Name, ToPID) ->
    ToPID ! {self(), Name, "hello!"},
    greeter(Name, recv).

Note: Variable names in Erlang must start with an uppercase letter. Identifiers that start with lowercase letters are known as atoms. They are used to represent named constant values. Functions are usually identified by their atomic names. (Atoms may also start with an uppercase letter, but in that case it must be enclosed in single quotes (‘).

The function start calls the spawn BIF (built-in function) to start a new actor or task in the system. Spawn takes three arguments - the name of a module where the function we want to call is defined, the name of the function and a list of values. This function will be called concurrently in the new task with the values in the list as arguments.

In this example, we start two actors using the same greeter function. If the second argument to the function is the constant recv, the actor will wait for a message to appear in its mailbox, print it and exit. If the PID (process ID) of an actor is passed as the second argument, the actor will send a message to that actor and then wait for a reply to come back. As message passing is asynchronous, The PID of the sender has to be packaged and send along with the message, if the sender expects to get a reply back. The receiver can use the PID packaged in the message to send back the reply. In the hello program, messages are packaged as a tuple of the form {SenderPID, SenderName, TextMessage}.

As you might have noticed, a message is dispatched to the process by the ! operator. Its first operand is the process identifier to which the message should be dispatched and its second operand is the message itself, which can be any valid Erlang value or term. Messages are retrieved from the mailbox using the receive expression.

You must save the hello module into a file called hello.erl. Now this program can be compiled and executed from the Erlang shell as:

1> c(hello).
c(hello).
{ok,hello}
2> hello:start().
Bob says: hello!
Alice says: hi, there

A Name Server

As an extended example of programming with actors, let us model a name server. A name server is a service that translates a textual name to an often numeric identifier. An example of a name server is the server component of the DNS which translates domain or host names into numeric IP addresses.

In this example, we implement the name server as an actor that responds to two messages - upsert and lookup. The upsert message adds a name-identifier mapping to the server. If the name is already mapped, its identifier is updated. The lookup message accepts a name and responds with the identifier mapped to it. The server makes no assumptions about the specific format for names and identifiers. Here is the complete code for the function that implements the message handling for the name server process:

-module(ns).
-export([start/0, server_loop/1]).

server_loop(Table) ->
    receive
    {upsert, From, {Name, Identifier}} ->
        From ! {ok, Name},
        server_loop(Table#{Name => Identifier});
    {lookup, From, Name} ->
        From ! maps:find(Name, Table),
        server_loop(Table);
    _ -> invalid_message
    end.

start() -> register(name_server, spawn(ns, server_loop, [#{}])).

The server manages the registrations in a variable called Table, which is an instance of the map data structure.

The register BIF associates a name to the process started by spawn. This name can be used instead of the process ID to communicate with the process. This becomes useful if the process is restarted. Its process ID will change but it can be registered to the same name (here the name we used is name_server). Other processes can continue to communicate with the process using this name, oblivious to the fact that it was restarted.

We also need a client process to interact with the name server. This is implemented in a module named nsclient:

-module(nsclient).
-export([upsert/2, lookup/1]).
-export([upsert_on/4, lookup_on/3]).

upsert(Name, Identifier) ->
    upsert_on(name_server, self(), Name, Identifier).

lookup(Name) ->
    lookup_on(name_server, self(), Name).

upsert_on(To, From, Name, Identifier) ->
    To ! {upsert, From, {Name, Identifier}},
    receive
    {ok, Response} ->
        {ok, Response};
    _ -> 
        {error, failure}
    end.

lookup_on(To, From, Name) ->
    To ! {lookup, From, Name},
    receive
    {ok, Identifier} ->
        {ok, Identifier};
    error ->
        {not_found, Name}
    end.

The client module exports two functions for upserting and looking up identifiers on a given name server.

We can now test the name server and client from the Erlang shell:

1> c(ns).
c(ns).
{ok,ns}
2> c(nsclient).
c(nsclient).
{ok,nsclient}
3> ns:start().
ns:start().
true
4> nsclient:upsert(abc, 1).
nsclient:upsert(abc, 1).
{ok,abc}
5> nsclient:upsert(def, 2).
nsclient:upsert(def, 2).
{ok,def}
6> nsclient:lookup(abc).
nsclient:lookup(abc).
{ok,1}
7> nsclient:lookup(def).
nsclient:lookup(def).
{ok,2}
8> nsclient:lookup(xyz).
nsclient:lookup(xyz).
{not_found,xyz}

A Distributed Name Server

Distributed programs are made up of components running on networks of computers, coordinating their activities by message passing. Distributed applications can get stuff done faster because different parts of the program are run in parallel on different machines. Distributed programs also tend to be more reliable because it may continue to function even if one or more machines fail.

It’s easy to write distributed programs in Erlang as actors can be spawned on remote machines. Messaging primitives like send and receive works transparently over a network. This enables a concurrent Erlang program to run some of its tasks on remote machines and take advantage of the benefits that come with distribution.

In this section, we will make the Name Server program to run on a dedicated Erlang node. A node is a self-contained Erlang system with its own address space and set of processes.

To run a distributed version of the Name Server, we need to start an Erlang node first. This can be accomplished by the following shell command:

$ erl -sname ns@localhost

The command will start an Erlang node named ns on localhost. Once we land in the Erlang shell, we can start the Name Server:

(ns@localhost)1> c(ns).
{ok,ns}
(ns@localhost)2> ns:start().
true

Now the Name Server is running on its own node and waiting for clients to send it messages. To send messages to a remote node the ! (send) operator need the name of the target process and also the name of the node on which that process is running. This can be passed as a tuple in the form {process_name, node_name} to the send operator. The upsert_on and lookup_on functions in the nsclient module will automatically take care of the remote messaging, we just need to pass the To and From arguments with appropriate process and node names. Let us experiment with this by starting a client on a new Erlang node:

$ erl -sname cl@localhost

(cl@localhost)1> c(nsclient).
{ok,nsclient}
(cl@localhost)2> register(name_client, self()).
true
(cl@localhost)3> To = {name_server, ns@localhost}.
{name_server,ns@localhost}
(cl@localhost)4> From = {name_client, cl@localhost}.
{name_client,cl@localhost}
(cl@localhost)5> nsclient:upsert_on(To, From, abc, 1).
{ok,abc}
(cl@localhost)6> nsclient:upsert_on(To, From, def, 2).
{ok,def}
(cl@localhost)7> nsclient:lookup_on(To, From, abc).
{ok,1}
(cl@localhost)8> nsclient:lookup_on(To, From, xyz).
{not_found,xyz}

Congratulations, we have our first concurrent, distributed program running!

Conclusion

Erlang makes it fun and easy to write programs that run on multi-core, networked machines. The intention of this post was to whet your appetite for distributed, concurrent programming and convince you that programming in this model need not be seen as rocket science. You just need to pick up the right tool for the job.

As this was post was meant to be an introduction, we left out some important topics. For instance, we did not discuss error handling and fault-tolerance in the context of actors. You might also be wondering about the complexities and security concerns associated with deploying real world distributed programs. Erlang developers has dealt with these problems and captured their experiences as libraries of “patterns”. We will talk about these interesting topics in future articles. So make sure you keep checking back the Anvetsu blog!


Note that name and e-mail are required for posting comments