In computing, a thread is a unique path of execution within a process. A process can have multiple threads.

Python provides a threading API via its $threading$ module.

Today we are going to discuss about six techniques or recipes in this article. These are,

  1. Creating a thread of execution
  2. Creating your own Thread class
  3. Customising the Thread class for finer control
  4. Saving and returning state from Thread classes
  5. Using TLS or Thread Local Storage
  6. Waiting for Thread Completion

Let’s $start()$ and get $run()$ning !

1. Creating a Thread

To build a new thread, we need to instantiate the $Thread$ class of the $threading$ module and passing it a target function or callable.

def greet():
    for i in range(10):
        print('Hello world {}'.format(i+1))

Our target function greets the world 10 times. Let us run it in its own thread.

>>> import threading
>>> t = threading.Thread(target=greet)
>>> t.start()
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
>>> Hello 6
Hello 7
Hello 8
Hello 9
Hello 10

>>> 

Did you notice how the Python prompt got interspersed with the messages ? This is because our function is running in its own thread so it does not hold down the main thread of the interpreter.

While this is fun, its not very powerful as it does not allow you to build custom thread classes that execute specific routines with reusable code. For that you need to sub-class the Thread.

2. Creating your own Thread class

class MyThread(threading.Thread):
    """ My custom thread sub-class """
    
    def greet(self):
        for i in range(10):
            print('Hello {}'.format(i+1))
            
    def run(self):
        self.greet()

Ok. So what was done here ? A brief tour.

  1. A new class MyThread was created, inheriting from threading.Thread class.
  2. The greet function went inside the class as a method.
  3. A method run was implemented, which called greet.

Does it work ? Let us see.

>>> t = MyThread()
>>> t.start()
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
Hello 6
Hello 7
Hello 8
Hello 9
>>> Hello 10

Yes, it does.

The magic here is in the $run$ method which executes any code inside it in a separate thread. The $start$ method sets up some state and gets $run$ to run in a separate thread of execution. (By the way, you should never call the $run$ method explicitly.)

3. Finer Control - Customising your Thread class

The MyThread class we built goes on and does its own stuff and terminates. Is there a way to control and parameterize what it does - say by passing arguments from outside ?

Yes, we can override the class _init_ for this. But this time lets do something more useful than greeting the world.

class PrimeThread(threading.Thread):
    """ Compute prime numbers *till* a given number as input """

    def __init__(self, sentinel):
        self._sentinel = sentinel
        super().__init__(group=None)
        
    def compute(self):
        """ Compute primes till sentinel value """

        for i in range(2, self._sentinel+1):
            for j in range(2, round(i**0.5) + 1):
                if i % j == 0:
                    break
            else:
                yield i

    def run(self):
        for n in self.compute():
            print(n)

The PrimeThread class computes prime numbers till a given value. Such barrier values are often called sentinels in computing. Our class computes primes till a given sentinel and prints them.

>>> t = PrimeThread(50)
>>> t.start()
2
3
5
>>> 7
11
13
17
19
23
29
31
37
41
43
47

>>>

Fine. So what was done here ? A quick tour.

  1. We added an _init_ method that accepts the sentinel value. This was stored as _sentinel in the instance dictionary.
  2. We have a $compute$ method that computes prime numbers starting from 2 to the sentinel value and yields them.
  3. The $run$ method calls $compute$ in a loop and prints each prime number.

4. Saving State

Guess that was a lot of unbridled, threaded fun. But why are we printing things ? Is there anything that prevents us from saving state and using it later inside a Thread sub-class ?

Not really, nothing. So lets do that.

Presenting the PrimesThread class.

class PrimesThread(threading.Thread):
    """ Compute prime numbers *till* a given number as input, and keeping state """

    def __init__(self, sentinel):
        self._sentinel = sentinel
        self.primes = []
        super().__init__(group=None)
        
    def compute(self):
        """ Compute primes till sentinel value """

        for i in range(2, self._sentinel+1):
            for j in range(2, round(i**0.5) + 1):
                if i % j == 0:
                    break
            else:
                yield i

    def run(self):
        for p in self.compute():
            self.primes.append(p)

Cool. So what did we do here? Another tour.

  1. Created a new instance state named primes - a list.
  2. Appended each prime number to this list inside $run$.

Does it work ? It should. Let us see.

>>> t = threading1.PrimesThread(50)
>>> t.start()

(Wait a bit.)

>>> t
<PrimesThread(Thread-9, stopped 140003133458176)>
>>> t.primes
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

Ok it works but just wait a second. Since primes is a list set on the thread class, theoretically its possible for an outside thread or object to modify it while our thread is running correct ?

Yes. It is, in other words though the above code does the job of keeping thread state, it does not do it in a $thread-safe$ way.

5. Enter the TLS …

TLS or Thread Local Storage is a way to keep data on a thread, that is visible only to the thread and not to any other objects in the program, while the thread is running. It is a quick and smart way to keep local thread data inside a safe - i.e, thread-safe.

class PrimetlsThread(threading.Thread):
    """ Compute prime numbers *till* a given number as input - using TLS """

    def __init__(self, sentinel):
        self._sentinel = sentinel
        self.primes = []
        super().__init__(group=None)
        
    def compute(self):
        """ Compute primes till sentinel value """

        for i in range(2, self._sentinel+1):
            for j in range(2, round(i**0.5) + 1):
                if i % j == 0:
                    break
            else:
                yield i

    def run(self):
        mydata = threading.local()
        mydata.primes = []
        
        for p in self.compute():
            mydata.primes.append(p)

        self.primes = mydata.primes[:]

Another terse tour of the code.

  1. In this class, the $run$ method uses a TLS variable named mydata to keep the primes list local to the thread during computation.
  2. Upon finishing, it copies this value to the instance’s primes variable. It is important to do this otherwise, the TLS storage is lost when the thread finishes.
  3. This way the data is safe from tampering while the thread is running.

6. Thread Completion

In the examples so far, we asked our thread for simple computation so it was mostly over by the time we looked.

What if we asked our thread to compute primes up to a million ? That will take a while right ? How do we know the thread is done ?

>>> t = threading1.PrimetlsThread(1000000)
>>> t.start()
>>> t
<PrimetlsThread(Thread-13, started 140003133458176)>
>>> t.join()
>>> len(t.primes)
78498

The $join$ method suspends the calling thread till the thread on which it is called is finished. So that is how you wait on a long running thread to complete - at the cost of your thread getting blocked.

There are other ways to do this too. Perhaps in a futue article.

Till then, Adios.


Note that name and e-mail are required for posting comments