The dictionary {} is one of the most versatile built-in type in Python. Since it allows for rather optimized storage of heterogeneous data with average of $O(1)$ performance for key access, it is one of the most used data-structures in Python.

In this post, I want to discuss how one can extend and inherit the dict type in Python, creating custom user dictionaries. We will start with few simple use-cases and go on to discuss more general, parameterized classes which are almost like patterns.

Extending {}

Python allows one to inherit from built-in types. So extending and implementing a custom dictionary class is rather straight-forward in Python.

Let us look at a couple of examples.

The Case of the Caseless Dict

Lets implement a custom dictionary class that allows one to neglect the case of the keys when keys are strings. In other words, one can use this dictionary where the case of any keys which are strings does not matter.

(Please note that all code is Python3.x only.)

class CaselessDict(dict):
    """ A dictionary sub-class with case-insensitive keys """
    def __setitem__(self, key, value):
        if isinstance(key, str):
            key = key.casefold()
        super().__setitem__(key, value)

    def __getitem__(self, key):
        if isinstance(key, str):
            key = key.casefold()
        return super().__getitem__(key)

Our aptly named CaselessDict class doesn’t distinguish between say $KEY$ and $key$ in its keys.

>>> d = CaselessDict()
>>> d['PYTHON'] = 'Versatile'
>>> d
{'python': 'Versatile'}
>>> d['python']

The same string with any case variation will access/modify same key.

>>> d['Python']
>>> d['Python'] = 'Wonderful'
>>> d['PYTHON']

The Adventure of the Missing Punctuations

Lets look at another example, where one needs a $normalized$ dictionary key which removes punctuation and other extra characters from a key before setting it in the dictionary.

(This kind of dictionary is useful when you are doing some text-processing task where normalizing words down to their most basic form - dropping punctuations, case-folding and quite often performing $stemming$ helps to optimize computation and save space.)

Let me present - The PunctSafeDict ($Punct$, not $Punk$)

import string

def filter_punct(instr):
    """ Remove all punctuation from 'instr' """
    return ''.join(filter(lambda x: x not in string.punctuation, instr))
class PunctSafeDict(dict):
    """ A dictionary which is punctuation safe for its keys.
    It removes all punctuations from a key """

    def __setitem__(self, key, value):
        if isinstance(key, str):
            key = filter_punct(key)
        super().__setitem__(key, value)

    def __getitem__(self, key):
        if isinstance(key, str):
            key = filter_punct(key)
        return super().__getitem__(key)

The implementation is quite straight-forward. Before setting and getting the key from/to the dictionary, a function named filter_punct is used to strip the key of all punctuations. So the key that it set is a normalized one with all punctuations removed, and the same function being used in _getitem_ mirrors the process, hence the lookup also becomes punctuation-safe .

>>> p = PunctSafeDict()
>>> p["Madam, I'm Adam."] = 'Palindrome'
>>> p.keys()
dict_keys(['Madam Im Adam'])
>>> p['!!!Python!!!'] = 'Language'
>>> p.keys()
dict_keys(['Madam Im Adam', 'Python'])

I am Caseless and Punct Safe

Now that we’ve got a case of the caseless and a feel of the punct-safe, can we create a dictionary having both the effects? In other words a caseless-punct-safe dictionary.

Python’s multiple inheritance comes to the rescue here and provides a rather elegant solution without having to rewrite code.

class CaselessPunctSafeDict(CaselessDict, PunctSafeDict):

This avatar is born with the super powers of both the parents.

>>> cpsd = CaselessPunctSafeDict()
>>> cpsd["Madam, I'm Adam."] = 'Palindrome'
>>> cpsd["!!!Python!!!"] = 'Powerful'
>>> cpsd["Hello, world!"] = 'nice to meet you'
>>> cpsd
{'madam im adam': 'Palindrome', 'python': 'Powerful', 'hello world': 'nice to meet you'}

As you can see, it dutifully removed all punctuations and lower-cased the keys.

Combining Pre-processing Functions - The CustomDict

If you analyze what we did with our two classes and their child earlier, you will realize that we effectively wrote key pre-processing functions in each case and called them in both methods where the key is set and get. In the first case the function lower-cased the key and in second case, it stripped the key of all punctuations.

We can take this as a general idea and create a very generic $CustomDict$, a custom dictionary which receives a list of any such arbitrary pre-processing functions and provides very generic and customizable behaviour. There is no need to bake-in the logic inside the dictionary class, but instead the pre-processing logic can be provided as functions from outside. This also makes our new solution very functional.

Presenting the $CustomDict$ class.

class CustomDict(dict):
    """ Custom dictionary class with parameterized key
    pre-processing functions (pfuncs) """

    def __setitem__(self, key, value):
        if isinstance(key, str):
            # Loop through pfuncs and
            # process key in order
            for func in self.pfuncs:
                key = func(key)
        super().__setitem__(key, value)

    def __getitem__(self, key):
        if isinstance(key, str):
            for func in self.pfuncs:
                key = func(key)
        return super().__getitem__(key)

The CustomDict class is very versatile, parameterizable and functional - because it allows its keys to be manipulated by a list of arbitrary pre-processing functions (pfuncs) which can be set on the class at the time of creation (We will see how to do this a moment later).

The logic of key manipulation is brought from outside-in than written from inside-out - hence the code is functional and parametric.

The CustomDict class provides a blueprint or base-class to create similar classes. We will use the magic of metaclasses to create the sub-classes.

Creating CustomDict classes - Metaclass as Factory

If you have read a bit about Python meta-classes, you may recall that they can be used as “class factories”. Remember, I said “class factories”, not “instance factories” here. Metaclasses provide a way to create new classes (or types) from an existing type in a rather dynamic way.

Presenting $CustomDictFactory$, a metaclass that also works as a factory for creating families of $CustomDict$ classes.

class CustomDictFactory(type):
    """ A factory class for CustomDict classes """
    def __new__(self, name, pfuncs):
        # returns a class, not an instance!
        return type(name, (CustomDict,), {'pfuncs': pfuncs})

NOTE: Observe that the CustomDict class should not be instantiated directly, but only via CustomDictFactory.

We have overridden the _new_ method of the class, so the constructor itself works as the class creator. (This may seem a bit odd as a constructor is supposed to return objects of same type, but believe me - this is perfectly fine for metaclasses).

Let us create a sub-class of CustomDict, which does the same job as CaselessPunctSafeDict but this time in a purely functional and dynamic fashion.

>>> D = CustomDictFactory('CaselessPunktSafeDict',[lambda x: x.casefold(),filter_punct])
>>> D
<class 'customdict.CaselessPunktSafeDict'>

Note that we have created a class, not an instance. To create the instance, we need to instantiate this class.

>>> d = D()
>>> d["Madam, I'm Adam."] = 'Palindrome'
>>> d
{'madam im adam': 'Palindrome'}
>>> d["@@@Python"] = 'Cool'
>>> d
{'madam im adam': 'Palindrome', 'python': 'Cool'}

As you can see, this performs the same job as the statically defined CaselessPunctSafeDict, but there is no static code here. Everything is assembled together and built-up dynamically, except the blueprint (template methods) provided by the CustomDict class.

You may have observed that the CustomDict should not be instantiated directly, though nothing in the code prevents it from being so. The interested reader can try and use Python’s abc module to perhaps reimplement CustomDict as an abstract class to enforce this.

Flexible Class Creation

You may at this point ask what is the benefit of all this rather clever and dynamic code - if we can achieve the same using multiple inheritance. Let me present another problem for a custom dictionary class.

Imagine you need a custom dictionary class that also requires the keys to be devoid of any spaces - in other words, the key strings need to be normalized using three steps.

  1. Lowercase (case-fold)
  2. Removal of punctuations
  3. Removal of spaces

To do this statically will require creation of another class and/or code modifiction or multiple inheritance from three base-classes (ugly!). But with our flexible CustomDictFactory, its just one extra function - that too defined anonymously.

>>> pfuncs = [lambda x: ''.join(filter(None, x.split())), lambda x: x.casefold(), filter_punct]
>>> D = CustomDictFactory('CaselessSpacelessPunktSafeDict', pfuncs)
>>> D
<class 'customdict.CaselessSpacelessPunktSafeDict'>

>>> d=D()
>>> d["Madam, I'm Adam."] = "Palindrome"
>>> d
{'madamimadam': 'Palindrome'}
>>> d["Walking, talking - it's all fun!"] = 'sentence'
>>> d
{'walkingtalkingitsallfun': 'sentence', 'madamimadam': 'Palindrome'}

Briefly, we introduced a new (anonymous) function as the first item in pfuncs - which strips a sentence of all spaces. The resultant dict sub-class had the cumulative effect of all three functions - which solved our problem.

Using Metaclasses - A More Direct Approach

There is another way to use meta-classes and bake-in the logic of the parametric functions into the dictionary sub-class directly by using the metaclass as its type.

class CustomDictType(type):
    """ Custom dictionary type which sets a list of pfuncs """
    def __call__(cls, *args, **kwargs):
        """ Overriding () of the class """

        pfuncs = kwargs.get('pfuncs', [])
        cls.pfuncs = pfuncs
        return type.__call__(cls) 

Here we create a custom type named CustomDictType and override its _call_ method which modifies how a class’s instance, which uses this type as metaclass, is created. In this case we dynamically inject a list of pfuncs into the class and thereby the instance.

Here is the modified dictionary sub-class, which uses this type as its metaclass.

class MyCustomDict(dict, metaclass=CustomDictType):
    """ A custom dictionary sub-class using a custom metaclass
    for parameterized pre-processing of keys using functions """
    def __setitem__(self, key, value):
        if isinstance(key, str):
            for func in self.pfuncs:
                key = func(key)
        super().__setitem__(key, value)

    def __getitem__(self, key):
        if isinstance(key, str):
            for func in self.pfuncs:
                key = func(key)
        return super().__getitem__(key) 

Here is how you can use this code, reusing the same pfuncs definition as before.

>>> d2 = MyCustomDict(pfuncs=pfuncs)
>>> d2
>>> d2["Madam, I'm Adam."] = "Palindrome"
>>> d2["I love Python, it's so dynamic!!"] = 'sentence'
>>> d2
{'madamimadam': 'Palindrome', 'ilovepythonitssodynamic': 'sentence'}

The main difference here is that the class and instance creation happens in one shot - so you don’t get a modified sub-class, instead you get the instance directly - all instances being the instance of class MyCustomDict .

Dynamic class creation using metaclasses and types is an interesting and powerful feature of Python. It allows you to build higher level abstractions and objects and provides a powerful alternative to solve problems by composing code blocks - classes and functions - together than re-writing code and linking objects via static approaches like inheritance.

Some further references to explore concepts presented here are provided below.


  1. Understanding Python metaclasses -
  2. Supercharge your classes -
  3. Case insensitive dictionary in Python -
  4. A PEP for key-transforming dictionary in Python -

Note that name and e-mail are required for posting comments