Documentation https://docs.python.org/3/library/collections.html#collections.defaultdict

Often you need to store data (counters, sums, list of sub-entities, etc.) about some entities. For this purpose a dict will suffice with one caveat: before adding information regarding a particular entity you need to check that it exists in the dict, otherwise you will get a KeyError. Do not know about you, but I have written similar code many times:

''' dct = {}

if key not in dct: dct[‘key’] = 0

dct[‘key’] += 1 '''

While it is perfectly workable, it does not add to readability. There is a better, more pythonic solution – defaultdict. https://docs.python.org/3/library/collections.html#collections.defaultdict

It is a subclass of regular dict which accepts a default_factory which is run every time you try to access a non-existing item in the dict. This way the code above can be simplified as follows:


from collections import defaultdict


dct = defaultdict(int)

dct['key'] += 1

Notes:

  • Default mechanism is a factory, not a value. I.e. it must be callable, for instance: built-in type, named or lambda function.
  • If you initialize a defaultdict without a factory or with None it will act as a regular dict, including throwing a KeyError when accessing non-existent key.
  • default_factory is a property and you can change it after initializing the dict:
from collections import defaultdict


dct = defaultdict(int)
dct['int'] += 1
dct.default_factory = list
dct['list'].append(0)
print(dct)

defaultdict(<class ‘list’>, {‘int’: 1, ‘list’: [0]})

Examples:


from collections import defaultdict

defaultdict(int) # Default value will be 0, since int without arguments returns 0
defaultdict(list) # Empty list
defaultdict(lambda: 10) # Any arbitrary lambda function
defaultdict(you_function) # Arbitrary named function