Documentation https://docs.python.org/3/library/collections.html#collections.defaultdict
Often you need to store data (counters, sums, list of sub-entities, etc.) about some entities. For this purpose a dict will suffice with one caveat: before adding information regarding a particular entity you need to check that it exists in the dict, otherwise you will get a KeyError. Do not know about you, but I have written similar code many times:
''' dct = {}
if key not in dct: dct[‘key’] = 0
dct[‘key’] += 1 '''
While it is perfectly workable, it does not add to readability. There is a better, more pythonic solution – defaultdict. https://docs.python.org/3/library/collections.html#collections.defaultdict
It is a subclass of regular dict which accepts a default_factory which is run every time you try to access a non-existing item in the dict. This way the code above can be simplified as follows:
from collections import defaultdict
dct = defaultdict(int)
dct['key'] += 1
Notes:
- Default mechanism is a factory, not a value. I.e. it must be callable, for instance: built-in type, named or lambda function.
- If you initialize a defaultdict without a factory or with None it will act as a regular dict, including throwing a KeyError when accessing non-existent key.
- default_factory is a property and you can change it after initializing the dict:
from collections import defaultdict
dct = defaultdict(int)
dct['int'] += 1
dct.default_factory = list
dct['list'].append(0)
print(dct)
defaultdict(<class ‘list’>, {‘int’: 1, ‘list’: [0]})
Examples:
from collections import defaultdict
defaultdict(int) # Default value will be 0, since int without arguments returns 0
defaultdict(list) # Empty list
defaultdict(lambda: 10) # Any arbitrary lambda function
defaultdict(you_function) # Arbitrary named function