Often you need to store data (counters, sums, lists, etc.) about some entities. A regular Python dictionary will suffice with one caveat: before adding information regarding a particular entity you need to check if the corresponding key exists in the dictionary, otherwise you will get a KeyError. Do not know about you, but I have written similar code many times:
dct = {}
if key not in dct:
dct['key'] = 0
dct['key'] += 1
A better way
While it is perfectly workable, it does not add to readability. There is a better, more pythonic solution – defaultdict (documentation).
It is a subclass of regular dict which accepts a default_factory which is run every time you try to access a non-existing item in the dict. This way the code above can be simplified as follows:
from collections import defaultdict
dct = defaultdict(int)
dct['key'] += 1
Notes
-
The default mechanism is a factory, not a value. I.e. it must be callable, for instance: built-in type, named, or lambda function.
-
If you initialize a defaultdict without a factory or with None it will act as a regular dict, including throwing a KeyError when accessing a non-existent key.
-
default_factory is a property and you can change it after initializing the dict:
from collections import defaultdict dct = defaultdict(int) dct['int'] += 1 dct.default_factory = list dct['list'].append(0) # defaultdict(<class 'list'>, {'int': 1, 'list': [0]})
Examples
from collections import defaultdict
defaultdict(int) # int evaluates to 0
defaultdict(list) # Empty list
defaultdict(lambda: 10) # Arbitrary lambda function
defaultdict(your_function) # Arbitrary named function