Comprehensions are a quick and easy way to generate sequences (collections) of any kind in Python. In simple cases, they should be easy to write and, more importantly, read. However, when I was making my first steps in Python I tried to avoid comprehensions as much as possible: most of them looked convoluted and not easy at all. Better stick to good old for loop. In time my attitude changed, of course, I just needed to dive deeper to understand syntax and use cases for applying comprehensions.

1. Structure of comprehensions

There are several types of comprehensions in Python depending on what type of sequences you are trying to build (more on them below), but all of them follow the same structure.

1.1. Basic structure

For example, this is a basic structure of list comprehension:

lst = [expression for item in iterable]

Let’s break it down:

  • expression – a general term for anything that is or evaluates to a value (i.e. that can be stored in a variable). Functions are expressions as well: they can be stored in a variable and when run they evaluate to a value (by default None). They are opposed by statements – commands to the Python interpreter to do something: for and while loops, import, return, if, and others.

    Note: if you need to ‘push’ into the resulting list several values at once (as a tuple), you need to do it expressly with brackets. Implied tuples as in return x, y statements are not allowed. For example:

    lst = [(, for item in iterable]
  • iterable – is any expression that can be iterated over: string, list, set, dictionary, generator, custom object, etc.

  • item – is an item in the iterable which the comprehension iterates through, one by one. It is available in the expression.

A classical example of a list comprehension is generating a list of integers:

lst = [i for i in range(10)]
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

1.2. Condition in comprehension (filter expression)

You can add a condition at the end of a comprehension:

lst = [expression for item in iterable if condition]

Condition is anything that evaluates to a boolean value. If the condition is true, the element will be processed by the comprehension, if false it is skipped. The item is available in the condition.

For example lets generate a list of even numbers:

[i for i in range(10) if i % 2 == 0]
# [0, 2, 4, 6, 8]

Note: that you cannot use an else statement here: you will get a syntax error. In other words, the condition at the end is used only to skip certain elements.

1.3. Ternary operator

If you want to modify a value added to the resulting sequence, but not skip it, you can use a ternary operator (aka conditional expression or inline if) in the expression part:

lst = [
    expression if condition else another_expression 
    for item in iterable

Lets generate a list of integers, but if it is odd it will be negative:

[i if i % 2 == 0 else -i for i in range(10)]
# [0, -1, 2, -3, 4, -5, 6, -7, 8, -9]

The ternary operator is not a part of the comprehension syntax – it is just a regular expression. But they are used often together and it is important to distinguish between them and a condition in a comprehension.

1.4. Chained comprehensions

Comprehensions can be chained at cost of their readability. They come in handy when you need to iterate over nested collections, for instance, a matrix (list of lists). The classical example is flattening a matrix:

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

flat_matrix = [item for row in matrix for item in row]
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

At first glance, they are not intuitive at all. Let’s get deeper into their structure:

lst = [
    resulting_expression for item in outer_iterable 
    for inner_expression in inner_iterable

inner_expression is available within the resulting expression.

Let’s take a less abstract example (borrowed from here ): we have a text that is a list of sentences and each sentence is, in turn, a list of words. You need to get all words in one list. You can do it with this comprehension:

text = [
    ["ST ToS", "is", "the", "best"], 
    ["after", "the", "Voyager"], 
    ["Any", "objections?"]

words = [word for sentence in text for word in sentence]
#  ["ST ToS", "is", "the", "best",
#  "after", "the", "Voyager",
#  "Any", "objections?"]

The same can be done with a couple of nested for loops:

words = []
for sentence in text:
    for word in sentence:
return list_of_words

More on how to read and understand comprehensions below.

1.5. Nested comprehensions

Let’s say we need to generate a matrix 3x3 of consequent numbers. We can do it in a loop:

cols_num, rows_num = 3, 3
matrix = []

for i in range(rows_num):
    row = []
    for j in range(cols_num):
        row.append(cols_num * i + j)

# [[0, 1, 2], [3, 4, 5], [6, 7, 8]]

The same result can be achieved by nesting two comprehensions:

cols_num, rows_num = 3, 3

    [cols_num * i + j for j in range(cols_num)] 
    for i in range(rows_num)

Here ‘inner’ comprehension which generates rows is the expression (output) of the outer expression, which combines rows into a matrix.

2. Types of comprehensions

There are three types of comprehensions in Python: list, dictionary, and set which allow creating a variable of corresponding collection type. There are also generator expressions that are not comprehensions per se but follow the same syntax.

2.1. List comprehensions

Above we have already reviewed the list comprehensions in detail – they are the oldest and the most used in the language.

2.2. Dictionary comprehensions

Structure of a dictionary comprehension:

dct = {
    for item in iterable [if condition]

Usually item is two variables, for instance when you are transforming list of tuple pairs into a dictionary:

pairs = [("apples", 10), ("bananas", 2), ("mangos", 1), ("avocados", 1)]

fruits = {fruit: number for fruit, number in pairs}
# {'apples': 10, 'bananas': 2, 'mangos': 1, 'avocados': 1}

You can still use a ternary operator in the key_expression and value_expression, but only in each expression separately. For example, if there is only one item of a particular fruit we want to add it to the ‘other’ bucket. We might try doing it like this, applying conditions both to the key and value in the dictionary being constructed:

fruits = {
    fruit if number > 1 else 'other'
    for fruit, number in pairs
# {'apples': 10, 'bananas': 2, 'other': 1}

However, note that two fruits satisfied the condition and should have been put into the ‘other’ bucket, but it holds only 1. The problem is in the fact that each time comprehension assigns a value to a key in a dictionary it overwrites it.

The result we want can be achieved with an assignment expression – so-called ‘walrus’ operator, which was introduced in Python 3.8:

other_counter = 0

fruits = {
    fruit if number > 1 else "other"
    :number if number > 1 else (other_counter := other_counter + number)
    for fruit, number in fruit_pairs
# {'apples': 10, 'bananas': 2, 'other': 2}

Walrus operator does two things:

  1. Computes value and assigns it to a variable on the left, exactly like other_counter = other_counter + number would.
  2. Returns this variable. That is why it is called an expression not a statement like = which returns nothing.

2.3. Set comprehensions

Set comprehensions work the same as list comprehensions, except you need to put it into curly braces:

{expression for item in iterable [if condition]}

For example:

{i for i in range(10)}
# {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

Since sets can hold only unique items if a set comprehension returns an item more than once only one will be in the set.

2.4. Generator expressions/comprehensions

Generators are a topic in themselves. In short, they allow generating sequences element by element, one at a time. This allows you to save resources, since you do not store the whole sequence and generate new elements on demand.

The most used generator-like object is range() – technically it is not a generator, but it behaves like one generating each new item on demand.

Generator comprehensions (actually called expressions ) follow the same syntax as list comprehensions but in round brackets:

g = (i for i in range(10000))
# 49995000

This example is more memory-efficient than first creating a list and then summing it.

The example above is not very useful, it duplicates what range already does. Let’s make it better by adding a condition so the generator returns only even numbers:

g = (i for i in range(10) if i % 2 == 0)

for i in g:
# 0
# 2
# 4
# 6
# 8

3. How to read comprehensions

The order of expressions and statements in comprehensions is almost the same as in for-loops. The only exception: the resulting expression which is added to the final sequence goes first.

Let’s look at an example, somewhat convoluted, but illustrative, numbering ‘actions’ on the data:

negative_text = [
    ['This', 'is', 'a', 'negative', 'text'], 
    ['You', 'can', 'not', 'change', 'it']

flat_positive_text = []

for sentence in negative_text:  # 1
    if len(sentence) > 1:  # 2
        for word in sentence:  # 3
            if word != 'not':  # 4
                flat_positive_text.append(  # 5
                    word if word != 'negative' else 'positive'  # 5
                )  # 5

#  ['This', 'is', 'a', 'positive', 'text', 'You', 'can', 'change', 'it']

Here we again flatten a list of sentences with three additional twists:

  1. We leave sentences that are longer than one word. Others are too small to matter.
  2. We skip the word “not”.
  3. We substitute the word “negative” with “positive”.

Now in comprehension mode. The line breaks are not required, but help the readability and demonstrate the example.

flat_positive_text = [
    word if word != 'negative' else 'positive'  # 5
    for sentence in negative_text  # 1
    if len(sentence) > 1  # 2
    for word in sentence  # 3
    if word != 'not'  # 4
#  ['This', 'is', 'a', 'positive', 'text', 'You', 'can', 'change', 'it']

All statements (for loops and if) in the comprehension go in the same order and word for word as in the loops. Only the expression moved to the beginning, plus we do not need list.append() since it is implied in the comprehension syntax.

4. Scope of variables in comprehensions

Comprehensions have their own scope. Iterable variable(s):

  1. does not modify existing variables:

    i = 'value'
    lst = [i for i in range(10)]
    # value
  2. is not available outside its comprehension:

    lst = [i for i in range(10)]
    # NameError: name 'i' is not defined

In other words, they behave the same as in a function scope:

i = 'value'

def func():
    i = 0

5. When (not) to use comprehensions

Before we answer that we need to consider what the comprehensions were added to the language for in the first place. PEP202 which proposed the list comprehension says:

List comprehensions provide a more concise way to create lists in situations where map() and filter() and/or nested loops would currently be used.

Therefore, the rationale behind comprehensions is to make code more readable. Since comprehensions have their separate implementation, they are a bit faster than a for loop, but I would argue that it is not important and should not be the sole reason for using it outside of very niche cases. More on the performance of comprehensions here.

Therefore, include comprehensions if they will make your code more readable. I like how it is put in the Google Python Style Guide :

Okay to use for simple cases. Each portion must fit on one line: mapping expression, for clause, filter expression. Multiple for clauses or filter expressions are not permitted. Use loops instead when things get more complicated.

Their most complicated acceptable example:

descriptive_name = [
    transform({'key': key, 'value': value}, color='black')
    for key, value in generate_iterable(some_input)
    if complicated_condition_is_met(key, value)

Anything more complicated will reduce readability instead of improving it. Performance win is very unlikely to justify losing readability.

6. Common use cases

Below are some common cases for using comprehensions.

6.1. Filtering

Example: filter out clients that have expired plan to contact them:

clients  # list of Client objects

clients_to_contact = [
    client for client in clients if client.is_plan_expired is True

Note: It is not necessary to use is True in the condition, but I find it more expressive and readable.

6.2. Get values of objects' certain property

items = [
    {"id": 1, "price": 1.99}, 
    {"id": 2, "price": 2.99}, 
    {"id": 3, "price": 3.5}

prices = [item["price"] for item in items]
# [1.99, 2.99, 3.5]

6.3. Turn pair values into a dictionary

pairs = [("apples", 10), ("bananas", 2), ("mangos", 1)]

fruits = {fruit: number for fruit, number in pairs}
# {'apples': 10, 'bananas': 2, 'mangos': 1}

6.4. Generate a list or matrix

Example: a matrix of 5 x 5 with random float numbers from -1 to 1:

import random

rows_num, cols_num = 5, 5

matrix = [
    [random.uniform(-1, 1) for _ in range(cols_num)] 
    for _ in range(rows_num)
# [[0.12177242274032363, 0.6106040351658775, -0.07131039288774255],
# [-0.8164461988134857, -0.14904520472004235, -0.587013387046897],
# [0.1191384124509558, 0.9415088022597389, 0.5419266415433419]]

Note: Since we do not need range values here we named them _ as per Python convention for values to ignore. Using the same variable name in both comprehensions is not a problem, since they are returned from ranges and cannot mutate them.