Performance of application-defined functions and collations in SQLite

When writing the post on Unicode case-insensitive search in SQLite I got interested in the performance of application-defined functions and collations: they must introduce overhead, the question is how much and whether it is something to be concerned about.

To answer these questions I made a simple test suite measuring the performance of both built-in SQLite and application-defined functions and collations – the full code is at the very end of the post. The basic idea of the performance test:

Create 10,000 rows of random strings (same across all tests).
For every function/collation: run a SELECT query for each of the strings, i.e. 10,000 times. Without an index, it will force a full table scan each time. In total it gives 100m calls to the function/collation being tested.
Build an index that can be used in the queries above.
Run the same queries with an index.
Repeat it 10 times to smooth out the effect of random factors and average the result.

I omit some details, feel free to look into the code. For bonus credit, we count the number of calls to some of the functions and collations. All performance was measured without counters since they introduce overhead that would distort the results.

Note on absolute numbers vs. relative comparison

In this kind of test, it does not make sense to compare absolute numbers: different hardware will produce different results. What took my computer 30 seconds to do, may take yours 15, 42, or any other number of seconds. The purpose of this exercise is to test the relative performance of the functions and whether it matters.

All figures are in seconds and include running the test in full, meaning in most cases 100m calls. One notable exception: relevant indexes are built only once per test.

Application-defined functions

We have three main test subjects:

def do_nothing(a: str):
    return a


def get_len(a: str):
    return len(a)


def casefold(a: str):
    return a.casefold()

Function	Description	Time, 10k queries
No function	To establish a baseline for how long pure calls take	4.372
`do_nothing`	Does nothing, establishes baseline overhead for calling an application-defined function compared to running no function	23.877
`length`	Built-in string length function to compare with custom get_length	5.799
`get_length`	Custom string length function; the only function that has identical built-in counterpart	22.0
`lower`	Built-in function to returning string in lower case, works only for ASCII symbols	12.012
`casefold`	Returns casefolded (i.e. “aggressively lowered” string) with full Unicode support	29.102

Conclusion: application-defined functions have relatively significant overhead: for 100m calls it amounted to 23.877 - 4.372 ≈ 19 seconds. To put it in perspective the overhead of calling casefold function amounted to 19 / 29.102 ≈ 65% of its running time.

Application-defined collations

Main contestants in the collations category:

def do_nothing_collation(a: str, b: str):
    return -1


def comparison_collation(a: str, b: str):
    if a < b:
        return -1
    if a > b:
        return 1
    return 0


def unicode_nocase(a: str, b: str):
    if a.casefold() < b.casefold():
        return -1
    if a.casefold() > b.casefold():
        return 1
    return 0


def unicode_nocase_opt(a: str, b: str):
    a = a.casefold()
    b = b.casefold()
    if a < b:
        return -1
    if a > b:
        return 1
    return 0

Collation	Description	Time, 10k queries
`do_nothing_collation`	Always returns -1, meaning that the first element is less than the second, to establish a baseline	18.855
`comparison_collation`	Simply compares incoming strings, case-sensitive	23.355
`nocase`	Built-in case-insensitive collation, works only for ASCII symbols	4.593
`unicode_nocase`	Case-insensitive search with full Unicode support; naive implementation: in the worst case it casefolds each string twice	36.272
`unicode_nocase_opt`	Optimized version of previous collation: it casefolds the strings only once and stores them in case they are needed the second time	32.741

Conclusions:

Do-nothing application-defined collation performed similarly to do-nothing function, implying similar overhead – in total 14 seconds on 100m calls.
From a purely computational standpoint collations are slower than SQL functions: collations operate with two strings, contain conditions and comparisons. During full table lookups, a function will be faster.
The optimized version of the collation is about 10% faster compared to naive. Other tweaks – new variable vs. overwriting and assignment vs. walrus operator vs. unpacking – did not affect the performance in any measurable way.

Indexes

The next question is how indexes affect the performance of application-defined functions and collations. The short answer is: indexes make the queries almost instantaneous, eliminating all meaningful overhead.

Function/collation	Without index, 10k queries	Time to build index	With index, 10k queries
`lower` function	12.012	0.004	0.016
`casefold` function	29.102	0.006	0.007
`nocase` built-in collation	4.593	0.005	0.036
`unicode_nocase` collation	36.272	0.04	0.027
`unicode_nocase_opt` collation	32.741	0.053	0.069

Conclusions:

In all cases, 10k SELECT queries with an index took less than 0.1 seconds, compared to 5-20+ seconds without an index.
Building an index is a relatively expensive operation – one building of an index took longer than 10k queries using it (except for unicode_nocase which is likely just a fluctuation).

Number of calls

As a side effect, the test suite also counts the number of calls to functions/collations when building and using indexes. With application-defined functions, there were no surprises: the function was called N (number of elements) times when building an index and was not called at all when using it.

With collations the situation is different. When building an index on a table with 10k random strings the collation was called about 124k times, which is approximately O(N * log N) and corresponds to what we should expect according to the SQLite documentation.

The collation is still used when traversing the index tree. The number of times is approximately log N, i.e. 14 times for an index of 10k items.

Side note: selecting all items without an index with ORDER BY triggers the same number of calls as building an index both for SQL functions and collations.

Conclusion: does the overhead matter?

Not much for several reasons.

First, sure, application-defined functions and collations involve relatively significant overhead, but they are still very fast, thanks to the overall efficiency of SQLite. If your database is small and requests are not very frequent, you will not see any issues even without indexes.

Second, if your database is large, you will need indexes anyway. With them, the performance overhead will be eliminated.

Finally, you do not have a choice. If you are thinking of application-defined functions or collations, you have to do something that SQLite itself cannot and any solution will have its cost, regardless of the solution.

If anything, this for the lack of a better word “research” underscores the importance of using the right indexes and their effect on performance.

Code

I see no use in publishing this code on GitHub. The code below requires no additional dependencies and can be run with Python 3.8+.

Code

import random
import sqlite3
import string
from time import time
from typing import Callable, List, Optional, Tuple
from pprint import pprint


"""
Functions and collations to be tests.
Allows for counting number of calls with a global counter
"""

# Global variable to count number of calls in some functions
counter = 0


def casefold(a: str):
    return a.casefold()


def casefold_c(a: str):
    global counter
    counter += 1
    return a.casefold()


def get_len(a: str):
    return len(a)


def get_len_c(a: str):
    global counter
    counter += 1
    return len(a)


def do_nothing(a: str):
    return a


def do_nothing_c(a: str):
    global counter
    counter += 1
    return a


def do_nothing_collation_c(a: str, b: str):
    global counter
    counter += 1
    return -1


def do_nothing_collation(a: str, b: str):
    return -1


def comparison_collation(a: str, b: str):
    if a < b:
        return -1
    if a > b:
        return 1
    return 0


def unicode_nocase_collation_naive_c(a: str, b: str):
    global counter
    counter += 1
    if a.casefold() < b.casefold():
        return -1
    if a.casefold() > b.casefold():
        return 1
    return 0


def unicode_nocase_collation_naive(a: str, b: str):
    if a.casefold() < b.casefold():
        return -1
    if a.casefold() > b.casefold():
        return 1
    return 0


def unicode_nocase_collation_new_var(a: str, b: str):
    ac = a.casefold()
    bc = b.casefold()
    if ac < bc:
        return -1
    if ac > bc:
        return 1
    return 0


def unicode_nocase_collation_new_var_decomposition(a: str, b: str):
    ac, bc = a.casefold(), b.casefold()
    if ac < bc:
        return -1
    if ac > bc:
        return 1
    return 0


def unicode_nocase_collation_new_var_walrus(a: str, b: str):
    if (ac := a.casefold()) < (bc := b.casefold()):
        return -1
    if ac > bc:
        return 1
    return 0


def unicode_nocase_collation_var_over(a: str, b: str):
    a = a.casefold()
    b = b.casefold()
    if a < b:
        return -1
    if a > b:
        return 1
    return 0


def unicode_nocase_collation_var_over_walrus(a: str, b: str):
    if (a := a.casefold()) < (b := b.casefold()):
        return -1
    if a > b:
        return 1
    return 0


def unicode_nocase_collation_var_over_decomposition(a: str, b: str):
    a, b = a.casefold(), b.casefold()
    if a < b:
        return -1
    if a > b:
        return 1
    return 0


"""Helpers and test runner"""


def random_string(length: int = 16):
    """
    Generate random ASCII characters (both cases)
    and digits string of required length
    """
    characters = string.ascii_letters + string.digits
    return "".join(random.choice(characters) for _ in range(length))


def set_up_db(rows) -> sqlite3.Connection:
    # DB setup
    connection = sqlite3.connect(":memory:")
    cursor = connection.cursor()
    cursor.execute("CREATE TABLE test (id INTEGER, text VARCHAR);")

    # Insert rows
    cursor.executemany(
        "INSERT INTO test (id, text) VALUES (?, ?)",
        rows
    )
    connection.commit()
    return connection


def run_query(cursor: sqlite3.Cursor, query: str, rows=None) -> float:
    """Run query and return duration."""
    counter = 0
    start_time = time()
    if rows is None:
        cursor.execute(query)
        cursor.fetchall()
    else:
        for row in rows:
            cursor.execute(query, [row[1]])
            cursor.fetchall()
    end_time = time()
    return end_time - start_time


def test_performance(
    rows: List[Tuple[int, str]],
    mode: str = "",
    callable: Optional[Callable] = None,
    name: str = "",
    printout: bool = False,
):
    # Set up DB
    connection = set_up_db(rows)
    cursor = connection.cursor()
    print_counter = False

    timing = {
        "select_no_index": 0.0,
        "order_no_index": 0.0,
        "build_index": 0.0,
        "select_with_index": 0.0,
        "order_with_index": 0.0,
    }

    if callable is not None:
        name = callable.__name__.upper()

    if name.endswith("_C"):  # Convention for testing number of runs
        print_counter = True

    if mode == "collation":
        select_query = f"SELECT id, text FROM test WHERE text = ? COLLATE {name};"
        order_query = f"SELECT id, text FROM test ORDER BY text COLLATE {name} ASC "
        index_query = f"CREATE INDEX idx1 ON test (text COLLATE {name});"
        if callable is not None:
            connection.create_collation(name, callable)

    else:  # mode == 'function' and catch all
        select_query = f"SELECT id, text FROM test WHERE {name}(text) = ?;"
        order_query = f"SELECT id, text FROM test ORDER BY {name}(text) ASC"
        index_query = f"CREATE INDEX idx1 ON test ({name}(text));"
        if callable is not None:
            connection.create_function(name, 1, callable, deterministic=True)

    if printout:
        print(f"Testing {mode} {name}")

    # Run SELECT query without index
    timing["select_no_index"] = run_query(cursor, select_query, rows)
    if printout:
        print(f"{timing['select_no_index']:.3f} sec - search without index")
        if print_counter:
            print(f"calls = {counter:,}, {counter/len(rows):,.1f} per query")

    # Run ORDER BY query without an index
    timing["order_no_index"] = run_query(cursor, order_query)
    if printout:
        print(
            f"{timing['order_no_index']:.3f} sec - select all items with ORDER BY without an index"
        )
        if print_counter:
            print(f"calls = {counter:,}, {counter/len(rows):,.1f} per item")

    # Build index
    timing["build_index"] = run_query(cursor, index_query)
    if printout:
        print(f"{timing['build_index']:.3f} sec - built index")
        if print_counter:
            print(f"calls = {counter:,}, {counter/len(rows):,.1f} per item")

    # Run query with an index
    timing["select_with_index"] = run_query(cursor, select_query, rows)
    if printout:
        print(f"{timing['select_with_index']:.3f} sec - search with index")
        if print_counter:
            print(f"calls = {counter:,}, {counter/len(rows):,.1f} per query")

    # Run ORDER BY query with an index
    timing["order_with_index"] = run_query(cursor, order_query)
    if printout:
        print(
            (
                f"{timing['order_with_index']:.3f} sec - select all items with ORDER BY with an index"
            )
        )
        if print_counter:
            print(f"calls = {counter:,}, {counter/len(rows):,.1f} per item")

        print()

    connection.close()

    return timing, name


"""Running tests"""


def main():
    ROWS_NUMBER = 10000
    PASSES_NUMBER = 10
    printout = False

    tests = [
        ## Functions
        # Establish baseline performance
        ("function", None, ""),
        ("function", do_nothing),
        ("function", None, "LENGTH"),
        ("function", get_len),
        # Compare lowering functions
        ("function", None, "LOWER"),
        ("function", casefold),
        ("function", casefold_c),
        ## Collations
        # Establish baseline performance
        ("collation", do_nothing_collation),
        # Simplest collation, case-sensitive comparison
        ("collation", comparison_collation),
        # Compare built-in collation with naive implementation
        ("collation", None, "BINARY"),
        ("collation", None, "NOCASE"),
        ("collation", unicode_nocase_collation_naive),
        ("collation", unicode_nocase_collation_naive_c),
        # Compare implementations using a new variable for casefolded strings
        ("collation", unicode_nocase_collation_new_var),
        ("collation", unicode_nocase_collation_new_var_walrus),
        ("collation", unicode_nocase_collation_new_var_decomposition),
        # Compare implementations overwriting passed variables with casefolded strings
        ("collation", unicode_nocase_collation_var_over),
        ("collation", unicode_nocase_collation_var_over_walrus),
        ("collation", unicode_nocase_collation_var_over_decomposition),
    ]

    rows = [(i, random_string()) for i in range(ROWS_NUMBER)]

    for test in tests:
        results = {
            "select_no_index": 0.0,
            "order_no_index": 0.0,
            "build_index": 0.0,
            "select_with_index": 0.0,
            "order_with_index": 0.0,
        }
        for _ in range(PASSES_NUMBER):
            timing, name = test_performance(rows, *test, printout=printout)
            results = {k: results[k] + timing[k] for k in results.keys()}

        print(name)
        pprint({k: round(results[k] / PASSES_NUMBER, 3) for k in results.keys()})
        print()


if __name__ == "__main__":
    main()

Application-defined functions#

Application-defined collations#

Indexes#

Number of calls#

Conclusion: does the overhead matter?#

Code#

Application-defined functions

Application-defined collations

Indexes

Number of calls

Conclusion: does the overhead matter?

Code