When writing the post on Unicode case-insensitive search in SQLite I got interested in the performance of application-defined functions and collations: they must introduce overhead, the question is how much and whether it is something to be concerned about.
To answer these questions I made a simple test suite measuring the performance of both built-in SQLite and application-defined functions and collations – the full code is at the very end of the post. The basic idea of the performance test:
-
Create 10,000 rows of random strings (same across all tests).
-
For every function/collation: run a
SELECT
query for each of the strings, i.e. 10,000 times. Without an index, it will force a full table scan each time. In total it gives 100m calls to the function/collation being tested. -
Build an index that can be used in the queries above.
-
Run the same queries with an index.
-
Repeat it 10 times to smooth out the effect of random factors and average the result.
I omit some details, feel free to look into the code. For bonus credit, we count the number of calls to some of the functions and collations. All performance was measured without counters since they introduce overhead that would distort the results.
Note on absolute numbers vs. relative comparison
In this kind of test, it does not make sense to compare absolute numbers: different hardware will produce different results. What took my computer 30 seconds to do, may take yours 15, 42, or any other number of seconds. The purpose of this exercise is to test the relative performance of the functions and whether it matters.
All figures are in seconds and include running the test in full, meaning in most cases 100m calls. One notable exception: relevant indexes are built only once per test.
Application-defined functions
We have three main test subjects:
def do_nothing(a: str):
return a
def get_len(a: str):
return len(a)
def casefold(a: str):
return a.casefold()
Function | Description | Time, 10k queries |
---|---|---|
No function | To establish a baseline for how long pure calls take | 4.372 |
do_nothing |
Does nothing, establishes baseline overhead for calling an application-defined function compared to running no function | 23.877 |
length |
Built-in string length function to compare with custom get_length | 5.799 |
get_length |
Custom string length function; the only function that has identical built-in counterpart | 22.0 |
lower |
Built-in function to returning string in lower case, works only for ASCII symbols | 12.012 |
casefold |
Returns casefolded (i.e. “aggressively lowered” string) with full Unicode support | 29.102 |
Conclusion: application-defined functions have relatively significant overhead: for 100m calls it amounted to 23.877 - 4.372 ≈ 19 seconds. To put it in perspective the overhead of calling casefold
function amounted to 19 / 29.102 ≈ 65% of its running time.
Application-defined collations
Main contestants in the collations category:
def do_nothing_collation(a: str, b: str):
return -1
def comparison_collation(a: str, b: str):
if a < b:
return -1
if a > b:
return 1
return 0
def unicode_nocase(a: str, b: str):
if a.casefold() < b.casefold():
return -1
if a.casefold() > b.casefold():
return 1
return 0
def unicode_nocase_opt(a: str, b: str):
a = a.casefold()
b = b.casefold()
if a < b:
return -1
if a > b:
return 1
return 0
Collation | Description | Time, 10k queries |
---|---|---|
do_nothing_collation |
Always returns -1, meaning that the first element is less than the second, to establish a baseline | 18.855 |
comparison_collation |
Simply compares incoming strings, case-sensitive | 23.355 |
nocase |
Built-in case-insensitive collation, works only for ASCII symbols | 4.593 |
unicode_nocase |
Case-insensitive search with full Unicode support; naive implementation: in the worst case it casefolds each string twice | 36.272 |
unicode_nocase_opt |
Optimized version of previous collation: it casefolds the strings only once and stores them in case they are needed the second time | 32.741 |
Conclusions:
-
Do-nothing application-defined collation performed similarly to do-nothing function, implying similar overhead – in total 14 seconds on 100m calls.
-
From a purely computational standpoint collations are slower than SQL functions: collations operate with two strings, contain conditions and comparisons. During full table lookups, a function will be faster.
-
The optimized version of the collation is about 10% faster compared to naive. Other tweaks – new variable vs. overwriting and assignment vs. walrus operator vs. unpacking – did not affect the performance in any measurable way.
Indexes
The next question is how indexes affect the performance of application-defined functions and collations. The short answer is: indexes make the queries almost instantaneous, eliminating all meaningful overhead.
Function/collation | Without index, 10k queries | Time to build index | With index, 10k queries |
---|---|---|---|
lower function |
12.012 | 0.004 | 0.016 |
casefold function |
29.102 | 0.006 | 0.007 |
nocase built-in collation |
4.593 | 0.005 | 0.036 |
unicode_nocase collation |
36.272 | 0.04 | 0.027 |
unicode_nocase_opt collation |
32.741 | 0.053 | 0.069 |
Conclusions:
-
In all cases, 10k
SELECT
queries with an index took less than 0.1 seconds, compared to 5-20+ seconds without an index. -
Building an index is a relatively expensive operation – one building of an index took longer than 10k queries using it (except for
unicode_nocase
which is likely just a fluctuation).
Number of calls
As a side effect, the test suite also counts the number of calls to functions/collations when building and using indexes. With application-defined functions, there were no surprises: the function was called N
(number of elements) times when building an index and was not called at all when using it.
With collations the situation is different. When building an index on a table with 10k random strings the collation was called about 124k times, which is approximately O(N * log N)
and corresponds to what we should expect according to the SQLite documentation.
The collation is still used when traversing the index tree. The number of times is approximately log N
, i.e. 14 times for an index of 10k items.
Side note: selecting all items without an index with ORDER BY
triggers the same number of calls as building an index both for SQL functions and collations.
Conclusion: does the overhead matter?
Not much for several reasons.
First, sure, application-defined functions and collations involve relatively significant overhead, but they are still very fast, thanks to the overall efficiency of SQLite. If your database is small and requests are not very frequent, you will not see any issues even without indexes.
Second, if your database is large, you will need indexes anyway. With them, the performance overhead will be eliminated.
Finally, you do not have a choice. If you are thinking of application-defined functions or collations, you have to do something that SQLite itself cannot and any solution will have its cost, regardless of the solution.
If anything, this for the lack of a better word “research” underscores the importance of using the right indexes and their effect on performance.
Code
I see no use in publishing this code on GitHub. The code below requires no additional dependencies and can be run with Python 3.8+.
Code
import random
import sqlite3
import string
from time import time
from typing import Callable, List, Optional, Tuple
from pprint import pprint
"""
Functions and collations to be tests.
Allows for counting number of calls with a global counter
"""
# Global variable to count number of calls in some functions
counter = 0
def casefold(a: str):
return a.casefold()
def casefold_c(a: str):
global counter
counter += 1
return a.casefold()
def get_len(a: str):
return len(a)
def get_len_c(a: str):
global counter
counter += 1
return len(a)
def do_nothing(a: str):
return a
def do_nothing_c(a: str):
global counter
counter += 1
return a
def do_nothing_collation_c(a: str, b: str):
global counter
counter += 1
return -1
def do_nothing_collation(a: str, b: str):
return -1
def comparison_collation(a: str, b: str):
if a < b:
return -1
if a > b:
return 1
return 0
def unicode_nocase_collation_naive_c(a: str, b: str):
global counter
counter += 1
if a.casefold() < b.casefold():
return -1
if a.casefold() > b.casefold():
return 1
return 0
def unicode_nocase_collation_naive(a: str, b: str):
if a.casefold() < b.casefold():
return -1
if a.casefold() > b.casefold():
return 1
return 0
def unicode_nocase_collation_new_var(a: str, b: str):
ac = a.casefold()
bc = b.casefold()
if ac < bc:
return -1
if ac > bc:
return 1
return 0
def unicode_nocase_collation_new_var_decomposition(a: str, b: str):
ac, bc = a.casefold(), b.casefold()
if ac < bc:
return -1
if ac > bc:
return 1
return 0
def unicode_nocase_collation_new_var_walrus(a: str, b: str):
if (ac := a.casefold()) < (bc := b.casefold()):
return -1
if ac > bc:
return 1
return 0
def unicode_nocase_collation_var_over(a: str, b: str):
a = a.casefold()
b = b.casefold()
if a < b:
return -1
if a > b:
return 1
return 0
def unicode_nocase_collation_var_over_walrus(a: str, b: str):
if (a := a.casefold()) < (b := b.casefold()):
return -1
if a > b:
return 1
return 0
def unicode_nocase_collation_var_over_decomposition(a: str, b: str):
a, b = a.casefold(), b.casefold()
if a < b:
return -1
if a > b:
return 1
return 0
"""Helpers and test runner"""
def random_string(length: int = 16):
"""
Generate random ASCII characters (both cases)
and digits string of required length
"""
characters = string.ascii_letters + string.digits
return "".join(random.choice(characters) for _ in range(length))
def set_up_db(rows) -> sqlite3.Connection:
# DB setup
connection = sqlite3.connect(":memory:")
cursor = connection.cursor()
cursor.execute("CREATE TABLE test (id INTEGER, text VARCHAR);")
# Insert rows
cursor.executemany(
"INSERT INTO test (id, text) VALUES (?, ?)",
rows
)
connection.commit()
return connection
def run_query(cursor: sqlite3.Cursor, query: str, rows=None) -> float:
"""Run query and return duration."""
counter = 0
start_time = time()
if rows is None:
cursor.execute(query)
cursor.fetchall()
else:
for row in rows:
cursor.execute(query, [row[1]])
cursor.fetchall()
end_time = time()
return end_time - start_time
def test_performance(
rows: List[Tuple[int, str]],
mode: str = "",
callable: Optional[Callable] = None,
name: str = "",
printout: bool = False,
):
# Set up DB
connection = set_up_db(rows)
cursor = connection.cursor()
print_counter = False
timing = {
"select_no_index": 0.0,
"order_no_index": 0.0,
"build_index": 0.0,
"select_with_index": 0.0,
"order_with_index": 0.0,
}
if callable is not None:
name = callable.__name__.upper()
if name.endswith("_C"): # Convention for testing number of runs
print_counter = True
if mode == "collation":
select_query = f"SELECT id, text FROM test WHERE text = ? COLLATE {name};"
order_query = f"SELECT id, text FROM test ORDER BY text COLLATE {name} ASC "
index_query = f"CREATE INDEX idx1 ON test (text COLLATE {name});"
if callable is not None:
connection.create_collation(name, callable)
else: # mode == 'function' and catch all
select_query = f"SELECT id, text FROM test WHERE {name}(text) = ?;"
order_query = f"SELECT id, text FROM test ORDER BY {name}(text) ASC"
index_query = f"CREATE INDEX idx1 ON test ({name}(text));"
if callable is not None:
connection.create_function(name, 1, callable, deterministic=True)
if printout:
print(f"Testing {mode} {name}")
# Run SELECT query without index
timing["select_no_index"] = run_query(cursor, select_query, rows)
if printout:
print(f"{timing['select_no_index']:.3f} sec - search without index")
if print_counter:
print(f"calls = {counter:,}, {counter/len(rows):,.1f} per query")
# Run ORDER BY query without an index
timing["order_no_index"] = run_query(cursor, order_query)
if printout:
print(
f"{timing['order_no_index']:.3f} sec - select all items with ORDER BY without an index"
)
if print_counter:
print(f"calls = {counter:,}, {counter/len(rows):,.1f} per item")
# Build index
timing["build_index"] = run_query(cursor, index_query)
if printout:
print(f"{timing['build_index']:.3f} sec - built index")
if print_counter:
print(f"calls = {counter:,}, {counter/len(rows):,.1f} per item")
# Run query with an index
timing["select_with_index"] = run_query(cursor, select_query, rows)
if printout:
print(f"{timing['select_with_index']:.3f} sec - search with index")
if print_counter:
print(f"calls = {counter:,}, {counter/len(rows):,.1f} per query")
# Run ORDER BY query with an index
timing["order_with_index"] = run_query(cursor, order_query)
if printout:
print(
(
f"{timing['order_with_index']:.3f} sec - select all items with ORDER BY with an index"
)
)
if print_counter:
print(f"calls = {counter:,}, {counter/len(rows):,.1f} per item")
print()
connection.close()
return timing, name
"""Running tests"""
def main():
ROWS_NUMBER = 10000
PASSES_NUMBER = 10
printout = False
tests = [
## Functions
# Establish baseline performance
("function", None, ""),
("function", do_nothing),
("function", None, "LENGTH"),
("function", get_len),
# Compare lowering functions
("function", None, "LOWER"),
("function", casefold),
("function", casefold_c),
## Collations
# Establish baseline performance
("collation", do_nothing_collation),
# Simplest collation, case-sensitive comparison
("collation", comparison_collation),
# Compare built-in collation with naive implementation
("collation", None, "BINARY"),
("collation", None, "NOCASE"),
("collation", unicode_nocase_collation_naive),
("collation", unicode_nocase_collation_naive_c),
# Compare implementations using a new variable for casefolded strings
("collation", unicode_nocase_collation_new_var),
("collation", unicode_nocase_collation_new_var_walrus),
("collation", unicode_nocase_collation_new_var_decomposition),
# Compare implementations overwriting passed variables with casefolded strings
("collation", unicode_nocase_collation_var_over),
("collation", unicode_nocase_collation_var_over_walrus),
("collation", unicode_nocase_collation_var_over_decomposition),
]
rows = [(i, random_string()) for i in range(ROWS_NUMBER)]
for test in tests:
results = {
"select_no_index": 0.0,
"order_no_index": 0.0,
"build_index": 0.0,
"select_with_index": 0.0,
"order_with_index": 0.0,
}
for _ in range(PASSES_NUMBER):
timing, name = test_performance(rows, *test, printout=printout)
results = {k: results[k] + timing[k] for k in results.keys()}
print(name)
pprint({k: round(results[k] / PASSES_NUMBER, 3) for k in results.keys()})
print()
if __name__ == "__main__":
main()