Sqlite User-Defined Functions
The sqlite_udf
playhouse module contains a number of user-definedfunctions, aggregates, and table-valued functions, which you may find useful.The functions are grouped in collections and you can register theseuser-defined extensions individually, by collection, or register everything.
Scalar functions are functions which take a number of parameters and return asingle value. For example, converting a string to upper-case, or calculatingthe MD5 hex digest.
Aggregate functions are like scalar functions that operate on multiple rows ofdata, producing a single result. For example, calculating the sum of a list ofintegers, or finding the smallest value in a particular column.
Table-valued functions are simply functions that can return multiple rows ofdata. For example, a regular-expression search function that returns all thematches in a given string, or a function that accepts two dates and generatesall the intervening days.
Note
To use table-valued functions, you will need to build theplayhouse._sqlite_ext
C extension.
Registering user-defined functions:
- db = SqliteDatabase('my_app.db')
- # Register *all* functions.
- register_all(db)
- # Alternatively, you can register individual groups. This will just
- # register the DATE and MATH groups of functions.
- register_groups(db, 'DATE', 'MATH')
- # If you only wish to register, say, the aggregate functions for a
- # particular group or groups, you can:
- register_aggregate_groups(db, 'DATE')
- # If you only wish to register a single function, then you can:
- from playhouse.sqlite_udf import gzip, gunzip
- db.register_function(gzip, 'gzip')
- db.register_function(gunzip, 'gunzip')
Using a library function (“hostname”):
- # Assume we have a model, Link, that contains lots of arbitrary URLs.
- # We want to discover the most common hosts that have been linked.
- query = (Link
- .select(fn.hostname(Link.url).alias('host'), fn.COUNT(Link.id))
- .group_by(fn.hostname(Link.url))
- .order_by(fn.COUNT(Link.id).desc())
- .tuples())
- # Print the hostname along with number of links associated with it.
- for host, count in query:
- print('%s: %s' % (host, count))
Functions, listed by collection name
Scalar functions are indicated by (f)
, aggregate functions by (a)
, andtable-valued functions by (t)
.
CONTROL_FLOW
ifthen_else
(_cond, truthy[, falsey=None])- Simple ternary-type operator, where, depending on the truthiness of the
cond
parameter, either thetruthy
orfalsey
value will bereturned.
DATE
Parameters:date_str – A datetime, encoded as a string.Returns:The datetime with any timezone info stripped off.
The time is not adjusted in any way, the timezone is simply removed.
Parameters:
- nseconds (int) – Number of seconds, total, in timedelta.
- glue (str) – Fragment to join values.Returns:Easy-to-read description of timedelta.
Example, 86471 -> “1 day, 1 minute, 11 seconds”
Parameters:datetime_value – A date-time.Returns:Minimum difference between any two values in list.
Aggregate function that computes the minimum difference between any twodatetimes.
Parameters:datetime_value – A date-time.Returns:Average difference between values in list.
Aggregate function that computes the average difference between consecutivevalues in the list.
Parameters:datetime_value – A date-time.Returns:Duration from smallest to largest value in list, in seconds.
Aggregate function that computes the duration from the smallest to thelargest value in the list, returned in seconds.
Parameters:
- start (datetime) – Start datetime
- stop (datetime) – Stop datetime
- step_seconds (int) – Number of seconds comprising a step.
Table-value function that returns rows consisting of the date/+time valuesencountered iterating from start to stop, step_seconds
at a time.
Additionally, if start does not have a time component and step_seconds isgreater-than-or-equal-to one day (86400 seconds), the values returned willbe dates. Conversely, if start does not have a date component, values willbe returned as times. Otherwise values are returned as datetimes.
Example:
- SELECT * FROM date_series('2017-01-28', '2017-02-02');
- value
- -----
- 2017-01-28
- 2017-01-29
- 2017-01-30
- 2017-01-31
- 2017-02-01
- 2017-02-02
FILE
Parameters:filename (str) – Filename to extract extension from.Returns:Returns the file extension, including the leading “.”.
Parameters:filename (str) – Filename to read.Returns:Contents of the file.
HELPER
Parameters:
- data (bytes) – Data to compress.
- compression (int) – Compression level (9 is max).Returns:Compressed binary data.
Parameters:data (bytes) – Compressed data.Returns:Uncompressed binary data.
Parameters:url (str) – URL to extract hostname from.Returns:hostname portion of URL
Parameters:key – Key to toggle.
Toggle a key between True/False state. Example:
- >>> toggle('my-key')
- True
- >>> toggle('my-key')
- False
- >>> toggle('my-key')
- True
Parameters:
- key – Key to set/retrieve.
- value – Value to set.Returns:Value associated with key.
Store/retrieve a setting in memory and persist during lifetime ofapplication. To get the current value, only specify the key. To set a newvalue, call with key and new value.
clear_toggles
()- Clears all state associated with the
toggle()
function.
clear_settings
()- Clears all state associated with the
setting()
function.
MATH
Parameters:
- start (int) – Start of range (inclusive)
- end (int) – End of range(not inclusive)
- step (int) – Interval at which to return a value.
Return a random integer between [start, end)
.
Parameters:
- mean (float) – Mean value
- sigma (float) – Standard deviation
Parameters:s (str) – String to convert to number.Returns:Integer, floating-point or NULL on failure.
Parameters:val – Numbers in list.Returns:The mode, or most-common, number observed.
Aggregate function which calculates mode of values.
Parameters:val – ValueReturns:Min difference between two values.
Aggregate function which calculates the minimal distance between twonumbers in the sequence.
Parameters:val – ValueReturns:Average difference between values.
Aggregate function which calculates the average distance between twoconsecutive numbers in the sequence.
Parameters:val – ValueReturns:The range from the smallest to largest value in sequence.
Aggregate function which returns range of values observed.
Parameters:val – ValueReturns:The median, or middle, value in a sequence.
Aggregate function which calculates the middle value in a sequence.
Note
Only available if you compiled the _sqlite_udf
extension.
STRING
dameraulevenshtein_dist
(_s1, s2)- Computes the edit distance from s1 to s2 using the damerau variant of thelevenshtein algorithm.
Note
Only available if you compiled the _sqlite_udf
extension.
Note
Only available if you compiled the _sqlite_udf
extension.
strdist
(_s1, s2)- Computes the edit distance from s1 to s2 using the standard librarySequenceMatcher’s algorithm.
Note
Only available if you compiled the _sqlite_udf
extension.
Parameters:
- regex (str) – Regular expression
- search_string (str) – String to search for instances of regex.
Table-value function that searches a string for substrings that matchthe provided regex
. Returns rows for each match found.
Example:
- SELECT * FROM regex_search('\w+', 'extract words, ignore! symbols');
- value
- -----
- extract
- words
- ignore
- symbols