Django-MySQL Documentation#

Django-MySQL extends Django’s built-in MySQL and MariaDB support their specific features not available on other databases.
If you’re new, check out the Exposition to see all the features in action, or get started with Installation. Otherwise, take your pick:
Exposition#
Every feature in whistle-stop detail.
Checks#
Extra checks added to Django’s check framework to ensure your Django and MySQL configurations are optimal.
$ ./manage.py check
?: (django_mysql.W001) MySQL strict mode is not set for database connection 'default'
...
QuerySet Extensions#
Django-MySQL comes with a number of extensions to QuerySet
that can be
installed in a number of ways - e.g. adding the QuerySetMixin
to your
existing QuerySet
subclass.
Approximate Counting#
SELECT COUNT(*) ...
can become a slow query, since it requires a scan of
all rows; the approx_count
functions solves this by returning the estimated
count that MySQL keeps in metadata. You can call it directly:
Author.objects.approx_count()
Or if you have pre-existing code that calls count()
on a QuerySet
you
pass it, such as the Django Admin, you can set the QuerySet
to do try
approx_count
first automatically:
qs = Author.objects.all().count_tries_approx()
# Now calling qs.count() will try approx_count() first
Query Hints#
Use MySQL’s query hints to optimize the SQL your QuerySet
s generate:
Author.objects.straight_join().filter(book_set__title__startswith="The ")
# Does SELECT STRAIGHT_JOIN ...
‘Smart’ Iteration#
Sometimes you need to modify every single instance of a model in a big table, without creating any long running queries that consume large amounts of resources. The ‘smart’ iterators traverse the table by slicing it into primary key ranges which span the table, performing each slice separately, and dynamically adjusting the slice size to keep them fast:
# Some authors to fix
bad_authors = Author.objects.filter(address="Nowhere")
# Before: bad, we can't fit all these in memory
for author in bad_authors.all():
pass
# After: good, takes small dynamically adjusted slices, wraps in atomic()
for author in bad_authors.iter_smart():
author.address = ""
author.save()
author.send_apology_email()
Integration with pt-visual-explain#
For interactive debugging of queries, this captures the query that the
QuerySet
represents, and passes it through EXPLAIN
and
pt-visual-explain
to get a visual representation of the query plan:
>>> Author.objects.all().pt_visual_explain()
Table scan
rows 1020
+- Table
table myapp_author
Model Fields#
Fields that use MariaDB/MySQL-specific features!
Dynamic Columns Field#
Use MariaDB’s Dynamic Columns for storing arbitrary, nested dictionaries of values:
class ShopItem(Model):
name = models.CharField(max_length=200)
attrs = DynamicField()
>>> ShopItem.objects.create(name="Camembert", attrs={"smelliness": 15})
>>> ShopItem.objects.create(name="Brie", attrs={"smelliness": 5, "squishiness": 10})
>>> ShopItem.objects.filter(attrs__smelliness_INTEGER__gte=10)
[<ShopItem: Camembert>]
EnumField#
A field class for using MySQL’s ENUM
type, which allows strings that are
restricted to a set of choices to be stored in a space efficient manner:
class BookCover(Model):
color = EnumField(choices=["red", "green", "blue"])
FixedCharField#
A field class for using MySQL’s CHAR
type, which allows strings to be
stored at a fixed width:
class Address(Model):
zip_code = FixedCharField(length=10)
Resizable Text/Binary Fields#
Django’s TextField
and
BinaryField
fields are fixed at the MySQL level to
use the maximum size class for the BLOB
and TEXT
data types - these
fields allow you to use the other sizes, and migrate between them:
class BookBlurb(Model):
blurb = SizedTextField(size_class=3)
# Has a maximum length of 16MiB, compared to plain TextField which has
# a limit of 4GB (!)
BIT(1) Boolean Fields#
Some database systems, such as the Java Hibernate ORM, don’t use MySQL’s
bool
data type for storing boolean flags and instead use BIT(1)
. This
field class allows you to interact with those fields:
class HibernateModel(Model):
some_bool = Bit1BooleanField()
some_nullable_bool = NullBit1BooleanField()
Field Lookups#
ORM extensions to built-in fields:
>>> Author.objects.filter(name__sounds_like="Robert")
[<Author: Robert>, <Author: Rupert>]
Aggregates#
MySQL’s powerful GROUP_CONCAT
statement is added as an aggregate, allowing
you to bring back the concatenation of values from a group in one query:
>>> author = Author.objects.annotate(book_ids=GroupConcat("books__id")).get(
... name="William Shakespeare"
... )
>>> author.book_ids
"1,2,5,17,29"
Database Functions#
MySQL-specific database functions for the ORM:
>>> Author.objects.annotate(
... full_name=ConcatWS("first_name", "last_name", separator=" ")
... ).first().full_name
"Charles Dickens"
Migration Operations#
MySQL-specific operations for django migrations:
from django.db import migrations
from django_mysql.operations import InstallPlugin
class Migration(migrations.Migration):
dependencies = []
operations = [InstallPlugin("metadata_lock_info", "metadata_lock_info.so")]
Cache#
An efficient backend for Django’s cache framework using MySQL features:
cache.set("my_key", "my_value") # Uses only one query
cache.get_many(["key1", "key2"]) # Only one query to do this too!
cache.set("another_key", some_big_value) # Compressed above 5kb by default
Locks#
Use MySQL as a locking server for arbitrarily named locks:
with Lock("ExternalAPI", timeout=10.0):
do_some_external_api_stuff()
Status#
Easy access to global or session status variables:
if global_status.get("Threads_running") > 100:
raise BorkError("Server too busy right now, come back later")
Management Commands#
dbparams
helps you include your database parameters from settings in
commandline tools with dbparams
:
$ mysqldump $(python manage.py dbparams) > dump.sql
Test Utilities#
Set some MySQL server variables on a test case for every method or just a specific one:
class MyTests(TestCase):
@override_mysql_variables(SQL_MODE="ANSI")
def test_it_works_in_ansi_mode(self):
self.run_it()
Installation#
Requirements#
Python 3.8 to 3.12 supported.
Django 3.2 to 5.0 supported.
MySQL 8.0 supported.
MariaDB 10.4 to 10.8 supported.
mysqclient 1.3 to 1.4 supported.
Installation#
Install it with pip:
$ python -m pip install django-mysql
Or add it to your project’s requirements.txt
.
Add 'django_mysql'
to your INSTALLED_APPS
setting:
INSTALLED_APPS = [
...,
"django_mysql",
...,
]
Django-MySQL comes with some extra checks to ensure your database configuration is optimal. It’s best to run these now you’ve installed to see if there is anything to fix:
$ python manage.py check --database default
(The --database
argument is new in Django 3.1.)
For help fixing any warnings, see Checks.
Are your tests slow? Check out my book Speed Up Your Django Tests which covers loads of ways to write faster, more accurate tests.
Extending your QuerySets#
Half the fun features are extensions to QuerySet
. You can add these to your
project in a number of ways, depending on what is easiest for your code - all
imported from django_mysql.models
.
- class Model#
The simplest way to add the
QuerySet
extensions - this is a subclass of Django’sModel
that setsobjects
to use the Django-MySQL extendedQuerySet
(below) viaQuerySet.as_manager()
. Simply change your model base to get the goodness:# from django.db.models import Model - no more! from django_mysql.models import Model class MySuperModel(Model): pass # TODO: come up with startup idea.
- class QuerySet#
The second way to add the extensions - use this to replace your model’s default manager:
from mythings import MyBaseModel from django_mysql.models import QuerySet class MySuperDuperModel(MyBaseModel): objects = QuerySet.as_manager() # TODO: what fields should this model have??
If you are using a custom manager, you can combine this like so:
from django.db import models from django_mysql.models import QuerySet class MySuperDuperManager(models.Manager): pass class MySuperDuperModel(models.Model): objects = MySuperDuperManager.from_queryset(QuerySet)() # TODO: fields
- class QuerySetMixin#
The third way to add the extensions, and the container class for the extensions. Add this mixin to your custom
QuerySet
class to add in all the fun:from django.db.models import Model from django_mysql.models import QuerySetMixin from stackoverflow import CopyPasteQuerySet class MySplendidQuerySet(QuerySetMixin, CopyPasteQuerySet): pass class MySplendidModel(Model): objects = MySplendidQuerySet.as_manager() # TODO: profit
- add_QuerySetMixin(queryset)#
A final way to add the extensions, useful when you don’t control the model class - for example with built in Django models. This function creates a subclass of a
QuerySet
's class that has theQuerySetMixin
added in and applies it to theQuerySet
:from django.contrib.auth.models import User from django_mysql.models import add_QuerySetMixin qs = User.objects.all() qs = add_QuerySetMixin(qs) # Now qs has all the extensions!
The extensions are described in QuerySet Extensions.
Checks#
Django-MySQL adds some extra checks to Django’s system check framework to advise on your database configuration. If triggered, the checks give a brief message, and a link here for documentation on how to fix it.
Warning
From Django 3.1 onwards, database checks are not automatically run in most
situations. You should use the --database
argument to
manage.py check
to run the checks. For example, with just one database
connection you can run manage.py check --database default
.
Note
A reminder: as per
the Django docs,
you can silence individual checks in your settings. For example, if you
determine django_mysql.W002
doesn’t require your attention, add the
following to settings.py
:
SILENCED_SYSTEM_CHECKS = [
"django_mysql.W002",
]
django_mysql.W001: Strict Mode#
This check has been removed since Django itself includes such a check,
mysql.W002
, since version 1.10. See its documentation.
django_mysql.W002: InnoDB Strict Mode#
InnoDB Strict Mode is similar to the general Strict Mode, but for InnoDB. It escalates several warnings around InnoDB-specific statements into errors. Normally this just affects per-table settings for compression. It’s recommended you activate this, but it’s not very likely to affect you if you don’t.
As above, the easiest way to set this is to add SET
to init_command
in
your DATABASES
setting:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.mysql",
"NAME": "my_database",
"OPTIONS": {
"init_command": "SET innodb_strict_mode=1",
},
}
}
Note
If you use this along with the init_command
for W001, combine them
as SET sql_mode='STRICT_TRANS_TABLES', innodb_strict_mode=1
.
Also, as above for django_mysql.W001
, it’s better that you set it
permanently for the server with SET GLOBAL
and a configuration file change.
django_mysql.W003: utf8mb4#
MySQL’s utf8
character set does not include support for the largest, 4 byte
characters in UTF-8; this basically means it cannot support emoji and custom
Unicode characters. The utf8mb4
character set was added to support all
these characters, and there’s really little point in not using it. Django
currently suggests using the utf8
character set for backwards
compatibility, but it’s likely to move in time.
It’s strongly recommended you change to the utf8mb4
character set and
convert your existing utf8
data as well, unless you’re absolutely sure
you’ll never see any of these ‘supplementary’ Unicode characters (note: it’s
very easy for users to type emoji on phone keyboards these days!).
Also see this classic blogpost: How to support full Unicode in MySQL databases.
The easiest way to set this up is to make a couple of changes to your
DATABASES
settings. First, add OPTIONS
with charset
to your MySQL
connection, so MySQLdb
connects using the utf8mb4
character set.
Second, add TEST
with COLLATION
and CHARSET
as below, so Django
creates the test database, and thus all tables, with the right character set:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.mysql",
"NAME": "my_database",
"OPTIONS": {
# Tell MySQLdb to connect with 'utf8mb4' character set
"charset": "utf8mb4",
},
# Tell Django to build the test database with the 'utf8mb4' character set
"TEST": {
"CHARSET": "utf8mb4",
"COLLATION": "utf8mb4_unicode_ci",
},
}
}
Note this does not transform the database, tables, and columns that already exist. Follow the examples in the ‘How to’ blog post link above to fix your database, tables, and character set. It’s planned to add a command to Django-MySQL to help you do this, see Issue 216.
QuerySet Extensions#
MySQL-specific Model and QuerySet extensions. To add these to your
Model
/Manager
/QuerySet
trifecta, see Installation. Methods
below are all QuerySet
methods; where standalone forms are referred to,
they can be imported from django_mysql.models
.
Approximate Counting#
- django_mysql.models.approx_count(fall_back=True, return_approx_int=True, min_size=1000)#
By default a QuerySet’s count() method runs SELECT COUNT(*) on a table. Whilst this is fast for
MyISAM
tables, forInnoDB
it involves a full table scan to produce a consistent number, due to MVCC keeping several copies of rows when under transaction. If you have lots of rows, you will notice this as a slow query - Percona have some more details.This method returns the approximate count found by running
EXPLAIN SELECT COUNT(*) ...
. It can be out by 30-50% in the worst case, but in many applications it is closer, and is good enough, such as when presenting many pages of results but users will only practically scroll through the first few. For example:>>> Author.objects.count() # slow 509741 >>> Author.objects.approx_count() # fast, with some error 531140
Three arguments are accepted:
- fall_back=True
If
True
and the approximate count cannot be calculated,count()
will be called and returned instead, otherwiseValueError
will be raised.The approximation can only be found for
objects.all()
, with no filters,distinct()
calls, etc., so it’s reasonable to fall back.
- return_approx_int=True
When
True
, anint
is not returned (excpet when falling back), but instead a subclass calledApproximateInt
. This is for all intents and purposes anint
, apart from when cast tostr
, it renders as e.g. ‘Approximately 12345’ (internationalization ready). Useful for templates you can’t edit (e.g. the admin) and you want to communicate that the number is not 100% accurate. For example:>>> print(Author.objects.approx_count()) # ApproximateInt Approximately 531140 >>> print(Author.objects.approx_count() + 0) # plain int 531140 >>> print(Author.objects.approx_count(return_approx_int=False)) # plain int 531140
- min_size=1000
The threshold at which to use the approximate algorithm; if the approximate count comes back as less that this number,
count()
will be called and returned instead, since it should be so small as to not bother your database. Set to0
to disable this behaviour and always return the approximation.The default of
1000
is a bit pessimistic - most tables won’t take long when callingCOUNT(*)
on tens of thousands of rows, but it could be slow for very wide tables.
- django_mysql.models.count_tries_approx(activate=True, fall_back=True, return_approx_int=True, min_size=1000)#
This is the ‘magic’ method to make pre-existing code, such as Django’s admin, work with
approx_count
. Callingcount_tries_approx
sets the QuerySet up such that then callingcount
will callapprox_count
instead, with the given arguments.To unset this, call
count_tries_approx
withactivate=False
.To ‘fix’ an Admin class with this, simply do the following (assuming
Author
inherits fromdjango_mysql
’sModel
):class AuthorAdmin(ModelAdmin): def get_queryset(self, request): qs = super(AuthorAdmin, self).get_queryset(request) return qs.count_tries_approx()
You’ll be able to see this is working on the pagination due to the word ‘Approximately’ appearing:
You can do this at a base class for all your
ModelAdmin
subclasses to apply the magical speed increase across your admin interface.
Query Hints#
The following methods add extra features to the ORM which allow you to access
some MySQL-specific syntax. They do this by inserting special comments which
pass through Django’s ORM layer and get re-written by a function that wraps the
lower-level cursor.execute()
.
Because not every user wants these features and there is a (small) overhead to every query, you must activate this feature by adding to your settings:
DJANGO_MYSQL_REWRITE_QUERIES = True
Once you’ve done this, the following methods will work.
- django_mysql.models.label(comment)#
Allows you to add an arbitrary comment to the start of the query, as the second thing after the keyword. This can be used to ‘tag’ queries so that when they show in the slow_log or another monitoring tool, you can easily back track to the python code generating the query. For example, imagine constructing a QuerySet like this:
qs = Author.objects.label("AuthorListView").all()
When executed, this will have SQL starting:
SELECT /*AuthorListView*/ ...
You can add arbitrary labels, and as many of them as you wish - they will appear in the order added. They will work in
SELECT
andUPDATE
statements, but not inDELETE
statements due to limitations in the way Django performs deletes.You should not pass user-supplied data in for the comment. As a basic protection against accidental SQL injection, passing a comment featuring
*/
will raise aValueError
, since that would prematurely end the comment. However due to executable comments, the comment is still prone to some forms of injection.However this is a feature - by not including spaces around your string, you may use this injection to use executable comments to add hints that are otherwise not supported, or to use MySQL 5.7+ optimizer hints.
- django_mysql.models.straight_join()#
Adds the
STRAIGHT_JOIN
hint, which forces the join order during aSELECT
. Note that you can’t force Django’s join order, but it tends to be in the order that the tables get mentioned in the query.Example usage:
# Note from Adam: sometimes the optimizer joined books -> author, which # is slow. Force it to do author -> books. Author.objects.distinct().straight_join().filter(books__age=12)[:10]
The MariaDB docs also have a good page Index Hints: How to Force Query Plans” which covers some cases when you might want to use
STRAIGHT_JOIN
.
- django_mysql.models.sql_small_result()#
Adds the
SQL_SMALL_RESULT
hint, which avoids using a temporary table in the case of aGROUP BY
orDISTINCT
.Example usage:
# Note from Adam: we have very few distinct birthdays, so using a # temporary table is slower Author.objects.values("birthday").distinct().sql_small_result()
- django_mysql.models.sql_big_result()#
Adds the
SQL_BIG_RESULT
hint, which forces using a temporary table in the case of aGROUP BY
orDISTINCT
.Example usage:
# Note from Adam: for some reason the optimizer didn’t use a temporary # table for this, so we force it Author.objects.distinct().sql_big_result()
- django_mysql.models.sql_buffer_result()#
Adds the
SQL_BUFFER_RESULT
hint, which forces the optimizer to use a temporary table to process the result. This is useful to free locks as soon as possible.Example usage:
# Note from Adam: seeing a lot of throughput on this table. Buffering # the results makes the queries less contentious. HighThroughputModel.objects.filter(x=y).sql_buffer_result()
- django_mysql.models.sql_cache()#
Adds the
SQL_CACHE
hint, which means the result set will be stored in the Query Cache. This only has an effect when the MySQL system variablequery_cache_type
is set to2
orDEMAND
.Warning
The query cache was removed in MySQL 8.0, and is disabled by default from MariaDB 10.1.7.
Example usage:
# Fetch recent posts, cached in MySQL for speed recent_posts = BlogPost.objects.sql_cache().order_by("-created")[:5]
Docs: MariaDB.
- django_mysql.models.sql_no_cache()#
Adds the
SQL_NO_CACHE
hint, which means the result set will not be fetched from or stored in the Query Cache. This only has an effect when the MySQL system variablequery_cache_type
is set to1
orON
.Warning
The query cache was removed in MySQL 8.0, and is disabled by default from MariaDB 10.1.7.
Example usage:
# Avoid caching all the expired sessions, since we’re about to delete # them deletable_session_ids = ( Session.objects.sql_no_cache().filter(expiry__lt=now()).values_list("id", flat=True) )
Docs: MariaDB.
- django_mysql.models.sql_calc_found_rows()#
Adds the
SQL_CALC_FOUND_ROWS
hint, which means the total count of matching rows will be calculated when you only take a slice. You can access this count with thefound_rows
attribute of theQuerySet
after filling its result cache, by e.g. iterating it.This can be faster than taking the slice and then again calling
.count()
to get the total count.Warning
This is deprecated in MySQL 8.0.17+.
Example usage:
>>> can_drive = Customer.objects.filter(age=21).sql_calc_found_rows()[:10] >>> len(can_drive) # Fetches the first 10 from the database 10 >>> can_drive.found_rows # The total number of 21 year old customers 1942
- django_mysql.models.use_index(*index_names, for_=None, table_name=None)#
Adds a
USE INDEX
hint, which affects the index choice made by MySQL’s query optimizer for resolving the query.Note that index names on your tables will normally have been generated by Django and contain a hash fragment. You will have to check your database schema to determine the index name.
If you pass any non-existent index names, MySQL will raise an error. This means index hints are especially important to test in the face of future schema changes.
for_
restricts the scope that the index hint applies to. By default it applies to all potential index uses during the query; you may supply one of'JOIN'
,'ORDER BY'
, or'GROUP BY'
to restrict the index hint to only be used by MySQL for index selection in their respective stages of query execution. For more information see the MySQL/MariaDB docs (link below).table_name
is the name of the table that the hints are for. By default, this will be the name of the table of the model that theQuerySet
is for, however you can supply any other table that may be joined into the query (from e.g.select_related()
). Be careful - there is no validation on the table name, and if it does not exist in the final query it will be ignored. Also it is injected raw into the resultant SQL, so you should not use user data otherwise it may open the potential for SQL injection.Note that
USE INDEX
accepts no index names to mean ‘use no indexes’, i.e. table scans only.Example usage:
# SELECT ... FROM `author` USE INDEX (`name_12345`) WHERE ... >>> Author.objects.use_index("name_12345").filter(name="John") # SELECT ... FROM `author` USE INDEX (`name_12345`, `name_age_678`) WHERE ... >>> Author.objects.use_index("name_12345", "name_age_678").filter(name="John") # SELECT ... FROM `author` USE INDEX FOR ORDER BY (`name_12345`) ... ORDER BY `name` >>> Author.objects.use_index("name_12345", for_="ORDER BY").order_by("name") # SELECT ... FROM `book` INNER JOIN `author` USE INDEX (`authbook`) ... >>> Book.objects.select_related("author").use_index("authbook", table_name="author")
- django_mysql.models.force_index(*index_names, for_=None)#
Similar to the above
use_index()
, but adds aFORCE INDEX
hint. Note that unlikeuse_index()
you must supply at least one index name. For more information, see the MySQL/MariaDB docs.
- django_mysql.models.ignore_index(*index_names, for_=None)#
Similar to the above
use_index()
, but adds anIGNORE INDEX
hint. Note that unlikeuse_index()
you must supply at least one index name. For more information, see the MySQL/MariaDB docs.
‘Smart’ Iteration#
Here’s a situation we’ve all been in - we screwed up, and now we need to fix the data. Let’s say we accidentally set the address of all authors without an address to “Nowhere”, rather than the blank string. How can we fix them??
The simplest way would be to run the following:
Author.objects.filter(address="Nowhere").update(address="")
Unfortunately with a lot of rows (‘a lot’ being dependent on your database server and level of traffic) this will stall other access to the table, since it will require MySQL to read all the rows and to hold write locks on them in a single query.
To solve this, we could try updating a chunk of authors at a time; such code tends to get ugly/complicated pretty quickly:
min_id = 0
max_id = 1000
biggest_author_id = Author.objects.order_by("-id")[0].id
while True:
Author.objects.filter(id__gte=min_id, id__lte=...)
# I'm not even going to type this all out, it's so much code
Here’s the solution to this boilerplate with added safety features - ‘smart’
iteration! There are two classes; one yields chunks of the given QuerySet
,
and the other yields the objects inside those chunks. Nearly every data update
can be thought of in one of these two methods.
- class django_mysql.models.SmartChunkedIterator(queryset, atomically=True, status_thresholds=None, pk_range=None, chunk_time=0.5, chunk_size=2, chunk_min=1, chunk_max=10000, report_progress=False, total=None)#
Implements a smart iteration strategy over the given
queryset
. There is a methoditer_smart_chunks
that takes the same arguments on theQuerySetMixin
so you can just:bad_authors = Author.objects.filter(address="Nowhere") for author_chunk in bad_authors.iter_smart_chunks(): author_chunk.update(address="")
Iteration proceeds by yielding primary-key based slices of the queryset, and dynamically adjusting the size of the chunk to try and take
chunk_time
seconds. In between chunks, thewait_until_load_low()
method ofGlobalStatus
is called to ensure the database is not under high load.Warning
Because of the slicing by primary key, there are restrictions on what
QuerySet
s you can use, and aValueError
will be raised if the queryset doesn’t meet that. Specifically, onlyQuerySet
s on models with integer-based primary keys, which are unsliced, and have noorder_by
will work.There are a lot of arguments and the defaults have been picked hopefully sensibly, but please check for your case though!
- queryset#
The queryset to iterate over; if you’re calling via
.iter_smart_chunks
then you don’t need to set this since it’s the queryset you called it on.
- atomically=True
If true, wraps each chunk in a transaction via django’s
transaction.atomic()
. Recommended for any write processing.
- status_thresholds=None
A dict of status variables and their maximum tolerated values to be checked against after each chunk with
wait_until_load_low()
.When set to
None
, the default,GlobalStatus
will use its default of{"Threads_running": 10}
. Set to an empty dict to disable status checking - but this is not really recommended, as it can save you from locking up your site with an overly aggressive migration.Using
Threads_running
is the most recommended variable to check against, and is copeid from the default behaviour ofpt-online-schema-change
. The default value of 10 threads is deliberately conservative to avoid locking small database servers. You should tweak it up based upon the live activity of your server - check the running thread count during normal traffic and add some overhead.
- pk_range=None
Controls the primary key range to iterate over with slices. By default, with
pk_range=None
, the QuerySet will be searched for its minimum and maximumpk
values before starting. On QuerySets that match few rows, or whose rows aren’t evenly distributed, this can still execute a long blocking table scan to find these two rows. You can remedy this by giving a value forpk_range
:If set to
'all'
, the range will be the minimum and maximum PK values of the entire table, excluding any filters you have set up - that is, forModel.objects.all()
for the givenQuerySet
’s model.If set to a 2-tuple, it will be unpacked and used as the minimum and maximum values respectively.
Note
The iterator determines the minimum and maximum at the start of iteration and does not update them whilst iterating, which is normally a safe assumption, since if you’re “fixing things” you probably aren’t creating any more bad data. If you do need to process every row then set
pk_range
to have a maximum far greater than what you expect would be reached by inserts that occur during iteration.
- chunk_time=0.5
The time in seconds to aim for each chunk to take. The chunk size is dynamically adjusted to try and match this time, via a weighted average of the past and current speed of processing. The default and algorithm is taken from the analogous
pt-online-schema-change
flag –chunk-time.
- chunk_size=2
The initial size of the chunk that will be used. As this will be dynamically scaled and can grow fairly quickly, the initial size of 2 should be appropriate for most use cases.
- chunk_min=1
The minimum number of objects in a chunk. You do not normally need to tweak this since the dynamic scaling works very well, however it might be useful if your data has a lot of “holes” or if there are other constraints on your application.
- chunk_max=10000
The maximum number of objects in a chunk, a kind of sanity bound. Acts to prevent harm in the case of iterating over a model with a large ‘hole’ in its primary key values, e.g. if only ids 1-10k and 100k-110k exist, then the chunk ‘slices’ could grow very large in between 10k and 100k since you’d be “processing” the non-existent objects 10k-100k very quickly.
- report_progress=False
If set to true, display out a running counter and summary on
sys.stdout
. Useful for interactive use. The message looks like this:AuthorSmartChunkedIterator processed 0/100000 objects (0.00%) in 0 chunks
And uses
\r
to erase itself when re-printing to avoid spamming your screen. At the endFinished!
is printed on a new line.
- total=None
By default the total number of objects to process will be calculated with
approx_count()
, withfall_back
set toTrue
. Thiscount()
query could potentially be big and slow.total
allows you to pass in the total number of objects for processing, if you can calculate in a cheaper way, for example if you have a read-replica to use.
- class django_mysql.models.SmartIterator#
A convenience subclass of
SmartChunkedIterator
that simply unpacks the chunks for you. Can be accessed via theiter_smart
method ofQuerySetMixin
.For example, rather than doing this:
bad_authors = Author.objects.filter(address="Nowhere") for authors_chunk in bad_authors.iter_smart_chunks(): for author in authors_chunk: author.send_apology_email()
You can do this:
bad_authors = Author.objects.filter(address="Nowhere") for author in bad_authors.iter_smart(): author.send_apology_email()
All the same arguments as
SmartChunkedIterator
are accepted.
- class django_mysql.models.SmartPKRangeIterator#
A subclass of
SmartChunkedIterator
that doesn’t return the chunk’sQuerySet
but instead returns the start and end primary keys for the chunk. This may be useful when you want to iterate but the slices need to be used in a raw SQL query. Can be accessed via theiter_smart_pk_ranges
method ofQuerySetMixin
.For example, rather than doing this:
for authors_chunk in Author.objects.iter_smart_chunks(): limits = author_chunk.aggregate(min_pk=Min("pk"), max_pk=Max("pk")) authors = Author.objects.raw( """ SELECT name from app_author WHERE id >= %s AND id <= %s """, (limits["min_pk"], limits["max_pk"]), ) # etc...
…you can do this:
for start_pk, end_pk in Author.objects.iter_smart_pk_ranges(): authors = Author.objects.raw( """ SELECT name from app_author WHERE id >= %s AND id < %s """, (start_pk, end_pk), ) # etc...
In the first format we were forced to perform a dumb query to determine the primary key limits set by
SmartChunkedIterator
, due to theQuerySet
not otherwise exposing this information.Note
There is a subtle difference between the two versions. In the first the end boundary,
max_pk
, is a closed bound, whereas in the second, theend_pk
fromiter_smart_pk_ranges
is an open bound. Thus the<=
changes to a<
.All the same arguments as
SmartChunkedIterator
are accepted.
Integration with pt-visual-explain#
How does MySQL really execute a query? The EXPLAIN
statement
(docs: MySQL /
MariaDB),
gives a description of the execution plan, and the pt-visual-explain
tool
can format this in an understandable tree.
This function is a shortcut to turn a QuerySet
into its visual explanation,
making it easy to gain a better understanding of what your queries really end
up doing.
- django_mysql.models.pt_visual_explain(display=True)#
Call on a
QuerySet
to print its visual explanation, or withdisplay=False
to return it as a string. It prepends the SQL of the query with ‘EXPLAIN’ and passes it through themysql
andpt-visual-explain
commands to get the output. You therefore need the MySQL client and Percona Toolkit installed where you run this.Example:
>>> Author.objects.all().pt_visual_explain() Table scan rows 1020 +- Table table myapp_author
Can also be imported as a standalone function if you want to use it on a
QuerySet
that does not have theQuerySetMixin
added, e.g. for built-in Django models:>>> from django_mysql.models import pt_visual_explain >>> pt_visual_explain(User.objects.all()) Table scan rows 1 +- Table table auth_user
Model Fields#
More MySQL and MariaDB specific ways to store data!
Field classes should always be imported from django_mysql.models
, similar
to the home of Django’s native fields in django.db.models
.
DynamicField#
MariaDB has a feature called Dynamic Columns that allows you to store different sets of columns for each row in a table. It works by storing the data in a blob and having a small set of functions to manipulate this blob. (Docs).
Django-MySQL supports the named Dynamic Columns of MariaDB 10.0+, as opposed to the numbered format of 5.5+. It uses the mariadb-dyncol python package to pack and unpack Dynamic Columns blobs in Python rather than in MariaDB (mostly due to limitations in the Django ORM).
- class django_mysql.models.DynamicField(spec=None, **kwargs)#
A field for storing Dynamic Columns. The Python data type is
dict
. Keys must bestr
s and values must be one of the supported value types inmariadb-dyncol
:str
int
float
datetime.date
datetime.datetime
datetime.datetime
A nested dict conforming to this spec too
Note that there are restrictions on the range of values supported for some of these types, and that
decimal.Decimal
objects are not yet supported though they are valid in MariaDB. For more information consult themariadb-dyncol
documentation.Values may also be
None
, though they will then not be stored, since dynamic columns do not storeNULL
, so you should use.get()
to retrieve values that may beNone
.To use this field, you’ll need to:
Use MariaDB 10.0.2+
Install
mariadb-dyncol
(python -m pip install mariadb-dyncol
)Use either the
utf8mb4
orutf8
character set for your database connection.
These are all checked by the field and you will see sensible errors for them when Django’s checks run if you have a
DynamicField
on a model.- spec#
This is an optional type specification that checks that the named columns, if present, have the given types. It is validated against on
save()
to ensure type safety (unlike normal Django validation which is only used in forms). It is also used for type information for lookups (below).spec
should be adict
with string keys and values that are the type classes you expect. You can also nest another such dictionary as a value for validating nested dynamic columns.For example:
import datetime class SpecModel(Model): attrs = DynamicField( spec={ "an_integer_key": int, "created_at": datetime.datetime, "nested_columns": { "lat": int, "lon": int, }, } )
This will enforce the following rules:
instance.attrs['an_integer_key']
, if present, is anint
instance.attrs['created_at']
, if present, is andatetime.datetime
instance.attrs['nested_columns']
, if present, is adict
instance.attrs['nested_columns']['lat']
, if present, is anint
instance.attrs['nested_columns']['lon']
, if present, is anint
Trying to save a
DynamicField
with data that does not match the rules of itsspec
will raiseTypeError
. There is no automatic casting, e.g. betweenint
andfloat
. Note that columns not inspec
will still be allowed and have no type enforced.For example:
>>> SpecModel.objects.create(attrs={"an_integer_key": 1}) # Fine >>> SpecModel.objects.create(attrs={"an_integer_key": 2.0}) Traceback (most recent call last): ... TypeError: Key 'an_integer_key' should be of type 'int' >>> SpecModel.objects.create(attrs={"non_spec_key": "anytype"}) # Fine
DynamicFields in Forms#
By default a DynamicField
has no form field, because there isn’t really a
practical way to edit its contents. If required, is possible to add extra form
fields to a ModelForm
that then update specific dynamic column names on the
instance in the form’s save()
.
Querying DynamicField#
You can query by names, including nested names. In cases where names collide
with existing lookups (e.g. you have a column named 'exact'
), you might
want to use the ColumnGet
database
function. You can also use the
ColumnAdd
and
ColumnDelete
functions for atomically
modifying the contents of dynamic columns at the database layer.
We’ll use the following example model:
from django_mysql.models import DynamicField, Model
class ShopItem(Model):
name = models.CharField(max_length=200)
attrs = DynamicField(
spec={
"size": str,
}
)
def __str__(self):
return self.name
Exact Lookups#
To query based on an exact match, just use a dictionary.
For example:
>>> ShopItem.objects.create(name="Camembert", attrs={"smelliness": 15})
>>> ShopItem.objects.create(name="Cheddar", attrs={"smelliness": 15, "hardness": 5})
>>> ShopItem.objects.filter(attrs={"smelliness": 15})
[<ShopItem: Camembert>]
>>> ShopItem.objects.filter(attrs={"smelliness": 15, "hardness": 5})
[<ShopItem: Cheddar>]
Name Lookups#
To query based on a column name, use that name as a lookup with one of the
below SQL types added after an underscore. If the column name is in your
field’s spec
, you can omit the SQL type and it will be extracted
automatically - this includes keys in nested dict
s.
The list of SQL types is:
BINARY
-dict
(a nestedDynamicField
)CHAR
-str
DATE
-datetime.date
DATETIME
-datetime.datetime
DOUBLE
-float
INTEGER
-int
TIME
-datetime.time
These will also use the correct Django ORM field so chained lookups based on
that type are possible, e.g. dynamicfield__age_INTEGER__gte=20
.
Beware that getting a named column can always return NULL
if the column is
not defined for a row.
For example:
>>> ShopItem.objects.create(name="T-Shirt", attrs={"size": "Large"})
>>> ShopItem.objects.create(
... name="Rocketship",
... attrs={"speed_mph": 300, "dimensions": {"width_m": 10, "height_m": 50}},
... )
# Basic template: DynamicField + '__' + column name + '_' + SQL type
>>> ShopItem.objects.filter(attrs__size_CHAR="Large")
[<ShopItem: T-Shirt>]
# As 'size' is in the field's spec, there is no need to give the SQL type
>>> ShopItem.objects.filter(attrs__size="Large")
[<ShopItem: T-Shirt>]
# Chained lookups are possible based on the data type
>>> ShopItem.objects.filter(attrs__speed_mph_INTEGER__gte=100)
[<ShopItem: Rocketship>]
# Nested keys can be looked up
>>> ShopItem.objects.filter(attrs__dimensions_BINARY__width_m_INTEGER=10)
[<ShopItem: Rocketship>]
# Nested DynamicFields can be queried as ``dict``s, as per the ``exact`` lookup
>>> ShopItem.objects.filter(attrs__dimensions_BINARY={"width_m": 10, "height_m": 50})
[<ShopItem: Rocketship>]
# Missing keys are always NULL
>>> ShopItem.objects.filter(attrs__blablabla_INTEGER__isnull=True)
[<ShopItem: T-Shirt>, <ShopItem: Rocketship>]
List Fields#
Legacy
These field classes are only maintained for legacy purposes. They aren’t recommended as comma separation is a fragile serialization format.
For new uses, you’re better off using Django 3.1’s JSONField
that works
with all database backends. On earlier versions of Django, you can use
django-jsonfield-backport.
Two fields that store lists of data, grown-up versions of Django’s
CommaSeparatedIntegerField
, cousins of
django.contrib.postgres
’s
ArrayField
. There are two versions:
ListCharField
, which is based on CharField
and appropriate for storing
lists with a small maximum size, and ListTextField
, which is based on
TextField
and therefore suitable for lists of (near) unbounded size (the
underlying LONGTEXT
MySQL datatype has a maximum length of 232 -
1 bytes).
- class django_mysql.models.ListCharField(base_field, size=None, **kwargs)#
A field for storing lists of data, all of which conform to the
base_field
.- base_field#
The base type of the data that is stored in the list. Currently, must be
IntegerField
,CharField
, or any subclass thereof - except fromListCharField
itself.
- size#
Optionally set the maximum numbers of items in the list. This is only checked on form validation, not on model save!
As
ListCharField
is a subclass ofCharField
, anyCharField
options can be set too. Most importantly you’ll need to setmax_length
to determine how many characters to reserve in the database.Example instantiation:
from django.db.models import CharField, Model from django_mysql.models import ListCharField class Person(Model): name = CharField() post_nominals = ListCharField( base_field=CharField(max_length=10), size=6, max_length=(6 * 11), # 6 * 10 character nominals, plus commas )
In Python simply set the field’s value as a list:
>>> p = Person.objects.create(name="Horatio", post_nominals=["PhD", "Esq."]) >>> p.post_nominals ['PhD', 'Esq.'] >>> p.post_nominals.append("III") >>> p.post_nominals ['PhD', 'Esq.', 'III'] >>> p.save()
Validation on save()
When performing the list-to-string conversion for the database,
ListCharField
performs some validation, and will raiseValueError
if there is a problem, to avoid saving bad data. The following are invalid:Any member containing a comma in its string representation
Any member whose string representation is the empty string
The default form field is
SimpleListField
.
- class django_mysql.models.ListTextField(base_field, size=None, **kwargs)#
The same as
ListCharField
, but backed by aTextField
and therefore much less restricted in length. There is nomax_length
argument.Example instantiation:
from django.db.models import IntegerField, Model from django_mysql.models import ListTextField class Widget(Model): widget_group_ids = ListTextField( base_field=IntegerField(), size=100, # Maximum of 100 ids in list )
Querying List Fields#
Warning
These fields are not built-in datatypes, and the filters use one or more SQL functions to parse the underlying string representation. They may slow down on large tables if your queries are not selective on other columns.
contains#
The contains
lookup is overridden on ListCharField
and
ListTextField
to match where the set field contains the given element,
using MySQL’s FIND_IN_SET
function (docs:
MariaDB /
MySQL docs).
For example:
>>> Person.objects.create(name="Horatio", post_nominals=["PhD", "Esq.", "III"])
>>> Person.objects.create(name="Severus", post_nominals=["PhD", "DPhil"])
>>> Person.objects.create(name="Paulus", post_nominals=[])
>>> Person.objects.filter(post_nominals__contains="PhD")
[<Person: Horatio>, <Person: Severus>]
>>> Person.objects.filter(post_nominals__contains="Esq.")
[<Person: Horatio>]
>>> Person.objects.filter(post_nominals__contains="DPhil")
[<Person: Severus>]
>>> Person.objects.filter(
... Q(post_nominals__contains="PhD") & Q(post_nominals__contains="III")
... )
[<Person: Horatio>]
Note
ValueError
will be raised if you try contains
with a list. It’s not
possible without using AND
in the query, so you should add the filters
for each item individually, as per the last example.
len#
A transform that converts to the number of items in the list. For example:
>>> Person.objects.filter(post_nominals__len=0)
[<Person: Paulus>]
>>> Person.objects.filter(post_nominals__len=2)
[<Person: Severus>]
>>> Person.objects.filter(post_nominals__len__gt=2)
[<Person: Horatio>]
Index lookups#
This class of lookups allows you to index into the list to check if the first
occurrence of a given element is at a given position. There are no errors if
it exceeds the size
of the list. For example:
>>> Person.objects.filter(post_nominals__0="PhD")
[<Person: Horatio>, <Person: Severus>]
>>> Person.objects.filter(post_nominals__1="DPhil")
[<Person: Severus>]
>>> Person.objects.filter(post_nominals__100="VC")
[]
Warning
The underlying function, FIND_IN_SET
, is designed for sets, i.e.
comma-separated lists of unique elements. It therefore only allows you to
query about the first occurrence of the given item. For example, this is
a non-match:
>>> Person.objects.create(name="Cacistus", post_nominals=["MSc", "MSc"])
>>> Person.objects.filter(post_nominals__1="MSc")
[] # Cacistus does not appear because his first MSc is at position 0
This may be fine for your application, but be careful!
Note
FIND_IN_SET
uses 1-based indexing for searches on comma-based strings
when writing raw SQL. However these indexes use 0-based indexing to be
consistent with Python.
Note
Unlike the similar feature on django.contrib.postgres
’s ArrayField
,
‘Index transforms’, these are lookups, and only allow direct value
comparison rather than continued chaining with the base-field lookups. This
is because the field is not a native list type in MySQL.
ListF()
expressions#
Similar to Django’s F
expression, this allows you to
perform an atomic add and remove operations on list fields at the database
level:
>>> from django_mysql.models import ListF
>>> Person.objects.filter(post_nominals__contains="PhD").update(
... post_nominals=ListF("post_nominals").append("Sr.")
... )
2
>>> Person.objects.update(post_nominals=ListF("post_nominals").pop())
3
Or with attribute assignment to a model:
>>> horatio = Person.objects.get(name="Horatio")
>>> horatio.post_nominals = ListF("post_nominals").append("DSocSci")
>>> horatio.save()
- class django_mysql.models.ListF(field_name)#
You should instantiate this class with the name of the field to use, and then call one of its methods.
Note that unlike
F
, you cannot chain the methods - the SQL involved is a bit too complicated, and thus only single operations are supported.- append(value)#
Adds the value of the given expression to the (right hand) end of the list, like
list.append
:>>> Person.objects.create(name="Horatio", post_nominals=["PhD", "Esq.", "III"]) >>> Person.objects.update(post_nominals=ListF("post_nominals").append("DSocSci")) >>> Person.objects.get().full_name "Horatio Phd Esq. III DSocSci"
- appendleft(value)#
Adds the value of the given expression to the (left hand) end of the list, like
deque.appendleft
:>>> Person.objects.update(post_nominals=ListF("post_nominals").appendleft("BArch")) >>> Person.objects.get().full_name "Horatio BArch Phd Esq. III DSocSci"
- pop()#
Takes one value from the (right hand) end of the list, like
list.pop
:>>> Person.objects.update(post_nominals=ListF("post_nominals").pop()) >>> Person.objects.get().full_name "Horatio BArch Phd Esq. III"
- popleft()#
Takes one value off the (left hand) end of the list, like
deque.popleft
:>>> Person.objects.update(post_nominals=ListF("post_nominals").popleft()) >>> Person.objects.get().full_name "Horatio Phd Esq. III"
Warning
All the above methods use SQL expressions with user variables in their queries, all of which start with
@tmp_
. This shouldn’t affect you much, but if you use user variables in your queries, beware for any conflicts.
Set Fields#
Legacy
These field classes are only maintained for legacy purposes. They aren’t recommended as comma separation is a fragile serialization format.
For new uses, you’re better off using Django 3.1’s JSONField
that works
with all database backends. On earlier versions of Django, you can use
django-jsonfield-backport.
Two fields that store sets of a base field in comma-separated strings -
cousins of Django’s CommaSeparatedIntegerField
.
There are two versions: SetCharField
, which is based on CharField
and
appropriate for storing sets with a small maximum size, and SetTextField
,
which is based on TextField
and therefore suitable for sets of (near)
unbounded size (the underlying LONGTEXT
MySQL datatype has a maximum length
of 232 - 1 bytes).
- SetCharField(base_field, size=None, **kwargs):
A field for storing sets of data, which all conform to the
base_field
.- django_mysql.models.base_field#
The base type of the data that is stored in the set. Currently, must be
IntegerField
,CharField
, or any subclass thereof - except fromSetCharField
itself.
- django_mysql.models.size#
Optionally set the maximum number of elements in the set. This is only checked on form validation, not on model save!
As
SetCharField
is a subclass ofCharField
, anyCharField
options can be set too. Most importantly you’ll need to setmax_length
to determine how many characters to reserve in the database.Example instantiation:
from django.db.models import IntegerField, Model from django_mysql.models import SetCharField class LotteryTicket(Model): numbers = SetCharField( base_field=IntegerField(), size=6, max_length=(6 * 3), # 6 two digit numbers plus commas )
In Python simply set the field’s value as a set:
>>> lt = LotteryTicket.objects.create(numbers={1, 2, 4, 8, 16, 32}) >>> lt.numbers {1, 2, 4, 8, 16, 32} >>> lt.numbers.remove(1) >>> lt.numbers.add(3) >>> lt.numbers {32, 3, 2, 4, 8, 16} >>> lt.save()
Validation on save()
When performing the set-to-string conversion for the database,
SetCharField
performs some validation, and will raiseValueError
if there is a problem, to avoid saving bad data. The following are invalid:If there is a comma in any member’s string representation
If the empty string is stored.
The default form field is
SimpleSetField
.
- SetTextField(base_field, size=None, **kwargs):
The same as
SetCharField
, but backed by aTextField
and therefore much less restricted in length. There is nomax_length
argument.Example instantiation:
from django.db.models import IntegerField, Model from django_mysql.models import SetTextField class Post(Model): tags = SetTextField( base_field=CharField(max_length=32), )
Querying Set Fields#
Warning
These fields are not built-in datatypes, and the filters use one or more SQL functions to parse the underlying string representation. They may slow down on large tables if your queries are not selective on other columns.
contains#
The contains
lookup is overridden on SetCharField
and SetTextField
to match where the set field contains the given element, using MySQL’s
FIND_IN_SET
(docs:
MariaDB /
MySQL).
For example:
>>> Post.objects.create(name="First post", tags={"thoughts", "django"})
>>> Post.objects.create(name="Second post", tags={"thoughts"})
>>> Post.objects.create(name="Third post", tags={"tutorial", "django"})
>>> Post.objects.filter(tags__contains="thoughts")
[<Post: First post>, <Post: Second post>]
>>> Post.objects.filter(tags__contains="django")
[<Post: First post>, <Post: Third post>]
>>> Post.objects.filter(Q(tags__contains="django") & Q(tags__contains="thoughts"))
[<Post: First post>]
Note
ValueError
will be raised if you try contains
with a set. It’s not
possible without using AND
in the query, so you should add the filters
for each item individually, as per the last example.
len#
A transform that converts to the number of items in the set. For example:
>>> Post.objects.filter(tags__len=1)
[<Post: Second post>]
>>> Post.objects.filter(tags__len=2)
[<Post: First post>, <Post: Third post>]
>>> Post.objects.filter(tags__len__lt=2)
[<Post: Second post>]
SetF()
expressions#
Similar to Django’s F
expression, this
allows you to perform an atomic add or remove on a set field at the database
level:
>>> from django_mysql.models import SetF
>>> Post.objects.filter(tags__contains="django").update(
... tags=SetF("tags").add("programming")
... )
2
>>> Post.objects.update(tags=SetF("tags").remove("thoughts"))
2
Or with attribute assignment to a model:
>>> post = Post.objects.earliest("id")
>>> post.tags = SetF("tags").add("python")
>>> post.save()
- class django_mysql.models.SetF(field_name)#
You should instantiate this class with the name of the field to use, and then call one of its two methods with a value to be added/removed.
Note that unlike
F
, you cannot chain the methods - the SQL involved is a bit too complicated, and thus you can only perform a single addition or removal.- add(value)#
Takes an expression and returns a new expression that will take the value of the original field and add the value to the set if it is not contained:
post.tags = SetF("tags").add("python") post.save()
- remove(value)#
Takes an expression and returns a new expression that will remove the given item from the set field if it is present:
post.tags = SetF("tags").remove("python") post.save()
Warning
Both of the above methods use SQL expressions with user variables in their queries, all of which start with
@tmp_
. This shouldn’t affect you much, but if you use user variables in your queries, beware for any conflicts.
EnumField#
Using a CharField
with a limited set of strings leads to inefficient data
storage since the string value is stored over and over on disk. MySQL’s
ENUM
type allows a more compact representation of such columns by storing
the list of strings just once and using an integer in each row to refer to
which string is there. EnumField
allows you to use the ENUM
type with
Django.
- class django_mysql.models.EnumField(choices, **kwargs)#
A subclass of Django’s
Charfield
that uses a MySQLENUM
for storage.choices
is a standard Django argument for any field class, however it is required forEnumField
. It can either be a list of strings, or a list of two-tuples of strings, where the first element in each tuple is the value used, and the second the human readable name used in forms. The best way to form it is with Django’sTextChoices
enumeration type.For example:
from django.db import models from django_mysql.models import EnumField class BookCoverColour(models.TextChoices): RED = "red" GREEN = "green" BLUE = "blue" class BookCover(models.Model): colour = EnumField(choices=BookCoverColour.choices)
Warning
It is possible to append new values to
choices
in migrations, as well as edit the human readable names of existing choices.However, editing or removing existing choice values will error if MySQL Strict Mode is on, and replace the values with the empty string if it is not.
Also the empty string has strange behaviour with
ENUM
, acting somewhat likeNULL
, but not entirely. It is therefore recommended you ensure Strict Mode is on.
FixedCharField#
Django’s CharField
uses the VARCHAR
data type, which uses variable
storage space depending on string length. This normally saves storage space,
but for columns with a fixed length, it adds a small overhead.
The alternative CHAR
data type avoids that overhead, but it has the
surprising behaviour of removing trailing space characters, and consequently
ignoring them in comparisons. FixedCharField
provides a Django field for
using CHAR
. This can help you interface with databases created by other
systems, but it’s not recommended for general use, due to the trialing space
behaviour.
- class django_mysql.models.FixedCharField(*args, max_length: int, **kwargs)#
A subclass of Django’s
Charfield
that uses theCHAR
data type.max_length
is as withCharField
, but must be within the range 0-255.For example:
from django_mysql.models import FixedCharField class Address(Model): zip_code = FixedCharField(length=5)
Resizable Text/Binary Fields#
Django’s TextField
and
BinaryField
fields are fixed at the MySQL level to
use the maximum size class for the BLOB
and TEXT
data types. This is
fine for most applications, however if you are working with a legacy database,
or you want to be stricter about the maximum size of data that can be stored,
you might want one of the other sizes.
The following field classes are simple subclasses that allow you to provide an extra parameter to determine which size class to use. They work with migrations, allowing you to swap them for the existing Django class and then use a migration to change their size class. This might help when taking over a legacy database for example.
- class django_mysql.models.SizedTextField(size_class: int, **kwargs)#
A subclass of Django’s
TextField
that allows you to use the other sizes ofTEXT
data type. Setsize_class
to:1
for aTINYTEXT
field, which has a maximum length of 255 bytes2
for aTEXT
field, which has a maximum length of 65,535 bytes3
for aMEDIUMTEXT
field, which has a maximum length of 16,777,215 bytes (16MiB)4
for aLONGTEXT
field, which has a maximum length of 4,294,967,295 bytes (4GiB)
- class django_mysql.models.SizedBinaryField(size_class, **kwargs)#
A subclass of Django’s
BinaryField
that allows you to use the other sizes ofBLOB
data type. Setsize_class
to:1
for aTINYBLOB
field, which has a maximum length of 255 bytes2
for aBLOB
field, which has a maximum length of 65,535 bytes3
for aMEDIUMBLOB
field, which has a maximum length of 16,777,215 bytes (16MiB)4
for aLONGBLOB
field, which has a maximum length of 4,294,967,295 bytes (4GiB)
BIT(1) Boolean Fields#
Some database systems, such as the Java Hibernate ORM, don’t use MySQL’s
bool
data type for storing boolean flags and instead use BIT(1)
.
Django’s default BooleanField
and NullBooleanField
classes can’t work
with this.
The following subclasses are boolean fields that work with BIT(1)
columns
that will help when connecting to a legacy database. If you are using
inspectdb
to generate models from the database, use these to replace the
TextField
output for your BIT(1)
columns.
- class Bit1BooleanField#
A subclass of Django’s
BooleanField
that uses theBIT(1)
column type instead ofbool
.
- class NullBit1BooleanField#
Note
Django deprecated
NullBooleanField
in version 3.1 and retains it only for use in old migrations.NullBit1BooleanField
is similarly deprecated.A subclass of Django’s
NullBooleanField
that uses theBIT(1)
column type instead ofbool
.
Field Lookups#
ORM extensions for filtering. These are all automatically added for the
appropriate field types when django_mysql
is in your INSTALLED_APPS
.
Note that lookups specific to included
model fields are documented with the field, rather
than here.
Case-sensitive String Comparison#
MySQL string comparison has a case-sensitivity dependent on the collation of
your tables/columns, as the Django manual describes.
However, it is possible to query in a case-sensitive manner even when your data
is not stored with a case-sensitive collation, using the BINARY
keyword.
The following lookup adds that capability to the ORM for
CharField
, TextField
, and
subclasses thereof.
case_exact#
Exact, case-sensitive match for character columns, no matter the underlying collation:
>>> Author.objects.filter(name__case_exact="dickens")
[]
>>> Author.objects.filter(name__case_exact="Dickens")
[<Author: Dickens>]
Soundex#
MySQL implements the Soundex algorithm with its SOUNDEX
function,
allowing you to find words sounding similar to each other (in
English only, regrettably). These lookups allow you to use that function in the
ORM and are added for CharField
and
TextField
.
soundex#
Match a given soundex string:
>>> Author.objects.filter(name__soundex="R163")
[<Author: Robert>, <Author: Rupert>]
SQL equivalent:
SELECT ... WHERE SOUNDEX(`name`) = 'R163'
sounds_like#
Match the SOUNDEX
of the given string:
>>> Author.objects.filter(name__sounds_like="Robert")
[<Author: Robert>, <Author: Rupert>]
SQL equivalent:
SELECT ... WHERE `name` SOUNDS LIKE 'Robert'
Aggregates#
MySQL-specific database aggregates for the ORM.
The following can be imported from django_mysql.models
.
- class django_mysql.models.BitAnd(column)#
Returns an
int
of the bitwiseAND
of all input values, or 18446744073709551615 (aBIGINT UNSIGNED
with all bits set to 1) if no rows match.Example usage:
>>> Book.objects.create(bitfield=29) >>> Book.objects.create(bitfield=15) >>> Book.objects.all().aggregate(BitAnd("bitfield")) {'bitfield__bitand': 13}
- class django_mysql.models.BitOr(column)#
Returns an
int
of the bitwiseOR
of all input values, or 0 if no rows match.Example usage:
>>> Book.objects.create(bitfield=29) >>> Book.objects.create(bitfield=15) >>> Book.objects.all().aggregate(BitOr("bitfield")) {'bitfield__bitor': 31}
- class django_mysql.models.BitXor(column)#
Returns an
int
of the bitwiseXOR
of all input values, or 0 if no rows match.Example usage:
>>> Book.objects.create(bitfield=11) >>> Book.objects.create(bitfield=3) >>> Book.objects.all().aggregate(BitXor("bitfield")) {'bitfield__bitxor': 8}
- class django_mysql.models.GroupConcat(column, distinct=False, separator=',', ordering=None)#
An aggregate that concatenates values from a column of the grouped rows. Useful mostly for bringing back lists of ids in a single query.
Example usage:
>>> from django_mysql.models import GroupConcat >>> author = Author.objects.annotate(book_ids=GroupConcat("books__id")).get( ... name="William Shakespeare" ... ) >>> author.book_ids "1,2,5,17,29"
Warning
MySQL will truncate the value at the value of
group_concat_max_len
, which by default is quite low at 1024 characters. You should probably increase it if you’re using this for any sizeable groups.Optional arguments:
- distinct=False
If set to
True
, removes duplicates from the group.
- separator=','
By default the separator is a comma. You can use any other string as a separator, including the empty string.
Warning
Due to limitations in the Django aggregate API, this is not protected against SQL injection. Don’t pass in user input for the separator.
- ordering=None
By default no guarantee is made on the order the values will be in pre-concatenation. Set ordering to
'asc'
to sort them in ascending order, and'desc'
for descending order. For example:>>> Author.objects.annotate(book_ids=GroupConcat("books__id", ordering="asc"))
Database Functions#
MySQL/MariaDB-specific database functions for the ORM.
The following can be imported from django_mysql.models.functions
.
Control Flow Functions#
- class django_mysql.models.functions.If(condition, true, false=None)#
Evaluates the expression
condition
and returns the value of the expressiontrue
if true, and the result of expressionfalse
if false. Iffalse
is not given, it will beValue(None)
, i.e.NULL
.Usage example:
>>> Author.objects.annotate( ... is_william=If(Q(name__startswith="William "), True, False) ... ).values_list("name", "is_william") [('William Shakespeare', True), ('Ian Fleming', False), ('William Wordsworth', True)]
Numeric Functions#
- class django_mysql.models.functions.CRC32(expression)#
Computes a cyclic redundancy check value and returns a 32-bit unsigned value. The result is
NULL
if the argument isNULL
. The argument is expected to be a string and (if possible) is treated as one if it is not.Usage example:
>>> Author.objects.annotate(description_crc=CRC32("description"))
String Functions#
- class django_mysql.models.functions.ConcatWS(*expressions, separator=',')#
ConcatWS
stands for Concatenate With Separator and is a special form ofConcat
(included in Django). It concatenates all of its argument expressions as strings with the givenseparator
. SinceNULL
values are skipped, unlike inConcat
, you can use the empty string as a separator and it acts as aNULL
-safe version ofConcat
.If
separator
is a string, it will be turned into aValue
. If you wish to join with the value of a field, you can pass in anF
object for that field.Usage example:
>>> Author.objects.annotate(sales_list=ConcatWS("sales_eu", "sales_us"))
- class django_mysql.models.functions.ELT(number, values)#
Given a numerical expression
number
, it returns thenumber
th element fromvalues
, 1-indexed. Ifnumber
is less than 1 or greater than the number of expressions, it will returnNone
. It is the complement of theField
function.Note that if
number
is a string, it will refer to a field, whereas members ofvalues
that are strings will be wrapped withValue
automatically and thus interpreted as the given string. This is for convenience with the most common usage pattern where you have the list pre-loaded in python, e.g. achoices
field. If you want to refer to a column, use Django’sF()
class.Usage example:
>>> # Say Person.life_state is either 1 (alive), 2 (dead), or 3 (M.I.A.) >>> Person.objects.annotate(state_name=ELT("life_state", ["Alive", "Dead", "M.I.A."]))
- class django_mysql.models.functions.Field(expression, values)#
Given an
expression
and a list of stringsvalues
, returns the 1-indexed location of theexpression
’s value invalues
, or 0 if not found. This is commonly used withorder_by
to keep groups of elements together. It is the complement of theELT
function.Note that if
expression
is a string, it will refer to a field, whereas if any member ofvalues
is a string, it will automatically be wrapped withValue
and refer to the given string. This is for convenience with the most common usage pattern where you have the list of things pre-loaded in Python, e.g. in a field’schoices
. If you want to refer to a column, use Django’sF()
class.Usage example:
>>> # Females, then males - but other values of gender (e.g. empty string) first >>> Person.objects.all().order_by(Field("gender", ["Female", "Male"]))
XML Functions#
- class django_mysql.models.functions.UpdateXML(xml_target, xpath_expr, new_xml)#
Returns the XML fragment
xml_target
with the single match forxpath_expr
replaced with the xml fragmentnew_xml
. If nothing matchesxpath_expr
, or if multiple matches are found, the originalxml_target
is returned unchanged.This can be used for single-query updates of text fields containing XML.
Note that if
xml_target
is given as a string, it will refer to a column, whilst if eitherxpath_expr
ornew_xml
are strings, they will be used as strings directly. If you wantxpath_expr
ornew_xml
to refer to columns, use Django’sF()
class.Usage example:
# Remove 'sagacity' from all authors' xml_attrs >>> Author.objects.update(xml_attrs=UpdateXML("xml_attrs", "/sagacity", ""))
- class django_mysql.models.functions.XMLExtractValue(xml_frag, xpath_expr)#
Returns the text (
CDATA
) of the first text node which is a child of the element(s) in the XML fragmentxml_frag
matched by the XPath expressionxpath_expr
. In SQL this function is calledExtractValue
; the class has theXML
prefix to make it clearer what kind of values are it extracts.Note that if
xml_frag
is given as a string, it will refer to a column, whilst ifxpath_expr
is a string, it will be used as a string. If you wantxpath_expr
to refer to a column, use Django’sF()
class.Usage example:
# Count the number of authors with 'sagacity' in their xml_attrs >>> num_authors_with_sagacity = ( ... Author.objects.annotate( ... has_sagacity=XMLExtractValue("xml_attrs", "count(/sagacity)") ... ) ... .filter(has_sagacity="1") ... .count() ... )
Regexp Functions#
Note
These work with MariaDB 10.0.5+ only, which includes PCRE regular expressions and these extra functions to use them. More information can be found in its documentation.
- class django_mysql.models.functions.RegexpInstr(expression, regex)#
Returns the 1-indexed position of the first occurrence of the regular expression
regex
in the string value ofexpression
, or 0 if it was not found.Note that if
expression
is given as a string, it will refer to a column, whilst ifregex
is a string, it will be used as a string. If you wantregex
to refer to a column, use Django’sF()
class.Docs: MariaDB.
Usage example:
>>> Author.objects.annotate(name_pos=RegexpInstr("name", r"ens")).filter(name_pos__gt=0) [<Author: Charles Dickens>, <Author: Robert Louis Stevenson>]
- class django_mysql.models.functions.RegexpReplace(expression, regex, replace)#
Returns the string value of
expression
with all occurrences of the regular expressionregex
replaced by the stringreplace
. If no occurrences are found, then subject is returned as is.Note that if
expression
is given as a string, it will refer to a column, whilst if eitherregex
orreplace
are strings, they will be used as strings. If you wantregex
orreplace
to refer to columns, use Django’sF()
class.Docs: MariaDB.
Usage example:
>>> Author.objects.create(name="Charles Dickens") >>> Author.objects.create(name="Roald Dahl") >>> qs = Author.objects.annotate( ... surname_first=RegexpReplace("name", r"^(.*) (.*)$", r"\2, \1") ... ).order_by("surname_first") >>> qs [<Author: Roald Dahl>, <Author: Charles Dickens>] >>> qs[0].surname_first "Dahl, Roald"
- class django_mysql.models.functions.RegexpSubstr(expression, regex)#
Returns the part of the string value of
expression
that matches the regular expressionregex
, or an empty string ifregex
was not found.Note that if
expression
is given as a string, it will refer to a column, whilst ifregex
is a string, it will be used as a string. If you wantregex
to refer to a column, use Django’sF()
class.Docs: MariaDB.
Usage example:
>>> Author.objects.create(name="Euripides") >>> Author.objects.create(name="Frank Miller") >>> Author.objects.create(name="Sophocles") >>> Author.objects.annotate(name_has_space=CharLength(RegexpSubstr("name", r"\s"))).filter( ... name_has_space=0 ... ) [<Author: Euripides>, <Author: Sophocles>]
Information Functions#
- class django_mysql.models.functions.LastInsertId(expression=None)#
With no argument, returns the last value added to an auto-increment column, or set by another call to
LastInsertId
with an argument. With an argument, sets the ‘last insert id’ value to the value of the given expression, and returns that value. This can be used to implement simpleUPDATE ... RETURNING
style queries.This function also has a class method:
- get(using=DEFAULT_DB_ALIAS)#
Returns the value set by a call to
LastInsertId()
with an argument, by performing a single query. It is stored per-connection, hence you may need to pass the alias of the connection that set theLastInsertId
asusing
.Note
Any queries on the database connection between setting
LastInsertId
and callingLastInsertId.get()
can reset the value. These might come from Django, which can issue multiple queries forupdate()
with multi-table inheritance, or fordelete()
with cascading.
Usage examples:
>>> Countable.objects.filter(id=1).update(counter=LastInsertId("counter") + 1) 1 >>> # Get the pre-increase value of 'counter' as stored on the server >>> LastInsertId.get() 242 >>> Author.objects.filter(id=1, age=LastInsertId("age")).delete() 1 >>> # We can also use the stored value directly in a query >>> Author.objects.filter(id=2).update(age=LastInsertId()) 1 >>> Author.objects.get(id=2).age 35
JSON Database Functions#
These functions work with data stored in Django’s JSONField
on MySQL and
MariaDB only. JSONField
is built in to Django 3.1+ and can be installed on
older Django versions with the
django-jsonfield-backport
package.
These functions use JSON paths to address content inside JSON documents - for more information on their syntax, refer to the docs: MySQL / MariaDB.
- class django_mysql.models.functions.JSONExtract(expression, *paths, output_field=None)#
Given
expression
that resolves to some JSON data, extract the given JSON paths. If there is a single path, the plain value is returned; if there is more than one path, the output is a JSON array with the list of values represented by the paths. If the expression does not match for a particular JSON object, returnsNULL
.If only one path is given,
output_field
may also be given as a model field instance likeIntegerField()
, into which Django will load the value; the default isJSONField()
, as it supports all return types including the array of values for multiple paths.Note that if
expression
is a string, it will refer to a field, whereas members ofpaths
that are strings will be wrapped withValue
automatically and thus interpreted as the given string. If you want any ofpaths
to refer to a field, use Django’sF()
class.Usage examples:
>>> # Fetch a list of tuples (id, size_or_None) for all ShopItems >>> ShopItem.objects.annotate(size=JSONExtract("attrs", "$.size")).values_list("id", "size") [(1, '3m'), (3, '5nm'), (8, None)] >>> # Fetch the distinct values of attrs['colours'][0] for all items >>> ShopItem.objects.annotate( ... primary_colour=JSONExtract("attrs", "$.colours[0]") ... ).distinct().values_list("primary_colour", flat=True) ['Red', 'Blue', None]
- class django_mysql.models.functions.JSONKeys(expression, path=None)#
Given
expression
that resolves to some JSON data containing a JSON object, return the keys in that top-level object as a JSON array, or ifpath
is given, return the keys at that path. If the path does not match, or ifexpression
is not a JSON object (e.g. it contains a JSON array instead), returnsNULL
.Note that if
expression
is a string, it will refer to a field, whereas ifpath
is a string it will be wrapped withValue
automatically and thus interpreted as the given string. If you wantpath
to refer to a field, use Django’sF()
class.>>> # Fetch the top-level keys for the first item >>> ShopItem.objects.annotate(keys=JSONKeys("attrs")).values_list("keys", flat=True)[0] ['size', 'colours', 'age', 'price', 'origin'] >>> # Fetch the keys in 'origin' for the first item >>> ShopItem.objects.annotate(keys=JSONKeys("attrs", "$.origin")).values_list( ... "keys", flat=True ... )[0] ['continent', 'country', 'town']
- class django_mysql.models.functions.JSONLength(expression, path=None)#
Given
expression
that resolves to some JSON data, return the length of that data, or ifpath
is given, return the length of the data at that path. If the path does not match, or ifexpression
isNULL
it returnsNULL
.As per the MySQL documentation, the length of a document is determined as follows:
The length of a scalar is 1.
The length of an array is the number of array elements.
The length of an object is the number of object members.
The length does not count the length of nested arrays or objects.
Note that if
expression
is a string, it will refer to a field, whereas ifpath
is a string it will be wrapped withValue
automatically and thus interpreted as the given string. If you wantpath
to refer to a field, use Django’sF()
class.>>> # Which ShopItems don't have more than three colours? >>> ShopItem.objects.annotate(num_colours=JSONLength("attrs", "$.colours")).filter( ... num_colours__gt=3 ... ) [<ShopItem: Rainbow Wheel>, <ShopItem: Hard Candies>]
- class django_mysql.models.functions.JSONInsert(expression, data)#
Given
expression
that resolves to some JSON data, adds to it using the dictionarydata
of JSON paths to new values. If any JSON path in thedata
dictionary does not match, or ifexpression
isNULL
, it returnsNULL
. Paths that already exist in the original data are ignored.Note that if
expression
is a string, it will refer to a field, whereas keys and values within thepairs
dictionary will be wrapped withValue
automatically and thus interpreted as the given string. If you want a key or value to refer to a field, use Django’sF()
class.>>> # Add power_level = 0 for those items that don't have power_level >>> ShopItem.objects.update(attrs=JSONInsert("attrs", {"$.power_level": 0}))
- class django_mysql.models.functions.JSONReplace(expression, data)#
Given
expression
that resolves to some JSON data, replaces existing paths in it using the dictionarydata
of JSON paths to new values. If any JSON path within thedata
dictionary does not match, or ifexpression
isNULL
, it returnsNULL
. Paths that do not exist in the original data are ignored.Note that if
expression
is a string, it will refer to a field, whereas keys and values within thepairs
dictionary will be wrapped withValue
automatically and thus interpreted as the given string. If you want a key or value to refer to a field, use Django’sF()
class.>>> # Reset all items' monthly_sales to 0 directly in MySQL >>> ShopItem.objects.update(attrs=JSONReplace("attrs", {"$.monthly_sales": 0}))
- class django_mysql.models.functions.JSONSet(expression, data)#
Given
expression
that resolves to some JSON data, updates it using the dictionarydata
of JSON paths to new values. If any of the JSON paths within the data dictionary does not match, or ifexpression
isNULL
, it returnsNULL
. All paths can be modified - those that did not exist before and those that did.Note that if
expression
is a string, it will refer to a field, whereas keys and values within thedata
dictionary will be wrapped withValue
automatically and thus interpreted as the given string. If you want a key or value to refer to a field, use Django’sF()
class.>>> # Modify 'size' value to '10m' directly in MySQL >>> shop_item = ShopItem.objects.latest() >>> shop_item.attrs = JSONSet("attrs", {"$.size": "10m"}) >>> shop_item.save()
- class django_mysql.models.functions.JSONArrayAppend(expression, data)#
Given
expression
that resolves to some JSON data, adds to it using the dictionarydata
of JSON paths to new values. If a path selects an array, the new value will be appended to it. On the other hand, if a path selects a scalar or object value, that value is autowrapped within an array and the new value is added to that array. If any of the JSON paths within the data dictionary does not match, or ifexpression
isNULL
, it returnsNULL
.Note that if
expression
is a string, it will refer to a field, whereas keys and values within thedata
dictionary will be wrapped withValue
automatically and thus interpreted as the given string. If you want a key or value to refer to a field, use Django’sF()
class.>>> # Append the string '10m' to the array 'sizes' directly in MySQL >>> shop_item = ShopItem.objects.latest() >>> shop_item.attrs = JSONArrayAppend("attrs", {"$.sizes": "10m"}) >>> shop_item.save()
Dynamic Columns Functions#
These are MariaDB 10.0+ only, and for use with DynamicField
.
- class django_mysql.models.functions.AsType(expression, data_type)#
A partial function that should be used as part of a
ColumnAdd
expression when you want to ensure thatexpression
will be stored as a given typedata_type
. The possible values fordata_type
are the same as documented for theDynamicField
lookups.Note that this is not a valid standalone function and must be used as part of
ColumnAdd
- see below.
- class django_mysql.models.functions.ColumnAdd(expression, to_add)#
Given
expression
that resolves to aDynamicField
(most often a field name), add/update with the dictionaryto_add
and return the new Dynamic Columns value. This can be used for atomic single-query updates on Dynamic Columns.Note that you can add optional types (and you should!). These can not be drawn from the
spec
of theDynamicField
due to ORM restrictions, so there are no guarantees about the types that will get used if you do not. To add a type cast, wrap the value with anAsType
(above) - see examples below.Docs: MariaDB.
Usage examples:
>>> # Add default 'for_sale' as INTEGER 1 to every item >>> ShopItem.objects.update(attrs=ColumnAdd("attrs", {"for_sale": AsType(1, "INTEGER")})) >>> # Fix some data >>> ShopItem.objects.filter(attrs__size="L").update( ... attrs=ColumnAdd("attrs", {"size": AsType("Large", "CHAR")}) ... )
- class django_mysql.models.functions.ColumnDelete(expression, *to_delete)#
Given
expression
that resolves to aDynamicField
(most often a field name), delete the columns listed by the other expressionsto_delete
, and return the new Dynamic Columns value. This can be used for atomic single-query deletions on Dynamic Columns.Note that strings in
to_delete
will be wrapped withValue
automatically and thus interpreted as the given string - if they weren’t, Django would interpret them as meaning “the value in this (non-dynamic) column”. If you do mean that, useF('fieldname')
.Docs: MariaDB.
Usage examples:
>>> # Remove 'for_sail' and 'for_purchase' from every item >>> ShopItem.objects.update(attrs=ColumnDelete("attrs", "for_sail", "for_purchase"))
- class django_mysql.models.functions.ColumnGet(expression, name, data_type)#
Given
expression
that resolves to aDynamicField
(most often a field name), return the value of the columnname
when cast to the typedata_type
, orNULL
/None
if the column does not exist. This can be used to select a subset of column values when you don’t want to fetch the whole blob. The possible values fordata_type
are the same as documented for theDynamicField
lookups.Docs: MariaDB.
Usage examples:
>>> # Fetch a list of tuples (id, size_or_None) for all items >>> ShopItem.objects.annotate(size=ColumnGet("attrs", "size", "CHAR")).values_list( ... "id", "size" ... ) >>> # Fetch the distinct values of attrs['seller']['url'] for all items >>> ShopItem.objects.annotate( ... seller_url=ColumnGet(ColumnGet("attrs", "seller", "BINARY"), "url", "CHAR") ... ).distinct().values_list("seller_url", flat=True)
Migration Operations#
MySQL-specific migration operations
that can all be imported from django_mysql.operations
.
Install Plugin#
- class django_mysql.operations.InstallPlugin(name, soname)[source]#
An
Operation
subclass that installs a MySQL plugin. RunsINSTALL PLUGIN name SONAME soname
, but does a check to see if the plugin is already installed to make it more idempotent.- name#
This is a required argument. The name of the plugin to install.
- soname#
This is a required argument. The name of the library to install the plugin from. Note that on MySQL you must include the extension (e.g.
.so
,.dll
) whilst on MariaDB you may skip it to keep the operation platform-independent.
Example usage:
from django.db import migrations from django_mysql.operations import InstallPlugin class Migration(migrations.Migration): dependencies = [] operations = [ # Install https://mariadb.com/kb/en/mariadb/metadata_lock_info/ InstallPlugin("metadata_lock_info", "metadata_lock_info.so") ]
Install SOName#
- class django_mysql.operations.InstallSOName(soname)[source]#
MariaDB only.
An
Operation
subclass that installs a MariaDB plugin library. One library may contain multiple plugins that work together, this installs all of the plugins in the named library file. RunsINSTALL SONAME soname
. Note that unlikeInstallPlugin
, there is no idempotency check to see if the library is already installed, since there is no way of knowing if all the plugins inside the library are installed.Docs: MariaDB.
- soname#
This is a required argument. The name of the library to install the plugin from. You may skip the file extension (e.g.
.so
,.dll
) to keep the operation platform-independent.
Example usage:
from django.db import migrations from django_mysql.operations import InstallSOName class Migration(migrations.Migration): dependencies = [] operations = [ # Install https://mariadb.com/kb/en/mariadb/metadata_lock_info/ InstallSOName("metadata_lock_info") ]
Alter Storage Engine#
- class django_mysql.operations.AlterStorageEngine(name, to_engine, from_engine=None)[source]#
An
Operation
subclass that alters the model’s table’s storage engine. Because Django has no knowledge of storage engines, you must provide the previous storage engine for the operation to be reversible.- name#
This is a required argument. The name of the model to alter.
- to_engine#
This is a required argument. The storage engine to move the model to.
- from_engine#
This is an optional argument. The storage engine the model is moving from. If you do not provide this, the operation is not reversible.
Note
If you’re using this to move from MyISAM to InnoDB, there’s a page for you in the MariaDB knowledge base - Converting Tables from MyISAM to InnoDB.
Example usage:
from django.db import migrations from django_mysql.operations import AlterStorageEngine class Migration(migrations.Migration): dependencies = [] operations = [AlterStorageEngine("Pony", from_engine="MyISAM", to_engine="InnoDB")]
Form Fields#
The following can be imported from django_mysql.forms
.
SimpleListField#
- class django_mysql.forms.SimpleListField(base_field, max_length=None, min_length=None)[source]#
A simple field which maps to a list, with items separated by commas. It is represented by an HTML
<input>
. Empty items, resulting from leading, trailing, or double commas, are disallowed.- base_field#
This is a required argument.
It specifies the underlying form field for the set. It is not used to render any HTML, but it does process and validate the submitted data. For example:
>>> from django import forms >>> from django_mysql.forms import SimpleListField >>> class NumberListForm(forms.Form): ... numbers = SimpleListField(forms.IntegerField()) ... >>> form = NumberListForm({"numbers": "1,2,3"}) >>> form.is_valid() True >>> form.cleaned_data {'numbers': [1, 2, 3]} >>> form = NumberListForm({"numbers": "1,2,a"}) >>> form.is_valid() False
- max_length#
This is an optional argument which validates that the list does not exceed the given length.
- min_length#
This is an optional argument which validates that the list reaches at least the given length.
User friendly forms
SimpleListField
is not particularly user friendly in most cases, however
it’s better than nothing.
SimpleSetField#
- class django_mysql.forms.SimpleSetField(base_field, max_length=None, min_length=None)[source]#
A simple field which maps to a set, with items separated by commas. It is represented by an HTML
<input>
. Empty items, resulting from leading, trailing, or double commas, are disallowed.- base_field#
This is a required argument.
It specifies the underlying form field for the set. It is not used to render any HTML, but it does process and validate the submitted data. For example:
>>> from django import forms >>> from django_mysql.forms import SimpleSetField >>> class NumberSetForm(forms.Form): ... numbers = SimpleSetField(forms.IntegerField()) ... >>> form = NumberSetForm({"numbers": "1,2,3"}) >>> form.is_valid() True >>> form.cleaned_data {'numbers': set([1, 2, 3])} >>> form = NumberSetForm({"numbers": "1,2,a"}) >>> form.is_valid() False
- max_length#
This is an optional argument which validates that the set does not exceed the given length.
- min_length#
This is an optional argument which validates that the set reaches at least the given length.
User friendly forms
SimpleSetField
is not particularly user friendly in most cases, however
it’s better than nothing.
Validators#
The following can be imported from django_mysql.validators
.
- class django_mysql.validators.ListMaxLengthValidator[source]#
A subclass of django’s
MaxLengthValidator
with list-specific wording.
- class django_mysql.validators.ListMinLengthValidator[source]#
A subclass of django’s
MinLengthValidator
with list-specific wording.
- class django_mysql.validators.SetMaxLengthValidator[source]#
A subclass of django’s
MaxLengthValidator
with set-specific wording.
- class django_mysql.validators.SetMinLengthValidator[source]#
A subclass of django’s
MinLengthValidator
with set-specific wording.
Cache#
A MySQL-specific backend for Django’s cache framework.
MySQLCache#
An efficient cache backend using a MySQL table, an alternative to Django’s
database-agnostic DatabaseCache
. It has the following advantages:
Each operation uses only one query, including the
*_many
methods. This is unlikeDatabaseCache
which uses multiple queries for nearly every operation.Automatic client-side
zlib
compression for objects larger than a given threshold. It is also easy to subclass and add your own serialization or compression schemes.Faster probabilistic culling behaviour during write operations, which you can also turn off and execute in a background task. This can be a bottleneck with Django’s
DatabaseCache
since it culls on every write operation, executing aSELECT COUNT(*)
which requires a full table scan.Integer counters with atomic
incr()
anddecr()
operations, like theMemcachedCache
backend.
Usage#
To use, add an entry to your CACHES
setting with:
BACKEND
set todjango_mysql.cache.MySQLCache
LOCATION
set totablename
, the name of the table to use. This name can be whatever you want, as long as it’s a valid table name that’s not already being used in your database.
For example:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "my_super_cache",
}
}
You then need to make the table. The schema is not compatible with that of
DatabaseCache
, so if you are switching, you will need to create a fresh
table.
Use the management command mysql_cache_migration
to output a migration that
creates tables for all the MySQLCache
instances you have configured. For
example:
$ python manage.py mysql_cache_migration
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
# Add a dependency in here on an existing migration in the app you
# put this migration in, for example:
# ('myapp', '0001_initial'),
]
operations = [
migrations.RunSQL(
"""
CREATE TABLE `my_super_cache` (
cache_key varchar(255) CHARACTER SET utf8 COLLATE utf8_bin
NOT NULL PRIMARY KEY,
value longblob NOT NULL,
value_type char(1) CHARACTER SET latin1 COLLATE latin1_bin
NOT NULL DEFAULT 'p',
expires BIGINT UNSIGNED NOT NULL
);
""",
"DROP TABLE `my_super_cache`"
),
]
Save this to a file in the migrations
directory of one of your project’s
apps, and add one of your existing migrations to the file’s dependencies
.
You might want to customize the SQL at this time, for example switching the
table to use the MEMORY
storage engine.
Django requires you to install
sqlparse
to run the RunSQL
operation in the migration, so make sure it is installed.
Once the migration has run, the cache is ready to work!
Multiple Databases#
If you use this with multiple databases, you’ll also need to set up routing
instructions for the cache table. This can be done with the same method
that is described for DatabaseCache
in the Django manual, apart
from the application name is django_mysql
.
Note
Even if you aren’t using multiple MySQL databases, it may be worth using routing anyway to put all your cache operations on a second connection - this way they won’t be affected by transactions your main code runs.
Extra Details#
MySQLCache
is fully compatible with Django’s cache API, but it also extends
it and there are, of course, a few details to be aware of.
incr/decr#
Like MemcachedCache
(and unlike DatabaseCache
), incr
and decr
are atomic operations, and can only be used with int
values. They have the
range of MySQL’s SIGNED BIGINT
(-9223372036854775808 to
9223372036854775807).
max_allowed_packet#
MySQL has a setting called max_allowed_packet
, which is the maximum size of
a query, including data. This therefore constrains the size of a cached value,
but you’re more likely to run up against it first with the
get_many
/set_many
operations.
The MySQL 8.0 default is 4MB, and the MariaDB 10.2 default is 16MB. Most applications should be fine with these limits. You can tweak the setting as high as 1GB - if this isn’t enough, you should probably be considering another solution!
culling#
MySQL is designed to store data forever, and thus doesn’t have a direct way of
setting expired rows to disappear. The expiration of old keys and the limiting
of rows to MAX_ENTRIES
is therefore performed in the cache backend by
performing a cull operation when appropriate. This deletes expired keys first,
then if more than MAX_ENTRIES
keys remain, it deletes 1 /
CULL_FREQUENCY
of them. The options and strategy are described in in more
detail in the Django manual.
Django’s DatabaseCache
performs a cull check on every write operation.
This runs a SELECT COUNT(*)
on the table, which means a full-table scan.
Naturally, this takes a bit of time and becomes a bottleneck for medium or
large cache table sizes of caching. MySQLCache
helps you solve this in two
ways:
The cull-on-write behaviour is probabilistic, by default running on 1% of writes. This is set with the
CULL_PROBABILITY
option, which should be a number between 0 and 1. For example, if you want to use the same cull-on-every-write behaviour as used byDatabaseCache
(you probably don’t), setCULL_PROBABILITY
to 1.0:CACHES = { "default": { "BACKEND": "django_mysql.cache.MySQLCache", "LOCATION": "some_table_name", "OPTIONS": {"CULL_PROBABILITY": 1.0}, } }
The
cull()
method is available as a public method so you can set up your own culling schedule in background processing, never affecting any user-facing web requests. SetCULL_PROBABILITY
to 0, and then set up your task. For example, if you are using celery you could use a task like this:@shared_task def clear_caches(): caches["default"].cull() caches["other_cache"].cull()
This functionality is also available as the management command
cull_mysql_caches
, which you might run as a cron job. It performscull()
on all of yourMySQLCache
instances, or you can give it names to just cull those. For example, this:$ python manage.py cull_mysql_caches default other_cache
…will call
caches['default'].cull()
andcaches['other_cache'].cull()
.
You can also disable the MAX_ENTRIES
behaviour, which avoids the SELECT
COUNT(*)
entirely, and makes cull()
only delete expired keys. To do this,
set MAX_ENTRIES
to -1:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"OPTIONS": {"MAX_ENTRIES": -1},
}
}
Note that you should then of course monitor the size of your cache table well, since it has no bounds on its growth.
compression#
Like the other Django cache backends, stored objects are serialized with
pickle
(except from integers, which are stored as integers so that the
incr()
and decr()
operations will work). If pickled data has has a size
in bytes equal to or greater than the threshold defined by the option
COMPRESS_MIN_LENGTH
, it will be compressed with zlib
in Python before
being stored, reducing the on-disk size in MySQL and network costs for storage
and retrieval. The zlib level is set by the option COMPRESS_LEVEL
.
COMPRESS_MIN_LENGTH
defaults to 5000, and COMPRESS_LEVEL
defaults to
the zlib
default of 6. You can tune these options - for example, to
compress all objects >= 100 bytes at the maximum level of 9, pass the options
like so:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"OPTIONS": {"COMPRESS_MIN_LENGTH": 100, "COMPRESS_LEVEL": 9},
}
}
To turn compression off, set COMPRESS_MIN_LENGTH
to 0. The options only
affect new writes - any compressed values already in the table will remain
readable.
custom serialization#
You can implement your own serialization by subclassing MySQLCache
. It uses
two methods that you should override.
Values are stored in the table with two columns - value
, which is the blob
of binary data, and value_type
, a single latin1 character that specifies
the type of data in value
. MySQLCache by default uses three codes for
value_type
:
i
- The blob is an integer. This is used so that counters can be deserialized by MySQL during the atomicincr()
anddecr()
operations.p
- The blob is a pickled Python object.z
- The blob is a zlib-compressed pickled Python object.
For future compatibility, MySQLCache
reserves all lower-case letters. For
custom types you can use upper-case letters.
The methods you need to override (and probably call super()
from) are:
- django_mysql.cache.encode(obj)#
Takes an object and returns a tuple
(value, value_type)
, ready to be inserted as parameters into the SQL query.
- django_mysql.cache.decode(value, value_type)#
Takes the pair of
(value, value_type)
as stored in the table and returns the deserialized object.
Studying the source of MySQLCache
will probably give you the best way to
extend these methods for your use case.
prefix methods#
Three extension methods are available to work with sets of keys sharing a common prefix. Whilst these would not be efficient on other cache backends such as memcached, in an InnoDB table the keys are stored in order so range scans are easy.
To use these methods, it must be possible to reverse-map the “full” key stored
in the databse to the key you would provide to cache.get
, via a ‘reverse
key function’. If you have not set KEY_FUNCTION
, MySQLCache
will use
Django’s default key function, and can therefore default the reverse key
function too, so you will not need to add anything.
However, if you have set KEY_FUNCTION
, you will also need to supply
REVERSE_KEY_FUNCTION
before the prefix methods can work. For example,
with a simple custom key function that ignores key_prefix
and version
,
you might do this:
def my_key_func(key, key_prefix, version):
return key # Ignore key_prefix and version
def my_reverse_key_func(full_key):
# key_prefix and version still need to be returned
key_prefix = None
version = None
return key, key_prefix, version
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"KEY_FUNCTION": my_key_func,
"REVERSE_KEY_FUNCTION": my_reverse_key_func,
}
}
Once you’re set up, the following prefix methods can be used:
- django_mysql.cache.delete_with_prefix(prefix, version=None)#
Deletes all keys that start with the string
prefix
. Ifversion
is not provided, it will default to theVERSION
setting. Returns the number of keys that were deleted. For example:>>> cache.set_many({"Car1": "Blue", "Car4": "Red", "Truck3": "Yellow"}) >>> cache.delete_with_prefix("Truck") 1 >>> cache.get("Truck3") None
Note
This method does not require you to set the reverse key function.
- django_mysql.cache.get_with_prefix(prefix, version=None)#
Like
get_many
, returns a dict of key to value for all keys that start with the stringprefix
. Ifversion
is not provided, it will default to theVERSION
setting. For example:>>> cache.set_many({"Car1": "Blue", "Car4": "Red", "Truck3": "Yellow"}) >>> cache.get_with_prefix("Truck") {'Truck3': 'Yellow'} >>> cache.get_with_prefix("Ca") {'Car1': 'Blue', 'Car4': 'Red'} >>> cache.get_with_prefix("") {'Car1': 'Blue', 'Car4': 'Red', 'Truck3': 'Yellow'}
- django_mysql.cache.keys_with_prefix(prefix, version=None)#
Returns a set of all the keys that start with the string
prefix
. Ifversion
is not provided, it will default to theVERSION
setting. For example:>>> cache.set_many({"Car1": "Blue", "Car4": "Red", "Truck3": "Yellow"}) >>> cache.keys_with_prefix("Car") set(['Car1', 'Car2'])
Changes#
Versions 0.1.10 -> 0.2.0#
Initially, in Django-MySQL version 0.1.10, MySQLCache
did not force the
columns to use case sensitive collations; in version 0.2.0 this was fixed. You
can upgrade by adding a migration with the following SQL, if you replace
yourtablename
:
ALTER TABLE yourtablename
MODIFY cache_key varchar(255) CHARACTER SET utf8 COLLATE utf8_bin
NOT NULL,
MODIFY value_type char(1) CHARACTER SET latin1 COLLATE latin1_bin
NOT NULL DEFAULT 'p';
Or as a reversible migration:
from django.db import migrations
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.RunSQL(
"""
ALTER TABLE yourtablename
MODIFY cache_key varchar(255) CHARACTER SET utf8 COLLATE utf8_bin
NOT NULL,
MODIFY value_type char(1) CHARACTER SET latin1 COLLATE latin1_bin
NOT NULL DEFAULT 'p'
""",
"""
ALTER TABLE yourtablename
MODIFY cache_key varchar(255) CHARACTER SET utf8 NOT NULL,
MODIFY value_type char(1) CHARACTER SET latin1 NOT NULL DEFAULT 'p'
""",
)
]
Locks#
The following can be imported from django_mysql.locks
.
- class django_mysql.locks.Lock(name, acquire_timeout=10.0, using=None)[source]#
MySQL can act as a locking server for arbitrary named locks (created on the fly) via its
GET_LOCK
function - sometimes called ‘User Locks’ since they are user-specific, and don’t lock tables or rows. They can be useful for your code to limit its access to some shared resource.This class implements a user lock and acts as either a context manager (recommended), or a plain object with
acquire
andrelease
methods similar tothreading.Lock
. These call the MySQL functionsGET_LOCK
,RELEASE_LOCK
, andIS_USED_LOCK
to manage it.The lock is only re-entrant (acquirable multiple times) on MariaDB.
Basic usage:
from django_mysql.exceptions import TimeoutError from django_mysql.locks import Lock try: with Lock("my_unique_name", acquire_timeout=2.0): mutually_exclusive_process() except TimeoutError: print("Could not get the lock")
For more information on user locks refer to the
GET_LOCK
documentation on MySQL or MariaDB.Warning
As the documentation warns, user locks are unsafe to use if you have replication running and your replication format (
binlog_format
) is set toSTATEMENT
. Most environments havebinlog_format
set toMIXED
because it can be more performant, but do check.- name#
This is a required argument.
Specifies the name of the lock. Since user locks share a global namespace on the MySQL server, it will automatically be prefixed with the name of the database you use in your connection from DATABASES and a full stop, in case multiple apps are using different databases on the same server.
MySQL enforces a maximum length on the total name (including the DB prefix that Django-MySQL adds) of 64 characters. MariaDB doesn’t enforce any limit. The practical limit on MariaDB is maybe 1 million characters or more, so most sane uses should be fine.
- acquire_timeout=10.0
The time in seconds to wait to acquire the lock, as will be passed to
GET_LOCK()
. Defaults to 10 seconds.
- using=None
The connection alias from
DATABASES
to use. Defaults to Django’sDEFAULT_DB_ALIAS
to use your main database connection.
- is_held()[source]#
Returns True iff a query to
IS_USED_LOCK()
reveals that this lock is currently held.
- holding_connection_id()[source]#
Returns the MySQL
CONNECTION_ID()
of the holder of the lock, orNone
if it is not currently held.
- acquire()[source]#
For using the lock as a plain object rather than a context manager, similar to
threading.Lock.acquire
. Note you should normally usetry
/finally
to ensure unlocking occurs.Example usage:
from django_mysql.locks import Lock lock = Lock("my_unique_name") lock.acquire() try: mutually_exclusive_process() finally: lock.release()
- release()[source]#
Also for using the lock as a plain object rather than a context manager, similar to
threading.Lock.release
. For example, see above.
- classmethod held_with_prefix(prefix, using=DEFAULT_DB_ALIAS)[source]#
Queries the held locks that match the given prefix, for the given database connection. Returns a dict of lock names to the
CONNECTION_ID()
that holds the given lock.Example usage:
>>> Lock.held_with_prefix("Author") {'Author.1': 451, 'Author.2': 457}
Note
Works with MariaDB 10.0.7+ only, when the
metadata_lock_info
plugin is loaded. You can install this in a migration using theInstallSOName
operation, like so:from django.db import migrations from django_mysql.operations import InstallSOName class Migration(migrations.Migration): dependencies = [] operations = [ # Install https://mariadb.com/kb/en/mariadb/metadata_lock_info/ InstallSOName("metadata_lock_info") ]
- class django_mysql.locks.TableLock(write=None, read=none, using=None)[source]#
MySQL allows you to gain a table lock to prevent modifications to the data during reads or writes. Most applications don’t need to do this since transaction isolation should provide enough separation between operations, but occasionally this can be useful, especially in data migrations or if you are using a non-transactional storage such as MyISAM.
This class implements table locking and acts as either a context manager (recommended), or a plain object with
acquire()
andrelease()
methods similar tothreading.Lock
. It uses the transactional pattern from the MySQL manual to ensure all the necessary steps are taken to lock tables properly. Note that locking has no timeout and blocks until held.Basic usage:
from django_mysql.locks import TableLock with TableLock(read=[MyModel1], write=[MyModel2]): fix_bad_instances_of_my_model2_using_my_model1_data()
- read#
A list of models or raw table names to lock at the
READ
level. Any models using multi-table inheritance will also lock their parents.
- write#
A list of models or raw table names to lock at the
WRITE
level. Any models using multi-table inheritance will also lock their parents.
- using=None
The connection alias from
DATABASES
to use. Defaults to Django’sDEFAULT_DB_ALIAS
to use your main database connection.
- acquire()[source]#
For using the lock as a plain object rather than a context manager, similar to
threading.Lock.acquire
. Note you should normally usetry
/finally
to ensure unlocking occurs.Example usage:
from django_mysql.locks import TableLock table_lock = TableLock(read=[MyModel1], write=[MyModel2]) table_lock.acquire() try: fix_bad_instances_of_my_model2_using_my_model1_data() finally: table_lock.release()
- release()[source]#
Also for using the lock as a plain object rather than a context manager, similar to
threading.Lock.release
. For example, see above.
Note
Transactions are not allowed around table locks, and an error will be raised if you try and use one inside of a transaction. A transaction is created to hold the locks in order to cooperate with InnoDB. There are a number of things you can’t do whilst holding a table lock, for example accessing tables other than those you have locked - see the MySQL/MariaDB documentation for more details.
Note
Table locking works on InnoDB tables only if the
innodb_table_locks
is set to 1. This is the default, but may have been changed for your environment.
Status#
MySQL gives you metadata on the server status through its SHOW GLOBAL
STATUS
and SHOW SESSION STATUS
commands. These classes make it easy to
get this data, as well as providing utility methods to react to it.
The following can all be imported from django_mysql.status
.
- class django_mysql.status.GlobalStatus(name, using=None)[source]#
Provides easy access to the output of
SHOW GLOBAL STATUS
. These statistics are useful for monitoring purposes, or ensuring queries your code creates aren’t saturating the server.Basic usage:
from django_mysql.status import global_status # Wait until a quiet moment while global_status.get("Threads_running") >= 5: time.sleep(1) # Log all status variables logger.log("DB status", extra=global_status.as_dict())
Note that
global_status
is a pre-existing instance for the default database connection fromDATABASES
. If you’re using more than database connection, you should instantiate the class:>>> from django_mysql.status import GlobalStatus >>> GlobalStatus(using="replica1").get("Threads_running") 47
To see the names of all the available variables, refer to the documentation: MySQL / MariaDB. They vary based upon server version, plugins installed, etc.
- using=None
The connection alias from
DATABASES
to use. Defaults to Django’sDEFAULT_DB_ALIAS
to use your main database connection.
- get(name)#
Returns the current value of the named status variable. The name may not include SQL wildcards (
%
). If it does not exist,KeyError
will be raised.The result set for
SHOW STATUS
returns values in strings, so numbers and booleans will be cast to their respective Python types -int
,float
, orbool
. Strings are be left as-is.
- get_many(names)#
Returns a dictionary of names to current values, fetching them in a single query. The names may not include wildcards (
%
).Uses the same type-casting strategy as
get()
.
- as_dict(prefix=None)#
Returns a dictionary of names to current values. If
prefix
is given, only those variables starting with the prefix will be returned.prefix
should not end with a wildcard (%
) since that will be automatically appended.Uses the same type-casting strategy as
get()
.
- wait_until_load_low(thresholds={'Threads_running': 10}, timeout=60.0, sleep=0.1)[source]#
A helper method similar to the logic in
pt-online-schema-change
for waiting with –max-load.Polls global status every
sleep
seconds until every variable named inthresholds
is at or below its specified threshold, or raises adjango_mysql.exceptions.TimeoutError
if this does not occur withintimeout
seconds. Settimeout
to 0 to never time out.thresholds
defaults to{'Threads_running': 10}
, which is the default variable used inpt-online-schema-change
, but with a lower threshold of 10 that is more suitable for small servers. You will very probably need to tweak it to your server.You can use this method during large background operations which you don’t want to affect other connections (i.e. your website). By processing in small chunks and waiting for low load in between, you sharply reduce your risk of outage.
- class django_mysql.status.SessionStatus(name, connection_name=None)[source]#
This class is the same as GlobalStatus apart from it runs
SHOW SESSION STATUS
, so some variables are restricted to the current connection only, rather than the whole server. For which, you should refer to the documentation: MySQL / MariaDB.Also it doesn’t have the
wait_until_load_low
method, which only makes sense in a global setting.Example usage:
from django_mysql.status import session_status read_operations = session_status.get("Handler_read")
And for a different connection:
from django_mysql.status import SessionStatus replica1_reads = SessionStatus(using="replica1").get("Handler_read")
Management Commands#
MySQL-specific management commands. These are automatically available with
your manage.py
when you add django_mysql
to your INSTALLED_APPS
.
dbparams
command#
Outputs your database connection parameters in a form suitable for inclusion in
other CLI commands, helping avoid copy/paste errors and accidental copying of
passwords to shell history files. Knows how to output parameters in two formats
- for mysql
related tools, or the DSN format that some percona tools take.
For example:
$ python manage.py dbparams && echo # 'echo' adds a newline
--user=ausername --password=apassword --host=ahost.example.com mydatabase
$ mysql $(python manage.py dbparams) # About the same as 'manage.py dbshell'
$ mysqldump $(python manage.py dbparams) | gzip -9 > backup.sql.gz # Neat!
The format of parameters is:
python manage.py dbparams [--mysql | --dsn] <optional-connection-alias>
If the database alias is given, it should be alias of a connection from the
DATABASES
setting; defaults to ‘default’. Only MySQL connections are
supported - the command will fail for other connection vendors.
Mutually exclusive format flags:
--mysql
#
Default, so shouldn’t need passing. Allows you to do, e.g.:
$ mysqldump $(python manage.py dbparams) | gzip -9 > backup.sql.gz
Which will translate to include all the relevant flags, including your database.
--dsn
#
Outputs the parameters in the DSN format, which is what many percona tools take, e.g.:
$ pt-duplicate-key-checker $(python manage.py dbparams --dsn)
Note
If you are using SSL to connect, the percona tools don’t support SSL
configuration being given in their DSN format; you must pass them via a
MySQL configuration file instead. dbparams
will output a warning
on stderr if this is the case. For more info see the percona blog.
Test Utilities#
The following can be imported from django_mysql.test.utils
.
- django_mysql.test.utils.override_mysql_variables(using='default', **options)[source]#
Overrides MySQL system variables for a test method or for every test method in a class, similar to Django’s
override_settings
. This can be useful when you’re testing code that must run under multiple MySQL environments (like most of django-mysql). For example:@override_mysql_variables(SQL_MODE="MSSQL") class MyTests(TestCase): def test_it_works_in_mssql(self): run_it() @override_mysql_variables(SQL_MODE="ANSI") def test_it_works_in_ansi_mode(self): run_it()
During the first test, the
SQL_MODE
will beMSSQL
, and during the second, it will beANSI
; each slightly changes the allowed SQL syntax, meaning they are useful to test.Note
This only sets the system variables for the session, so if the tested code closes and re-opens the database connection the change will be reset.
- django_mysql.test.utils.using#
The connection alias to set the system variables for, defaults to ‘default’.
Exceptions#
Various exception classes that can be raised by django_mysql
code. They can
imported from the django_mysql.exceptions
module.
Contributing#
Run the tests#
Install tox.
Run a supported version of MySQL or MariaDB. This is easiest with the official Docker images. For example:
docker run --detach --name mariadb -e MYSQL_ROOT_PASSWORD=hunter2 --publish 3306:3306 mariadb:10.7
Run the tests by passing environment variables with your connection parameters. For the above Docker command:
DB_HOST=127.0.0.1 DB_USER=root DB_PASSWORD='hunter2' tox -e py310-django40
tox environments are split per Python and Django version.
You can run a subset of tests by passing them after
--
like:DB_HOST=127.0.0.1 DB_USER=root DB_PASSWORD='hunter2' tox -e py310-django40 -- tests/testapp/test_cache.py
You can also pass other pytest arguments after the
--
.
Changelog#
4.12.0 (2023-10-11)#
Support Django 5.0.
4.11.0 (2023-07-10)#
Drop Python 3.7 support.
4.10.0 (2023-06-16)#
Support Python 3.12.
4.9.0 (2023-02-25)#
Support Django 4.2.
Drop support for MySQL 5.7 and MariaDB 10.3. They will both reach EOL this year, and Django 4.2 does not support them.
4.8.0 (2022-12-06)#
Make
MySQLCache.touch()
returnTrue
if the key was touched,False
otherwise. This return value was missing since the method was added for Django 2.1.Fix a bug where set fields’
contains
lookups would put SQL parameters in the wrong order.Remove deprecated database functions which exist in Django 3.0+:
Sign
MD5
SHA1
SHA2
.
4.7.1 (2022-08-11)#
Ensure that changing
choices
on anEnumField
triggers a migration on Django 4.1.
4.7.0 (2022-06-05)#
Support Python 3.11.
Support Django 4.1.
Drop support for MariaDB 10.2, as it is end of life.
4.6.0 (2022-05-10)#
Drop support for Django 2.2, 3.0, and 3.1.
Support MariaDB 10.7 and 10.8.
Drop
django_mysql.utils.connection_is_mariadb
. On Django 3.0+ you can simply check:connection.vendor == "mysql" and connection.mysql_is_mariadb
Deprecate database functions which exist in Django 3.0+:
Sign
MD5
SHA1
SHA2
4.5.0 (2022-01-23)#
Drop
pt_fingerprint()
. Its complicated threading code leaked processes. Switch to callingpt-fingerprint
directly withsubprocess.run()
instead.Add model
FixedCharField
for storing fixed width strings using aCHAR
type.Thanks to Caleb Ely in PR #883.
4.4.0 (2022-01-10)#
Drop Python 3.6 support.
4.3.0 (2021-12-07)#
Fix
DynamicField.deconstruct()
to correctly handleblank
.Make JSON database functions work on MariaDB.
4.2.0 (2021-10-05)#
Support Python 3.10.
4.1.0 (2021-09-28)#
Support Django 4.0.
4.0.0 (2021-08-24)#
Test with MariaDB 10.6.
Add type hints.
Drop support for MySQL 5.6 and MariaDB 10.1, as they are both end of life.
Drop
fix_datetime_columns
management command, which was useful when upgrading from MySQL < 5.6.Drop check for strict mode (
django_mysql.W001
) as this is now included in Django itself since version 1.10.Drop the
Greatest
,Least
,Abs
,Ceiling
,Floor
andRound
database functions as they exist in Django core now. Swap to importing them fromdjango.db.models.functions
.Drop
JSONField
model and form fields. Django 3.1 provides aJSONField
implementation that works with all database backends, use that instead. If you are on an earlier version of Django, use django-jsonfield-backport.Make JSON database functions work with Django’s
JSONField
, and the backport. They remain MySQL only.Drop HANDLER functionality. This was not particularly robustly implemented and is somewhat dangerous to use due to its potential for dirty reads.
3.12.0 (2021-06-11)#
Fix index hints for tables with aliases.
Thanks to Henrik Aarnio in PR #786.
Stop distributing tests to reduce package size. Tests are not intended to be run outside of the tox setup in the repository. Repackagers can use GitHub’s tarballs per tag.
3.11.1 (2021-01-26)#
Pass the
chunk_size
argument through inQuerySetMixin.iterator()
. (Issue #752)
3.11.0 (2021-01-25)#
Support Django 3.2.
Rework system checks for Django 3.1, which made database checks optional. To run the checks you now need to pass the
--database
argument to thecheck
command, for examplepython manage.py check --database default
.
3.10.0 (2020-12-09)#
Drop Python 3.5 support.
Support Python 3.9.
Deprecate the
Greatest
,Least
,Abs
,Ceiling
,Floor
andRound
database functions as they exist in Django core now. Using the Django-MySQL versions now triggers aDeprecationWarning
.Deprecate
JSONField
. Django 3.1 provides aJSONField
implementation that works with all database backends, use that instead. If you are on an earlier version of Django, use django-jsonfield-backport.
3.9.0 (2020-10-11)#
Move license from BSD to MIT License.
Fix form JSONField for to not use the
ensure_ascii
flag, making it support all unicode characters.
3.8.1 (2020-07-27)#
Fix one more RemovedInDjango40Warning message for django.utils.translation.
3.8.0 (2020-07-27)#
Drop Django 2.0 and 2.1 support.
Test with MariaDB 10.5.
Drop testing with MariaDB 10.0 (Django only officially supports MariaDB 10.1+ anyway).
Fix RemovedInDjango40Warning messages for django.utils.translation.
3.7.1 (2020-06-24)#
Fix query rewriting to install for recreated database connections. (Issue #677)
3.7.0 (2020-06-15)#
Add Django 3.1 support.
3.6.0 (2020-06-09)#
Changed query rewriting to use Django’s database instrumentation. (Issue #644)
Added
JSONIn
lookup which only works with literal values (not with expressions nor subqueries).Fix
JSONContains
to make it work with scalar values again. (PR #668).
3.5.0 (2020-05-04)#
Add MySQL 8 support.
3.4.0 (2020-04-16)#
Prevent
collections.abc.Sequence
warning.Drop Django 1.11 support. Only Django 2.0+ is supported now.
Prevent
JSONField
from addingCAST(... AS JSON)
forstr
,int
, andfloat
objects.
3.3.0 (2019-12-10)#
Update Python support to 3.5-3.8.
Converted setuptools metadata to configuration file. This meant removing the
__version__
attribute from the package. If you want to inspect the installed version, useimportlib.metadata.version("django-mysql")
(docs / backport).Fix
GroupConcat
to work with bothseparator
andordering
set. (PR #596).
3.2.0 (2019-06-14)#
Update Python support to 3.5-3.7, as 3.4 has reached its end of life.
Always cast SQL params to tuples in ORM code.
3.1.0 (2019-05-17)#
Remove authors file and documentation page. This was showing only 4 out of the 17 total contributors.
Tested on Django 2.2. No changes were needed for compatibility.
3.0.0.post1 (2019-03-05)#
Remove universal wheel. Version 3.0.0 has been pulled from PyPI after being up for 3 hours to fix mistaken installs on Python 2.
3.0.0 (2019-03-05)#
Drop Python 2 support, only Python 3.4+ is supported now.
2.5.0 (2019-03-03)#
Drop Django 1.8, 1.9, and 1.10 support. Only Django 1.11+ is supported now.
2.4.1 (2018-08-18)#
Django 2.1 compatibility - no code changes were required, releasing for PyPI trove classifiers and documentation.
2.4.0 (2018-07-31)#
Added
JSONArrayAppend
database function that wraps the respective JSON-modifying function from MySQL 5.7.
2.3.1 (2018-07-22)#
Made
EnumField
escape its arguments in apymysql
-friendly fashion.
2.3.0 (2018-06-19)#
Started testing with MariaDB 10.3.
Changed
GlobalStatus.wait_until_load_low()
to increase the default number of allowed running threads from 5 to 10, to account for the new default threads in MariaDB 10.3.Added
encoder
anddecoder
arguments toJSONField
for customizing the way JSON is encoded and decoded from the database.Added a
touch
method to theMySQLCache
to refresh cache keys, as added in Django 2.1.Use a temporary database connection in system checks to avoid application startup stalls.
2.2.2 (2018-04-24)#
Fixed some crashes from
DynamicField
instances without explicitspec
definitions.Fixed a crash in system checks for
ListCharField
andSetCharField
instances missingmax_length
.
2.2.1 (2018-04-14)#
Fixed
JSONField.deconstruct()
to not break the path for subclasses.
2.2.0 (2017-12-04)#
Add
output_field
argument toJSONExtract
function.Improved DB version checks for
JSONField
andDynamicField
so you can have just one connection that supports them.Django 2.0 compatibility.
2.1.1 (2017-10-10)#
Changed subprocess imports for compatibility with Google App Engine.
(Insert new release notes below this line)
Made
MySQLCache.set_many
return a list as per Django 2.0.
2.1.0 (2017-06-11)#
Django 1.11 compatibility
Some fixes to work with new versions of
mysqlclient
2.0.0 (2017-05-28)#
Fixed
JSONField
model field string serialization. This is a small backwards incompatible change.Storing strings mostly used to crash with MySQL error -1 “error totally whack”, but in the case your string was valid JSON, it would store it as a JSON object at the MySQL layer and deserialize it when returned. For example you could do this:
>>> mymodel.attrs = '{"foo": "bar"}' >>> mymodel.save() >>> mymodel = MyModel.objects.get(id=mymodel.id) >>> mymodel.attrs {'foo': 'bar'}
The new behaviour now correctly returns what you put in:
>>> mymodel.attrs '{"foo": "bar"}'
Removed the
connection.is_mariadb
monkey patch. This is a small backwards incompatible change. Instead of using it, usedjango_mysql.utils.connection_is_mariadb
.
1.2.0 (2017-05-14)#
Only use Django’s vendored six (
django.utils.six
). Fixes usage ofEnumField
and field lookups whensix
is not installed as a standalone package.Added
JSONInsert
,JSONReplace
andJSONSet
database functions that wraps the respective JSON-modifying functions from MySQL 5.7.Fixed
JSONField
to work with Django’s serializer framework, as used in e.g.dumpdata
.Fixed
JSONField
form field so that it doesn’t overquote inputs when redisplaying the form due to invalid user input.
1.1.1 (2017-03-28)#
Don’t allow NaN in
JSONField
because MySQL doesn’t support it
1.1.0 (2016-07-22)#
Dropped Django 1.7 support
Made the query hint functions raise
RuntimeError
if you haven’t activated the query-rewriting layer in settings.
1.0.9 (2016-05-12)#
Fixed some features to work when there are non-MySQL databases configured
Fixed
JSONField
to allow control characters, which MySQL does - but not in a top-level string, only inside a JSON object/array.
1.0.8 (2016-04-08)#
SmartChunkedIterator
now fails properly for models whose primary key is a non-integer foreign key.pty
is no longer imported at the top-level indjango_mysql.utils
, fixing Windows compatibility.
1.0.7 (2016-03-04)#
Added new
JSONField
class backed by the JSON type added in MySQL 5.7.Added database functions
JSONExtract
,JSONKeys
, andJSONLength
that wrap the JSON functions added in MySQL 5.7, which can be used with the JSON type columns as well as JSON data held in text/varchar columns.Added
If
database function for simple conditionals.
1.0.6 (2016-02-26)#
Now MySQL 5.7 compatible
The final message from
SmartChunkedIterator
is now rounded to the nearest second.Lock
andTableLock
classes now haveacquire
andrelease()
methods for using them as normal objects rather than context managers
1.0.5 (2016-02-10)#
Added
manage.py
commandfix_datetime_columns
that outputs the SQL necessary to fix anydatetime
columns intodatetime(6)
, as required when upgrading a database to MySQL 5.6+, or MariaDB 5.3+.SmartChunkedIterator
output now includes the total time taken and number of objects iterated over in the final message.
1.0.4 (2016-02-02)#
Fixed the new system checks to actually work
1.0.3 (2016-02-02)#
Fixed
EnumField
so that it works properly with forms, and does not accept themax_length
argument.SmartChunkedIterator
output has been fixed for reversed iteration, and now includes a time estimate.Added three system checks that give warnings if the MySQL configuration can (probably) be improved.
1.0.2 (2016-01-24)#
New function
add_QuerySetMixin
allows addding theQuerySetMixin
to arbitraryQuerySet
s, for when you can’t edit a model class.Added field class
EnumField
that uses MySQL’sENUM
data type.
1.0.1 (2015-11-18)#
Added
chunk_min
argument toSmartChunkedIterator
1.0.0 (2015-10-29)#
Changed version number to 1.0.0 to indicate maturity.
Added
DynamicField
for using MariaDB’s Named Dynamic Columns, and related database functionsColumnAdd
,ColumnDelete
, andColumnGet
.SmartChunkedIterator
withreport_progress=True
correctly reports ‘lowest pk so far’ when iterating in reverse.Fix broken import paths during
deconstruct()
for subclasses of all fields:ListCharField
,ListTextField
,SetCharField
,SetTextField
,SizedBinaryField
andSizedTextField
Added XML database functions -
UpdateXML
andXMLExtractValue
.
0.2.3 (2015-10-12)#
Allow
approx_count
on QuerySets for which only query hints have been usedAdded index query hints to
QuerySet
methods, via query-rewriting layerAdded
ordering
parameter toGroupConcat
to specify theORDER BY
clauseAdded index query hints to
QuerySet
methods, via query-rewriting layerAdded
sql_calc_found_rows()
query hint that calculates the total rows that match when you only take a slice, which becomes available on thefound_rows
attributeMade
SmartChunkedIterator
work withreverse()
’dQuerySet
s
0.2.2 (2015-09-03)#
SmartChunkedIterator
now takes an argumentchunk_size
as the initial chunk sizeSmartChunkedIterator
now allows models whose primary key is aForeignKey
Added
iter_smart_pk_ranges
which is similar toiter_smart_chunks
but yields only the start and end primary keys for each chunks, in a tuple.Added prefix methods to
MySQLCache
-delete_with_prefix
,get_with_prefix
,keys_with_prefix
Added
Bit1BooleanField
andNullBit1BooleanField
model fields that work with boolean fields built by other databases that use theBIT(1)
column type
0.2.1 (2015-06-22)#
Added Regexp database functions for MariaDB -
RegexpInstr
,RegexpReplace
, andRegexpSubstr
Added the option to not limit the size of a
MySQLCache
by settingMAX_ENTRIES
= -1.MySQLCache
performance improvements in get, get_many, and has_keyAdded query-rewriting layer added which allows the use of MySQL query hints such as
STRAIGHT_JOIN
via QuerySet methods, as well as adding label comments to track where queries are generated.Added
TableLock
context manager
0.2.0 (2015-05-14)#
More database functions added -
Field
and its complementELT
, andLastInsertId
Case sensitive string lookup added as to the ORM for
CharField
andTextField
Migration operations added -
InstallPlugin
,InstallSOName
, andAlterStorageEngine
Extra ORM aggregates added -
BitAnd
,BitOr
, andBitXor
MySQLCache
is now case-sensitive. If you are already using it, an upgradeALTER TABLE
and migration is provided at the end of the cache docs.(MariaDB only) The
Lock
class gained a class methodheld_with_prefix
to query held locks matching a given prefixSmartIterator
bugfix for chunks with 0 objects slowing iteration; they such chunks most often occur on tables with primary key “holes”Now tested against Django master for cutting edge users and forwards compatibility
0.1.10 (2015-04-30)#
Added the
MySQLCache
backend for use with Django’s caching framework, a more efficient version ofDatabaseCache
Fix a
ZeroDivision
error inWeightedAverageRate
, which is used in smart iteration
0.1.9 (2015-04-20)#
pt_visual_explain
no longer executes the given query before fetching itsEXPLAIN
New
pt_fingerprint
function that wraps thept-fingerprint
tool efficientlyFor
List
fields, the newListF
class allows you to do atomic append or pop operations from either end of the list in a single queryFor
Set
fields, the newSetF
class allows you to do atomic add or remove operatiosn from the set in a single queryThe
@override_mysql_variables
decorator has been introduced which makes testing code with different MySQL configurations easyThe
is_mariadb
property gets added onto Django’s MySQLconnection
class automaticallyA race condition in determining the minimum and maximum primary key values for smart iteration was fixed.
0.1.8 (2015-03-31)#
Add
Set
andList
fields which can store comma-separated sets and lists of a base field with MySQL-specific lookupsSupport MySQL’s
GROUP_CONCAT
as an aggregate!Add a
functions
module with many MySQL-specific functions for the new Django 1.8 database functions featureAllow access of the global and session status for the default connection from a lazy singleton, similar to Django’s
connection
objectFix a different recursion error on
count_tries_approx
0.1.7 (2015-03-25)#
Renamed
connection_name
argument tousing
onLock
,GlobalStatus
, andSessionStatus
classes, for more consistency with Django.Fix recursion error on
QuerySetMixin
when usingcount_tries_approx
0.1.6 (2015-03-21)#
Added support for
HANDLER
statements as aQuerySet
extensionNow tested on Django 1.8
Add
pk_range
argument for ‘smart iteration’ code
0.1.5 (2015-03-11)#
Added
manage.py
commanddbparams
for outputting database paramters in formats useful for shell scripts
0.1.4 (2015-03-10)#
Fix release process
0.1.3 (2015-03-08)#
Added
pt_visual_explain
integration onQuerySet
Added soundex-based field lookups for the ORM
0.1.2 (2015-03-01)#
Added
get_many
toGlobalStatus
Added
wait_until_load_low
toGlobalStatus
which allows you to wait for any high load on your database server to dissipate.Added smart iteration classes and methods for
QuerySet
s that allow efficient iteration over very large sets of objects slice-by-slice.
0.1.1 (2015-02-23)#
Added
Model
andQuerySet
subclasses which add theapprox_count
method
0.1.0 (2015-02-12)#
First release on PyPI
Lock
sGlobalStatus
andSessionStatus