Query caching allows avoiding repeated queries to the database by caching the results locally for faster retrieval.
To enable queries, set the query cache local path in dj.config['query_cache']
and activate the query caching with conn.set_query_cache(query_cache)
where conn
is the connection object.
A reference to the connection object is kept as a property of the schema or table objects, e.g. schema.connection
. Alternatively, the function dj.conn()
returns the currently active connection object.
The query_cache
argument is an aribtrary string serving to differentiate cache states; setting a new value will effectively start a new cache, triggering retrieval of new values once.
Setting query_cache=None
turns off query caching.
While query caching is enabled, any insert
or delete
calls and any transactions are disabled and will raise an error. This ensures that stale data are not used for updating the database in violation of data integrity.
To clear the cache, delete the contents of the folder specified in dj.config['query_cache']
.
A complete example¶
import datajoint as dj
import numpy as np
import os
Create a table to store image data and populate it
schema = dj.schema('test_query_caching')
@schema
class Image(dj.Manual):
definition = """
image_number : int
---
image : longblob
"""
Image.insert1((1, np.random.randn(300,300)))
Time query without caching.
%%timeit
r = (Image & 'image_number=1').fetch1('image')
403 ms ± 72.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Enable query caching and note that queries are sped up.
dj.config['query_cache']=os.path.expanduser('~/tmp')
conn = dj.conn()
conn.set_query_cache('s0')
%%timeit
r = (Image & 'image_number=1').fetch1('image')
2.91 ms ± 57.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Any attempts to insert or delete data will result in an error while query caching is on.
Image.delete()
---------------------------------------------------------------------------
DataJointError Traceback (most recent call last)
<ipython-input-8-3eb8731b0f22> in <module>
----> 1 Image.delete()
~/dev/datajoint-python/datajoint/table.py in delete(self, transaction, safemode)
384 if transaction:
385 if not self.connection.in_transaction:
--> 386 self.connection.start_transaction()
387 else:
388 if not safemode:
~/dev/datajoint-python/datajoint/connection.py in start_transaction(self)
333 if self.in_transaction:
334 raise errors.DataJointError("Nested connections are not supported.")
--> 335 self.query('START TRANSACTION WITH CONSISTENT SNAPSHOT')
336 self._in_transaction = True
337 logger.info("Transaction started")
~/dev/datajoint-python/datajoint/connection.py in query(self, query, args, as_dict, suppress_warnings, reconnect)
273 use_query_cache = bool(self._query_cache)
274 if use_query_cache and not re.match(r"\s*(SELECT|SHOW)", query):
--> 275 raise errors.DataJointError("Only SELECT queries are allowed when query caching is on.")
276 if use_query_cache:
277 if not config['query_cache']:
DataJointError: Only SELECT queries are allowed when query caching is on.
Turn off query caching. Data manua
# Turn off query caching
conn.set_query_cache(None)
Image.delete()
Deleting 1 rows from `test_query_caching`.`image`
Commit deletes? [yes, No]: yes
Deletes committed.