Skip to article frontmatterSkip to article content

Caching

Query caching allows avoiding repeated queries to the database by caching the results locally for faster retrieval.

To enable queries, set the query cache local path in dj.config['query_cache'] and activate the query caching with conn.set_query_cache(query_cache) where conn is the connection object.

A reference to the connection object is kept as a property of the schema or table objects, e.g. schema.connection. Alternatively, the function dj.conn() returns the currently active connection object.

The query_cache argument is an aribtrary string serving to differentiate cache states; setting a new value will effectively start a new cache, triggering retrieval of new values once.

Setting query_cache=None turns off query caching.

While query caching is enabled, any insert or delete calls and any transactions are disabled and will raise an error. This ensures that stale data are not used for updating the database in violation of data integrity.

To clear the cache, delete the contents of the folder specified in dj.config['query_cache'].

A complete example

import datajoint as dj
import numpy as np
import os

Create a table to store image data and populate it

schema = dj.schema('test_query_caching')

@schema
class Image(dj.Manual):
    definition = """
    image_number : int
    ---
    image : longblob
    """
Image.insert1((1, np.random.randn(300,300)))

Time query without caching.

%%timeit
r = (Image & 'image_number=1').fetch1('image')
403 ms ± 72.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Enable query caching and note that queries are sped up.

dj.config['query_cache']=os.path.expanduser('~/tmp')
conn = dj.conn()
conn.set_query_cache('s0')
%%timeit
r = (Image & 'image_number=1').fetch1('image')
2.91 ms ± 57.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Any attempts to insert or delete data will result in an error while query caching is on.

Image.delete()
---------------------------------------------------------------------------
DataJointError                            Traceback (most recent call last)
<ipython-input-8-3eb8731b0f22> in <module>
----> 1 Image.delete()

~/dev/datajoint-python/datajoint/table.py in delete(self, transaction, safemode)
    384         if transaction:
    385             if not self.connection.in_transaction:
--> 386                 self.connection.start_transaction()
    387             else:
    388                 if not safemode:

~/dev/datajoint-python/datajoint/connection.py in start_transaction(self)
    333         if self.in_transaction:
    334             raise errors.DataJointError("Nested connections are not supported.")
--> 335         self.query('START TRANSACTION WITH CONSISTENT SNAPSHOT')
    336         self._in_transaction = True
    337         logger.info("Transaction started")

~/dev/datajoint-python/datajoint/connection.py in query(self, query, args, as_dict, suppress_warnings, reconnect)
    273         use_query_cache = bool(self._query_cache)
    274         if use_query_cache and not re.match(r"\s*(SELECT|SHOW)", query):
--> 275             raise errors.DataJointError("Only SELECT queries are allowed when query caching is on.")
    276         if use_query_cache:
    277             if not config['query_cache']:

DataJointError: Only SELECT queries are allowed when query caching is on.

Turn off query caching. Data manua

# Turn off query caching
conn.set_query_cache(None)
Image.delete()
Deleting 1 rows from `test_query_caching`.`image`
Commit deletes? [yes, No]: yes
Deletes committed.