Datomic Antipatterns: Conn as a Value
One of the most common faux pas I see in the Datomic-wilds is creating
functions that eschew the database as a value (e.g. accepting db
as an argument). Instead, they treat the connection as a
value. Their functions accept conn as an argument, or worse,
retrieve the connection from some global state.
The Trouble with Conn
At best, this practice can save you a bit of typing. At worst, it’ll cost you a lot of time. Connections as values mean your functions are hard to compose, hard to test and in the worst case, liable to produce inconsistent results.
Below is a very simple Datomic function that can be shown to exhibit all three of these bad behaviors.
Bad: Connection as a Value
(def conn (d/connect "datomic:free://some-db")) ;; A typical *connection as a value*-like function. ;; This version pulls the conn from the ambient environment (defn low-volume-accounts [limit] (d/q '[:find ?account :in $ ?limit :where [?account :sales/volume ?vol] (<= ?vol ?limit)] (d/db conn) limit))
Let’s illustrate problem-by-problem where connection as a value falls flat and show how database as a value works better.
Composability
low-volume-accounts is a perfectly fine function, but what happens
when your boss walks up and adds the oh so important requirement: “low
volume isn’t a set number, it’s calculated from all of the other
volumes.” That certainly throws a wrench into things.
Your first instinct might be to lump that functionality into the
existing low-volume-accounts, but I implore you not too. Collecting
data from the database in one query to rule them all-fashion is
learned behavior (aka. Stockholm Syndrome) from our time with
traditional RDBMSs. Unlike traditional databases, Datomic
queries naturally (and quite often) decompose into multiple
steps. This is no accident, this is by design.
So you write calculate-low-volume-limit, refactor
low-volume-accounts into accounts-under-volume and the two
together like so:
(defn accounts-under-volume [limit] (d/q '[:find ?account :in $ ?limit :where [?account ?sales/volume ?vol] (<= ?vol ?limit)] (d/db conn) limit)) (defn calculate-low-volume-limit [] ...) (defn low-volume-accounts [] (let [volume-limit (calculate-low-volume-limit) account-ids (accounts-under-volume volume-limit)] (map #(d/entity (d/db conn) %) account-ids))
How about that composability! You stuck the two together and it does the thing. Well, that is to say, it does the thing most of the time.
Consistency
So how does low-volume-accounts fall apart? What happens when the
actual database on the other end of conn gets updated by another
client? It mutates. While the database returned by any given (d/db
conn) invocation can change, the value returned cannot. Once you
have it, it is yours and it will not change out from under you. It is
immutable.
While low-volume-accounts seems to return accurate results, it’s
bound to bite you eventually. You can almost guarantee it’s not going
to be at an opportune time, either: under high load, during peak
hours–you know heads are going to roll.
For that reason, you should always accept db as an argument. No
ambient state. No conn. You’ll see this very same sensibility
reflected in Datomic’s own APIs: almost every database function
accepts db, except for the ones that write (transact, etc.).
With those sobering thoughts, you can re-write low-volume-accounts to
query against a consistent, immutable database value:
(defn accounts-under-volume [db limit] (d/q '[:find ?account :in $ ?limit :where [?account ?sales/volume ?vol] (<= ?vol ?limit)] db limit)) (defn calculate-low-volume-limit [db] ...) (defn low-volume-accounts [] (let [db (d/db conn) ;; The rubber has to hit the road somewhere... volume-limit (calculate-low-volume-limit db) account-ids (accounts-under-volume db volume-limit)] (map #(d/entity db %) account-ids))
All’s well, right?
Testability
Unfortunately it isn’t. As you set out to test low-volume-accounts
you discover you’ll need to write a bunch of fixtures to setup and
clean the database between test cases.
Stop!
There is a better way. You may know Datomic can travel back in time,
but did you know it can travel forward too? The handy
with function
takes your boring, old database, along with a vector of transaction
data (commonly tx-data) and returns to you a new database-value as
if that transaction had actually happened. The catch is, it hasn’t;
it just looks that way.
Here’s low-volume-accounts fully ready for time travel:
;; Where we're going, we don't need transactions... (defn low-volume-accounts [db] (let [volume-limit (calculate-low-volume-limit db) account-ids (accounts-under-volume db volume-limit)] (map #(d/entity db %) account-ids))
Now, it’s easy enough to write tests that setup the database
once. Where you used to transact relevent seed data or mutations
for a test, you now use with to prepare a suitable reality. (An
aside, you’ll also want to separate the generation of tx-data from the
transacting itself.)
Ultimately, you’ll end up with something like this:
(ns big.finance-test (:use big.finance) (:require [datomic.api :as d])) (use-fixtures :once transact-schema) (def base-db (d/db conn)) (def seed-accounts [ ... ]) (defn add-accounts-tx "Produce transaction data to add a number of accounts to a database" [accounts] ;; Transform accounts into a sequence of entity-maps ;; [{:db/id -1001 :sales/volume 42} ...] ) (deftest low-volume-accounts-test (let [db (->> (add-accounts-tx seed-accounts) (d/with base-db) :db-after)] (is (= ... (low-volume-accounts db)))))
Summary
TL;DR Always prefer db as an argument to your database functions rather
than a conn (explicit or otherwise). When you do so your functions
will be more composable, consistent despite changes and easier to
test.
Like this post? Subscribe to my newsletter.
Get fresh content on Clojure, Architecture and Software Development, each and every week.