Datomic Antipatterns: Conn as a Value

  • clojure
  • datomic

One of the most common faux pas I see in the Datomic-wilds is creating functions that eschew the database as a value (e.g. accepting db as an argument). Instead, they treat the connection as a value. Their functions accept conn as an argument, or worse, retrieve the connection from some global state.

The Trouble with Conn

At best, this practice can save you a bit of typing. At worst, it’ll cost you a lot of time. Connections as values mean your functions are hard to compose, hard to test and in the worst case, liable to produce inconsistent results.

Below is a very simple Datomic function that can be shown to exhibit all three of these bad behaviors.

Bad: Connection as a Value

(def conn (d/connect "datomic:free://some-db"))

;; A typical *connection as a value*-like function.
;; This version pulls the conn from the ambient environment
(defn low-volume-accounts [limit]
  (d/q '[:find ?account
         :in $ ?limit
         :where [?account :sales/volume ?vol]
                (<= ?vol ?limit)]
       (d/db conn)
       limit))

Let’s illustrate problem-by-problem where connection as a value falls flat and show how database as a value works better.

Composability

low-volume-accounts is a perfectly fine function, but what happens when your boss walks up and adds the oh so important requirement: “low volume isn’t a set number, it’s calculated from all of the other volumes.” That certainly throws a wrench into things.

Your first instinct might be to lump that functionality into the existing low-volume-accounts, but I implore you not too. Collecting data from the database in one query to rule them all-fashion is learned behavior (aka. Stockholm Syndrome) from our time with traditional RDBMSs. Unlike traditional databases, Datomic queries naturally (and quite often) decompose into multiple steps. This is no accident, this is by design.

So you write calculate-low-volume-limit, refactor low-volume-accounts into accounts-under-volume and the two together like so:

(defn accounts-under-volume [limit]
  (d/q '[:find ?account
         :in $ ?limit
         :where [?account ?sales/volume ?vol]
                (<= ?vol ?limit)]
       (d/db conn)
       limit))

(defn calculate-low-volume-limit []
  ...)

(defn low-volume-accounts []
  (let [volume-limit (calculate-low-volume-limit)
        account-ids (accounts-under-volume volume-limit)]
    (map #(d/entity (d/db conn) %) account-ids))

How about that composability! You stuck the two together and it does the thing. Well, that is to say, it does the thing most of the time.

Consistency

So how does low-volume-accounts fall apart? What happens when the actual database on the other end of conn gets updated by another client? It mutates. While the database returned by any given (d/db conn) invocation can change, the value returned cannot. Once you have it, it is yours and it will not change out from under you. It is immutable.

While low-volume-accounts seems to return accurate results, it’s bound to bite you eventually. You can almost guarantee it’s not going to be at an opportune time, either: under high load, during peak hours–you know heads are going to roll.

For that reason, you should always accept db as an argument. No ambient state. No conn. You’ll see this very same sensibility reflected in Datomic’s own APIs: almost every database function accepts db, except for the ones that write (transact, etc.).

With those sobering thoughts, you can re-write low-volume-accounts to query against a consistent, immutable database value:

(defn accounts-under-volume [db limit]
  (d/q '[:find ?account
         :in $ ?limit
         :where [?account ?sales/volume ?vol]
                (<= ?vol ?limit)]
       db
       limit))

(defn calculate-low-volume-limit [db]
  ...)

(defn low-volume-accounts []
  (let [db (d/db conn) ;; The rubber has to hit the road somewhere...
        volume-limit (calculate-low-volume-limit db)
        account-ids (accounts-under-volume db volume-limit)]
    (map #(d/entity db %) account-ids))

All’s well, right?

Testability

Unfortunately it isn’t. As you set out to test low-volume-accounts you discover you’ll need to write a bunch of fixtures to setup and clean the database between test cases.

Stop!

There is a better way. You may know Datomic can travel back in time, but did you know it can travel forward too? The handy with function takes your boring, old database, along with a vector of transaction data (commonly tx-data) and returns to you a new database-value as if that transaction had actually happened. The catch is, it hasn’t; it just looks that way.

Here’s low-volume-accounts fully ready for time travel:

;; Where we're going, we don't need transactions...
(defn low-volume-accounts [db]
  (let [volume-limit (calculate-low-volume-limit db)
        account-ids (accounts-under-volume db volume-limit)]
    (map #(d/entity db %) account-ids))

Now, it’s easy enough to write tests that setup the database once. Where you used to transact relevent seed data or mutations for a test, you now use with to prepare a suitable reality. (An aside, you’ll also want to separate the generation of tx-data from the transacting itself.)

Ultimately, you’ll end up with something like this:

(ns big.finance-test
  (:use big.finance)
  (:require [datomic.api :as d]))

(use-fixtures :once transact-schema)

(def base-db (d/db conn))

(def seed-accounts
  [ ... ])

(defn add-accounts-tx
  "Produce transaction data to add a number of accounts to a database"
  [accounts]
  ;; Transform accounts into a sequence of entity-maps
  ;; [{:db/id -1001 :sales/volume 42} ...]
  )

(deftest low-volume-accounts-test
  (let [db (->> (add-accounts-tx seed-accounts)
                (d/with base-db)
                :db-after)]
    (is (= ...
           (low-volume-accounts db)))))

Summary

TL;DR Always prefer db as an argument to your database functions rather than a conn (explicit or otherwise). When you do so your functions will be more composable, consistent despite changes and easier to test.

Like this post? Subscribe to my newsletter.

Get fresh content on Clojure, Architecture and Software Development, each and every week.

comments powered by Disqus