Precheck Assertions for SQL Tests

Goal: Guard against "errors of equality" where your test passes, but not for the reasons you thought it would.

Dall-e rendering of SQL Testing Framekwork

Setup

It's the end of the quarter at StreamBuster, the buzzy tech startup that's disrupting the streaming video market. At StreamBuster we believe that there's a billion dollar market for the game-changing convenience that comes from driving to a store, picking up a physical disk (if there's a good one in stock anyway) and driving home with it to enjoy with friends and family.Think of the thrill of the hunt, when a movie you want to watch is in stock. Think about how much accomplishment you'll feel about having "brought home the bacon" to an adoring spouse and children. Streaming is dead, people want StreamBuster stores.

The executive team has decided that a friendly competition between stores would motivate the teams and increase sales. You've been asked to create a report that shows sales by store by month.

See the SQL Testing Framework 101 tutorial for database setup instructions to follow along on your computer.

Get to work

The query is simple enough:

SELECT
   store.store_id
  ,DATE_PART('month', payment_date) AS payment_month
  ,SUM(payment.amount) AS total_payment
FROM store
JOIN staff USING(store_id)
JOIN payment USING(staff_id)
GROUP BY store_id, DATE_PART('month', payment_date)
;

Here we make use of the USING keyword to keep the query short and easy to read.

With a query that does a roll-up like this, rolling stores and dates up into an aggregate, it's usually a good idea to spot-check a few to make sure that the roll-up is right. This is straightforward with SQL Testing Framework:

\set query /queries/sales-by-store.sql
:setup_test
WITH text AS (
  SELECT 'Store #3 has the right number of sales' AS value
), expect AS (
  SELECT
    SUM(payment.amount) AS value
  FROM store
  JOIN staff USING(store_id)
  JOIN payment USING(staff_id)
  WHERE store_id = 3
    AND DATE_PART('month', payment_date) = 3
), actual AS (
  SELECT
    total_payment AS value
  FROM "/queries/sales-by-store.sql"
  WHERE store_id = 3
    AND payment_month = 3
)
:evaluate_test
:cleanup_test

We're just checking to make sure that the roll up for Store #3 in March is correct. The test runner gives us a passing test:

 actual | expect | did_pass |                  text
--------+--------+----------+----------------------------------------
        |        | true     | Store #1 has the right number of sales
(1 row)

Wait a minute, the runner says our test passed, but why are actual and expect both empty? That's not what we were expecting!

Precheck Assertions

Upon closer inspection, we find that there is no Store #3.

postgres=# SELECT * FROM store;
 store_id | manager_staff_id | address_id |     last_update
----------+------------------+------------+---------------------
        1 |                1 |          1 | 2006-02-15 09:57:12
        2 |                2 |          2 | 2006-02-15 09:57:12
(2 rows)

This "error of equality" is a common source of test-related errors. Our test was technically correct, but because of the way we wrote it, our intent wasn't actually tested.

SQL Testing Framework has a "Precheck Assertions" feature to help guard against this kind of error. We need to test that the expect and actual are equal, and also that they contain reasonable values.

Using the prechecks temporary table, we can easily check for errors of equality:

\set query /queries/sales-by-store.sql
:setup_test
WITH text AS (
  SELECT 'Store #3 has the right number of sales' AS value
), expect AS (
  SELECT
    SUM(payment.amount) AS value
  FROM store
  JOIN staff USING(store_id)
  JOIN payment USING(staff_id)
  WHERE store_id = 3
    AND DATE_PART('month', payment_date) = 3
), precheck_not_blank AS (
  INSERT INTO :"prechecks" (value)
  SELECT COALESCE(SUM(value) > 0, FALSE)
  FROM expect
), actual AS (
  SELECT
    total_payment AS value
  FROM "/queries/sales-by-store.sql"
  WHERE store_id = 3
    AND payment_month = 3
)
:evaluate_test
:cleanup_test

The new stuff is in the precheck_not_blank common table expression.

, precheck_not_blank AS (
  INSERT INTO :"prechecks" (value)
  SELECT COALESCE(SUM(value) > 0, FALSE)
  FROM expect

The prechecks table has one BOOLEAN column that the test runner will inspect. Now our test runner output looks like this:

 actual | expect | did_pass |                  text
--------+--------+----------+----------------------------------------
        |        | f-pre    | Store #3 has the right number of sales
(1 row)

In the did_pass column, the value of f-pre tells us that the test failed because of a failed precheck assertion.

Let's fix our test, but using Store #1 as the sample, and run again:

\set query /queries/sales-by-store.sql
:setup_test
WITH text AS (
  SELECT 'Store #1 has the right number of sales' AS value
), expect AS (
  SELECT
    SUM(payment.amount) AS value
  FROM store
  JOIN staff USING(store_id)
  JOIN payment USING(staff_id)
  WHERE store_id = 1
    AND DATE_PART('month', payment_date) = 3
), precheck_not_blank AS (
  INSERT INTO :"prechecks" (value)
  SELECT COALESCE(SUM(value) > 0, FALSE)
  FROM expect
), actual AS (
  SELECT
    total_payment AS value
  FROM "/queries/sales-by-store.sql"
  WHERE store_id = 1
    AND payment_month = 3
)
:evaluate_test
:cleanup_test
 actual  |  expect  | did_pass |                  text
----------+----------+----------+----------------------------------------
 11776.83 | 11776.83 | true     | Store #1 has the right number of sales
(1 row)

Much better, now our test is passing, and it's passing for the right reasons- because Store #1 exists, has sales, and has the right amount of sales.

Even More Precheck Assertions

We can add as many precheck assertions as we like. Unless they are all TRUE the test will be marked as f-pre. Here's an example with two precheck assertions:

\set query /queries/sales-by-store.sql
:setup_test
WITH text AS (
  SELECT 'Store #1 has the right number of sales' AS value
), expect AS (
  SELECT
    SUM(payment.amount) AS value
  FROM store
  JOIN staff USING(store_id)
  JOIN payment USING(staff_id)
  WHERE store_id = 1
    AND DATE_PART('month', payment_date) = 3
), precheck_not_blank AS (
  INSERT INTO :"prechecks" (value)
  SELECT COALESCE(SUM(value) > 0, FALSE)
  FROM expect
), precheck_that_always_fails AS (
 INSERT INTO :"prechecks" (value)
 SELECT FALSE
), actual AS (
  SELECT
    total_payment AS value
  FROM "/queries/sales-by-store.sql"
  WHERE store_id = 1
    AND payment_month = 3
)
:evaluate_test
:cleanup_test

Of course this a contrived example, you'd never actually want a precheck assertion that always fails. For demonstration purposes, our test runner output is failing again:

 actual  |  expect  | did_pass |                  text
----------+----------+----------+----------------------------------------
 11776.83 | 11776.83 | f-pre    | Store #1 has the right number of sales
(1 row)

If you have lots of precheck assertions and need to know which one specifically failed, the result of each precheck is available in the test_results table:

postgres=# SELECT
   id
  ,precheck
  ,filename
  ,text
FROM stf.test_results
;

 id | precheck |           filename           |                        text
----+----------+------------------------------+-----------------------------------------------------
 31 | {}       | sales-by-store.sql           | Store #3 has the right number of sales
 32 | {f}      | sales-by-store.sql           | Store #3 has the right number of sales
 33 | {t}      | sales-by-store.sql           | Store #1 has the right number of sales
 34 | {f,t}    | sales-by-store.sql           | Store #1 has the right number of sales
(34 rows)

Once again, SQL Testing Framework has saved us from making a silly mistake!

The code for this example is available on Github

Up next: testing for skew in the data. Learn how