The much delayed followup to last year’s post Vector denormalisation in PostgreSQL. I’ve been wrestling with rewrite rules and triggers, and discovering it’s not as straightforward as I first thought.
In the previous post, I explained how to improve the performance of simple vector data, using a little denormalisation. Here I discuss how to write a normalised interface to the vector data, using views.
View for horizontal vectorisation
Recall that the horizontal vectorisation scheme — in which each vector has elements for a fixed set of positions — appeared to perform better for most queries on the sensor data. Our vector data has been denormalised into a form where each tuple contains the values for all sensors at a moment in time:
CREATE TABLE hlog ( time TIMESTAMP NOT NULL, values DOUBLE PRECISION NOT NULL, CONSTRAINT hlog_pk PRIMARY KEY (time) );
This schema has been denormalised as a performance optimisation, which means it’s no longer the natural, normalised schema for that data. For general query purposes it can be convenient to hide this denormalisation, and present the data in its proper normalised form. This is easily done with a view:
CREATE VIEW log AS SELECT time, i AS sensor_id, values[i] AS value FROM hlog, generate_series(1,36) AS gs(i) WHERE values[i] IS NOT NULL;
Updating the view
We can also hide the denormalised form when updating the data. There are two possible approaches: using rewrite rules, and using triggers. To fully support updates, our system needs to support several cases:
- Adding an element to an existing vector.
- Creating a new vector and adding one element to it.
- Attempting to add an element that already exists, and getting a unique constraint exception.
- Updating a single element to a new value.
- Updating an element with a new sensor_id (i.e. a different position in the vector).
- Updating an element with a new time (i.e. a different vector, which may or may not already exist).
- Deleting an element.
- Deleting or updating a non-existent element, and seeing no change.
- Deleting the last element in a vector, and having the empty vector be automatically removed.
It also needs to handle queries that perform modifications to multiple rows in the view, with the same requirements for every row changed.
Rewrite rules translate INSERT/UPDATE/DELETE queries into other queries. One of their uses is to let operations on views update the data in the underlying tables. As it turns out, pure rewrite rules are not sufficient to satisfy all the above requirements for updates on the log view. I’ll present some partly-adequate rules, and explain their shortcomings later.
The first rule is INSERT, and it needs to do two things: insert a tuple into hlog if no matching tuple for that time exists yet; and update the specific element of hlog.values to the new value. Ideally, it should also replicate the primary key checking that would be in force on an ordinary table, by checking for duplicates.
CREATE OR REPLACE RULE log_ins AS ON INSERT TO log DO INSTEAD ( -- Try to insert if the element already exists; this will synthesise a uniqueness constraint error. INSERT INTO hlog ("time", "values") SELECT DISTINCT new."time", ARRAY::double precision WHERE EXISTS (SELECT * FROM hlog WHERE hlog."time" = new."time" AND hlog."values"[new.sensor_id] IS NOT NULL); -- Insert a new row if it doesn't exist. INSERT INTO hlog ("time", "values") SELECT DISTINCT new."time", ARRAY::double precision WHERE NOT EXISTS ( SELECT * FROM hlog WHERE hlog."time" = new."time"); -- Update the element on the row. UPDATE hlog SET "values"[gs.i] = new.value FROM generate_series(1, 36) gs(i) WHERE hlog."time" = new."time" AND new.sensor_id = gs.i; );
Next is DELETE, which should set the element to NULL, if it exists. As an optimisation, it can remove any tuple containing all NULLs.
CREATE OR REPLACE RULE log_del AS ON DELETE TO log DO INSTEAD ( -- Update the element to the null VALUE, if it exists. UPDATE hlog SET "values"[old.sensor_id] = NULL::double precision WHERE hlog."time" = old."time" AND hlog."values"[old.sensor_id] IS NOT NULL; -- Delete the row if it is completely empty (i.e. all NULL). DELETE FROM hlog WHERE hlog."time" = old."time" AND TRUE = ALL (SELECT unnest(hlog."values") IS NULL)); ;
Finally, the UPDATE rule should remove the element if it exists, and then insert it again separately. The trick is that if time is one of the fields being altered, then separate tuples are affected in each step. It is possible to add enough statements to the rule to cover these cases. But that duplicates a lot of statements from the INSERT and DELETE rules; why don’t we just issue INSERT and DELETE against the view and rely on the rules engine to rewrite that into the appropriate statements from those rules:
CREATE OR REPLACE RULE log_upd AS ON UPDATE TO log DO INSTEAD ( DELETE FROM log WHERE log."time" = old."time" AND log.sensor_id = old.sensor_id AND log.value = old.value; INSERT INTO log ("time", sensor_id, value) VALUES (new."time", new.sensor_id, new.value); );
Such update rules can work on simple views that represent straightforward relational operations on their underlying tables. But our view is not one of those: in particular, multiple rows of the view log can map to a single row in the underlying table hlog. This causes problems for the rewrite rules.
An update query pulls a list of rows to update from the view, and then updates each one in sequence. In our case, the same underlying row may need to be updated many times, if multiple elements in it are changed by the query. But this is not supported in a single query execution: each row is updated once at most. (Under the hood, a row modification is done with respect to the snapshot taken immediately prior to that query; if the query has already modified the row, the snapshot version will again be used as the base for subsequent modifications in the query.)
With the above rules, if we attempt to update many elements, we’ll see that only one change has taken effect.
Triggers can be set up on tables and views, and are called when certain operations, such as INSERT, UPDATE, and DELETE are applied to the table. For views, we can use triggers for these operations to update the underlying tables.
-- Trigger function for inserts, updates and deletes on the view log. CREATE OR REPLACE FUNCTION log_trig() RETURNS TRIGGER LANGUAGE 'plpgsql' AS $$ BEGIN -- Check for inappropriate use. IF TG_TABLE_NAME != 'log' OR TG_OP NOT IN ('INSERT', 'UPDATE', 'DELETE') THEN RAISE NOTICE 'trigger % unexpectedly called with % operation and % relation!', TG_NAME, TG_OP, TG_TABLE_NAME; END IF; IF TG_OP IN ('UPDATE', 'DELETE') THEN -- If the row and element does not exist, change nothing and return NULL. IF NOT EXISTS (SELECT * FROM hlog WHERE hlog."time" = old."time" AND hlog."values"[old.sensor_id] IS NOT NULL) THEN RETURN NULL; END IF; -- Update the element on the row. UPDATE hlog SET "values"[old.sensor_id] = NULL WHERE hlog."time" = old."time"; -- Delete the row if it's now empty. DELETE FROM hlog WHERE hlog."time" = old."time" AND TRUE = ALL (SELECT unnest(hlog."values") IS NULL); END IF; IF TG_OP IN ('INSERT', 'UPDATE') THEN -- If the sensor_id is out of range (1..36), raise an exception. IF NEW.sensor_id NOT BETWEEN 1 AND 36 THEN RAISE numeric_value_out_of_range USING HINT = 'sensor_id must be between 1 and 36.'; END IF; -- If the row and element already exists, then raise a unique violation exception. IF EXISTS (SELECT * FROM hlog WHERE hlog."time" = new."time" AND hlog."values"[new.sensor_id] IS NOT NULL) THEN RAISE unique_violation USING DETAIL = 'Key (time, sensor_id) = (' || NEW."time" || ', ' || NEW.sensor_id || ') already exists.'; END IF; -- Insert a new row if it doesn't exist. INSERT INTO hlog ("time", "values") SELECT new."time", ARRAY::double precision WHERE NOT EXISTS (SELECT * FROM hlog WHERE hlog."time" = new."time"); -- Update the element on the row. UPDATE hlog SET "values"[new.sensor_id] = new.value WHERE hlog."time" = new."time"; RETURN NEW; ELSE -- Op was DELETE. RETURN OLD; END IF; END; $$; -- Create the a row-level trigger for all INSERT, UPDATE and DELETE events on the view. CREATE TRIGGER log_trg_iud INSTEAD OF INSERT OR UPDATE OR DELETE ON log FOR EACH ROW EXECUTE PROCEDURE log_trig();
Here, we have implemented the trigger in two parts, deletion and insertion. UPDATE statements involve both parts. In each part, the appropriate checks are made (that data exists to be changed; or that data does not exist and can be safely inserted, and the incoming data is in range). Updates are handled slightly inefficiently; the code assumes a worst-case update, of the last element in a row being moved to a new, non-extant row. An optimisation could check that the update’s time field is unchanged, and then simply update a row once, instead of updating it to remove the element and then updating it to insert it.
Using rules and triggers to implement traditional modifiability at the row level has a significant performance cost. In both cases, every single update requires multiple queries to be run. In the case of rules, this includes a largely redundant query on the original view, which is joined with a similar query to determine which rows in the underlying table to update. With triggers, the actual queries run are more direct, but have to be done separately for every single row changed.
For practical use, perhaps it is better to acknowledge the denormalised table and write a loader function that efficiently populates the underlying tables directly.
View for vertical vectorisation
What if the data is vectorised using the vertical scheme? The data is denormalised into vectors by minute. The underlying table can be populated using the denorm aggregate function:
CREATE TABLE vlog ( first_time TIMESTAMP NOT NULL, sensor_id INTEGER NOT NULL, values DOUBLE PRECISION NOT NULL, CONSTRAINT vlog_pk PRIMARY KEY (first_time, sensor_id) ); INSERT INTO vlog SELECT date_trunc('minute', time), sensor_id, denorm(extract('seconds' FROM time - date_trunc('minute', time))::integer, value) FROM original_log GROUP BY 1, 2;
And the renormalised view definition is:
CREATE VIEW log AS SELECT first_time + interval '1 second' * i AS time, sensor_id, values[i] AS value FROM vlog, generate_series(0,59) AS gs(i) WHERE values[i] IS NOT NULL;
Similar rules and triggers (with the same problems for rules) remain. The mapping from the view to the underlying table is different.
- Changes to the sensor_id field now affect two rows in the underlying table.
- The time field is mapped differently: part of it is used to identify the underlying row, and the remaining part is used to identify the element with that row’s values array. In the trigger, there is scope for a single-row update optimisation where the time field changes only in the second part.
The rule and trigger definitions can be adapted from those used in horizontal vectorisation above. Determining the underlying row to be updates for a given time can be done using the
An index trick
A remaining issue is: can a query on this view use the index on log.first_time? Unfortunately a query on this basic view can’t use that index, because the search expression depends on
gs.i. We can’t even use a function index, because
i comes from the other (generated) table. The query for selecting an arbitrary time requires a full scan:
EXPLAIN ANALYZE SELECT * FROM log WHERE time = '2011-01-01 16:57:35'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..5020793.00 rows=829333 width=527) (actual time=9716.074..13008.464 rows=100 loops=1) Join Filter: ((vlog."values"[gs.i] IS NOT NULL) AND ((vlog.first_time + ('00:00:01'::interval * (gs.i)::double precision)) = '2011-01-01 16:57:35'::timestamp without time zone)) -> Seq Scan on vlog (cost=0.00..13573.00 rows=166700 width=523) (actual time=0.007..3593.088 rows=166700 loops=1) -> Function Scan on generate_series gs (cost=0.00..10.00 rows=1000 width=4) (actual time=0.000..0.013 rows=60 loops=166700) Total runtime: 13008.527 ms (5 rows)
There are ways to improve this. The first is to add first_time as a column to the view, and allow the query to use that for filtering by time. The down side is that this reveals a detail from the denormalisation, which should be irrelevant to queries on the view.
There is another method that preserves the nice, simple, normalised properties of the view. We can maintain a table vlog_time containing every distinct time, and rewrite the view as:
CREATE VIEW log_fast AS SELECT time, sensor_id, values[i] FROM vlog, generate_series(0,59) AS gs(i), vlog_time AS lt WHERE values[i] IS NOT NULL AND date_trunc('minute', lt.time) = first_time AND extract('second' from (lt.time - first_time)) = i;
A query such as
SELECT * FROM log WHERE time BETWEEN '1:00:00' AND '2:00:00' can use an index on
vlog_time.time to select a subset of that table, which is then joined to log using the index on log.first_time. The join condition
date_trunc('minute', lt.time) = first_time is an indexable expression on one side, but on the other side it is an indexed field in vlog.
EXPLAIN ANALYZE SELECT * FROM log_fast WHERE time = '2011-01-01 16:57:35'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=198.25..265.75 rows=497 width=527) (actual time=197.254..197.449 rows=100 loops=1) Hash Cond: ((gs.i)::double precision = date_part('second'::text, (lt."time" - vlog.first_time))) Join Filter: (vlog."values"[gs.i] IS NOT NULL) -> Function Scan on generate_series gs (cost=0.00..10.00 rows=1000 width=4) (actual time=0.019..0.037 rows=60 loops=1) -> Hash (cost=197.00..197.00 rows=100 width=531) (actual time=197.159..197.159 rows=100 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 55kB -> Nested Loop (cost=0.00..197.00 rows=100 width=531) (actual time=166.383..196.907 rows=100 loops=1) -> Index Scan using vlog_time_pkey on vlog_time lt (cost=0.00..8.28 rows=1 width=8) (actual time=0.028..0.030 rows=1 loops=1) Index Cond: ("time" = '2011-01-01 16:57:35'::timestamp without time zone) -> Index Scan using vlog_pk on vlog (cost=0.00..187.22 rows=100 width=523) (actual time=166.328..196.780 rows=100 loops=1) Index Cond: (first_time = date_trunc('minute'::text, lt."time")) Total runtime: 197.531 ms (12 rows)
The query plan may be messier but it is a lot faster.