We have a new way to understand the performance of views, indexes, and materialized views in Materialize. By mapping runtime data about low-level dataflow operators up to a sensible intermediate language, you’ll be better able to identify and refine computationally expensive parts of your queries.
Databases have a lot of EXPLAIN
ing to do
Databases typically offer some way to understand how queries run.
Postgres, for example, has the EXPLAIN
statement;
running EXPLAIN ... query ...
presents the user with a summary of
how the plan will be run (what kind of joins, etc.) along with an
estimate of the cost.1 Postgres’s EXPLAIN ANALYZE
statement does
one better: it actually runs the query, collecting information as it
goes; when the query terminates, it displays the plan, annotated with
runtime data like memory used and time spent in each part of the
plan.
Materialize has been able to
EXPLAIN
queries
for a long time, but adding in runtime feedback is harder. What would
it mean to EXPLAIN ANALYZE CREATE INDEX ...
? When should Materialize
stop reporting information? Indexes in Materialize don’t have an ‘end time’! What if
you want information about an index that’s already running?
We’ve implemented a new way to glean insights into how your indexes and materialized views are running. To understand how to use it, let’s take a quick detour through how Materialize compiles your SQL queries down to dataflows.
Materialize’s compilation pipeline
Materialize compiles SQL through a series of intermediate languages: a high-level intermediate language (HIR), a mid-level intermediate language (MIR), and a low-level intermediate language (LIR). A SQL query is translated to an HIR query, which is then translated into one or more MIR queries. Our optimizer does the bulk of its decision making in MIR: planning joins, removing redundancies, and identifying patterns Materialize can run particularly effectively. The compiler then lowers MIR into LIR, our final intermediate representation. LIR is abstract enough to still be a ‘high-level’ plan, but LIR is low-level enough to explicitly map out all of the details of the plan: how to aggregate, which indexes to use, etc. Having fixed the plan in LIR, translating the LIR to dataflows---the actual runtime engine of Materialize---is straightforward enough.
Materialize already tracks a variety of runtime information in the
mz_instrospection
schema.
But this runtime information is attributed to the dataflow operators
that Materialize actually runs. After running our compiler, these
dataflow operators don’t look anything like the original SQL query!
A SQL query might have hundreds of dataflow operators, and it takes
real expertise to know which operators correspond to which parts of
the query.
To bridge that expertise gap, we’ve created a source map, called
mz_introspection.mz_lir_mapping
.
We map ranges of dataflow operators up to LIR operators---the fixed
plans that are the last intermediate representation in our compiler.
Using mz_lir_mapping
, you can relate performance statistics---like
total computation time and memory usage---to a high-level representation, as seen in
our EXPLAIN
or
Postgres’s EXPLAIN ANALYZE
.
Mapping dataflow metrics up to LIR
It’s easiest to get a feel for what mz_lir_mapping
does for you by
example. Let’s start by generating a sample database tracking which customers
bought which products.
CREATE TABLE sales (product_id BIGINT NOT NULL, customer_id BIGINT NOT NULL);
CREATE INDEX idx_sales_by_product ON sales(product_id);
-- generates 100 products and 10k customers, each buying 16 products
INSERT INTO sales (product_id, customer_id)
WITH
product_seed(salt) AS
(VALUES ('abc'), ('def'), ('ghi'), ('jkl'), ('mno'), ('qrs'), ('tuv'), ('wxy'),
('zAB'), ('CDE'), ('FGH'), ('IJK'), ('LMN'), ('OPQ'), ('RST'), ('UVW'))
SELECT seahash(product_seed.salt || customer_id::text) % 100 AS product_id,
customer_id
FROM generate_series(1, 10000) AS customer_id
CROSS JOIN product_seed;
Since Materialize is deterministic, we’ll generate random-feeling data deterministically, using a hash with salt.2
The resulting distribution of purchases across products is fairly uniform:
SELECT MIN(count), AVG(count), ROUND(STDDEV(count), 1) AS stddev, MAX(count)
FROM (SELECT COUNT(product_id) AS count
FROM sales
GROUP BY product_id
ORDER BY count DESC);
min | avg | stddev | max |
---|---|---|---|
1485 | 1600 | 41.2 | 1698 |
With our toy database populated, let’s explore how an analytics query performs: who are the top 5 buyers of the top 5 products? First, let’s create some views: one for the top 5 products, one to count purchases of those popular products, and one for the top 5 buyers of each of those top 5 products.
-- top 5 most popular products
CREATE VIEW popular_products(product_id) AS
SELECT product_id
FROM sales
GROUP BY product_id
ORDER BY COUNT(product_id) DESC
LIMIT 5;
-- number of purchases of each popular product, per customer
CREATE VIEW popular_purchases(product_id, customer_id, count) AS
SELECT product_id, customer_id, COUNT(customer_id)
FROM popular_products
JOIN sales
USING (product_id)
GROUP BY product_id, customer_id;
-- top 5 buyers of each of the top five popular products
CREATE VIEW top_buyers(customer_id, product_id, count) AS
SELECT customer_id, product_id, count
FROM (SELECT DISTINCT product_id FROM popular_purchases) products,
LATERAL ( SELECT customer_id, count FROM popular_purchases
WHERE product_id = products.product_id
ORDER BY count DESC
LIMIT 5)
ORDER BY product_id, count DESC;
Having defined these views, let’s index top_buyers
by product_id
,
so our analytics dashboard can quickly look up who the top buyers of our top
products are.
CREATE INDEX idx_top_buyers ON top_buyers(product_id);
Now that we have the idx_top_buyers
index, let’s understand
its performance using the mz_lir_mapping
source mapping.
Attributing memory usage
Materialize’s incremental view maintenance trades space for time:
we’re able to give consistent, up-to-the-minute second (!) answers by caching
appropriately. Since caching uses memory, if we’re “optimizing a query”, then managing memory is the name of
the game. Let’s combine the new source mapping
mz_introspection.mz_lir_mapping
(how is our query implemented?)
with the metrics data
mz_introspection.mz_arrangement_sizes
(how much memory are we using?)
and the catalog data in
mz_catalog.mz_objects
(what’s defined?)
to see how much memory each operator is consuming:
-- attribute memory usage
SELECT mo.name AS name, global_id, lir_id, parent_lir_id,
REPEAT(' ', nesting * 2) || operator AS operator,
pg_size_pretty(SUM(size)) AS size
FROM mz_introspection.mz_lir_mapping mlm
LEFT JOIN mz_introspection.mz_arrangement_sizes mas
ON ( mlm.operator_id_start <= mas.operator_id
AND mas.operator_id < mlm.operator_id_end)
JOIN mz_catalog.mz_objects mo
ON (mlm.global_id = mo.id)
WHERE mo.name LIKE '%top_buyers%'
GROUP BY mo.name, global_id, lir_id, operator, parent_lir_id, nesting
ORDER BY global_id, lir_id DESC;
name | global_id | lir_id | parent_lir_id | operator | size |
---|---|---|---|---|---|
top_buyers | u195 | 11 | null | TopK::Basic 10 | 1011 kB |
top_buyers | u195 | 10 | 11 | MapFilterProject 9 | null |
top_buyers | u195 | 9 | 10 | Reduce::Accumulable 8 | 316 kB |
top_buyers | u195 | 8 | 9 | Join::Differential 6 » 7 | null |
top_buyers | u195 | 7 | 8 | Get::PassArrangements u191 | null |
top_buyers | u195 | 6 | 8 | Arrange 5 | 1738 bytes |
top_buyers | u195 | 5 | 6 | MapFilterProject 4 | null |
top_buyers | u195 | 4 | 5 | TopK::Basic 3 | 36 kB |
top_buyers | u195 | 3 | 4 | Arrange 2 | null |
top_buyers | u195 | 2 | 3 | Reduce::Accumulable 1 | 20 kB |
top_buyers | u195 | 1 | 2 | Get::Arrangement u191 | null |
idx_top_buyers | u196 | 13 | null | Arrange 12 | 1882 bytes |
idx_top_buyers | u196 | 12 | 13 | Get::PassArrangements u195 | null |
The results show information for two high-level objects: the
top_buyers
view and the idx_top_buyers
index. Each of these has a
global_id
---an internal identifier for that object. Each object has
several LIR operator
s, with most of the operators living in the
view.
Each LIR operator represents a high-level part of the plan; you can
read more about it in our EXPLAIN
docs. Operators
form a tree, like expressions in any programming language. We’ve used
mz_lir_mapping
’s nesting
field to indent the operators nicely; ordering by
lir_id
descending puts the operators in the correct
order.3
In mz_lir_mapping
, these operators always are a single line, of the
form OperatorName c1 c2 ...
, where each ci
is a “child ID”---the
lir_id
of one of the inputs of that operator. So TopK::Basic 10
indicates a TopK
operator that reads from the operator with lir_id
10 (which, in our example, is a MapFilterProject
). The
Join::Differential 6 » 7
line indicates a
differential
join of the inputs in LIR ids 6 and 7.
The size
column sums up the sizes in mz_arrangement_sizes
for
every dataflow operator used to implement a given LIR
operator. Looking at the size
column above, it seems that the
outermost TopK
is the expensive one.
It’s worth getting a sense of how much runtime data we’ve just
aggregated together when analyzing this relatively simple query. Most
LIR operators correspond to quite a few dataflow operators; many have
arrangements. Let’s adapt our query to count dataflows and
arrangements in the top_buyers
view:
SELECT REPEAT(' ', nesting * 2) || operator AS operator,
operator_id_end - operator_id_start AS dataflow_ops,
COUNT(mas.size) AS arrangements
FROM mz_introspection.mz_lir_mapping mlm
LEFT JOIN mz_introspection.mz_arrangement_sizes mas
ON ( mlm.operator_id_start <= mas.operator_id
AND mas.operator_id < mlm.operator_id_end)
WHERE global_id = 'u195'
GROUP BY lir_id, operator, nesting, operator_id_end, operator_id_start
ORDER BY lir_id desc;
operator | dataflows_ops | arrangements |
---|---|---|
TopK::Basic 10 | 162 | 16 |
MapFilterProject 9 | 9 | 0 |
Reduce::Accumulable 8 | 38 | 3 |
Join::Differential 6 » 7 | 18 | 0 |
Get::PassArrangements u191 | 0 | 0 |
Arrange 5 | 15 | 1 |
MapFilterProject 4 | 7 | 0 |
TopK::Basic 3 | 162 | 16 |
Arrange 2 | 4 | 0 |
Reduce::Accumulable 1 | 38 | 3 |
Get::Arrangement u191 | 9 | 0 |
If we tried to attribute memory at the dataflow level to top_buyers
,
we would have to poke through these hundreds of dataflow operators
and dozens of arrangements. But with our source map
above, it’s easy to get a structured
summary of dataflow metrics: the outermost TopK::Basic 10
operator
uses the lion’s share of memory.
Setting hints for TopK
queries
The TopK
operator works by building a tower of arrangements of
decreasing size: even if you’ve only asked for the top 5 elements,
Materialize can only incrementally maintain the view if somewhere it
maintains a complete ranking. (The tower helps us keep latency low and
incremental maintenance cheap.) By default, Materialize will allocate
eight generously sized levels for the arrangements in a TopK
. Our
toy example is so small, we’re surely wasting a lot of that
space. Let’s use the LIMIT INPUT GROUP SIZE
hint to tell
Materialize the expected group size on the input---which informs how
tall to make the tower. But what hint should we give?
Materialize already uses runtime data to offer hints on group sizing
for existing dataflows, via
mz_introspection.mz_expected_group_size_advice
. But
if we simply take a peek, we’ll see that there are two TopK
operators (corresponding to TopK::Basic 10
and TopK::Basic 3
in
mz_lir_mapping
):
SELECT * FROM mz_introspection.mz_expected_group_size_advice;
dataflow_id | dataflow_name | region_id | region_name | levels | to_cut | savings | hint |
---|---|---|---|---|---|---|---|
96 | Dataflow: materialize.public.idx_top_buyers | 33066 | TopK | 8 | 6 | 20582 | 255.0 |
96 | Dataflow: materialize.public.idx_top_buyers | 33329 | TopK | 8 | 5 | 715344 | 4095.0 |
Which TopK
corresponds to which LIMIT
clause in our query? An astute observer
might have a guess from having attributed memory
usage; an experienced field engineer might
have a guess from the region_id
. But with mz_lir_mapping
, we don’t
have to guess:
-- topk hints
SELECT mo.name AS name, mlm.global_id AS global_id, lir_id, parent_lir_id,
REPEAT(' ', nesting * 2) || operator AS operator,
levels, to_cut, pg_size_pretty(savings) AS savings, hint
FROM mz_introspection.mz_lir_mapping mlm
JOIN mz_introspection.mz_dataflow_global_ids mdgi
ON (mlm.global_id = mdgi.global_id)
LEFT JOIN mz_introspection.mz_expected_group_size_advice megsa
ON ( megsa.dataflow_id = mdgi.id
AND mlm.operator_id_start <= megsa.region_id
AND megsa.region_id < mlm.operator_id_end)
JOIN mz_catalog.mz_objects mo
ON (mlm.global_id = mo.id)
WHERE mo.name LIKE '%top_buyers%'
ORDER BY mlm.global_id, lir_id DESC;
name | global_id | lir_id | parent_lir_id | operator | levels | to_cut | savings | hint |
---|---|---|---|---|---|---|---|---|
top_buyers | u195 | 11 | null | TopK::Basic 10 | 8 | 5 | 699 kB | 4095.0 |
top_buyers | u195 | 10 | 11 | MapFilterProject 9 | null | null | null | null |
top_buyers | u195 | 9 | 10 | Reduce::Accumulable 8 | null | null | null | null |
top_buyers | u195 | 8 | 9 | Join::Differential 6 » 7 | null | null | null | null |
top_buyers | u195 | 7 | 8 | Get::PassArrangements u191 | null | null | null | null |
top_buyers | u195 | 6 | 8 | Arrange 5 | null | null | null | null |
top_buyers | u195 | 5 | 6 | MapFilterProject 4 | null | null | null | null |
top_buyers | u195 | 4 | 5 | TopK::Basic 3 | 8 | 6 | 20 kB | 255.0 |
top_buyers | u195 | 3 | 4 | Arrange 2 | null | null | null | null |
top_buyers | u195 | 2 | 3 | Reduce::Accumulable 1 | null | null | null | null |
top_buyers | u195 | 1 | 2 | Get::Arrangement u191 | null | null | null | null |
idx_top_buyers | u196 | 13 | null | Arrange 12 | null | null | null | null |
idx_top_buyers | u196 | 12 | 13 | Get::PassArrangements u195 | null | null | null | null |
The outermost TopK
---the one responsible for so much memory---should
be sized a little larger than the inner one. Making the fix is not so
hard: DROP
the old definitions and recreate them with the limits in
place:
DROP VIEW popular_products CASCADE;
CREATE VIEW popular_products(product_id) AS
SELECT product_id
FROM sales
GROUP BY product_id
OPTIONS (LIMIT INPUT GROUP SIZE = 255)
ORDER BY COUNT(product_id) DESC
LIMIT 5;
CREATE VIEW popular_purchases(product_id, customer_id, count) AS
SELECT product_id, customer_id, COUNT(customer_id)
FROM popular_products
JOIN sales
USING (product_id)
GROUP BY product_id, customer_id;
CREATE VIEW top_buyers(customer_id, product_id, count) AS
SELECT customer_id, product_id, count
FROM (SELECT DISTINCT product_id FROM popular_purchases) products,
LATERAL ( SELECT customer_id, count FROM popular_purchases
WHERE product_id = products.product_id
OPTIONS (LIMIT INPUT GROUP SIZE = 4095)
ORDER BY count DESC
LIMIT 5)
ORDER BY product_id, count DESC;
CREATE INDEX idx_top_buyers ON top_buyers(product_id);
Let’s rerun our memory attribution query.
We’ll see a roughly 70% reduction in memory usage for both of the
TopK
operators:
name | global_id | lir_id | parent_lir_id | operator | size |
---|---|---|---|---|---|
top_buyers | u199 | 11 | null | TopK::Basic 10 | 305 kB |
top_buyers | u199 | 10 | 11 | MapFilterProject 9 | null |
top_buyers | u199 | 9 | 10 | Reduce::Accumulable 8 | 316 kB |
top_buyers | u199 | 8 | 9 | Join::Differential 6 » 7 | null |
top_buyers | u199 | 7 | 8 | Get::PassArrangements u191 | null |
top_buyers | u199 | 6 | 8 | Arrange 5 | 1738 bytes |
top_buyers | u199 | 5 | 6 | MapFilterProject 4 | null |
top_buyers | u199 | 4 | 5 | TopK::Basic 3 | 9272 bytes |
top_buyers | u199 | 3 | 4 | Arrange 2 | null |
top_buyers | u199 | 2 | 3 | Reduce::Accumulable 1 | 20 kB |
top_buyers | u199 | 1 | 2 | Get::Arrangement u191 | null |
idx_top_buyers | u200 | 13 | null | Arrange 12 | 1882 bytes |
idx_top_buyers | u200 | 12 | 13 | Get::PassArrangements u199 | null |
(Notice that the the global_id
s have changed, because we DROP
ped
and recreated the VIEW
s and index.)
What’s next?
Eventually, we’ll build syntax like EXPLAIN ANALYZE INDEX ...
around queries like these---once we know which information helps
the most. For now, we’ve documented these and some other common
debugging
queries;
we expect our field engineering team and users to extend these queries
and adapt them to their own ends.
There’s a separate language design problem, too: what’s the right
level of abstraction for EXPLAIN
? If you run EXPLAIN PLAN
on a
query today, we give a very detailed static plan, with many lines for
each LIR operator. If you query mz_lir_mapping
, you’ll see a
terse, one-line description for each LIR operator. What’s the right level of detail? As users get more
experience debugging their live queries, we’ll get a better sense of
what to show and what to hide.
These source maps have already turned arduous, manual tasks that took hours into quick glances that take minutes. Live debugging information makes it much easier to write better queries… so fire up the Materialize emulator and play around with these new features!
- These cost estimates are used in Postgres’s query planning, but they should be taken with a grain of salt.↩
- The values here should be stable across versions of Materialize---seahash values should only change at their major version bumps.↩
lir_id
s number the nodes of the LIR abstract syntax tree in a left-to-right, post-order traversal.↩