This blog examines the performance of Materialize vs. Aurora PostgreSQL read replicas for computationally intensive workloads. We demonstrate that Materialize outperforms Aurora for complex queries over relatively small data volumes.
Specifically, for the same on-demand cost, Materialize delivers 100x greater throughput with 1000x lower latency. And unlike other solutions that offload computation from OLTP databases, Materialize does so without sacrificing correctness or requiring external change data capture (CDC) tools to move data between systems.
Read on to learn what our benchmark tests uncovered about Materialize and Aurora PostgreSQL.
Aurora PostgreSQL: Not Designed for Complex, Read-Intensive Queries
OLTP databases like Aurora PostgreSQL are the backbone of modern applications. They excel at handling vast amounts of transactional operations, ensure data integrity, and deliver fast, atomic transactions.
However, these strengths can become limitations. Intricate joins, aggregations, and data transformations that modern applications thrive on can bog down an OLTP system, leading to performance bottlenecks and a degraded user experience.
To mitigate this load, many organizations implement read replicas to distribute read operations across multiple database copies. While this offloading can help with performance, it does not fully address the challenges posed by complex queries. Read replicas are designed for straightforward read operations and often struggle with tasks involving multiple joins, large aggregations, and otherwise non-trivial data transformations.
Materialize offers a radically different approach by providing efficient incremental computation. Unlike traditional read replicas, Materialize shifts the computational burden from the read phase to the write phase by precomputing views and incrementally materializing results.
This ensures data remains fresh and up-to-date, delivering low-latency results. Materialize transforms how we handle complex queries—making them faster while maintaining consistency—and ultimately solves the challenges that traditional read replicas and caching mechanisms cannot.
Benchmarking Use Case: Dynamic Pricing for an Online Retailer
Now that we’ve explained the conceptual differences, let’s perform a quantitative comparison of Materialize and Aurora PostgreSQL. We’ll use dynamic pricing for an online retailer as a sample use case.
In this use case, the price of any given item fluctuates based on available inventory, snap promotions, popularity, and other factors. The goal is a scalable solution that offers low latency and always displays the current price for an item.
The logic for dynamic pricing is encapsulated in a standard SQL view definition, which references six tables [1]. The entire database contains only 1GB of data across all tables. You can see the code for the view below:
CREATE VIEW dynamic_pricing AS
WITH recent_prices AS (
SELECT grp.product_id, AVG(price) AS avg_price
FROM (SELECT DISTINCT product_id FROM sales) grp,
LATERAL (
SELECT product_id, price
FROM sales
WHERE sales.product_id = grp.product_id
ORDER BY sale_date DESC LIMIT 10
) sub
GROUP BY grp.product_id
),
promotion_effect AS (
SELECT
p.product_id,
MIN(pr.promotion_discount) AS promotion_discount
FROM promotions pr
JOIN products p ON pr.product_id = p.product_id
WHERE pr.active = TRUE
GROUP BY p.product_id
),
popularity_score AS (
SELECT
s.product_id,
RANK() OVER (PARTITION BY p.category_id ORDER BY COUNT(s.sale_id) DESC) AS popularity_rank,
COUNT(s.sale_id) AS sale_count
FROM sales s
JOIN products p ON s.product_id = p.product_id
GROUP BY s.product_id, p.category_id
),
inventory_status AS (
SELECT
i.product_id,
SUM(i.stock) AS total_stock,
RANK() OVER (ORDER BY SUM(i.stock) DESC) AS stock_rank
FROM inventory i
GROUP BY i.product_id
),
high_demand_products AS (
SELECT
p.product_id,
AVG(s.sale_price) AS avg_sale_price,
COUNT(s.sale_id) AS total_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id
GROUP BY p.product_id
HAVING COUNT(s.sale_id) > (SELECT AVG(total_sales) FROM (SELECT COUNT(*) AS total_sales FROM sales GROUP BY product_id) subquery)
),
dynamic_pricing AS (
SELECT
p.product_id,
p.base_price,
CASE
WHEN pop.popularity_rank <= 3 THEN 1.2
WHEN pop.popularity_rank BETWEEN 4 AND 10 THEN 1.1
ELSE 0.9
END AS popularity_adjustment,
rp.avg_price,
COALESCE(1.0 - (pe.promotion_discount / 100), 1) AS promotion_discount,
CASE
WHEN inv.stock_rank <= 3 THEN 1.1
WHEN inv.stock_rank BETWEEN 4 AND 10 THEN 1.05
ELSE 1
END AS stock_adjustment,
CASE
WHEN p.base_price > rp.avg_price THEN 1 + (p.base_price - rp.avg_price) / rp.avg_price
ELSE 1 - (rp.avg_price - p.base_price) / rp.avg_price
END AS demand_multiplier,
hd.avg_sale_price,
CASE
WHEN p.product_name ilike '%cheap%' THEN 0.8
ELSE 1.0
END AS additional_discount
FROM products p
LEFT JOIN recent_prices rp ON p.product_id = rp.product_id
LEFT JOIN promotion_effect pe ON p.product_id = pe.product_id
JOIN popularity_score pop ON p.product_id = pop.product_id
LEFT JOIN inventory_status inv ON p.product_id = inv.product_id
LEFT JOIN high_demand_products hd ON p.product_id = hd.product_id
)
SELECT
dp.product_id,
dp.base_price * dp.popularity_adjustment * dp.promotion_discount * dp.stock_adjustment * dp.demand_multiplier * dp.additional_discount AS adjusted_price
FROM dynamic_pricing dp;
The benchmark test is to query dynamic_price for the current price of a single product at different scales. All measurements are taken via dbbench on an otherwise idle c5a.4xlarge
EC2 instance:
SELECT * FROM dymamic_pricing WHERE product_id = $1
Configurations for Benchmark Testing
Aurora PostgreSQL Configuration
To set up Aurora PostgreSQL for benchmarking, we connected to a db.r6gd.xlarge
read replica instance with read optimizations enabled. All queries in PostgreSQL were executed against this instance.
The primary keys for each table were set, along with secondary indexes based on the view’s needs. Foreign keys were also established to provide the query optimizer with as much information as possible to enhance performance. You can view the code below:
ALTER TABLE products
ADD CONSTRAINT products_pkey PRIMARY KEY (product_id);
ALTER TABLE categories
ADD CONSTRAINT categories_pkey PRIMARY KEY (category_id);
ALTER TABLE suppliers
ADD CONSTRAINT suppliers_pkey PRIMARY KEY (supplier_id);
ALTER TABLE sales
ADD CONSTRAINT sales_pkey PRIMARY KEY (sale_id);
ALTER TABLE inventory
ADD CONSTRAINT inventory_pkey PRIMARY KEY (inventory_id);
ALTER TABLE promotions
ADD CONSTRAINT promotions_pkey PRIMARY KEY (promotion_id);
ALTER TABLE public.inventory
ADD CONSTRAINT inventory_product_id_fkey FOREIGN KEY (product_id) REFERENCES public.products (product_id);
ALTER TABLE public.promotions
ADD CONSTRAINT promotions_product_id_fkey FOREIGN KEY (product_id) REFERENCES public.products (product_id);
ALTER TABLE public.sales
ADD CONSTRAINT sales_product_id_fkey FOREIGN KEY (product_id) REFERENCES public.products (product_id);
CREATE INDEX idx_products_product_name ON products(product_name);
CREATE INDEX idx_sales_product_id ON sales(product_id);
CREATE INDEX idx_sales_sale_date ON sales(sale_date);
CREATE INDEX idx_sales_product_id_sale_date ON sales(product_id, sale_date);
CREATE INDEX idx_promotions_product_id ON promotions(product_id);
CREATE INDEX idx_promotions_active ON promotions(active);
CREATE INDEX idx_promotions_product_id_active ON promotions(product_id, active);
CREATE INDEX idx_inventory_product_id ON inventory(product_id);
Materialize Configuration
We configured a 50cc cluster on the Materialize side, which has approximately the same on-demand cost as the Aurora read-replica. Materialize also connects directly to the primary writer instance to consume changes using the built-in Postgres source [2].
The system maintains consistency by applying all changes within an upstream transaction atomically. Unlike other CDC tools, Materialize will never show partial writes. Within Materialize, the same view is defined and indexed directly.
Indexing a view in Materialize initiates the pre-computation of results and ensures they are kept up-to-date as the upstream database is modified. Queries executed against this view leverage the index to short-circuit any work that is already completed, significantly improving query performance. The code is as follows:
CREATE INDEX ON dynamic_pricing (product_id);
Overview of Test Scenarios
We designed three test scenarios to comprehensively evaluate the performance of Materialize versus read-optimized Aurora PostgreSQL read replicas. These scenarios simulate common real-world use cases, and stress test the systems under varying conditions.
Scenario 1: Single Database Connection with Continuous Writes
In this scenario, a single database connection runs one SELECT query at a time as fast as possible. Simultaneously, a continuous stream of writes (inserts, updates, and deletes) is performed on the database. This tests the ability of each system to maintain read performance under write-heavy conditions.
Scenario 2: Ten Database Connections with Continuous Writes
Here, ten database connections issue SELECT queries concurrently while a continuous stream of writes is performed on the database. This scenario tests each system’s overall robustness and concurrency handling under mixed read/write workloads.
Scenario 3: 100 Database Connections with Continuous Writes
In this final scenario, 100 database connections issue SELECT queries concurrently while the database undergoes a continuous stream of writes. This scenario assesses the robustness and scalability of each system under a read-heavy workload with simultaneous writes.
Test Scenarios: Results for Materialize and Aurora PostgreSQL
Scenario 1
Aurora PostgreSQL
Performance: 80 transactions (0.131 TPS)
Mean Latency: 7.639s ± 13.38ms
P99 Latency: 7.757s
Max Latency: 7.892s
4.294967296s - 8.589934592s \[ 80]: ██████████████████████████████████████████████████
Materialize
Performance: 46,254 transactions (77.088 TPS)
Mean Latency: 12.953ms ± 176.35µs
P99 Latency: 56.124ms
Max Latency: 268.435ms
Transactions:
4.194304ms - 8.388608ms \[ 28845]: ██████████████████████████████████████████████████
8.388608ms - 16.777216ms \[ 9252]: ████████████████
16.777216ms - 33.554432ms \[ 6623]: ███████████▍
33.554432ms - 67.108864ms \[ 1195]: ██
67.108864ms - 134.217728ms \[ 244]: ▍
Scenario 2
Aurora PostgreSQL
Performance: 277 transactions (0.451 TPS)
Mean Latency: 21.983s ± 225.77ms
P99 Latency: 24.516s
Max Latency: 25.076s
Transactions:
8.589934592s - 17.179869184s \[ 3]: ▌
17.179869184s - 34.359738368s \[ 274]: ██████████████████████████████████████████████████
Materialize
Performance: 96221 transactions (160.365 TPS)
Mean Latency: 29.35ms ± 114.129µs
P99 Latency: 644.438ms
Max Latency: 825.438ms
Transactions:
4.194304ms - 8.388608ms \[ 5]: ▏
8.388608ms - 16.777216ms \[ 7445]: ██████▎
16.777216ms - 33.554432ms \[ 78498]: ██████████████████████████████████████████████████
33.554432ms - 67.108864ms \[ 3777]: ███████████████████▍
67.108864ms - 134.217728ms \[ 662]: █▍
134.217728ms - 268.435456ms \[ 44]: ▏
268.435456ms - 536.870912ms \[ 687]: ▌
536.870912ms - 825.438664ms \[ 5103]: ████▎
Scenario 3
Aurora PostgreSQL
The Aurora read replica crashed under the load, and we were unable to complete the benchmark.
Materialize
Performance: 276,486 transactions (460.719 TPS)
Latency: 216.996ms±583.073µs
P99 Latency: 713.052ms
Max Latency: 3.975s
Transactions:
16.777216ms - 33.554432ms \[ 52]: ▏
33.554432ms - 67.108864ms \[ 949]: ▏
67.108864ms - 134.217728ms \[ 20931]: █████
134.217728ms - 268.435456ms \[207898]: ██████████████████████████████████████████████████
268.435456ms - 536.870912ms \[ 36042]: ████████▋
536.870912ms - 1.073741824s \[ 10397]: ██▌
1.073741824s - 2.147483648s \[ 216]: ▏
2.147483648s - 3.975392663s \[ 1]: ▏
Conclusion: Materialize Outperforms Aurora PostgreSQL
In modern applications, computationally intensive workloads are the norm. And our benchmarking demonstrated that for complex queries, Materialize significantly outperforms Aurora PostgreSQL read replicas. As an operational data platform, Materialize is ideal for offloading queries from OLTP systems.
Some of the most critical finds of our benchmarking tests include:
Higher throughput and lower latency: Materialize achieves 100x greater throughput and 1000x lower latency than Aurora PostgreSQL for complex queries over small data volumes.
Efficient resource utilization: Materialize’s CPU usage never exceeded 13%, compared to Aurora’s frequent CPU alerts and eventual crash under high load.
Consistent fresh data: Materialize maintains a lag of less than 1 second, ensuring near-real-time data accuracy without the risk of stale results. Additionally, Materialize has a feature in private preview called real-time recency that guarantees users can always read their writes across systems, making it ideal for use cases that can never tolerate stale results.
Stay tuned for more benchmarking tests from us! And feel free to post on our Twitter or LinkedIn to request benchmarking tests.
Appendix[1] Table Definitions
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
base_price NUMERIC(10, 2) NOT NULL,
category_id INTEGER NOT NULL,
supplier_id INTEGER NOT NULL,
available BOOLEAN NOT NULL
);
CREATE TABLE categories (
category_id SERIAL PRIMARY KEY,
category_name VARCHAR(255) NOT NULL
);
CREATE TABLE suppliers (
supplier_id SERIAL PRIMARY KEY,
supplier_name VARCHAR(255) NOT NULL
);
CREATE TABLE sales (
sale_id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
sale_price NUMERIC(10, 2) NOT NULL,
sale_date TIMESTAMP NOT NULL, price NUMERIC(10, 2) NOT NULL
);
CREATE TABLE inventory (
inventory_id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
stock INTEGER NOT NULL,
warehouse_id INTEGER NOT NULL,
restock_date TIMESTAMP NOT NULL
);
CREATE TABLE promotions (
promotion_id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
promotion_discount NUMERIC(10, 2) NOT NULL,
start_date TIMESTAMP NOT NULL,
end_date TIMESTAMP NOT NULL,
active BOOLEAN NOT NULL
);