Performance Testing a Postgres Database vs Elasticsearch 5: Column Statistics
24 January 2017 on . 2 minutes to read
This is the first post on benchmarking a postgres database vs a (1 node) elasticsearch instance. The subject of this test are numeric column statistics, based on 10 Million products inserted into both the database and elasticsearch index.
Up to date list of articles diving into my ecommerce performance investigations:
- Intro to the Ecommerce SaaS Benchmark Application
- Postgres Database vs Elasticsearch Numeric Column Statistics
- Postgres Find Where vs Elasticsearch Sub Aggregations
- Simple Elasticsearch Aggregation vs Postgres Counts Benchmark
Rails.logger.level = :info
Benchmark.ips do |x|
column = :brand_id
x.report("Product Brand ID Elasticsearch Stats") {Product.elasticsearch_stats(column)}
x.report("Product Brand ID PG Stats") {Product.pg_stats(column)}
x.compare!
end
Warming up --------------------------------------
Product Brand ID Elasticsearch Stats
42.000 i/100ms
Product Brand ID PG Stats
1.000 i/100ms
Calculating -------------------------------------
Product Brand ID Elasticsearch Stats
451.179 (± 8.4%) i/s - 2.268k in 5.066563s
Product Brand ID PG Stats
3.249 (± 0.0%) i/s - 17.000 in 5.236520s
Comparison:
Product Brand ID Elasticsearch Stats: 451.2 i/s
Product Brand ID PG Stats: 3.2 i/s - 138.86x slower
Point, blouses Elasticsearch.
Further Breakdown:
column = :brand_id
Product.elasticsearch_stats(column)
Product Search (3.6ms) curl http://localhost:9200/products_development/_search?pretty -d '{"query":{"match_all":{}},"size":0,"from":0,"timeout":"11s","_source":false,"aggs":{"brand_id_stats":{"stats":{"field":"brand_id"}}}}'
{
"avg" => 502.5,
"count" => 1000000,
"max" => 1002.0,
"min" => 3.0,
"sum" => 502500000.0
}
Product.pg_stats(column)
(115.7ms) SELECT AVG("products"."brand_id") FROM "products"
(75.1ms) SELECT COUNT(*) FROM "products"
(0.3ms) SELECT MAX("products"."brand_id") FROM "products"
(0.2ms) SELECT MIN("products"."brand_id") FROM "products"
(110.1ms) SELECT SUM("products"."brand_id") FROM "products"
{
:avg => 502.5,
:count => 1000000,
:max => 1002,
:min => 3,
:sum => 502500000
}
As you can see here, the operational time for all the complex queries take an order of magnitude longer than the entire elasticsearch query. The min and max queries are quick, but I’d bet that’s because these operations were performed on an indexed numerical column. I’ll update this post in the future against a non indexed column for comparison.
If you enjoy having free time and the peace of mind that a professional is on your side, then you’d love to have me work on your project.
Contact or view a list of available services to see how I’ll make your life better, easier and bring satisfaction back into you running your business.