Why is my spatial query slower in Postgres 13 than in Postgres 11?

Question

Postgres versions

PostgreSQL 13.2 on x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203, 64-bit
PostgreSQL 11.11 (Debian 11.11-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

Query

SELECT count(*) FROM "addresses" AS a1 WHERE ST_DWithin('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography, a1.geom, '25000');

result is as expected result. But slow on PG11 (1.1s) and even slower on PG13 (2.9s)

Table definition

                                                              Table "public.addresses"
   Column    |              Type              | Collation | Nullable |                Default                | Storage  | Stats target | Description 
-------------+--------------------------------+-----------+----------+---------------------------------------+----------+--------------+-------------
 id          | bigint                         |           | not null | nextval('addresses_id_seq'::regclass) | plain    |              | 
 description | text                           |           |          |                                       | extended |              | 
 country     | character varying(255)         |           |          |                                       | extended |              | 
 locality    | character varying(255)         |           |          |                                       | extended |              | 
 region      | character varying(255)         |           |          |                                       | extended |              | 
 postal_code | character varying(255)         |           |          |                                       | extended |              | 
 street      | text                           |           |          |                                       | extended |              | 
 geom        | geometry                       |           |          |                                       | main     |              | 
 inserted_at | timestamp(0) without time zone |           | not null |                                       | plain    |              | 
 updated_at  | timestamp(0) without time zone |           | not null |                                       | plain    |              | 
 url         | character varying(255)         |           | not null |                                       | extended |              | 
 origin_id   | character varying(255)         |           |          |                                       | extended |              | 
 type        | character varying(255)         |           |          |                                       | extended |              | 
Indexes:
    "addresses_pkey" PRIMARY KEY, btree (id)
    "addresses_origin_id_index" UNIQUE, btree (origin_id)
    "addresses_url_index" UNIQUE, btree (url)
    "idx_addresses_geom" gist (geom)
Referenced by:
    TABLE "actors" CONSTRAINT "actors_physical_address_id_fkey" FOREIGN KEY (physical_address_id) REFERENCES addresses(id)
    TABLE "events" CONSTRAINT "events_physical_address_id_fkey" FOREIGN KEY (physical_address_id) REFERENCES addresses(id)
Access method: heap

Cardinalities

SELECT ST_SRID(geom) AS srid, count(*) from addresses group by 1;
 srid | count  
------+--------
      |      1
 4326 | 265011

Query plans

PG11

Finalize Aggregate  (cost=52276.51..52276.52 rows=1 width=8) (actual time=1082.209..1095.487 rows=1 loops=1)
  Buffers: shared hit=3444 read=4611
  ->  Gather  (cost=52276.29..52276.50 rows=2 width=8) (actual time=1082.180..1095.468 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=3444 read=4611
        ->  Partial Aggregate  (cost=51276.29..51276.30 rows=1 width=8) (actual time=897.135..897.138 rows=1 loops=3)
              Buffers: shared hit=3444 read=4611
              ->  Parallel Seq Scan on addresses a1  (cost=0.00..51272.61 rows=1472 width=0) (actual time=6.796..894.827 rows=1960 loops=3)
                    Filter: (((geom)::geography && '0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography) AND ('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography && _st_expand((geom)::geography, '25000'::double precision)) AND _st_dwithin('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography, (geom)::geography, '25000'::double precision, true))
                    Rows Removed by Filter: 86378
                    Buffers: shared hit=3444 read=4611
Planning Time: 1.167 ms
Execution Time: 1095.650 ms

PG13 (jit = off)

Finalize Aggregate  (cost=2770301.18..2770301.19 rows=1 width=8) (actual time=2801.774..2830.421 rows=1 loops=1)
  Buffers: shared hit=8227
  ->  Gather  (cost=2770300.97..2770301.18 rows=2 width=8) (actual time=2800.661..2830.384 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=8227
        ->  Partial Aggregate  (cost=2769300.97..2769300.98 rows=1 width=8) (actual time=2639.846..2639.849 rows=1 loops=3)
              Buffers: shared hit=8227
              ->  Parallel Seq Scan on addresses a1  (cost=0.00..2769300.94 rows=11 width=0) (actual time=63.094..2637.528 rows=1960 loops=3)
                    Filter: st_dwithin('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography, (geom)::geography, '25000'::double precision, true)
                    Rows Removed by Filter: 86378
                    Buffers: shared hit=8227
Planning Time: 0.424 ms
Execution Time: 2830.699 ms

There seems to be two issues : 1/ spacial index is not used 2/ PG13 is 3x slower one the same machine, same data.

The easiest way to figure it out is probably to force it to use the same plan, by for example dropping the index addresses_pkey on v13, and comparing the plans then. — jjanes, Sep 28 '21 at 21:28
`addresses_pkey` is the primary key index. Don't think it can be dropped. — Setop, Sep 28 '21 at 21:29
If you don't want to drop the constraint, you could still learn a bit about what is going on in the planner's "mind" by looking at `explain analyse SELECT * FROM "addresses" AS a1 WHERE ST_DWithin('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography, a1."geom"::geography, '25000');` — jjanes, Sep 28 '21 at 21:34
Please, to count `null` cases properly, run the cardinality query again as `SELECT ST_SRID(geom) AS srid, count(*) from addresses group by 1;` — Erwin Brandstetter, Oct 03 '21 at 16:22

Erwin Brandstetter · Accepted Answer · 2021-10-14T22:18:16.850

The more pressing issue is the bad query. Shouldn't take > 1 sec in either instance. Your filter on a1."geom"::geography. This cast disables any plain index on addresses.geom - including the spatial index you have:

"idx_addresses_geom" gist (geom)

So we see ST_DWithin() only as FILTER while it should be the index condition to identify qualifying addresses quickly. You need an expression index instead. See:

PostGIS window function not picking up index?

Your query counts the number of events within 25 km of a given point. The third parameter of ST_DWithin() is the distance. Measured in meters for geography. But for geometry, the unit is defined by the reference system (SRID). The PostGis manual:

For geometry: The distance is specified in units defined by the spatial reference system of the geometries. For this function to make sense, the source geometries must be in the same coordinate system (have the same SRID).

Your example input is a geography containing a point with SRID 4326:

SELECT ST_SRID ('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography);  -- 4326

The unit of measurement for SRID 4326 is degrees. Either operate with geography (more precise, more expensive) or with geometry and a different SRID to keep using meters.

Solution

The best solution depends on more undisclosed information. My example solution operates on transformed geometries in index and query:

CREATE INDEX idx_addresses_geom_27571 ON pg_temp.addresses USING gist (ST_Transform(geom, 27571));

SELECT count(*)  -- ①
FROM   addresses a
JOIN   events e ON e.physical_address_id = a.id
WHERE  ST_DWithin(ST_Transform('0101000020E61000001EE1B4E0455F0340E92807B309664840664840'::geometry, 27571)
                , ST_Transform(a.geom, 27571)
                , '25000');  -- ②

① A minor improvement. Assuming events.begins_on is defined NOT NULL, count(*) is equivalent in this query, and a bit faster.

② The main issue as described.

I picked SRID 27571 to accommodate measurement in meters geometries in northern France (as indicated by your example). Picking the most suitable reference system is up to you.

Obviously, you only need to transform input geometries that don't have the right SRID, yet.

ST_DWithin Calc Meters - transform or cast?

Performance regression in Postgres 13 / PostGIS ???

For the fixed query, I'd expect Postgres 13 to be a bit faster than Postgres 11 on an equivalent and properly configured server.

But why is the original, bad query faster with Postgres 11?

Postgres 11:

 Filter: (((geom)::geography && '0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography) AND ('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography && _st_expand((geom)::geography, '25000'::double precision)) AND _st_dwithin('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography, (geom)::geography, '25000'::double precision, true))

Postgres 13:

Filter: st_dwithin('0101000020E61000001EE1B4E0455F0340E92807B309664840'::geography, (geom)::geography, '25000'::double precision, true)

In Postgres, index support it bound to operators, not functions. Older version of PostGIS had to "hack" the system by using inline-able functions to bring in the operators required for index usage. So ST_DWithin(geog1, geog2, 25000) is replaced with ((geog2 && geog1) AND (geog1 && _st_expand(geog2, 25000)) AND _st_dwithin(geog1, geog2, 25000, true)) to add bounding box operators. See:

There was a major change to this whole architecture in Postgres 12 and PostGIS 3.0. Quoting the release notes for PostGis 3.0:

4341, Using "support function" API in PgSQL 12+ to replace SQL inlining as the mechanism for providing index support under ST_Intersects, et al

Since then, PostGis functions supposed to use an index have an attached "support function" instead of being rewritten with function inlining. The Postgres manual:

For target functions that return boolean, it may be possible to convert a function call appearing in WHERE into an indexable operator clause or clauses. The converted clauses might be exactly equivalent to the function's condition, or they could be somewhat weaker (that is, they might accept some values that the function condition does not). In the latter case the index condition is said to be lossy; it can still be used to scan an index, but the function call will have to be executed for each row returned by the index to see if it really passes the WHERE condition or not. To create such conditions, the support function must implement the SupportRequestIndexCondition request type.

The query plan for my adapted query + index (as detailed above) looks like this for me in Postgres 13 and PostGIS 3.1.3:

'Aggregate  (cost=52.48..52.48 rows=1 width=8)'
'  ->  Index Scan using idx_addresses_geom_27571 on addresses a  (cost=0.26..52.47 rows=1 width=0)'
'        Index Cond: (st_transform(geom, 27571) && st_expand(''0101000020B36B00006E51041D4B8022411FF9F0303F1E3141''::geometry, ''25000''::double precision))'
'        Filter: st_dwithin(''0101000020B36B00006E51041D4B8022411FF9F0303F1E3141''::geometry, st_transform(geom, 27571), ''25000''::double precision)'

Note the added index condition (st_transform(geom, 27571) && st_expand(''0101000020B36B00006E51041D4B8022411FF9F0303F1E3141''::geometry, ''25000''::double precision))' But the Filter line sticks to the original functional expression.

It would seem that, without index support, executing the rewritten functional expression with bounding box operators in Postgres 11 is substantially cheaper than the new implementation in Postgres 13..

Either you are not using the appropriate PostGIS version for Postgres 13 (???) or you found a regression, that might be worth looking into.

Asides

timestamp(0) without time zone may not be the best choice. The expression rounds the timestamp. When rounding up, the timestamp may be in the future, technically. Executing something like inserted_at <= now() in the same transaction may fail you, and you'll never find out why. Well, until just now. See:

Discard millisecond part from timestamp

Also timestamptz is typically the better choice for international data. See:

Ignoring time zones altogether in Rails and PostgreSQL

varchar(255) hardly ever makes sense. See:

You have a single row with geom IS NULL. You might fix that and declare the column NOT NULL.

Thanks for your detailed explanation. I tested every proposal : remove cast, add geo index, simplify query (I edit my question with it). Same result : same timing (PG11 : 1.2s, PG13 : 3s), same execution plan :( — Setop, Sep 29 '21 at 08:13
about degrees versus meters, I'm surprised because the result I get in the app seems correct : I get point of interest within 25km of the point I give. I'll double check. — Setop, Sep 29 '21 at 08:17
the one you mention in your answer `SELECT count(*) -- ② FROM addresses a JOIN events e ON e.physical_address_id = a.id WHERE ST_DWithin(', a.geom, '25000');` — Setop, Sep 29 '21 at 13:56
`SELECT ST_SRID(geom) g, count(ST_SRID (geom)) from addresses group by g;` -> `4326 | 265011`. Do you think it is not the right one to use ? — Setop, Sep 30 '21 at 11:32
@Setop: Instead of this piecemeal strategy in comments, please provide all relevant data in the question. See instructions here: https://dba.stackexchange.com/tags/postgresql-performance/info Also disclose SRID of data in table columns and input data. The query should use the index (unless your search area covers major parts of all rows in the table). — Erwin Brandstetter, Sep 30 '21 at 13:18
I reworked the question following the instructions. Hope it makes send. — Setop, Oct 03 '21 at 13:11
The answer is still the same. But I spelled it out in more detail now. — Erwin Brandstetter, Oct 04 '21 at 00:36
And I think I found an explanation for the performance degradation of the bad query, too, now. — Erwin Brandstetter, Oct 04 '21 at 01:41