Thread: Too few rows expected by Planner on partitioned tables

Too few rows expected by Planner on partitioned tables

From

Julian Wolf

Date:

21 July 2020, 13:09:22

Hello,


A description of what you are trying to achieve and what results you expect:
Our database is growing on a daily basis by about 2.5million rows per table (2 at the moment). Because of that, we decided to partition the data, especially, as we are pre-aggregating the data for weeks, months, quarters and years. Every aggregation is stored in a separate partition:

Days: ..._yYYYYmMMd  (base data)
Weeks: ..._yYYYYmMMw (aggregated all weeks of the month)
month: ..._yYYYYmMM (aggregated month)
etc.


Our problem is, that the planner always predicts one row to be returned, although only a part of the primary key is queried. This problem exceeds feasibility of performance rapidly - a query only involving a few days already takes dozens of seconds. All tables are analyzed and pg_stats looks reasonable IMHO.


PostgreSQL version number you are running:
postgres=# SELECT version();
                                                     version                                                      
------------------------------------------------------------------------------------------------------------------
 PostgreSQL 12.3 (Debian 12.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
(1 Zeile)

postgres=# SELECT name, current_setting(name), source
postgres-#   FROM pg_settings
postgres-#   WHERE source NOT IN ('default', 'override');
postgres=# SELECT name, current_setting(name), source
  FROM pg_settings
  WHERE source NOT IN ('default', 'override');
              name              |             current_setting             |        source        
--------------------------------+-----------------------------------------+----------------------
 application_name               | psql                                    | client
 checkpoint_completion_target   | 0.9                                     | configuration file
 client_encoding                | UTF8                                    | client
 cluster_name                   | 12/main                                 | configuration file
 DateStyle                      | ISO, DMY                                | configuration file
 default_text_search_config     | pg_catalog.german                       | configuration file
 dynamic_shared_memory_type     | posix                                   | configuration file
 effective_cache_size           | 6GB                                     | configuration file
 effective_io_concurrency       | 200                                     | configuration file
 enable_partitionwise_aggregate | on                                      | configuration file
 enable_partitionwise_join      | on                                      | configuration file
 external_pid_file              | /var/run/postgresql/12-main.pid         | configuration file
 lc_messages                    | de_DE.UTF-8                             | configuration file
 lc_monetary                    | de_DE.UTF-8                             | configuration file
 lc_numeric                     | de_DE.UTF-8                             | configuration file
 lc_time                        | de_DE.UTF-8                             | configuration file
 listen_addresses               | *                                       | configuration file
 log_line_prefix                | %m [%p] %q%u@%d                         | configuration file
 log_timezone                   | Etc/UTC                                 | configuration file
 maintenance_work_mem           | 512MB                                   | configuration file
 max_connections                | 300                                     | configuration file
 max_parallel_workers           | 2                                       | configuration file
 max_stack_depth                | 2MB                                     | environment variable
 max_wal_size                   | 2GB                                     | configuration file
 max_worker_processes           | 2                                       | configuration file
 min_wal_size                   | 256MB                                   | configuration file
 port                           | 5432                                    | configuration file
 random_page_cost               | 1.1                                     | configuration file
 shared_buffers                 | 2GB                                     | configuration file
 ssl                            | on                                      | configuration file
 ssl_cert_file                  | /etc/ssl/certs/ssl-cert-snakeoil.pem    | configuration file
 ssl_key_file                   | /etc/ssl/private/ssl-cert-snakeoil.key  | configuration file
 stats_temp_directory           | /var/run/postgresql/12-main.pg_stat_tmp | configuration file
 temp_buffers                   | 256MB                                   | configuration file
 TimeZone                       | Etc/UTC                                 | configuration file
 unix_socket_directories        | /var/run/postgresql                     | configuration file
 work_mem                       | 128MB                                   | configuration file
(37 Zeilen)

Operating system and version:
Linux dev 5.4.44-2-pve #1 SMP PVE 5.4.44-2 (Wed, 01 Jul 2020 16:37:57 +0200) x86_64 GNU/Linux

On a quad core virtualized machine with SSD storage and 16GB RAM.


What program you're using to connect to PostgreSQL:
psql and IntelliJ

I'm trying to gather as much information as possible and focus just on one of the two tables (the problem persists in both though):
-------------------------------------------------------------------------------------------------------

Stucture:

CREATE TABLE location_statistics
(
    daterange                daterange NOT NULL,
    spatial_feature_id           INTEGER,
    visitor_profile_id           INTEGER,
    activity_type_combination_id INTEGER,
    activity_chain_id            INTEGER NOT NULL,
    visitors                     REAL,
    dwell_time                   INTEGER,
    travel_time                  INTEGER,
    n                            INTEGER NOT NULL DEFAULT 1,

    PRIMARY KEY ( daterange, spatial_feature_id, visitor_profile_id, activity_type_combination_id,
                 activity_chain_id),
    FOREIGN KEY (daterange) REFERENCES dateranges (daterange) ON DELETE CASCADE ON UPDATE CASCADE,
    FOREIGN KEY (spatial_feature_id) REFERENCES spatial_features (id) ON DELETE CASCADE ON UPDATE CASCADE,
    FOREIGN KEY (visitor_profile_id) REFERENCES visitor_profiles (id) ON DELETE CASCADE ON UPDATE CASCADE,
    FOREIGN KEY (activity_type_combination_id) REFERENCES activity_type_combinations (id) ON DELETE RESTRICT ON UPDATE CASCADE,
    FOREIGN KEY (activity_chain_id) REFERENCES activity_chains (id) ON DELETE CASCADE ON UPDATE CASCADE
) PARTITION BY LIST (daterange);



-------------------------------------------------------------------------------------------------------

Creating of partitions:

CREATE OR REPLACE FUNCTION create_partition_tables(additional_dates TEXT[] = NULL)
    RETURNS VOID
    VOLATILE
    LANGUAGE plpgsql
AS
$$
DECLARE
    new_partition RECORD;
BEGIN

    FOR new_partition IN
        (
            SELECT for_values_str,
                   master_table,
                   partition_name
            FROM resolve_existing_partitions((additional_dates))
            WHERE NOT existing
        )
        LOOP

            EXECUTE ' CREATE TABLE '
                        || new_partition.partition_name
                        || ' PARTITION OF '
                        || new_partition.master_table
                        || ' FOR VALUES IN (' || new_partition.for_values_str || ')';

            RAISE NOTICE 'Partition % for % created',new_partition.partition_name, new_partition.master_table;
        END LOOP;
END
$$;

-------------------------------------------------------------------------------------------------------

Size of table:

SELECT schemaname,relname,n_live_tup
FROM pg_stat_user_tables
where relname like 'location_statistics_y2019m03%'
ORDER BY n_live_tup DESC;

schemaname	relname	n_live_tup
mobility_insights	location_statistics_y2019m03d	23569853
mobility_insights	location_statistics_y2019m03w	19264373
mobility_insights	location_statistics_y2019m03	18105295


-------------------------------------------------------------------------------------------------------



select * from pg_stats
where tablename = 'location_statistics_y2019m03w';

schemaname	tablename	attname	inherited	null_frac	avg_width	n_distinct	most_common_vals	most_common_freqs	histogram_bounds	correlation	most_common_elems	most_common_elem_freqs	elem_count_histogram
mobility_insights	location_statistics_y2019m03w	daterange	false	0	14	-1							
mobility_insights	location_statistics_y2019m03w	spatial_feature_id	false	0	4	600	{12675,7869,7867,7892,7915,7963,12677,12683,12237,7909,7868,9478,7914,11309,7913,7911,12509,9510,7962,10547,9559,10471,11782,10590,9552,10554,9527,10488,12680,9546,11330,11409,9595,12293,10845,11469,10531,10467,9525,7927,11115,10541,10544,9509,9515,10637,10486,10859,9703,9591,11195,11657,7878,7938,7910,9560,9565,9532,11016,12435,12525,9578,7973,9558,10536,12650,9516,9547,7871,10537,10923,10812,12546,9574,12454,9511,10435,11840,7926,12540,8187,10469,7935,9504,9536,11203,7964,9484,10534,10538,12391,10888,8237,9501,9517,12516,10927,11102,7985,10527}	{0.11813333630561829,0.06599999964237213,0.03723333403468132,0.031433332711458206,0.027033332735300064,0.023233333602547646,0.022333333268761635,0.0212333332747221,0.021166667342185974,0.02083333395421505,0.02033333294093609,0.0201666671782732,0.02006666734814644,0.019200000911951065,0.018833333626389503,0.01823333278298378,0.01510000042617321,0.014333332888782024,0.013633333146572113,0.013399999588727951,0.01146666705608368,0.011300000362098217,0.011233333498239517,0.011033332906663418,0.009666666388511658,0.009233333170413971,0.008433333598077297,0.007966666482388973,0.007966666482388973,0.007466666866093874,0.007300000172108412,0.007199999876320362,0.006566666532307863,0.006500000134110451,0.005799999926239252,0.00570000009611249,0.005166666582226753,0.004833333194255829,0.004766666796058416,0.004666666500270367,0.00423333328217268,0.0041333334520459175,0.004100000020116568,0.003966666758060455,0.0038333332631736994,0.0037666666321456432,0.003700000001117587,0.0035000001080334187,0.003433333244174719,0.0033666666131466627,0.0033333334140479565,0.003100000089034438,0.002933333395048976,0.00286666676402092,0.00283333333209157,0.00283333333209157,0.0026666666381061077,0.0024666667450219393,0.0024333333130925894,0.0024333333130925894,0.0024333333130925894,0.0023333332501351833,0.002266666619107127,0.002266666619107127,0.002266666619107127,0.002266666619107127,0.002233333420008421,0.002233333420008421,0.002199999988079071,0.002199999988079071,0.002199999988079071,0.002166666556149721,0.002166666556149721,0.002133333357051015,0.002099999925121665,0.0020666667260229588,0.0020666667260229588,0.0020666667260229588,0.002033333294093609,0.002033333294093609,0.0019333333475515246,0.0018666667165234685,0.0018333332845941186,0.0018333332845941186,0.0018333332845941186,0.0017999999690800905,0.0017666666535660625,0.0017666666535660625,0.0017666666535660625,0.0017666666535660625,0.0017666666535660625,0.0017333333380520344,0.0017000000225380063,0.0017000000225380063,0.0016666667070239782,0.0016333333915099502,0.0015999999595806003,0.0015999999595806003,0.001500000013038516,0.001500000013038516}	{7870,7891,7906,7917,7954,7965,7966,7969,7974,7977,7979,7984,7986,8132,8171,8194,9479,9482,9488,9491,9493,9496,9498,9499,9503,9507,9512,9513,9520,9521,9524,9526,9530,9534,9537,9541,9544,9554,9562,9570,9573,9577,9581,9583,9586,9599,9675,9736,10436,10442,10450,10464,10482,10491,10495,10510,10513,10515,10516,10523,10529,10535,10539,10543,10553,10575,10602,10718,10756,10816,10882,10902,10928,11008,11025,11064,11158,11276,11316,11382,11486,11538,11602,11673,11731,11766,11775,11835,11906,12052,12088,12130,12277,12356,12383,12397,12408,12471,12545,12627,12678}	0.11252771			
mobility_insights	location_statistics_y2019m03w	visitor_profile_id	false	0	4	9806	{3081,3114,2739,3642,2445,103,1625,1874,4005,2282,1550,3792,5564,750,1526,4427,2993,4881,1498,2682,5345,5601,8210,1613,2407,5019,1944,2266,3690,4529,4354,1218,11605,4126,5453,11698,11988,4207,6935,559,9151,12020,12048,12006,12049,3695,4874,5596,5945,6740,1366,7186,101,2026,5694,9152,4446,5788,8892,9365,11619,12027,871,5943,7567,7936,7939,8653,437,3971,5733,5961,7872,2728,3358,4154,4605,6187,9057,1967,4625,4837,5784,8910,1482,2036,6268,7557,8835,9,576,933,1686,2145,2229,3000,3692,4645,4666,5386}	{0.0024666667450219393,0.0023333332501351833,0.002300000051036477,0.002199999988079071,0.0020666667260229588,0.002033333294093609,0.002033333294093609,0.002033333294093609,0.0019666666630655527,0.0019333333475515246,0.0019000000320374966,0.0018666667165234685,0.0018666667165234685,0.0018333332845941186,0.0018333332845941186,0.0018333332845941186,0.0017999999690800905,0.0017333333380520344,0.0016333333915099502,0.0015999999595806003,0.0015666666440665722,0.0015666666440665722,0.0015666666440665722,0.0015333333285525441,0.001500000013038516,0.001466666697524488,0.00143333338201046,0.00143333338201046,0.00143333338201046,0.00143333338201046,0.00139999995008111,0.001366666634567082,0.001366666634567082,0.0013333333190530539,0.0013333333190530539,0.0013333333190530539,0.0013333333190530539,0.0012666666880249977,0.0012666666880249977,0.0012333333725109696,0.0012333333725109696,0.0012333333725109696,0.0012333333725109696,0.0012000000569969416,0.0012000000569969416,0.0011666666250675917,0.0011666666250675917,0.0011666666250675917,0.0011666666250675917,0.0011666666250675917,0.0011333333095535636,0.0011333333095535636,0.0010999999940395355,0.0010999999940395355,0.0010999999940395355,0.0010999999940395355,0.0010666666785255075,0.0010666666785255075,0.0010666666785255075,0.0010666666785255075,0.0010666666785255075,0.0010666666785255075,0.0010333333630114794,0.0010333333630114794,0.0010333333630114794,0.0010333333630114794,0.0010333333630114794,0.0010333333630114794,0.0010000000474974513,0.0010000000474974513,0.0010000000474974513,0.0010000000474974513,0.0010000000474974513,0.0009666666737757623,0.0009666666737757623,0.0009666666737757623,0.0009666666737757623,0.0009666666737757623,0.0009666666737757623,0.0009333333582617342,0.0009333333582617342,0.0009333333582617342,0.0009333333582617342,0.0009333333582617342,0.0008999999845400453,0.0008999999845400453,0.0008999999845400453,0.0008999999845400453,0.0008999999845400453,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172,0.0008666666690260172}	{1,89,222,365,497,628,786,886,987,1108,1200,1320,1459,1584,1677,1812,1953,2080,2183,2306,2436,2581,2690,2798,2871,3018,3138,3294,3391,3525,3678,3783,3917,3992,4097,4253,4362,4442,4564,4693,4788,4897,5045,5157,5285,5414,5520,5630,5722,5843,5941,6041,6217,6444,6683,6892,7117,7330,7544,7730,7906,8076,8273,8471,8645,8789,8931,9063,9227,9378,9519,9610,9657,10667,10998,11483,11760,11960,12181,12262,12336,12440,12519,12629,13608,13782,13974,14116,14278,15670,16742,17892,18814,20657,23107,26119,31244,39466,59333,68728,83799}	-0.03462254			
mobility_insights	location_statistics_y2019m03w	activity_type_combination_id	false	0	4	145	{6,1,8,10,59,28,5,2,67,14,4,11,12,3,9,133,23,90,25,45,92,32,213,37,50,182,71,89,29,33,46,195,61,84,43,17,20,106,18,160,95,137,15,125,203,214,206,218,107,105,143,85,211,27,38,221,126,79,135,217,175,128,42,108,120,159,208,76,130}	{0.15360000729560852,0.14463333785533905,0.11789999902248383,0.06403333693742752,0.056533332914114,0.04636666551232338,0.035466667264699936,0.033533334732055664,0.02669999934732914,0.026133334264159203,0.023900000378489494,0.0203000009059906,0.019866665825247765,0.019233332946896553,0.01876666583120823,0.011966666206717491,0.01126666646450758,0.010066666640341282,0.009533333592116833,0.009499999694526196,0.00860000029206276,0.008366666734218597,0.0077666668221354485,0.00706666661426425,0.006899999920278788,0.006866666488349438,0.006599999964237213,0.006466666702181101,0.00566666666418314,0.004999999888241291,0.004533333238214254,0.004533333238214254,0.004333333112299442,0.0041333334520459175,0.004000000189989805,0.0033333334140479565,0.0031999999191612005,0.0031333332881331444,0.0025333333760499954,0.002166666556149721,0.0020666667260229588,0.0016333333915099502,0.0015333333285525441,0.00143333338201046,0.0013000000035390258,0.0013000000035390258,0.0012666666880249977,0.0012666666880249977,0.0011666666250675917,0.0010333333630114794,0.0010333333630114794,0.0009333333582617342,0.0009333333582617342,0.0007666666642762721,0.000733333348762244,0.000733333348762244,0.000699999975040555,0.0006666666595265269,0.0006666666595265269,0.0006666666595265269,0.0006333333440124989,0.0006000000284984708,0.0004666666791308671,0.0004666666791308671,0.0004666666791308671,0.0004333333345130086,0.0004333333345130086,0.000366666674381122,0.000366666674381122}	{22,26,36,54,54,54,64,64,70,77,87,88,96,97,98,98,101,112,114,114,118,119,127,127,131,138,145,148,148,151,151,153,153,155,155,155,163,164,164,165,166,169,169,170,170,173,176,180,184,187,187,187,194,194,201,201,201,219,227,227,228,231,231,232,233,233,251,256,272,274,286,303,303,315,324,490}	0.027344994			
mobility_insights	location_statistics_y2019m03w	activity_chain_id	false	0	4	75638	{5161,5206,5162,5184,5195,5323,5397,5815,6530,5216,7603,6545,5153,6332,6981,7432,5818,5415,5596,7121,7531,5359,5618,5967,6393,7884,14611,21593,355,5325,5986,6407,23475,5213,6039,6385,6621,6849,9910,10026,11114,15860,164,165,200,5165,5262,5890,6043,6231,6659,6950,7251,7284,8228,8456,8923,9212,9851,9886,12203,12983,14685,16472,21550,43,271,307,992,5220,5243,5481,5482,5509,5516,5532,5603,5621,5757,5917,6026,6063,6139,6146,6210,6214,6464,6499,6671,6728,6758,6889,7010,7173,7643,8032,8081,8290,9676,10875}	{0.002133333357051015,0.0017999999690800905,0.00143333338201046,0.0011333333095535636,0.0010333333630114794,0.000699999975040555,0.000699999975040555,0.000699999975040555,0.0006666666595265269,0.0006333333440124989,0.0006333333440124989,0.0005000000237487257,0.0004666666791308671,0.0004666666791308671,0.0004666666791308671,0.0004666666791308671,0.0004333333345130086,0.000366666674381122,0.000366666674381122,0.000366666674381122,0.000366666674381122,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.0003000000142492354,0.0003000000142492354,0.0003000000142492354,0.0003000000142492354,0.0003000000142492354,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503}	{16,3832,5935,6980,8254,9534,11187,13024,15280,17910,20278,23752,27191,30933,35166,39736,44912,84588,87937,91731,96462,98710,99978,101481,102822,104232,105743,107178,108599,109896,111309,112882,114244,115636,117258,118951,120523,122033,123500,124882,126475,127916,129472,131137,132751,134476,135966,137506,139103,140651,142235,143923,145489,147256,148803,150223,151772,153331,155019,156745,158504,160131,161734,163321,164954,166505,168223,169899,171482,173009,174615,176117,177796,179595,181180,182924,184591,186335,188152,189909,191799,193278,194998,196949,198845,200761,202607,204272,206366,208030,209664,211457,213181,214854,216416,218122,219912,221852,223592,225495,227061}	-0.13226064			
mobility_insights	location_statistics_y2019m03w	visitors	false	0	4	141556	{2.231728,2.515927,1.690992,2.716124,1.666667,4.006526,4.547657,2.685691,2.042206,2.0369,2.907664,3.202489,3.321924,5,2.21855,0.357143,1.781995,2.773392,2.430318,3.585561,0.251593,0.294118,0.333333,0.416667,0.47619,1.997838,2.901269,3.665649,0.083864,0.166667,0.228721,0.278577,0.284229,0.3125,0.375056,0.833333,2.434593,2.616505,2.744186,2.95092,3.26703,3.7,3.959243}	{0.0008999999845400453,0.0007999999797903001,0.0006333333440124989,0.0005666666547767818,0.0005333333392627537,0.0005333333392627537,0.0005333333392627537,0.0005000000237487257,0.0004333333345130086,0.00039999998989515007,0.00039999998989515007,0.000366666674381122,0.000366666674381122,0.000366666674381122,0.00033333332976326346,0.0003000000142492354,0.0003000000142492354,0.0003000000142492354,0.00026666666963137686,0.00026666666963137686,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503,0.00019999999494757503}	{2e-06,0.00196,0.003629,0.00529,0.00717,0.00941,0.011622,0.013755,0.016387,0.019173,0.022388,0.02522,0.028369,0.031243,0.03431,0.037177,0.04011,0.043427,0.046591,0.04976,0.052685,0.05561,0.058774333,0.061956,0.065245,0.068608,0.072032,0.0754775,0.078632,0.081756,0.084959,0.088382,0.091822,0.095209,0.098459,0.102495,0.106105,0.109757,0.113244,0.116785,0.120467,0.124337,0.128564,0.132854,0.136804,0.140986,0.145268,0.149572,0.153727,0.157896,0.162,0.166096,0.170477,0.174326,0.178639,0.182968,0.187422,0.191749,0.19638,0.200433,0.205387,0.209918,0.214573,0.218993,0.224327,0.229155,0.234454,0.239658,0.244123,0.249223,0.254667,0.260309,0.265922,0.271871,0.277339,0.283247,0.289332,0.296549,0.303343,0.309744,0.317473,0.325838,0.335268,0.344108,0.352898,0.363003,0.3743145,0.387081,0.401563,0.420192,0.440096,0.461973,0.490929,0.528797,0.574014,0.652174,0.7746,1.056453,1.79342,2.771285,14.935622}	0.010959746			
mobility_insights	location_statistics_y2019m03w	dwell_time	false	0	4	45441	{84600,82800,3600,4500,5400,8100,22500,24300,19800,85499,6300,7200,20700,23400,28800,3722,9000,15300,21600,10800,9900,10802,17100,79200,85500,11700,13500,14400,18900,25200,12600,16200,18000,83700,900,3672,3785,3885,5395,5803,5882,7227,27000,27900,43200,80100}	{0.002199999988079071,0.0010000000474974513,0.0008333333535119891,0.0007999999797903001,0.0006666666595265269,0.0006000000284984708,0.0006000000284984708,0.0006000000284984708,0.0005666666547767818,0.0005666666547767818,0.0004666666791308671,0.0004666666791308671,0.0004666666791308671,0.0004666666791308671,0.0004666666791308671,0.00039999998989515007,0.00039999998989515007,0.00039999998989515007,0.00039999998989515007,0.000366666674381122,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.00033333332976326346,0.0003000000142492354,0.0003000000142492354,0.0003000000142492354,0.0003000000142492354,0.0003000000142492354,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00026666666963137686,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356}	{901,2191,3633,3768,3915,4052,4205,4339,4491,4656,4827,5001,5185,5397,5634,5858,6082,6301,6551,6807,7083,7382,7726,8047,8396,8763,9194,9619,9983,10422,10807,11222,11641,12068,12558,13041,13493,13974,14398,14902,15401,15892,16457,16919,17431,17930,18442,18975,19508,20077,20672,21238,21709,22227,22779,23430,24002,24556,25239,26011,26758,27547,28312,29178,29973,30780,31617,32484,33460,34584,35745,36979,38294,39664,41203,42960,44652,46476,48492,50223,52200,54421,56359,58815,61658,64739,67538,70443,73060,75490,77594,79466,80991,82197,83188,83999,84836,85406,85738,86091,86400}	-0.066642396			
mobility_insights	location_statistics_y2019m03w	travel_time	false	0	4	11756	{0,5,2700,900,3600,1800,3599,10,425,810,1680,2245}	{0.5346666574478149,0.0006666666595265269,0.0005000000237487257,0.0004666666791308671,0.00039999998989515007,0.000366666674381122,0.00026666666963137686,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356,0.00023333333956543356}	{2,139,279,423,551,648,752,852,937,1024,1112,1195,1286,1375,1451,1540,1631,1760,1861,1958,2058,2162,2264,2367,2470,2575,2683,2805,2912,3013,3146,3270,3373,3513,3604,3709,3824,3951,4067,4205,4328,4437,4532,4681,4841,5002,5147,5291,5452,5602,5763,5924,6060,6223,6390,6554,6719,6917,7109,7294,7490,7698,7904,8095,8299,8537,8724,8982,9242,9536,9775,10066,10363,10632,10933,11273,11643,12014,12368,12776,13176,13580,14021,14450,14922,15462,15934,16468,17097,17693,18538,19456,20254,21245,22403,23780,25470,27648,31072,36178,62080}	0.31811374			
mobility_insights	location_statistics_y2019m03w	n	false	0	4	7	{1,2,3,4,5,6,7}	{0.9218999743461609,0.04879999905824661,0.014600000344216824,0.0075333332642912865,0.0038666666951030493,0.0026000000070780516,0.000699999975040555}		0.85469824			

-------------------------------------------------------------------------------------------------------

Query:
EXPLAIN ( ANALYZE , BUFFERS )
SELECT sum(visitors * n)
FROM location_statistics st
WHERE st.daterange = '[2019-03-04,2019-03-11)'::DATERANGE
  AND spatial_feature_id = 12675
QUERY PLAN
Aggregate  (cost=2.79..2.80 rows=1 width=8) (actual time=143.073..143.073 rows=1 loops=1)
  Buffers: shared hit=67334
  ->  Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st  (cost=0.56..2.78 rows=1 width=8) (actual time=0.026..117.284 rows=516277 loops=1)
        Index Cond: ((daterange = '[2019-03-04,2019-03-11)'::daterange) AND (spatial_feature_id = 12675))
        Buffers: shared hit=67334
Planning Time: 0.082 ms
Execution Time: 143.095 ms


For completeness sake:

EXPLAIN (ANALYZE , BUFFERS)
SELECT sum(visitors * n)
FROM location_statistics_y2019m03w st
WHERE st.daterange = '[2019-03-04,2019-03-11)'::DATERANGE
  AND spatial_feature_id = 12675
QUERY PLAN
Aggregate  (cost=2.79..2.80 rows=1 width=8) (actual time=156.304..156.305 rows=1 loops=1)
  Buffers: shared hit=66602 read=732
  ->  Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st  (cost=0.56..2.78 rows=1 width=8) (actual time=0.194..111.464 rows=516277 loops=1)
        Index Cond: ((daterange = '[2019-03-04,2019-03-11)'::daterange) AND (spatial_feature_id = 12675))
        Buffers: shared hit=66602 read=732
Planning Time: 0.058 ms
Execution Time: 156.326 ms


As can be seen, the planner predicts one row to be returned, although it should be around 3% (11% of the entries are of the given ID, which are distributed over 4 weeks = date ranges) of the table. Using the partition table directly, does not change this fact.

How can I solve this problem?

Thank you very much in advance.

Julian P. Wolf | Invenium Data Insights GmbH

julian.wolf@invenium.io | +43 664 88 199 013

Herrengasse 28 | 8010 Graz | www.invenium.io

Re: Too few rows expected by Planner on partitioned tables

From

Justin Pryzby

Date:

21 July 2020, 17:27:13

On Tue, Jul 21, 2020 at 01:09:22PM +0000, Julian Wolf wrote:
> Our problem is, that the planner always predicts one row to be returned, although only a part of the primary key is
queried.This problem exceeds feasibility of performance rapidly - a query only involving a few days already takes
dozensof seconds. All tables are analyzed and pg_stats looks reasonable IMHO.
 

>     daterange                daterange NOT NULL,
>     spatial_feature_id           INTEGER,

> Aggregate  (cost=2.79..2.80 rows=1 width=8) (actual time=143.073..143.073 rows=1 loops=1)
>   Buffers: shared hit=67334
>   ->  Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st  (cost=0.56..2.78
rows=1width=8) (actual time=0.026..117.284 rows=516277 loops=1)
 
>         Index Cond: ((daterange = '[2019-03-04,2019-03-11)'::daterange) AND (spatial_feature_id = 12675))
>         Buffers: shared hit=67334
> 
> As can be seen, the planner predicts one row to be returned, although it should be around 3% (11% of the entries are
ofthe given ID, which are distributed over 4 weeks = date ranges) of the table. Using the partition table directly,
doesnot change this fact.
 

Is there a correlation between daterange and spacial_feature_id ?

Are the estimates good if you query on *only* daterange?  spacial_feature_id ?

Maybe what you need is:
https://www.postgresql.org/docs/devel/sql-createstatistics.html
CREATE STATISTICS stats (dependencies) ON daterange, spacial_feature_id FROM location_statistics;
ANALYZE location_statistics;

-- 
Justin

Re: Too few rows expected by Planner on partitioned tables

From

Julian Wolf

Date:

22 July 2020, 06:33:17

Hello Justin,

thank you very much for your fast response.

> Is there a correlation between daterange and spacial_feature_id ?

I am not entirely sure, what you mean by that. Basically, no, they are not correlated - spatial features are places on a map, date ranges are time periods. But, as they are both part of a primary key in this particular table, they are correlated in some way as to be a part of uniquely identifying a row.

> Are the estimates good if you query on *only* daterange? spacial_feature_id ?

Unfortunately no, they are not:

------------------------------------------------------------------------------------------------------------------------------------------------

EXPLAIN (ANALYZE , BUFFERS)
SELECT sum(visitors * n)
FROM location_statistics st
WHERE st.daterange = '[2019-03-04,2019-03-11)'::DATERANGE

QUERY PLAN

Aggregate (cost=2.79..2.80 rows=1 width=8) (actual time=1143.393..1143.393 rows=1 loops=1)

Buffers: shared hit=304958

-> Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st (cost=0.56..2.78 rows=1 width=8) (actual time=0.024..931.645 rows=4296639 loops=1)

Index Cond: (daterange = '[2019-03-04,2019-03-11)'::daterange)

Buffers: shared hit=304958

Planning Time: 0.080 ms

Execution Time: 1143.421 ms

------------------------------------------------------------------------------------------------------------------------------------------------

EXPLAIN (ANALYZE , BUFFERS)
SELECT sum(visitors * n)
FROM location_statistics_y2019m03w st
WHERE st.daterange = '[2019-03-04,2019-03-11)'::DATERANGE

QUERY PLAN

Aggregate (cost=2.79..2.80 rows=1 width=8) (actual time=1126.819..1126.820 rows=1 loops=1)

Buffers: shared hit=304958

-> Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st (cost=0.56..2.78 rows=1 width=8) (actual time=0.023..763.852 rows=4296639 loops=1)

Index Cond: (daterange = '[2019-03-04,2019-03-11)'::daterange)

Buffers: shared hit=304958

Planning Time: 0.046 ms

Execution Time: 1126.845 ms

------------------------------------------------------------------------------------------------------------------------------------------------

Checking only on the spatial_feature is not the same query, as the table contains 4 different date ranges. Furthermore, there is no index for this operation. Because of that, I can only invoke this query on one partition, otherwise the query would take days.


EXPLAIN (ANALYZE , BUFFERS)
SELECT sum(visitors * n)
FROM location_statistics_y2019m03w st
WHERE spatial_feature_id = 12675

QUERY PLAN

Finalize Aggregate (cost=288490.25..288490.26 rows=1 width=8) (actual time=1131.593..1131.593 rows=1 loops=1)

Buffers: shared hit=40156 read=139887

-> Gather (cost=288490.03..288490.24 rows=2 width=8) (actual time=1131.499..1148.872 rows=2 loops=1)

Workers Planned: 2

Workers Launched: 1

Buffers: shared hit=40156 read=139887

-> Partial Aggregate (cost=287490.03..287490.04 rows=1 width=8) (actual time=1118.578..1118.579 rows=1 loops=2)

Buffers: shared hit=40156 read=139887

-> Parallel Seq Scan on location_statistics_y2019m03w st (cost=0.00..280378.27 rows=948235 width=8) (actual time=3.544..1032.899 rows=1134146 loops=2)

Filter: (spatial_feature_id = 12675)

Rows Removed by Filter: 8498136

Buffers: shared hit=40156 read=139887

Planning Time: 0.218 ms

JIT:

Functions: 12

Options: Inlining false, Optimization false, Expressions true, Deforming true

Timing: Generation 0.929 ms, Inlining 0.000 ms, Optimization 0.426 ms, Emission 6.300 ms, Total 7.655 ms

Execution Time: 1191.741 ms

The estimates seem to be good though.

Thanks in Advance

Julian

Julian P. Wolf | Invenium Data Insights GmbH

julian.wolf@invenium.io | +43 664 88 199 013

Herrengasse 28 | 8010 Graz | www.invenium.io

From: Justin Pryzby <pryzby@telsasoft.com>
Sent: Tuesday, July 21, 2020 7:27 PM
To: Julian Wolf <julian.wolf@invenium.io>
Cc: pgsql-performance Postgres Mailing List <pgsql-performance@lists.postgresql.org>
Subject: Re: Too few rows expected by Planner on partitioned tables

On Tue, Jul 21, 2020 at 01:09:22PM +0000, Julian Wolf wrote:
> Our problem is, that the planner always predicts one row to be returned, although only a part of the primary key is queried. This problem exceeds feasibility of performance rapidly - a query only involving a few days already takes dozens of seconds. All tables are analyzed and pg_stats looks reasonable IMHO.

>     daterange                daterange NOT NULL,
>     spatial_feature_id           INTEGER,

> Aggregate (cost=2.79..2.80 rows=1 width=8) (actual time=143.073..143.073 rows=1 loops=1)
>   Buffers: shared hit=67334
>   -> Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st (cost=0.56..2.78 rows=1 width=8) (actual time=0.026..117.284 rows=516277 loops=1)
>         Index Cond: ((daterange = '[2019-03-04,2019-03-11)'::daterange) AND (spatial_feature_id = 12675))
>         Buffers: shared hit=67334
>
> As can be seen, the planner predicts one row to be returned, although it should be around 3% (11% of the entries are of the given ID, which are distributed over 4 weeks = date ranges) of the table. Using the partition table directly, does not change this fact.

Is there a correlation between daterange and spacial_feature_id ?

Are the estimates good if you query on *only* daterange? spacial_feature_id ?

Maybe what you need is:
https://www.postgresql.org/docs/devel/sql-createstatistics.html
CREATE STATISTICS stats (dependencies) ON daterange, spacial_feature_id FROM location_statistics;
ANALYZE location_statistics;

--
Justin

Re: Too few rows expected by Planner on partitioned tables

From

Justin Pryzby

Date:

22 July 2020, 12:28:47

On Wed, Jul 22, 2020 at 06:33:17AM +0000, Julian Wolf wrote:
> Hello Justin,
> 
> 
> thank you very much for your fast response.
> 
> > Is there a correlation between daterange and spacial_feature_id ?
> 
> I am not entirely sure, what you mean by that. Basically, no, they are not correlated - spatial features are places
ona map, date ranges are time periods. But, as they are both part of a primary key in this particular table, they are
correlatedin some way as to be a part of uniquely identifying a row.

> 
> 
> > Are the estimates good if you query on *only* daterange?  spacial_feature_id ?
> Unfortunately no, they are not:

I checked and found that range types don't have "normal" statistics, and in
particular seem to use a poor ndistinct estimate..

                /* Estimate that non-null values are unique */
                stats->stadistinct = -1.0 * (1.0 - stats->stanullfrac);

You could try to cheat and hardcode a different ndistinct that's "less wrong"
by doing something like this:

ALTER TABLE t ALTER a SET (N_DISTINCT=-0.001);  ANALYZE t;

Maybe a better way is to create an index ON: lower(range),upper(range)
And then query: WHERE (lower(a),upper(a)) = (1,112);

Since you'd be storing the values separately in the index anyway, maybe this
means that range types won't work well for you for primary, searchable columns.

But if you're stuck with the schema, another kludge, if you want to do
something extra weird, is to remove statistics entirely by disabling
autoanalyze on the table and then manually run ANALYZE(columns) where columns
doesn't include the range column.  You'd have to remove the stats:

begin; DELETE FROM pg_statistic s USING pg_attribute a WHERE s.staattnum=a.attnum AND s.starelid=a.attrelid AND
starelid='t'::regclassAND a.attname='a';

-- 
Justin

Re: Too few rows expected by Planner on partitioned tables

From

Justin Pryzby

Date:

22 July 2020, 14:40:16

On Tue, Jul 21, 2020 at 01:09:22PM +0000, Julian Wolf wrote:
> Hello,
> 
> A description of what you are trying to achieve and what results you expect:
> Our database is growing on a daily basis by about 2.5million rows per table (2 at the moment). Because of that, we
decidedto partition the data, especially, as we are pre-aggregating the data for weeks, months, quarters and years.
Everyaggregation is stored in a separate partition:
 
> 
...
> Our problem is, that the planner always predicts one row to be returned, although only a part of the primary key is
queried.This problem exceeds feasibility of performance rapidly - a query only involving a few days already takes
dozensof seconds. All tables are analyzed and pg_stats looks reasonable IMHO.
 
...
>     PRIMARY KEY ( daterange, spatial_feature_id, visitor_profile_id, activity_type_combination_id,
>                  activity_chain_id),
...
> ) PARTITION BY LIST (daterange);

> schemaname relname n_live_tup
> mobility_insights location_statistics_y2019m03d 23569853
> mobility_insights location_statistics_y2019m03w 19264373
> mobility_insights location_statistics_y2019m03 18105295

> Aggregate  (cost=2.79..2.80 rows=1 width=8) (actual time=143.073..143.073 rows=1 loops=1)
>   Buffers: shared hit=67334
>   ->  Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st  (cost=0.56..2.78
rows=1width=8) (actual time=0.026..117.284 rows=516277 loops=1)
 
>         Index Cond: ((daterange = '[2019-03-04,2019-03-11)'::daterange) AND (spatial_feature_id = 12675))
>         Buffers: shared hit=67334

I guess this isn't actually the problem query, since it takes 143ms and not
dozens of seconds.  I don't know what is the problem query, but maybe it might
help to create an new index on spatial_feature_id, which could be scanned
rather than scanning the unique index.

Also, if daterange *and* spatial_feature_id are always *both* included, then
this might work:

postgres=# CREATE STATISTICS t_stats (mcv) ON daterange,spatial_feature_id FROM t ; ANALYZE t;

-- 
Justin

Re: Too few rows expected by Planner on partitioned tables

From

Julian Wolf

Date:

29 July 2020, 06:17:06

Hi,

Thank you very much for your answers and sorry for the delayed response.

> I checked and found that range types don't have "normal" statistics, and in
> particular seem to use a poor ndistinct estimate..

> /* Estimate that non-null values are unique */
> stats->stadistinct = -1.0 * (1.0 - stats->stanullfrac);

I investigated this idea and played around with the n_distinct value and you are absolutely right, the statistics do behave strangely with range types. Even creating statistics

(CREATE STATISTICS t_stats (mcv) ON daterange,spatial_feature_id FROM t ; ANALYZE t;)

doesn't change the fact.

I do get that range types were created with GIST and range comparison in mind, but as they are a really neat way to describe not only a date but also granularity dependency (i.e. "this data represent this exact week"), it would be really nice, if these data types would work with primary keys and thus b-tree too.

In my case, I switched the daterange type with a BIGINT, which holds the exact same information on byte level. This value can then be immutably converted back to daterange and vice versa. This solved the problem for me.

Thank you very much for your time and help.

Best Regards

Julian P. Wolf | Invenium Data Insights GmbH

julian.wolf@invenium.io | +43 664 88 199 013

Herrengasse 28 | 8010 Graz | www.invenium.io

From: Justin Pryzby <pryzby@telsasoft.com>
Sent: Wednesday, July 22, 2020 4:40 PM
To: Julian Wolf <julian.wolf@invenium.io>
Cc: pgsql-performance Postgres Mailing List <pgsql-performance@lists.postgresql.org>
Subject: Re: Too few rows expected by Planner on partitioned tables

On Tue, Jul 21, 2020 at 01:09:22PM +0000, Julian Wolf wrote:
> Hello,
>
> A description of what you are trying to achieve and what results you expect:
> Our database is growing on a daily basis by about 2.5million rows per table (2 at the moment). Because of that, we decided to partition the data, especially, as we are pre-aggregating the data for weeks, months, quarters and years. Every aggregation is stored in a separate partition:
>
...
> Our problem is, that the planner always predicts one row to be returned, although only a part of the primary key is queried. This problem exceeds feasibility of performance rapidly - a query only involving a few days already takes dozens of seconds. All tables are analyzed and pg_stats looks reasonable IMHO.
...
>     PRIMARY KEY ( daterange, spatial_feature_id, visitor_profile_id, activity_type_combination_id,
>                  activity_chain_id),
...
> ) PARTITION BY LIST (daterange);

> schemaname relname n_live_tup
> mobility_insights location_statistics_y2019m03d 23569853
> mobility_insights location_statistics_y2019m03w 19264373
> mobility_insights location_statistics_y2019m03 18105295

> Aggregate (cost=2.79..2.80 rows=1 width=8) (actual time=143.073..143.073 rows=1 loops=1)
>   Buffers: shared hit=67334
>   -> Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st (cost=0.56..2.78 rows=1 width=8) (actual time=0.026..117.284 rows=516277 loops=1)
>         Index Cond: ((daterange = '[2019-03-04,2019-03-11)'::daterange) AND (spatial_feature_id = 12675))
>         Buffers: shared hit=67334

I guess this isn't actually the problem query, since it takes 143ms and not
dozens of seconds. I don't know what is the problem query, but maybe it might
help to create an new index on spatial_feature_id, which could be scanned
rather than scanning the unique index.

Also, if daterange *and* spatial_feature_id are always *both* included, then
this might work:

postgres=# CREATE STATISTICS t_stats (mcv) ON daterange,spatial_feature_id FROM t ; ANALYZE t;

--
Justin

Re: Too few rows expected by Planner on partitioned tables

From

Julian Wolf

Date:

26 August 2020, 06:54:39

Hi Justin,

thank you very much for your help and sorry for the late answer.

After testing around with your suggestions, it actually was the daterange type which caused all the problems. Messing around with the statistics value improved performance drastically but did not solve the problem. We decided to replace the daterange type with a BIGINT and calculate the "id" of the daterange by just using the BIGINT (2x 4 bytes) representation of the daterange. Thus, it can be transformed in both directions immutably.

CREATE OR REPLACE FUNCTION to_daterange_id(daterange DATERANGE)
    RETURNS BIGINT
    IMMUTABLE
    LANGUAGE plpgsql
AS
$$
BEGIN
    return (extract(EPOCH FROM lower(daterange))::BIGINT << 32) |
           extract(EPOCH FROM upper(daterange))::BIGINT;
end;

--------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION to_daterange(daterange_id BIGINT)
    RETURNS DATERANGE
    IMMUTABLE
    LANGUAGE plpgsql
AS
$$
BEGIN
    RETURN daterange(to_timestamp(daterange_id >> 32)::DATE, to_timestamp(daterange_id & x'FFFFFFFF'::BIGINT)::DATE);
END;
$$;

So there is no daterange object messing up the primary key index. Your other suggestions sadly didn't work, as the daterange was the partition key of the table too, this field was inevitably the first criterion in all queries and thus overruled every other index.

With that said and done, it would be nice, if daterange objects could be used in unique indexes too. They are a great way to identify data which represents a week, month, etc. worth of data (similar to a two-column-date representation).

Thank you very much again for your time and help

Julian

Julian P. Wolf | Invenium Data Insights GmbH

julian.wolf@invenium.io | +43 664 88 199 013

Herrengasse 28 | 8010 Graz | www.invenium.io

Re: Too few rows expected by Planner on partitioned tables

From

Michael Lewis

Date:

26 August 2020, 13:32:41

On Wed, Aug 26, 2020, 1:37 AM Julian Wolf <julian.wolf@invenium.io> wrote:

Hi Justin,

thank you very much for your help and sorry for the late answer.

After testing around with your suggestions, it actually was the daterange type which caused all the problems. Messing around with the statistics value improved performance drastically but did not solve the problem. We decided to replace the daterange type with a BIGINT and calculate the "id" of the daterange by just using the BIGINT (2x 4 bytes) representation of the daterange. Thus, it can be transformed in both directions immutably.
CREATE OR REPLACE FUNCTION to_daterange_id(daterange DATERANGE)
    RETURNS BIGINT
    IMMUTABLE
    LANGUAGE plpgsql
AS
$$
BEGIN
    return (extract(EPOCH FROM lower(daterange))::BIGINT << 32) |
           extract(EPOCH FROM upper(daterange))::BIGINT;
end;
--------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION to_daterange(daterange_id BIGINT)
    RETURNS DATERANGE
    IMMUTABLE
    LANGUAGE plpgsql
AS
$$
BEGIN
    RETURN daterange(to_timestamp(daterange_id >> 32)::DATE, to_timestamp(daterange_id & x'FFFFFFFF'::BIGINT)::DATE);
END;
$$;

You might want to consider changing that language declaration to SQL.