Thread: Slaves show different results for query
Hi, I have a strange case where a SELECT for a primary key returns 0 rows on one slave, while it returns the correct 1 row onanother slave and on the master. It does however return that row on all slaves when queried with LIKE and trailing or leadingwildcard. psql version is 9.3.5, because that's what comes with Ubuntu 14.04 LTS Master runs Ubuntu 14.04 LTS Slave 1 runs also Ubuntu 14.04 LTS Slave 2 runs Mac OS X 10.7, pgsql installed via homebrew Both slaves are configured with streaming replication, and I've been using that setup for years, starting with psql 9.1,with no problems so far. Suspecting some weird problem, I already re-initialized slave 2 with a fresh backup and startedreplication from the beginning, so the database is fresh from a master copy, and is verified to be current. 2015-05-19 20:53:43.937 CEST LOG: entering standby mode 2015-05-19 20:53:43.974 CEST LOG: redo starts at 31/3F000028 2015-05-19 20:53:45.522 CEST LOG: consistent recovery state reached at 31/40CCE6E8 2015-05-19 20:53:45.523 CEST LOG: database system is ready to accept read only connections 2015-05-19 20:53:45.604 CEST LOG: started streaming WAL from primary at 31/41000000 on timeline 1 So here's the query. SELECT * FROM MyTable WHERE email = 'foo@example.com'; This returns 1 row on master and slave 1, but 0 on slave 2, while this query: SELECT * FROM MyTable WHERE email LIKE 'foo@example.com%'; or this one SELECT * FROM MyTable WHERE email LIKE '%foo@example.com'; returns the correct 1 row on all three systems. Note that this works with the wildcard on either end, or also somewhere inthe middle, doesn't matter. Note: "email" is the primary key on this table. This behaviour is the same with any address to be queried, and is also the same on a similar second table. This does NOToccur on any other table, which all have integer primary keys. There is also no problem when I select for other attributeson these tables. Does anyone have a hint? Thanks Maik
First, are your postgresql.conf options the same on master and slave2 with regard to memory allocation and all Planner Method Configuration options?
Next, is it possible you have a corrupted index on slave2.?SELECT n.nspname as schema,
i.relname as table,
i.indexrelname as index,
i.idx_scan,
i.idx_tup_read,
i.idx_tup_fetch,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
pg_get_indexdef(idx.indexrelid),
CASE WHEN idx.indisvalid
THEN 'valid'
ELSE 'INVALID'
END as istatus,
pg_relation_size(quote_ident(n.nspname)|| '.' || quote_ident(i.relname)) as size_in_bytes,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname)|| '.' || quote_ident(i.relname))) as size
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.relname = 'MyTable'
ORDER BY 1, 2, 3;
On Wed, May 20, 2015 at 2:52 AM, Musall Maik <lists@musall.de> wrote:
Hi,
I have a strange case where a SELECT for a primary key returns 0 rows on one slave, while it returns the correct 1 row on another slave and on the master. It does however return that row on all slaves when queried with LIKE and trailing or leading wildcard.
psql version is 9.3.5, because that's what comes with Ubuntu 14.04 LTS
Master runs Ubuntu 14.04 LTS
Slave 1 runs also Ubuntu 14.04 LTS
Slave 2 runs Mac OS X 10.7, pgsql installed via homebrew
Both slaves are configured with streaming replication, and I've been using that setup for years, starting with psql 9.1, with no problems so far. Suspecting some weird problem, I already re-initialized slave 2 with a fresh backup and started replication from the beginning, so the database is fresh from a master copy, and is verified to be current.
2015-05-19 20:53:43.937 CEST LOG: entering standby mode
2015-05-19 20:53:43.974 CEST LOG: redo starts at 31/3F000028
2015-05-19 20:53:45.522 CEST LOG: consistent recovery state reached at 31/40CCE6E8
2015-05-19 20:53:45.523 CEST LOG: database system is ready to accept read only connections
2015-05-19 20:53:45.604 CEST LOG: started streaming WAL from primary at 31/41000000 on timeline 1
So here's the query.
SELECT * FROM MyTable WHERE email = 'foo@example.com';
This returns 1 row on master and slave 1, but 0 on slave 2, while this query:
SELECT * FROM MyTable WHERE email LIKE 'foo@example.com%';
or this one
SELECT * FROM MyTable WHERE email LIKE '%foo@example.com';
returns the correct 1 row on all three systems. Note that this works with the wildcard on either end, or also somewhere in the middle, doesn't matter. Note: "email" is the primary key on this table.
This behaviour is the same with any address to be queried, and is also the same on a similar second table. This does NOT occur on any other table, which all have integer primary keys. There is also no problem when I select for other attributes on these tables.
Does anyone have a hint?
Thanks
Maik
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Hi Melvin,
slave2 has less shared_buffers (256m vs 2048m), temp_buffers (16m vs 128m), work_mem (8m vs 16m) and maintenance_work_mem (8m vs 16m) due to hardware constraints. All other settings are the same and mostly default.
Unlikely, as I replaced the slave2 db with a fresh backup from master yesterday to rule this out.
All three (master, slave1, slave2) use the same plan:
thanks for the response.
Am 20.05.2015 um 14:26 schrieb Melvin Davidson <melvin6925@gmail.com>:First, are your postgresql.conf options the same on master and slave2 with regard to memory allocation and all Planner Method Configuration options?
Next, is it possible you have a corrupted index on slave2.?
I would suggest verifying there is no difference in the query plan between master and slave2.IE: EXPLAIN SELECT * FROM MyTable WHERE email = 'foo@example.com';
Index Scan using mytable_pk on mytable (cost=0.42..8.44 rows=1 width=205)
Index Cond: ((email)::text = 'foo@example.com'::text)
Check your indexes with:
SELECT n.nspname as schema,
i.relname as table,
i.indexrelname as index,
i.idx_scan,
i.idx_tup_read,
i.idx_tup_fetch,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
pg_get_indexdef(idx.indexrelid),
CASE WHEN idx.indisvalid
THEN 'valid'
ELSE 'INVALID'
END as istatus,
pg_relation_size(quote_ident(n.nspname)|| '.' || quote_ident(i.relname)) as size_in_bytes,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname)|| '.' || quote_ident(i.relname))) as size
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.relname = 'MyTable'
ORDER BY 1, 2, 3;
master:
schema | table | index | idx_scan | idx_tup_read | idx_tup_fetch | type | pg_get_indexdef | istatus | size_in_bytes | size
--------+---------+-----------------------------+----------+--------------+---------------+------+-----------------------------------------------------------------------------------+---------+---------------+--------
public | mytable | mytable_emailaddressref_idx | 6 | 11 | 4 | idx | CREATE INDEX mytable_emailaddressref_idx ON mytable USING btree (emailaddressref) | valid | 123609088 | 118 MB
public | mytable | mytable_pk | 1291541 | 1305655 | 1291371 | pkey | CREATE UNIQUE INDEX mytable_pk ON mytable USING btree (email) | valid | 123609088 | 118 MB
public | mytable | mytable_syncstatus_idx | 3710 | 2250428 | 0 | idx | CREATE INDEX mytable_syncstatus_idx ON mytable USING btree (syncstatus) | valid | 123609088 | 118 MB
public | mytable | mytable_userref_idx | 0 | 0 | 0 | idx | CREATE INDEX mytable_userref_idx ON mytable USING btree (userref) | valid | 123609088 | 118 MB
slave1:
schema | table | index | idx_scan | idx_tup_read | idx_tup_fetch | type | pg_get_indexdef | istatus | size_in_bytes | size
--------+---------+-----------------------------+----------+--------------+---------------+------+-----------------------------------------------------------------------------------+---------+---------------+--------
public | mytable | mytable_emailaddressref_idx | 0 | 0 | 0 | idx | CREATE INDEX mytable_emailaddressref_idx ON mytable USING btree (emailaddressref) | valid | 123609088 | 118 MB
public | mytable | mytable_pk | 3 | 13 | 3 | pkey | CREATE UNIQUE INDEX mytable_pk ON mytable USING btree (email) | valid | 123609088 | 118 MB
public | mytable | mytable_syncstatus_idx | 0 | 0 | 0 | idx | CREATE INDEX mytable_syncstatus_idx ON mytable USING btree (syncstatus) | valid | 123609088 | 118 MB
public | mytable | mytable_userref_idx | 0 | 0 | 0 | idx | CREATE INDEX mytable_userref_idx ON mytable USING btree (userref) | valid | 123609088 | 118 MB
slave2:
schema | table | index | idx_scan | idx_tup_read | idx_tup_fetch | type | pg_get_indexdef | istatus | size_in_bytes | size
--------+---------+-----------------------------+----------+--------------+---------------+------+-----------------------------------------------------------------------------------+---------+---------------+--------
public | mytable | mytable_emailaddressref_idx | 1 | 3 | 1 | idx | CREATE INDEX mytable_emailaddressref_idx ON mytable USING btree (emailaddressref) | valid | 123609088 | 118 MB
public | mytable | mytable_pk | 15 | 0 | 0 | pkey | CREATE UNIQUE INDEX mytable_pk ON mytable USING btree (email) | valid | 123609088 | 118 MB
public | mytable | mytable_syncstatus_idx | 0 | 0 | 0 | idx | CREATE INDEX mytable_syncstatus_idx ON mytable USING btree (syncstatus) | valid | 123609088 | 118 MB
public | mytable | mytable_userref_idx | 0 | 0 | 0 | idx | CREATE INDEX mytable_userref_idx ON mytable USING btree (userref) | valid | 123609088 | 118 MB
Now there are a few different numbers, although I wasn't really successful trying to quickly read up what that means. And I can understand that different stats there can lead to different performance, but I think whatever the stats are and the execution plan is, this must not lead to a false result, right?
Besides, the scales look similar in slave1 and slave2, although the query results are different.
Maik
On Wed, May 20, 2015 at 2:52 AM, Musall Maik <lists@musall.de> wrote:Hi,
I have a strange case where a SELECT for a primary key returns 0 rows on one slave, while it returns the correct 1 row on another slave and on the master. It does however return that row on all slaves when queried with LIKE and trailing or leading wildcard.
psql version is 9.3.5, because that's what comes with Ubuntu 14.04 LTS
Master runs Ubuntu 14.04 LTS
Slave 1 runs also Ubuntu 14.04 LTS
Slave 2 runs Mac OS X 10.7, pgsql installed via homebrew
Both slaves are configured with streaming replication, and I've been using that setup for years, starting with psql 9.1, with no problems so far. Suspecting some weird problem, I already re-initialized slave 2 with a fresh backup and started replication from the beginning, so the database is fresh from a master copy, and is verified to be current.
2015-05-19 20:53:43.937 CEST LOG: entering standby mode
2015-05-19 20:53:43.974 CEST LOG: redo starts at 31/3F000028
2015-05-19 20:53:45.522 CEST LOG: consistent recovery state reached at 31/40CCE6E8
2015-05-19 20:53:45.523 CEST LOG: database system is ready to accept read only connections
2015-05-19 20:53:45.604 CEST LOG: started streaming WAL from primary at 31/41000000 on timeline 1
So here's the query.
SELECT * FROM MyTable WHERE email = 'foo@example.com';
This returns 1 row on master and slave 1, but 0 on slave 2, while this query:
SELECT * FROM MyTable WHERE email LIKE 'foo@example.com%';
or this one
SELECT * FROM MyTable WHERE email LIKE '%foo@example.com';
returns the correct 1 row on all three systems. Note that this works with the wildcard on either end, or also somewhere in the middle, doesn't matter. Note: "email" is the primary key on this table.
This behaviour is the same with any address to be queried, and is also the same on a similar second table. This does NOT occur on any other table, which all have integer primary keys. There is also no problem when I select for other attributes on these tables.
Does anyone have a hint?
Thanks
Maik
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On 05/19/2015 11:52 PM, Musall Maik wrote: > Hi, > > I have a strange case where a SELECT for a primary key returns 0 rows on one slave, while it returns the correct 1 rowon another slave and on the master. It does however return that row on all slaves when queried with LIKE and trailingor leading wildcard. > > psql version is 9.3.5, because that's what comes with Ubuntu 14.04 LTS > Master runs Ubuntu 14.04 LTS > Slave 1 runs also Ubuntu 14.04 LTS > Slave 2 runs Mac OS X 10.7, pgsql installed via homebrew > > Both slaves are configured with streaming replication, and I've been using that setup for years, starting with psql 9.1,with no problems so far. Suspecting some weird problem, I already re-initialized slave 2 with a fresh backup and startedreplication from the beginning, so the database is fresh from a master copy, and is verified to be current. > > 2015-05-19 20:53:43.937 CEST LOG: entering standby mode > 2015-05-19 20:53:43.974 CEST LOG: redo starts at 31/3F000028 > 2015-05-19 20:53:45.522 CEST LOG: consistent recovery state reached at 31/40CCE6E8 > 2015-05-19 20:53:45.523 CEST LOG: database system is ready to accept read only connections > 2015-05-19 20:53:45.604 CEST LOG: started streaming WAL from primary at 31/41000000 on timeline 1 > > > So here's the query. > > SELECT * FROM MyTable WHERE email = 'foo@example.com'; > > This returns 1 row on master and slave 1, but 0 on slave 2, while this query: > > SELECT * FROM MyTable WHERE email LIKE 'foo@example.com%'; > > or this one > > SELECT * FROM MyTable WHERE email LIKE '%foo@example.com'; > > returns the correct 1 row on all three systems. Note that this works with the wildcard on either end, or also somewherein the middle, doesn't matter. Note: "email" is the primary key on this table. > > This behaviour is the same with any address to be queried, and is also the same on a similar second table. This does NOToccur on any other table, which all have integer primary keys. There is also no problem when I select for other attributeson these tables. > > Does anyone have a hint? What are the encodings on the various machines and in the databases? > > Thanks > Maik > > > -- Adrian Klaver adrian.klaver@aklaver.com
> Am 20.05.2015 um 15:17 schrieb Adrian Klaver <adrian.klaver@aklaver.com>: > > On 05/19/2015 11:52 PM, Musall Maik wrote: >> Hi, >> >> I have a strange case where a SELECT for a primary key returns 0 rows on one slave, while it returns the correct 1 rowon another slave and on the master. It does however return that row on all slaves when queried with LIKE and trailingor leading wildcard. >> >> […] >> Does anyone have a hint? > > What are the encodings on the various machines and in the databases? All encodings UTF8, all collate and ctype en_US.UTF-8. What do you mean by "machines" exactly? This is not dependent on shellenvironment or something, I get this also via JDBC. Maik
On 05/20/2015 06:34 AM, Musall Maik wrote: > >> Am 20.05.2015 um 15:17 schrieb Adrian Klaver <adrian.klaver@aklaver.com>: >> >> On 05/19/2015 11:52 PM, Musall Maik wrote: >>> Hi, >>> >>> I have a strange case where a SELECT for a primary key returns 0 rows on one slave, while it returns the correct 1 rowon another slave and on the master. It does however return that row on all slaves when queried with LIKE and trailingor leading wildcard. >>> >>> […] > >>> Does anyone have a hint? >> >> What are the encodings on the various machines and in the databases? > > All encodings UTF8, all collate and ctype en_US.UTF-8. What do you mean by "machines" exactly? This is not dependent onshell environment or something, Actually it is: http://www.postgresql.org/docs/9.4/interactive/locale.html "Locale support refers to an application respecting cultural preferences regarding alphabets, sorting, number formatting, etc. PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system. For additional information refer to the documentation of your system." The reason I ask is that the machine you are having problems with has OS X. Over the years I have seen quite a few reports on this list of OS X locale/encoding issues. I get this also via JDBC. > > Maik > > > -- Adrian Klaver adrian.klaver@aklaver.com
Adrian Klaver <adrian.klaver@aklaver.com> writes: > The reason I ask is that the machine you are having problems with has OS > X. Over the years I have seen quite a few reports on this list of OS X > locale/encoding issues. Yes. Here's the problem: OS X UTF8 locales (other than C) don't sort the same as UTF8 locales on Linux. Because of this, the index created by the master is effectively corrupt from the standpoint of the OS X slave; it's not in the correct sort order. It might sometimes find the right results anyway, but often not. You might be able to get away with the described configuration if you recreate the database using C locale, but personally I wouldn't risk it. Masters and slaves in a WAL-shipping replication configuration should be on substantially identical platforms, else you're just asking for trouble. regards, tom lane
Hi Tom, that's very useful information. Very cool to get such results so quickly here. Thanks Maik > Am 20.05.2015 um 16:05 schrieb Tom Lane <tgl@sss.pgh.pa.us>: > > Adrian Klaver <adrian.klaver@aklaver.com> writes: >> The reason I ask is that the machine you are having problems with has OS >> X. Over the years I have seen quite a few reports on this list of OS X >> locale/encoding issues. > > Yes. Here's the problem: OS X UTF8 locales (other than C) don't sort the > same as UTF8 locales on Linux. Because of this, the index created by the > master is effectively corrupt from the standpoint of the OS X slave; it's > not in the correct sort order. It might sometimes find the right results > anyway, but often not. > > You might be able to get away with the described configuration if you > recreate the database using C locale, but personally I wouldn't risk it. > Masters and slaves in a WAL-shipping replication configuration should be > on substantially identical platforms, else you're just asking for trouble. > > regards, tom lane > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general