Thread: BUG #15230: "Logical decoding" is not sensitive to client encodingsetting
BUG #15230: "Logical decoding" is not sensitive to client encodingsetting
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15230 Logged by: Hillel Eilat Email address: hillel.eilat@attunity.com PostgreSQL version: 9.4.4 Operating system: Windows 7 Description: Logical Decoding is not sensitive to Client character encoding setting My project uses Logical Decoding by interacting with the WAL backend via native non-SQL interface. The plugin used is the common "test_decoding", which is shipped together with the kit. There is a Japanese database for which encoding is defined as ""EUC_JP". Ordinarily - we process the streamed data in UTF8 client encoding - thus maintaining a common general "consumer" functions. Consequently, prior to issuing PQconnectdbParams(keywords, values, true) - a {"client_encoding","UTF8"} couple is introduced. To be on the safe side - a couple of PQclientEncoding(pConn) / pg_encoding_to_char(iClientEncoding) is issued thereafter, for approving that UTF8 was properly set. Despite the above setting , data which is streamed in does not show up in UTF8. It preserves the backend server EUC_JP encoding. This must be a bug. One would expect that decoded data which is treamed in should be subjected to client encoding. Your assistance will be appreciated. Regards Hillel.
Re: BUG #15230: "Logical decoding" is not sensitive to clientencoding setting
From
Euler Taveira
Date:
2018-06-05 5:29 GMT-03:00 PG Bug reporting form <noreply@postgresql.org>: > The plugin used is the common "test_decoding", which is shipped together > with the kit. > What is the test_decoding output mode? By default, it uses textual mode. Did you set binary mode (foce-binary=1)? > There is a Japanese database for which encoding is defined as ""EUC_JP". > Ordinarily - we process the streamed data in UTF8 client encoding - thus > maintaining a common general "consumer" functions. > Consequently, prior to issuing PQconnectdbParams(keywords, values, true) - a > {"client_encoding","UTF8"} couple is introduced. > To be on the safe side - a couple of PQclientEncoding(pConn) / > pg_encoding_to_char(iClientEncoding) is issued thereafter, > for approving that UTF8 was properly set. > client_encoding should be set in the replication connection because if you set it later it won't be passed down to libpqwalreceiver. [1] https://www.postgresql.org/docs/9.4/static/logicaldecoding-output-plugin.html#LOGICALDECODING-OUTPUT-MODE -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
RE: BUG #15230: "Logical decoding" is not sensitive to clientencoding setting
From
Hillel Eilat
Date:
Thanks. 1. As per your question - default (=textual) decoding mode is used. 2. Factually - client_encoding is set in the replication connection. The problem is that it does not help. Data which is streamed in, is represented in the server_encoding (Japanese in this case) while we expect UTF8 - whichwas set as client_encoding. For being more specific - here is the essence of a piece of "C" code which is used for establishing the connection - viaPQconnectdbParams(keywords, values, true); This is the REPLICATION connection on which "START_REPLICATION SLOT "XXXXXXX" LOGICAL LLL/SSS" is executed later. One would expect that data fetched in via PQgetCopyData(...) thereafter, will show up in client_encoding representation. But this is not the case... Your clarifications will be appreciated. Thanks Hillel. char *pszClientEncoding = "UTF8"; // Set client encoding i = 0; // Initial Array index keywords[i] = "dbname"; values[i] = pszDbName == NULL ? "replication" : pszDbName; i++; keywords[i] = "replication"; values[i] = pszDbName == NULL ? "true" : "database"; i++; keywords[i] = "fallback_application_name"; values[i] = pszProgName; i++; if (pszDbHost) { keywords[i] = "host"; values[i] = pszDbHost; i++; } if (pszDbUser) { keywords[i] = "user"; values[i] = pszDbUser; i++; } if (pszDbPort) { keywords[i] = "port"; values[i] = pszDbPort; i++; } if (pszClientEncoding) // Set client encoding { keywords[i] = "client_encoding"; values[i] = pszClientEncoding; i++; } /* Prompting for password here is not a matter of interest (the -"W" connad option) */ //need_password = (dbgetpassword == 1 && dbpassword == NULL); need_password = 0; // No point in this mechanism here //do { if (pszDbPassword) { keywords[i] = "password"; values[i] = pszDecryptedPassword; } else { keywords[i] = NULL; values[i] = NULL; } tmpconn = PQconnectdbParams(keywords, values, true); if (!tmpconn) { pSetup->config.logger_error((char *)pszLoggingOrg,__LINE__,kPG_LOGGER_SEVERITY_ERROR,"PQconnectdbParams(...)- Could not connect to the server."); return NULL; } if (PQstatus(tmpconn) == CONNECTION_BAD && PQconnectionNeedsPassword(tmpconn) && dbgetpassword != -1) { AT_STR->snprintf(szMsg, sizeof(szMsg), "Could not connect to server. Missing or improper password: %s",ar_PQerrorMessage(tmpconn)); pSetup->config.logger_error((char *)pszLoggingOrg,__LINE__,kPG_LOGGER_SEVERITY_ERROR,szMsg); ar_PQfinish(tmpconn); return NULL; } } //while (need_password); -----Original Message----- From: Euler Taveira [mailto:euler@timbira.com.br] Sent: Thursday, June 14, 2018 5:28 PM To: Hillel Eilat <Hillel.Eilat@attunity.com>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #15230: "Logical decoding" is not sensitive to client encoding setting 2018-06-05 5:29 GMT-03:00 PG Bug reporting form <noreply@postgresql.org>: > The plugin used is the common "test_decoding", which is shipped > together with the kit. > What is the test_decoding output mode? By default, it uses textual mode. Did you set binary mode (foce-binary=1)? > There is a Japanese database for which encoding is defined as ""EUC_JP". > Ordinarily - we process the streamed data in UTF8 client encoding - > thus maintaining a common general "consumer" functions. > Consequently, prior to issuing PQconnectdbParams(keywords, values, > true) - a {"client_encoding","UTF8"} couple is introduced. > To be on the safe side - a couple of PQclientEncoding(pConn) / > pg_encoding_to_char(iClientEncoding) is issued thereafter, for > approving that UTF8 was properly set. > client_encoding should be set in the replication connection because if you set it later it won't be passed down to libpqwalreceiver. [1] https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F9.4%2Fstatic%2Flogicaldecoding-output-plugin.html%23LOGICALDECODING-OUTPUT-MODE&data=01%7C01%7Chillel.eilat%40attunity.com%7C9a1fc00d858f459156cc08d5d20313bc%7C128547273c574819ab290c418b8310a1%7C1&sdata=i4ViTGALzy04B%2F9GU4MToSVYJLCDxCxZahqChrax%2Bdk%3D&reserved=0 -- Euler Taveira Timbira - https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.timbira.com.br%2F&data=01%7C01%7Chillel.eilat%40attunity.com%7C9a1fc00d858f459156cc08d5d20313bc%7C128547273c574819ab290c418b8310a1%7C1&sdata=NOwGcjs2uIMGLCp6JaCjixKzL3mGDZVGxPJxo5m4UUo%3D&reserved=0 PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento