Re: [PATCH] Make pg_basebackup configure and start standby [Review] - Mailing list pgsql-hackers
From | Boszormenyi Zoltan |
---|---|
Subject | Re: [PATCH] Make pg_basebackup configure and start standby [Review] |
Date | |
Msg-id | 5075D350.4030706@cybertec.at Whole thread Raw |
In response to | Re: [PATCH] Make pg_basebackup configure and start standby [Review] (Boszormenyi Zoltan <zb@cybertec.at>) |
Responses |
Re: [PATCH] Make pg_basebackup configure and start standby [Review]
|
List | pgsql-hackers |
2012-10-10 20:36 keltezéssel, Boszormenyi Zoltan írta: > 2012-10-10 18:23 keltezéssel, Fujii Masao írta: >> On Wed, Oct 10, 2012 at 10:12 PM, Boszormenyi Zoltan <zb@cybertec.at> wrote: >>> Hi, >>> >>> thanks for the new review. >>> >>> 2012-10-10 08:58 keltezéssel, Amit Kapila írta: >>>> On Thursday, October 04, 2012 9:50 PM Boszormenyi Zoltan >>>>> 2012-10-04 16:43 keltezéssel, Tom Lane írta: >>>>> >>>>>> Boszormenyi Zoltan <zb@cybertec.at> writes: >>>>>>>> Did you think about something like the attached code? >>>>>>> Or rather this one, which fixes a bug so fillPGconn() and >>>>>>> PQconninfo() are symmetric and work for "requiressl". >>>>>> That's incredibly ugly. I'm not sure where we should keep the "R" >>>>>> information, but shoehorning it into the existing PQconninfoOption >>>>>> struct like that seems totally unacceptable. Either we're willing to >>>>>> break backwards compatibility on the Disp-Char strings, or we need to >>>>>> put that info somewhere else. >>>>> I hope only handling the new flag part is ugly. It can be hidden in the >>>>> PQconninfoMapping[] array and PQconninfo(conn, true) pre-filters the >>>>> list as in the attached version. >>>> Please find the review of the patch. >>>> >>>> >>>> Basic stuff: >>>> ------------ >>>> - patch apply failed at exports.txt file. Needs rebase to the current >>>> master. >>> >>> Done. >>> >>>> - Compiles cleanly with no warnings >>>> >>>> >>>> What it does: >>>> -------------- >>>> pg_basebackup does the base backup from the primary machine and also >>>> writes >>>> the recovery.conf file with the option -R, >>>> which is used for the standby to connect to primary for streaming >>>> replication. >>>> >>>> Testing: >>>> --------- >>>> 1. Test pg_basebackup with option -R to check that the recovery.conf file >>>> is >>>> written to data directory. >>>> --recovery.conf file is created in data directory. >>>> 2. Test pg_basebackup with option -R to check that the recovery.conf >>>> file is >>>> not able to create because of disk full. >>>> --Error is given as recovery.conf file is not able to create. >>>> 3. Test pg_basebackup with option -R including -h, -U, -p, -w and -W. >>>> >>>> verify the recovery.conf which is created in data directory. >>>> --All the primary connection info parameters are working fine. >>>> 4. Test pg_basebackup with conflict options (-x or -X and -R). >>>> >>>> --Error is given when the conflict options are provided to >>>> pg_basebackup. >>>> >>>> 5. Test pg_basebackup with option -R from a standby server. >>>> --recovery.conf file is created in data directory. >>>> >>>> Code Review: >>>> ------------- >>>> 1. >>>> typedef struct PQconninfoMapping { >>>> + char *keyword; >>>> + size_t member_offset; >>>> + bool for_replication; >>>> + char *conninfoValue; /* Special value >>>> mapping >>>> */ >>>> + char *connValue; >>>> +} PQconninfoMapping; >>>> >>>> Provide the better explanation of conninfoValue and connValue, how and >>>> where >>>> these are used? >>> >>> OK. This is only used for " requiressl='1' " (in the connection string) >>> and if '1' is set, PGconn.sslmode will be set to "require". All other >>> values are treated as it's being unset. This simplistic mapping >>> is used because there is only one such setting where different values >>> are used on the conninfo and the PGconn sides. >>> >>> >>>> 2. if (tmp && strncmp(tmp, mapping->conninfoValue, len) == 0) >>>> >>>> In any case if the above condition is not satisfied then the PGconn data >>>> is >>>> not filled with PQconninfoOption. >>>> Is it correct? >>> >>> Yes, it stays NULL as makeEmptyPGconn() initialized it. This case only >>> happens >>> with the "requiressl" setting with its special mapping. If you set " >>> requiressl = '0' " >>> then it means that " requiressl='1' " was not set so the PGconn side stays >>> as NULL. >>> >>> The special casing was there in the old code too and behaves the same. >>> >>> >>>> Documentation: >>>> ------------- >>>> 1. >>>> + <para> >>>> + The <literal>for_replication</> argument can be used to exclude >>>> some >>>> >>>> + options from the list which are used by the walreceiver module. >>>> + <application>pg_basebackup</application> uses this pre-filtered >>>> list >>>> >>>> + to construct <literal>primary_conninfo</> in the automatically >>>> generated >>>> + recovery.conf file. >>>> + </para> >>>> >>>> I feel the explanation should be as follows, >>>> exclude some options from the list which are not used by the walreceiver >>>> module? >>> >>> Err, no. The list excludes those[1] that *are* used (would be >>> overridden) by the walreceiver module: >>> >>> ----8<--------8<--------8<--------8<--------8<---- >>> static bool >>> libpqrcv_connect(char *conninfo, XLogRecPtr startpoint) >>> { >>> ... >>> snprintf(conninfo_repl, sizeof(conninfo_repl), >>> "%s dbname=replication replication=true >>> fallback_application_name=walreceiver", >>> conninfo); >>> ----8<--------8<--------8<--------8<--------8<---- >>> >>> [1] Actually, more than these 3 options are excluded. The deprecated >>> ones are also excluded. >>> >>> >>>> Observations/Issues: >>>> ------------------- >>>> 1. If the password contains any escape sequence characters, It is leading >>>> to >>>> problems while walreceiver connecting to primary >>>> by using the primary conninfo from recovery.conf >>>> >>>> please log an warning message or a note in document to handle such a >>>> case >>>> manually by the user. >>> >>> Done at both places. >>> >>> Also, to adapt to the style of other error messages, now all my fprintf() >>> statements >>> are prefixed with: "%s: ...", progname. >> In new patches, when I ran "pg_basebackup -D hoge -c fast -R" on MacOS, >> I got the following error message. BTW, I compiled the patched PostgreSQL >> with --enable-debug and --enable-cassert options. >> >> pg_basebackup(41751) malloc: *** error for object 0x106001af0: pointer >> being freed was not allocated >> *** set a breakpoint in malloc_error_break to debug >> Abort trap: 6 > > Can you show a backtrace? I compiled it on Fedora 17/x86_64 with > --enable-depend --enable-debug --enable-cassert. GLIBC is also smart > enough to catch improper free() calls, too. I was able to test it on OSX and I found my bug. Attached is the new code. The problem was in conninfo_init(), the last entry in the filtered list was not initialized to 0. It seems that for some reason, my Linux machine gave me pre-initialized memory. > >> $ uname -a >> Darwin hrk.local 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 >> 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64 >> >> >> When tar output format is specified together with -R option, recovery.conf is >> not included in base.tar. I think it should. > > Why? This patch only promises to write the recovery.conf into the > directory specified with -D. > >> + setting up the standby. Since creating a backup for a standalone >> + server with <option>-x</option> or <option>-X</option> and adding >> + a recovery.conf to it wouldn't make a working standby, these options >> + naturally conflict. >> >> Is this right? ISTM that basically we can use pg_basebackup -x to take >> a base backup for starting the standby for now. No? > > I don't know. Pointers? > > Thanks, > Zoltán Böszörményi > -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
pgsql-hackers by date: