Home > mailing lists

Re: PQputCopyEnd doesn't adhere to its API contract - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: PQputCopyEnd doesn't adhere to its API contract
Date	September 4, 2014 21:13:53
Msg-id	CA+TgmoYgNki4Ex7Spxs2tVmiThMQcVZ7BABbrUwjZbGUjKGWow@mail.gmail.com Whole thread Raw
In response to	Re: PQputCopyEnd doesn't adhere to its API contract (David G Johnston <david.g.johnston@gmail.com>)
Responses	Re: PQputCopyEnd doesn't adhere to its API contract
List	pgsql-hackers

Tree view

On Thu, Sep 4, 2014 at 1:18 PM, David G Johnston
<david.g.johnston@gmail.com> wrote:
> Specific observations would help though that is partly the idea - I've been
> more focused on clarity and organization even if it requires deviating from
> the current general documentation style.

OK.

-   to the network connection used by <application>libpq</application>.
+   to the connection used by <application>libpq</application>.

This change is unrelated to the topic at hand and seems not to be an
improvement.

+  <note>
+   <para>
+    Please review the function notes for specific interface protocols -
+    the following is a simplified overview.
+   </para>
+  </note>

This seems pointless.  Of course general documentation will be less
specific than documentation for specific functions.

+   First, the application issues the SQL   <command>COPY</command> command via <function>PQexec</function> or one
+   of the equivalent functions.  The response
+   will either be an error or a <structname>PGresult</> object bearing
+   a status code for <literal>PGRES_COPY_OUT</literal> or
+   <literal>PGRES_COPY_IN</literal> call implied by the specified copy
+   direction (TO or FROM respectively).

This implies that the response won't be a PGresult in the error case,
but of course the function has the same C result type either way.

+  <para>
+   Second, the application should use the functions in this
+   section to receive data rows or transmit buffer loads.  Buffer loads are
+   not guaranteed to be processed until the copy transfer is completed.
+  </para>

The main change here vs. the existing text is that you're now using
the phase "buffer loads" to refer to what gets transmitted, and "data
rows" to refer to what gets received.  The existing text uses the term
"data rows" for both, which seems correct to me.  My first reaction on
reading your revised text was "wait, what's a buffer load?".

+   Third, as lastly, when the data transfer is
+   complete the client must issue a PQgetResult to "commit" the copy transfer
+   and get the final <structname>PGresult</> object that indicates

I assume you mean "and lastly", since "as lastly" doesn't sound like
good English.

-   At this point further SQL commands can be issued via
-   <function>PQexec</function>.  (It is not possible to execute other SQL
-   commands using the same connection while the <command>COPY</command>
-   operation is in progress.)

Removing this text doesn't seem like a good idea.  It's a quite
helpful clarification.  The <note> you've added in its place doesn't
seem like a good substitute for it, and more generally, I think we
should avoid the overuse of constructs like <note>.  Emphasis needs to
be used minimally or it loses its meaning.

> If this is not acceptable I'm happy to incorporate the ideas of others to
> try and get the best of both worlds.

+   <para>
+    The return value of both these function can be one of [-1, 0, 1]
+    whose meaning depends on whether you are in blocking or non-blocking mode.
+   </para>

The use of braces to list a set of possible values is not standard in
mathematics generally, our documentation, or any other documentation I
have seen.

+   <para>
+    Non-Blocking Mode: A value of 1 means that the payload was
+    placed in the queue while a -1 means an immediate and permanent failure.
+    A return value of 0 means that the queue was full: you need to try again
+    at the next wait-ready.
+   </para>

We generally avoid emphasizing captions or headings in this way.  The
markup engine has no knowledge that "Non-Blocking Mode" is special; it
will format and style this just as if you had written "This is how it
works: a value of 1 means...".  That's probably not appropriate. Our
style is to write English prose using full sentences, e.g. "In
non-blocking mode, a value of 1 means...".

-       into buffer loads of any convenient size.  Buffer-load boundaries
+       into buffer-loads of any convenient size.  Buffer-load boundaries

Well, OK, here's a mention of buffer loads in the existing
documentation.  But, when you previously used the term, you didn't
hyphenate it, but here you are changing it to be hyphenated.  Note
that the fact that it's hyphenated the second time on this line is
because the two-word term is being used as an adjective modifying
boundaries, not because every use of those two words should be
hyphenated.  (Consider how odd the previous sentence would look if I
had written it this way: ...not because every use of those two-words
should be hyphenated.  Yet the earlier hyphenation of two-word looks
correct, at least to me, for the same reasons as in the
documentation.)

-       Ends the <literal>COPY_IN</> operation successfully if
-       <parameter>errormsg</> is <symbol>NULL</symbol>.  If
-       <parameter>errormsg</> is not <symbol>NULL</symbol> then the
-       <command>COPY</> is forced to fail, with the string pointed to by
-       <parameter>errormsg</> used as the error message.  (One should not
-       assume that this exact error message will come back from the server,
-       however, as the server might have already failed the
-       <command>COPY</> for its own reasons.  Also note that the option
-       to force failure does not work when using pre-3.0-protocol
-       connections.)
+       If <parameter>errormsg</> is <symbol>NULL</symbol> this ends
+       the <literal>COPY_IN</> operation successfully; otherwise the
+       <command>COPY</> is forced to fail with the string pointed to by
+       <parameter>errormsg</> used as the error message - though
+       this exact error message may not come back
+       as server might have already failed the
+       <command>COPY</> for its own reasons.

I agree that the current wording isn't great, in that you've kind of
got to parse through those first two sentences to understand what it's
really saying.  But I don't see your rewording as an imprvement.  The
current wording at least makes one thing clear right at the outset:
we're going to end a COPY_IN.  If I were going to try to improve this
I'd probably say something like "Ends the COPY_IN operation.
Normally, errormsg will be NULL, in which case the COPY will succeed
if no error has occurred on the server side.  However, errormsg can
also be non-NULL if the client wishes to force the server to fail the
copy with an ERROR." ...and I'd keep the rest of the text as it is.

+       Note: The option to force failure does not work
+       when using pre-3.0-protocol connections.

If you want a note, you have to use <note>.  But as above, it's better
that this *isn't* a note but just an English sentence, to avoid
overuse of emphasis.

I won't go through all of the remaining hunks as the issues are
basically similar.  Most of these changes are unrelated to the actual
problem that I was complaining about here.  If you want to improve the
wording of the documentation generally, that should be a separate
patch, not tied to whatever we're doing about this specific issue;
whatever we do about this specific issue should be a minimal patch
that does only that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Peter Geoghegan
Date: 04 September 2014, 20:56:02
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From: Stephen Frost
Date: 04 September 2014, 21:17:41
Subject: Re: Commitfest status

Re: PQputCopyEnd doesn't adhere to its API contract - Mailing list pgsql-hackers

Previous

Next