Re: Problem with restoring dump (may be tsearch-related) - Mailing list pgsql-general
From | Markus Wollny |
---|---|
Subject | Re: Problem with restoring dump (may be tsearch-related) |
Date | |
Msg-id | 2266D0630E43BB4290742247C8910575014CE3C5@dozer.computec.de Whole thread Raw |
In response to | Problem with restoring dump (may be tsearch-related) ("Markus Wollny" <Markus.Wollny@computec.de>) |
Responses |
Re: Problem with restoring dump (may be tsearch-related)
Re: Problem with restoring dump (may be tsearch-related) |
List | pgsql-general |
Hi! The ü is literally in the file - we are parsing all of our editor's input for optimal HTML-output. And the german umlauts are represented as &[v]uml; where [v] is the corresponding vowel. Now you mention it, I believe that all of the strings which are in one of these "parse error at or near"-messages are actually preceded by a HTML-umlaut or the like: Just a snippet from my first example: psql:alldb1.sql:1122826: ERROR: parser: parse error at or near "ußerst" would be "äußerst" -> äßerst psql:alldb1.sql:1122826: ERROR: parser: parse error at or near "chst" could be "höchst" -> h¨chst psql:alldb1.sql:1122826: ERROR: parser: parse error at or near "mmern" could be "kümmern" -> kümmern" psql:alldb1.sql:1122827: ERROR: parser: parse error at or near "ren" could be "Türen" -> "Türen" psql:alldb1.sql:1122827: ERROR: parser: parse error at or near "rfer" could be "Dörfer" -> "Dörfer" psql:alldb1.sql:1122827: ERROR: parser: parse error at or near "ndig" could be "hintergründig" -> "hintergründig" psql:alldb1.sql:1122828: ERROR: parser: parse error at or near "henvorteile" could be "Höhenvorteile" -> "Höhenvorteile" psql:alldb1.sql:1122828: ERROR: parser: parse error at or near "hten" could be "blühten" -> "blühten" psql:alldb1.sql:1122829: ERROR: parser: parse error at or near "berqueren" could be "überqueren" -> "überqueren" psql:alldb1.sql:1122829: ERROR: parser: parse error at or near "cken" -> "Lücken" -> "Lücken" psql:alldb1.sql:1122830: ERROR: parser: parse error at or near "ck" -> "zurück" -> "zurück" psql:alldb1.sql:1122831: ERROR: parser: parse error at or near "hrend" -> "führend" -> "führend" psql:alldb1.sql:1122831: ERROR: parser: parse error at or near "ude" -> "Gebäude" -> "Gebäude" psql:alldb1.sql:1122831: ERROR: parser: parse error at or near "nnen" -> "können" -> "können" psql:alldb1.sql:1122831: ERROR: parser: parse error at or near "berzeugen" -> "überzeugen" ->"überzeugen" As txtidx actually just contains substrings and ignores the HTML-umlauts (a slight disadvantage we are quite happy to live with), it only stores those substrings before or after ampersand or semicolon anyway - which shouldn't cause any problems whatsoever, so I think we might rule out tsearch being the cause. But why would ordinary plain text cause these parse-errors? What shall I do next in order to get down to the problem itself? Regards, Markus > -----Ursprüngliche Nachricht----- > Von: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Gesendet: Donnerstag, 5. September 2002 18:23 > An: Markus Wollny > Cc: pgsql-general@postgresql.org > Betreff: Re: [GENERAL] Problem with restoring dump (may be > tsearch-related) > > > "Markus Wollny" <Markus.Wollny@computec.de> writes: > > The entries are quite long, and I don't want to cause too > much traffic, > > so I don't dare to give you more than this one example: > > > Restore-attempt outputs e.g.: > > psql:alldb1.sql:1434914: ERROR: parser: parse error at or near > > "ckenmuskeln" > > Hmm. I see that string in the context > > > Wie sich die Rückenmuskeln anspannen, wird im Bild aber nicht > > What exactly is the string that you've represented here as ü ? > Is that literally what's in the dump file, or has something helpfully > html-ized some weird Unicode sequence? > > As far as I can tell, what must be happening is that the COPY data > transfer has been terminated and the regular SQL parser is trying to > make sense of the input starting at "ckenmuskeln anspannen,". I'm > wondering if something is misreading the ü sequence as "\." ... > which would probably be a character-set-encoding kind of problem. > > regards, tom lane >
pgsql-general by date: