Re: Header unfolding in archived mail - Mailing list pgsql-www
From | Noah Misch |
---|---|
Subject | Re: Header unfolding in archived mail |
Date | |
Msg-id | 20131209004119.GA1266851@tornado.leadboat.com Whole thread Raw |
In response to | Header unfolding in archived mail (Noah Misch <noah@leadboat.com>) |
Responses |
Re: Header unfolding in archived mail
|
List | pgsql-www |
On Sat, Sep 07, 2013 at 06:07:45PM -0400, Noah Misch wrote: > The mailing list web archives display the subject of message > 20130603190727.GA360354@tornado.leadboat.com as follows: > > Partitioning performance: cache stringToNode() ofpg_constraint.ccbin > > Note the lack of whitespace after "of". The original message, which you can > see by downloading the mbox for June 2013, conveyed the subject this way: > > Subject: Partitioning performance: cache stringToNode() of > pg_constraint.ccbin > > Per RFC 5322, section 2.2.3: > > The process of moving from this folded multiple-line representation > of a header field to its single line representation is called > "unfolding". Unfolding is accomplished by simply removing any CRLF > that is immediately followed by WSP. Each header field should be > treated in its unfolded form for further syntactic and semantic > evaluation. An unfolded header field has no length restriction and > therefore may be indeterminately long. > > So, the archives should present the subject like this: > > Partitioning performance: cache stringToNode() of pg_constraint.ccbin > > Gmane and osdir.com do so. MARC and Gmail show a space in place of the tab, > but Gmail converts every subject-line tab to a space. I have attached a > patch, against pgarchives.git, making its unfolding code conform to RFC 5322. > The change also affects headers folded before a space rather than before a > tab, such as 50E31370.5030405@cybertec.at. Those have been displaying fine > despite the lack of unfolding because newline-space renders like a space in > HTML. I unit-tested the change, but I did not test the full archives load. > > > The "raw" message display feature seems to have its own set of rules, and I > failed to find their implementation. Here are the subject lines for the > aforementioned messages according to "raw" display: > > Subject: Partitioning performance: cache stringToNode() of pg_constraint.ccbin > Subject: Review of "pg_basebackup and pg_receivexlog to use non-blocking socket > communication", was: Re: Re: [BUGS] BUG #7534: walreceiver takes > long time to detect n/w breakdown > > In one case, "\n\t" from the true raw original (in the mbox file) became " ". > In the other case, two instances of "\n " became "\n\t". Any ideas where that > transformation is coming from? Ping. Any advice on how to more-thoroughly test the pgarchives.git change, or where I might find the corresponding code affecting "raw" message display? -- Noah Misch EnterpriseDB http://www.enterprisedb.com