pgsql: doc: Warn that ts_headline() output is not HTML-safe. - Mailing list pgsql-committers

From Dean Rasheed
Subject pgsql: doc: Warn that ts_headline() output is not HTML-safe.
Date
Msg-id E1uAQuj-000SSs-2T@gemulon.postgresql.org
Whole thread Raw
Responses Re: pgsql: doc: Warn that ts_headline() output is not HTML-safe.
List pgsql-committers
doc: Warn that ts_headline() output is not HTML-safe.

Add a documentation warning to ts_headline() pointing out that, when
working with untrusted input documents, the output is not guaranteed
to be safe for direct inclusion in web pages. This is because, while
it does remove some XML tags from the input, it doesn't remove all
HTML markup, and so the result may be unsafe (e.g., it might permit
XSS attacks).

To guard against that, all HTML markup should be removed from the
input, making it plain text, or the output should be passed through an
HTML sanitizer.

In addition, document precisely what the default text search parser
recognises as valid XML tags, since that's what determines which XML
tags ts_headline() will remove.

Reported-by: Richard Neill <richard.neill@telos.digital>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 13

Branch
------
REL_14_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/1ba9ffa56eb510b8d9ae57431ad61a9e1a396674

Modified Files
--------------
doc/src/sgml/textsearch.sgml | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)


pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pgsql: doc: Improve explanations when a table rewrite is needed
Next
From: Nathan Bossart
Date:
Subject: pgsql: Remove extra "not" in pg_upgrade documentation.