Re: once more: documentation search indexing - Mailing list pgsql-www
From | Andres Freund |
---|---|
Subject | Re: once more: documentation search indexing |
Date | |
Msg-id | 20210612213753.pagmpjjtixjztenj@alap3.anarazel.de Whole thread Raw |
In response to | Re: once more: documentation search indexing ("Jonathan S. Katz" <jkatz@postgresql.org>) |
Responses |
Re: once more: documentation search indexing
|
List | pgsql-www |
Hi, On 2021-06-12 17:05:22 -0400, Jonathan S. Katz wrote: > Thank you for bringing this up I applaud the suggestion of approach. Glad to hear it. > > Suggested small steps: > > > > - add a docs/current link to https://www.postgresql.org/docs/. Often > > enough that's what a user wants anyway, and it's not useful to add > > additional steps for users and search engines to navigate to > > docs/current/. > > We do that at the very top: that is the first link in the main body. > This was done back in Nov 2020[1] Oh - I had not realized that at all. I think the similarity to the news bar made me completely blend the "view the manual" element out. > > I can see us either making it a separate row in the versioned table, > > or to split the most recent released version's link into a /current/ > > and $major link. > > I'm not sure if that's any different than the above right now; if there > is something you could cite around that, I'm happy to be convinced > otherwise. I don't think the existing link is particularly helpful - it's just visually too different from the other links. And doesn't indicate which version it is for etc. > However, I'm also not opposed to putting a (Current) link next to the > current version in the table. I think that'd at least be helpful from a > user perspective, if they don't click the big button up top. Yea, I think that'd be good. > > - put version in page titles where it makes sense. E.g. change > > "PostgreSQL: Documentation: 10: 6.1. Inserting Data" to > > "PostgreSQL 10 Documentation: 6.1. Inserting Data" > > > > The current ordering doesn't seem like it has much going for it, and > > it can't help search engines to have the version number people might > > search for removed from the product name. > > > > Right now this seem to contribute to less than helpful titles in > > search engine results. Searching anonymously for "postgres alter > > table" I get the less than helpful "Documentation: 12: ALTER TABLE - > > PostgreSQL" on google. > > > > It might also be worth to go a bit further and put the documentation > > version *after* the page title, given that it's most likely already > > clear to the reader that this is about postgres. I.e. something like > > "ALTER TABLE - Documentation for PostgreSQL 14" > > I think having "PostgreSQL $MAJOR_VERSION" together would help both for > some of the indexing issues + readability in the search engine. The > question is around how the content is ordered. in the title. > > Doing "PostgreSQL $MAJOR_VERSION: Documentation: $page_title" might be > the way to go. The other thing I see done for SEO what you suggest, but > just hyphenated i.e. "ALTER TABLE - Documentation - PostgreSQL 14" > > Anyway, I'm generally in favor for combining at least "PostgreSQL > $MAJOR_VERSION." Yea, let's do that separately then. WRT ordering, I do think I prefer the versions with the actual subject of the page first - to distinguish between different PG doc pages "PostgreSQL 14 Documentation" is really not helpful. I often have multiple doc pages open in different tabs, and there's right now no way to distinguish them, because there's never enough space for even just "PostgreSQL 13: Documentation:", not to speak of an actual title. > That all said, as stated and cited in some of those previous threads, I > think the biggest lift is around making our documentation URLs > canonical. After discussing with Magnus a bit, there are a few things > that we need to consider in it: > > 1. Whether or not the documentation page is in "current" > 2. If it's not in "current", which is the last version the page is a > part of? We make that the canonical Yea, I know that's a potentially significant improvement. I just didn't feel it's useful to wade into the topic because it's been discussed for about a decade by now. And that there's things we could make easier progress on... > I've attached a patch that does this. The one part I'm not sure I like > is how we treat something that is solely in "devel" -- knowing that > eventually something in devel could end up in current. Perhaps if > something is only in "devel", we exclude it from being part of the > canonical tree? Right now all of docs/devel is prevented from being indexed via robots.txt: Disallow: /docs/devel/ So it won't really matter for SEO purposes. Greetings, Andres Freund