Re: [POC] verifying UTF-8 using SIMD instructions - Mailing list pgsql-hackers

From John Naylor
Subject Re: [POC] verifying UTF-8 using SIMD instructions
Date
Msg-id CAFBsxsHWAy+GS39rEbsczLb-3H1=P_93urv-85K0R7dUQfajwQ@mail.gmail.com
Whole thread Raw
In response to Re: [POC] verifying UTF-8 using SIMD instructions  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: [POC] verifying UTF-8 using SIMD instructions
List pgsql-hackers
Here is a more polished version of the function pointer approach, now adapted to all multibyte encodings. Using the not-yet-committed tests from [1], I found a thinko bug that resulted in the test for nul bytes to not only be wrong, but probably also elided by the compiler. Doing it correctly is noticeably slower on pure ascii, but still several times faster than before, so the conclusions haven't changed any. I'll run full measurements later this week, but I'll share the patch now for review.

[1] https://www.postgresql.org/message-id/11d39e63-b80a-5f8d-8043-fff04201fadc@iki.fi

--
Attachment

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: jsonb_array_elements_recursive()
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] GSoC 2017: Foreign Key Arrays