Thread: Segfault in jit tuple deforming on arm64 due to LLVM issue
Hi! I have an instance that started to consistently crash with segfault or bus error and most of the generated coredumps had corrupted stacks. Some salvageable frames showed the error happening within ExecRunCompiledExpr. Sure enough, running the query with jit disabled stopped the crashes. The issue happens with the following setup: Ubuntu jammy on arm64, 30G postgresql-14 14.12-1.pgdg22.04+1 libllvm15 1:15.0.7-0ubuntu0.22.04.3 I was able to isolate the impacted database the db (pg_dump of the table was not enough, a base backup had to be used) and reproduce the issue on a debug build of PostgresSQL. This time, there's no crash but it was stuck in an infinite loop within jit tuple deforming: #0 0x0000ec53660aa14c in deform_0_1 () #1 0x0000ec53660aa064 in evalexpr_0_0 () #2 0x0000ab8f9b322948 in ExecEvalExprSwitchContext (isNull=0xfffff47c3c87, econtext=0xab8fd0f13878, state=0xab8fd0f13c50) at executor/./build/../src/include/executor/executor.h:342 #3 ExecProject (projInfo=0xab8fd0f13c48) at executor/./build/../src/include/executor/executor.h:376 Looking at the generated assembly, the infinite loop happens between deform_0_1+140 and deform_0_1+188 // Store address page in x11 register 0xec53660aa130 <deform_0_1+132> adrp x11, 0xec53fd308000 // Start of the infinite loop 0xec53660aa138 <deform_0_1+140> adr x8, 0xec53660aa138 <deform_0_1+140> // Load the content of 0xec53fd308000[x12] in x10, x12 was 0 at that time 0xec53660aa13c <deform_0_1+144> ldrsw x10, [x11, x12, lsl #2] // Add the loaded offset to x8 0xec53660aa140 <deform_0_1+148> add x8, x8, x10 ... // Branch to address in x8. Since x10 was 0, x8 has the value deform_0_1+140, creating the infinite loop 0xec53660aa168 <deform_0_1+188> br x8 Looking at the content of 0xec53fd308000, We only see 0 values stored at the address. x/6 0xec53fd308000 0xec53fd308000: 0x00000000 0x00000000 0x00000000 0x00000000 0xec53fd308010: 0x00000000 0x00000000 The assembly matches the code for the find_start switch case in llvmjit_deform[1]. The content at the address 0xec53fd308000 should contain the offset table from the PC to branch to the correct attcheckattnoblocks block. As a comparison, if I execute a query not impacted by the issue (the size of the jit compiled module seems to be a factor), I can see that the offset table was correctly filled. x/6 0xec55fd30700 0xec55fd307000: 0x00000060 0x00000098 0x000000e8 0x00000170 0xec55fd307010: 0x0000022c 0x000002e8 I was suspecting something was erasing the content of the offset table so I've checked with rr. However, it was only initialized and nothing was written at this memory address. I was starting to suspect a possible LLVM issue and ran the query against a debug build of llvm_jit. It immediately triggered the following assertion[2]: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry &, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed. This happens when LLVM is resolving relocations. #5 __GI___assert_fail (assertion=0xf693f214771a "isInt<33>(Result) && \"overflow check failed for relocation\"", file=0xf693f2147269 "/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp", line=507, function=0xf693f214754f "void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry &, uint64_t, uint64_t, uint32_t, int64_t)") at ./assert/assert.c:101 #6 llvm::RuntimeDyldELF::resolveAArch64Relocation () at /var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507 #7 llvm::RuntimeDyldELF::resolveRelocation () at /var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1044 #8 llvm::RuntimeDyldELF::resolveRelocation () at /var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1026 #9 llvm::RuntimeDyldImpl::resolveRelocationList () at /var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1112 #10 llvm::RuntimeDyldImpl::resolveLocalRelocations () at /var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:157 #11 llvm::RuntimeDyldImpl::finalizeAsync() at /var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1247 During the assertion failure, I have the following values: Value: 0xfbc84fab9000 FinalAddress: 0xfbc5b9cea12c Addend: 0x0 Result: 0x295dcf000 The result is indeed greater than an int32, triggering the assert. Looking at the sections created by LLVM in allocateSection[3], we have 3 sections created: .text {Address = 0xfbc5b9cea000, AllocatedSize = 90112} .rodata {Address = 0xfbc84fab9000, AllocatedSize = 4096} .eh_frame {Address = 0xfbc84fab7000, AllocatedSize = 8192} When resolving relocation, the difference between the rodata section and the PC is computed and stored in the ADRP instruction. However, when a new section is allocated, LLVM will request a new memory block from the memory allocator[4]. The MemGroup.Near is passed as the start hint of mmap but that's only a hint and the kernel doesn't provide any guarantee that the new allocated block will be near. With the impacted query, there are more than 10GB of gap between the .text section and the .rodata section, making it impossible for the code in the .text section to correctly fetch data from the .rodata section as the address in ADRP is limited to a +/-4GB range. There are mentions about this in the ABI that the GOT section should be within 4GB from the text section[5]. Though in this case, there's no GOT section as the offsets are stored in the .rodata section but the constraint is going to be similar. This is a known LLVM issue[6] that impacted Impala, Numba and Julia. There's an open PR[7] to fix the issue by allocating all sections as a single memory block, avoiding the gaps between sections. There's also a related discussion on this on llvm-rtdyld discourse[8]. A possible mitigation is to switch from RuntimeDyld to JITLinking but this requires at least LLVM15 as LLVM14 doesn't have any significant relocation support for aarch64[9]. I did test using JITLinking on my impacted db and it seems to fix the issue. JITLinking has no exposed C interface though so it requires additional wrapping. I don't necessarily have a good answer for this issue. I've tried to tweak relocation settings or the jit code to avoid relocation without too much success. Ideally, the llvm fix will be merged and backported in llvm but the PR has been open for some time now. I've seen multiple segfault reports that look similar to this issue (example: [10], [11]) but I don't think it was linked to the LLVM bug so I figured I would at least share my findings. [1] https://github.com/postgres/postgres/blob/REL_14_STABLE/src/backend/jit/llvm/llvmjit_deform.c#L364-L382 [2] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp#L501-L513 [3] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/SectionMemoryManager.cpp#L41C32-L41C47 [4] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/SectionMemoryManager.cpp#L94-L110 [5] https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#7code-models [6] https://github.com/llvm/llvm-project/issues/71963 [7] https://github.com/llvm/llvm-project/pull/71968 [8] https://discourse.llvm.org/t/llvm-rtdyld-aarch64-abi-relocation-restrictions/74616 [9] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/JITLink/ELF_aarch64.cpp#L75-L84 [10] https://www.postgresql.org/message-id/flat/CABa%2BnRvwZy_5t1QF9NJNGwAf03tv_PO_Sg1FsN1%2B-3Odb1XgBA%40mail.gmail.com [11] https://www.postgresql.org/message-id/flat/CADAf1kavcN-kY%3DvEm3MYxhUa%2BrtGFs7tym5d7Ee6Ni2cwwxGqQ%40mail.gmail.com Regards, Anthonin Bonnefoy
On Thu, Aug 22, 2024 at 7:22 PM Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote: > Ideally, the llvm fix will be merged and backported > in llvm but the PR has been open for some time now. I fear that back-porting, for the LLVM project, would mean "we fix it in main/20.x, and also back-port it to 19.x". Do distros back-port further? Nice detective work! The JITLINK change sounds interesting, and like something we need to do sooner or later.
On Thu, Aug 22, 2024 at 12:33 PM Thomas Munro <thomas.munro@gmail.com> wrote: > I fear that back-porting, for the LLVM project, would mean "we fix it > in main/20.x, and also back-port it to 19.x". Do distros back-port > further? That's also my fear, I'm not familiar with distros back-port policy but eyeballing ubuntu package changelog[1], it seems to be mostly build fixes. Given that there's no visible way to fix the relocation issue, I wonder if jit shouldn't be disabled for arm64 until either the RuntimeDyld fix is merged or the switch to JITLink is done. Disabling jit tuple deforming may be enough but I'm not confident the issue won't happen in a different part. [1] https://launchpad.net/ubuntu/+source/llvm-toolchain-16/+changelog
On Sat, Aug 24, 2024 at 12:22 AM Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote: > On Thu, Aug 22, 2024 at 12:33 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > I fear that back-porting, for the LLVM project, would mean "we fix it > > in main/20.x, and also back-port it to 19.x". Do distros back-port > > further? > > That's also my fear, I'm not familiar with distros back-port policy > but eyeballing ubuntu package changelog[1], it seems to be mostly > build fixes. > > Given that there's no visible way to fix the relocation issue, I > wonder if jit shouldn't be disabled for arm64 until either the > RuntimeDyld fix is merged or the switch to JITLink is done. Disabling > jit tuple deforming may be enough but I'm not confident the issue > won't happen in a different part. We've experienced something a little similar before: In the early days of PostgreSQL LLVM, it didn't work at all on ARM or POWER. We sent a trivial fix[1] upstream that landed in LLVM 7; since it was a small and obvious problem and it took a long time for some distros to ship LLVM 7, we even contemplated hot-patching that LLVM function with our own copy (but, ugh, only for about 7 nanoseconds). That was before we turned JIT on by default, and was also easier to deal with because it was an obvious consistent failure in basic tests, so packagers probably just disabled the build option on those architectures. IIUC this one is a random and rare crash depending on malloc() and perhaps also the working size of your virtual memory dart board. (Annoyingly, I had tried to reproduce this quite a few times on small ARM systems when earlier reports came in, d'oh!). This degree of support window mismatch is probably what triggered RHEL to develop their new rolling LLVM version policy. Unfortunately, it's the other distros that tell *us* which versions to support, and not the reverse (for example CF #4920 is about to drop support for LLVM < 14, but that will only be for PostgreSQL 18+). Ultimately, if it doesn't work, and doesn't get fixed, it's hard for us to do much about it. But hmm, this is probably madness... I wonder if it would be feasible to detect address span overflow ourselves at a useful time, as a kind of band-aid defence... [1] https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com
On Mon, Aug 26, 2024 at 4:33 AM Thomas Munro <thomas.munro@gmail.com> wrote: > IIUC this one is a random and rare crash depending on malloc() and > perhaps also the working size of your virtual memory dart board. > (Annoyingly, I had tried to reproduce this quite a few times on small ARM > systems when earlier reports came in, d'oh!). allocateMappedMemory used when creating sections will eventually call mmap[1], not malloc. So the amount of shared memory configured may be a factor in triggering the issue. My first attempts to reproduce the issue from scratch weren't successful either. However, trying again with different values of shared_buffers, I've managed to trigger the issue somewhat reliably. On a clean Ubuntu jammy, I've compiled the current PostgreSQL REL_14_STABLE (6bc2bfc3) with the following options: CLANG=clang-14 ../configure --enable-cassert --enable-debug --prefix ~/.local/ --with-llvm Set "shared_buffers = '4GB'" in the configuration. More may be needed but 4GB was enough for me. Create a table with multiple partitions with pgbench. The goal is to have a jit module big enough to trigger the issue. pgbench -i --partitions=64 Then run the following query with jit forcefully enabled: psql options=-cjit_above_cost=0 -c 'SELECT count(bid) from pgbench_accounts;' If the issue was successfully triggered, it should segfault or be stuck in an infinite loop. > Ultimately, if it doesn't work, and doesn't get fixed, it's hard for > us to do much about it. But hmm, this is probably madness... I wonder > if it would be feasible to detect address span overflow ourselves at a > useful time, as a kind of band-aid defence... There's a possible alternative, but it's definitely in the same category as the hot-patching idea. llvmjit uses LLVMOrcCreateRTDyldObjectLinkingLayerWithSectionMemoryManager to create the ObjectLinkingLayer and it will be created with the default SectionMemoryManager[2]. It should be possible to provide a modified SectionMemoryManager with the change to allocate sections in a single block and it could be restricted to arm64 architecture. A part of me tells me this is probably a bad idea but on the other hand, LLVM provides this way to plug a custom allocator and it would fix the issue... [1] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/Support/Unix/Memory.inc#L115-L117 [2] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp#L967-L973
On Tue, Aug 27, 2024 at 11:32 AM Thomas Munro <thomas.munro@gmail.com> wrote: > SectorMemoryManager Erm, "Section". (I was working on some file system stuff at the weekend, and apparently my fingers now auto-complete "sector".)
Thanks! And that's great news. Do you want to report this experience to the PR, in support of committing it? That'd make it seem easier to consider shipping a back-ported copy...
On Tue, Aug 27, 2024 at 12:01 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > Thanks! And that's great news. Do you want to report this experience > to the PR, in support of committing it? That'd make it seem easier to > consider shipping a back-ported copy... Yes, I will do that.
I've run some additional tests, mostly pgbench with options=-cjit_above_cost=0 for an extended duration on an instance that was impacted. I haven't seen any issues nor performance regressions compared to the unpatched version. I will switch the commitfest entry to Ready for Committer if there's no objection.
On Thu, Oct 17, 2024 at 10:36 PM Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote: > I've run some additional tests, mostly pgbench with > options=-cjit_above_cost=0 for an extended duration on an instance > that was impacted. I haven't seen any issues nor performance > regressions compared to the unpatched version. > > I will switch the commitfest entry to Ready for Committer if there's > no objection. Thanks! I'm going to go ahead and commit this. I asked Andres if he wanted to object to this plan (as author of our LLVM stuff) and he did not. I tried, for several frustrating days, to figure out how to solve our problem using JITLink and stay on a more "supported" path, but it all seems a bit unready, as various things don't work or aren't available in various versions in our support range. At least I now have the bones of a patch to prepare for JITLink in LLVM 20 or whenever they force our hand... I'll write about that in a new thread soon.
On Thu, Oct 31, 2024 at 6:49 AM Thomas Munro <thomas.munro@gmail.com> wrote: > There are a couple of cases of dual-licensed code in our tree where we > explicitly used the Boost alternative instead of Apache 2. I plead > complete ignorance of this topic and defer to those who know about > such things: can we actually do this? I guess at a minimum a copy of > the licence would need to appear somewhere -- perhaps under > src/backend/jit/llvm? I'm also not super knowledgeable about the licensing intricacies but I read it the same way - a license file has to be provided due to the 4a clause. llvmlite did this when they added the patched memory manager[1] > 4d says that if you modified the code you have > to say so prominently, but I did that at the top (and the changes are > completely trivial, just some #ifdef swizzling to massage some > function prototypes to suit older LLVMs). Otherwise I understand it > to be generally "BSD-like" (sans advert clause) but there is also some > stuff about patents, which surely aren't relevant to this in > practice... but I know that some projects object to it on principle > and because it smells like contract law, or something.... not an area > I am well informed about. Who should I be asking? (Naively, I > wondered: could there be some kind of fair use concept for > back-patching fixes to broken libraries that you're merely a user of > where you can be excused from the burdens of a distributor? Yeah > wishful thinking I'm sure.) You mean 4b, right? LLVM doesn't seem to have any NOTICE files so the 4d clause shouldn't apply. The top comment looks fine to notify the source of the modified file and how it was changed. But again, I don't have much experience in this so I can't be sure. [1]: https://github.com/numba/llvmlite/pull/1009/files#diff-80b149f35cebd583e21dfc49c0007a7fab89c3c6d07c028e4a87de0848aa2ed8
> On 31 Oct 2024, at 06:48, Thomas Munro <thomas.munro@gmail.com> wrote: > I guess at a minimum a copy of the licence would need to appear somewhere That's my interpretation of it as well. > perhaps under src/backend/jit/llvm? Since SectionMemoryManager.h is in src/backend/jit I wonder if it should be a placed in a section in src/backend/jit/README with an overview of the what and why (or maybe a new src/backend/jit/llvm/README would be even better). The license doesn't have to be in a separate file AFAICT and including a (version of) your excellent summary in the commit message along with it would probably help readers. -- Daniel Gustafsson
On Tue, Nov 5, 2024 at 9:00 AM Thomas Munro <thomas.munro@gmail.com> wrote: > Reasoning: Old LLVM required C++11. LLVM 9 switched to C++14. LLVM > 14 switched C++17. Pretty soon they'll flip to C++20 or C++23, they > don't mess around. The corresponding -std=c++XX flag finishes up in > our compile lines, because llvm-config --cxxflags spits it out, to > match the features they're using in headers that we include (easy to > spot examples being std::make_unique (C++14) and std::string_view > (C++17)), so you might say that PostgreSQL indirectly chases C++ > standards much faster than it chases C standards. This particular > code is a special case because it's guarded for LLVM 12+ only, so it's > OK to use C++14 in that limited context even in back branches. We > have to be careful that it doesn't contain C++17 code since it came > from recent LLVM, but it doesn't seem to by inspection, and you can > check on a machine with CXX=g++ and LLVM 14 on Linux, which uses > -std=c++14 and fails if you add a use of <string_view> and > std::string_view. (Warning: the system C++ standard library on Macs > and other Clang-based systems doesn't have enough version guards so it > won't complain, but GCC and its standard library will explicitly tell > you not to use C++17 features in a C++14 program.) I think the switch to C++14 happened with LLVM 10[0] while the C++17 switch happened with LLVM 16[1]. Double checking on an ubuntu jammy (can't install earlier llvm version than 12 on those): VERSIONS=(20 19 18 17 16 15 14 13 12) for version in ${VERSIONS[@]}; do llvm-config-$version --cxxflags done -I/usr/lib/llvm-20/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-19/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-18/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-17/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-16/include -std=c++17 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-15/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-14/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-13/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-12/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS Which is still fine since, as you said, the code only applied for LLVM12+ which will always have at least c++14. I've tried compiling and running against all those llvm versions and the associated stdc++ version earlier in the thread[2] and had no issues. > If there are no further comments, I'm going to push this to all > branches tomorrow morning. For master only, I will remove the #if > condition and comment about LLVM 12+, as we now require 14+. Patch looks good to me. [0]: https://github.com/llvm/llvm-project/blob/release/10.x/llvm/CMakeLists.txt#L53 [1]: https://github.com/llvm/llvm-project/blob/release/16.x/llvm/CMakeLists.txt#L70 [2]: https://www.postgresql.org/message-id/CAO6_XqqxEQ%3DJY%2BtYO-KQn3_pKQ3O-mPojcwG54L5eptiu1cSJQ%40mail.gmail.com