On Mon, Dec 8, 2025 at 5:18 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> a) fuzz the parser first, because it's easy and we can get interesting inputs
> b) fuzz the AST utilities, seeded with "successful" corpus members from a)
> c) stare really hard at the corpus of b) and figure out how to
> usefully mutate a PlannedStmt with it
Got stuck a bit at (c). The first two fit very well with my preferred
fuzzer setup, where I mock the world and fuzz the heck out of a tiny
corner of it. But a "mutated plan" would 1) take a lot of time for me
to design and 2) probably be counterproductive if I start chasing
impossible plans.
So I've inverted it, so that the server calls libfuzzer midquery,
instead of libfuzzer driving the code under test. That gives me *real*
plans that I can then hit with a bunch of garbage advice -- but it's
more than an order of magnitude slower, unfortunately, so I have to
seed it with the output of a+b before it gets anywhere, and then I
cannot minimize the corpus (which fills up rapidly with unoptimized
inputs) because libfuzzer isn't driving. I feel like there is
considerable room for improvement here... but I could spend a bunch of
time finding it that is then not spent fuzzing.
--
The first thing found with the new architecture is this:
-- note that f is not a partitioned table
SET pg_plan_advice.advice = 'join_order(f/e (f d))';
EXPLAIN (COSTS OFF, PLAN_ADVICE)
SELECT * FROM gt_fact f JOIN gt_dim d ON f.dim_id = d.id;
ERROR: cannot determine RTI for advice target
Test, and a quick guess at expected output, attached.
--Jacob