Increasing test coverage of WAL redo functions - Mailing list pgsql-hackers
| From | Heikki Linnakangas | 
|---|---|
| Subject | Increasing test coverage of WAL redo functions | 
| Date | |
| Msg-id | 546CA8EC.7050800@vmware.com Whole thread Raw | 
| Responses | Re: Increasing test coverage of WAL redo functions Re: Increasing test coverage of WAL redo functions | 
| List | pgsql-hackers | 
To test WAL replay, I often set up a master-standby system with
streaming replication and run "make installcheck" on the master.
However, the regression suite doesn't generate all WAL record types. I
spent some time looking at the lcov report (make coverage-html), and
crafted new tests to test those redo functions that were not otherwise
covered.
All the new test cases are related to indexes, mostly vacuuming them.
See attached. With this patch, all WAL record types are tested, although
there are still a few codepaths within the redo functions (aside from
"can't happen" error checks) that are not exercised.
There are a couple of problems with these new tests:
1. Whether the vacuum tests test what they're supposed to, depends on
what else is going on in the system. If there's another backend present
that holds onto an snapshot, vacuum won't be able to remove any rows, so
that code will go untested. Running those tests in parallel with other
tests makes it quite likely that nothing can be vacuumed.
2. These make the regression database larger. The following tables and
indexes are added:
postgres=# \d+
                          List of relations
  Schema |       Name       | Type  | Owner  |  Size   | Description
--------+------------------+-------+--------+---------+-------------
  public | btree_tall_tbl   | table | heikki | 24 kB   |
  public | btree_test_tbl   | table | heikki | 392 kB  |
  public | gin_test_tbl     | table | heikki | 588 MB  |
  public | gist_point_tbl   | table | heikki | 1056 kB |
  public | spgist_point_tbl | table | heikki | 1056 kB |
  public | spgist_text_tbl  | table | heikki | 1472 kB |
(6 rows)
postgres=# \di+
                                    List of relations
  Schema |       Name       | Type  | Owner  |      Table       |  Size
   | Descri
ption
--------+------------------+-------+--------+------------------+---------+-------
------
  public | btree_tall_idx   | index | heikki | btree_tall_tbl   | 1176 kB |
  public | btree_test_idx   | index | heikki | btree_test_tbl   | 472 kB  |
  public | gin_test_idx     | index | heikki | gin_test_tbl     | 220 MB  |
  public | gist_pointidx    | index | heikki | gist_point_tbl   | 1744 kB |
  public | spgist_point_idx | index | heikki | spgist_point_tbl | 1120 kB |
  public | spgist_text_idx  | index | heikki | spgist_text_tbl  | 440 kB  |
(6 rows)
The GIN test needs to create a huge table, to cover the case of
splitting an internal posting tree page. That 588MB table plus index is
obviously too much to include in the regular regression suite. I'm not
sure how much smaller it could be made, but it's going to be in that
ballpark anyway. It also takes a long time to run.
I think the rest are tolerable, they make the regression database about
9 MB larger, from 45 MB to 53 MB, and only take a few seconds to run, on
my laptop.
My plan is to leave out that large GIN test for now, and commit the
rest. I'll add comments to the vacuum tests explaining that it's a hit
and miss whether they manage to vacuum anything. It's still better to
have the tests even if they sometimes fail to test vacuum as intended,
than not have the tests at all. In either case, they won't fail unless
there's a bug somewhere, and they will still exercise some code that's
not otherwise tested at all.
Thoughts?
PS. The brin test case is currently in a funny position in
serial_schedule, compared to parallel_schedule. This patch moves it to
where I think it belongs.
- Heikki
		
	Attachment
pgsql-hackers by date: