Re: git: uh-oh - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: git: uh-oh |
Date | |
Msg-id | AANLkTikJ+9rZfHjEAS0a9cVwfitsTk2xRg-w3NfDyH+2@mail.gmail.com Whole thread Raw |
In response to | Re: git: uh-oh (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: git: uh-oh
|
List | pgsql-hackers |
On Wed, Aug 18, 2010 at 11:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Haggerty <mhagger@alum.mit.edu> writes: >> So let's take the simplest example: a branch BRANCH1 is created from >> trunk commit T1, then some time later another FILE1 from trunk commit T3 >> is added to BRANCH1 in commit B4. How should this series of events be >> represented in a git repository? >> ... >> The "exclusive" possibility is to ignore the fact that some of the >> content of B4 came from trunk and to pretend that FILE1 just appeared >> out of nowhere in commit B4 independent of the FILE1 in TRUNK: > >> T0 -- T1 -- T2 -------- T3 -- T4 TRUNK >> \ >> B1 -- B2 -- B3 -- B4 BRANCH1 > >> This is also wrong, because it doesn't reflect the true lineage of FILE1. > > Maybe not, but that *is* how things appeared in the CVS history, and > we'd rather have a git history that looks like the CVS history than > one that claims that boatloads of utterly unrelated commits are part > of a branch's history. Exactly. IMHO, the way this should work is by starting at the beginning of time and working forward. At each step, we examine the earliest revision of each file for which no git commit has yet been written. From among those, we select the one with the earliest timestamp. We then also select all other files whose most recent unprocessed revision is nearly contemporaneous and shares the same author and log message. From the results, we generate a commit. Then we repeat. When we arrive at a branch point, the branch gets processed separately from the trunk. If there is no trunk rev which has every file at the rev where it starts on the branch, then we use some sane algorithm to pick the best one (perhaps, the one that has the right revs of the most files) and then insert a fixup commit on the branch to remove the deltas and carry on as before. > The "inclusive" possibility might be tolerable if it restricted itself > to mentioning commits that actually touched FILE1 in between its > addition to TRUNK and its addition to BRANCH1. So far as I can see, > though, cvs2git is mentioning *every* commit on TRUNK between T1 and B4 > ... not even between T3 and B4, but back to the branch point. How can > you possibly justify that as either sane or useful? git can't do that. It's finding those commits by following parent pointers from the merge commits. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: