]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/user-manual.txt
user-manual: introduce the word "commit" earlier
[thirdparty/git.git] / Documentation / user-manual.txt
CommitLineData
71f4b183
CW
1Git User's Manual (for version 1.5.1 or newer)
2______________________________________________
d19fbc3c 3
99eaefdd
BF
4
5Git is a fast distributed revision control system.
6
d19fbc3c 7This manual is designed to be readable by someone with basic unix
79c96c57 8command-line skills, but no previous knowledge of git.
d19fbc3c 9
2624d9a5
BF
10<<repositories-and-branches>> and <<exploring-git-history>> explain how
11to fetch and study a project using git--read these chapters to learn how
12to build and test a particular version of a software project, search for
13regressions, and so on.
ef89f701 14
2624d9a5
BF
15People needing to do actual development will also want to read
16<<Developing-with-git>> and <<sharing-development>>.
6bd9b682
BF
17
18Further chapters cover more specialized topics.
19
d19fbc3c
BF
20Comprehensive reference documentation is available through the man
21pages. For a command such as "git clone", just use
22
23------------------------------------------------
24$ man git-clone
25------------------------------------------------
26
2624d9a5
BF
27See also <<git-quick-start>> for a brief overview of git commands,
28without any explanation.
b181d57f 29
99f171bb 30Finally, see <<todo>> for ways that you can help make this manual more
2624d9a5 31complete.
b181d57f 32
b181d57f 33
e34caace 34[[repositories-and-branches]]
d19fbc3c
BF
35Repositories and Branches
36=========================
37
e34caace 38[[how-to-get-a-git-repository]]
d19fbc3c
BF
39How to get a git repository
40---------------------------
41
42It will be useful to have a git repository to experiment with as you
43read this manual.
44
a5f90f31
BF
45The best way to get one is by using the gitlink:git-clone[1] command to
46download a copy of an existing repository. If you don't already have a
47project in mind, here are some interesting examples:
d19fbc3c
BF
48
49------------------------------------------------
50 # git itself (approx. 10MB download):
51$ git clone git://git.kernel.org/pub/scm/git/git.git
52 # the linux kernel (approx. 150MB download):
53$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
54------------------------------------------------
55
56The initial clone may be time-consuming for a large project, but you
57will only need to clone once.
58
59The clone command creates a new directory named after the project
60("git" or "linux-2.6" in the examples above). After you cd into this
61directory, you will see that it contains a copy of the project files,
62together with a special top-level directory named ".git", which
63contains all the information about the history of the project.
64
e34caace 65[[how-to-check-out]]
d19fbc3c
BF
66How to check out a different version of a project
67-------------------------------------------------
68
a2ef9d63
BF
69Git is best thought of as a tool for storing the history of a collection
70of files. It stores the history as a compressed collection of
71interrelated snapshots of the project's contents. In git each such
72version is called a <<def_commit,commit>>.
d19fbc3c 73
81b6c950
BF
74A single git repository may contain multiple branches. It keeps track
75of them by keeping a list of <<def_head,heads>> which reference the
a2ef9d63 76latest commit on each branch; the gitlink:git-branch[1] command shows
81b6c950 77you the list of branch heads:
d19fbc3c
BF
78
79------------------------------------------------
80$ git branch
81* master
82------------------------------------------------
83
4f752407
BF
84A freshly cloned repository contains a single branch head, by default
85named "master", with the working directory initialized to the state of
86the project referred to by that branch head.
d19fbc3c 87
81b6c950
BF
88Most projects also use <<def_tag,tags>>. Tags, like heads, are
89references into the project's history, and can be listed using the
d19fbc3c
BF
90gitlink:git-tag[1] command:
91
92------------------------------------------------
93$ git tag -l
94v2.6.11
95v2.6.11-tree
96v2.6.12
97v2.6.12-rc2
98v2.6.12-rc3
99v2.6.12-rc4
100v2.6.12-rc5
101v2.6.12-rc6
102v2.6.13
103...
104------------------------------------------------
105
fe4b3e59 106Tags are expected to always point at the same version of a project,
81b6c950 107while heads are expected to advance as development progresses.
fe4b3e59 108
81b6c950 109Create a new branch head pointing to one of these versions and check it
d19fbc3c
BF
110out using gitlink:git-checkout[1]:
111
112------------------------------------------------
113$ git checkout -b new v2.6.13
114------------------------------------------------
115
116The working directory then reflects the contents that the project had
117when it was tagged v2.6.13, and gitlink:git-branch[1] shows two
118branches, with an asterisk marking the currently checked-out branch:
119
120------------------------------------------------
121$ git branch
122 master
123* new
124------------------------------------------------
125
126If you decide that you'd rather see version 2.6.17, you can modify
127the current branch to point at v2.6.17 instead, with
128
129------------------------------------------------
130$ git reset --hard v2.6.17
131------------------------------------------------
132
81b6c950 133Note that if the current branch head was your only reference to a
d19fbc3c 134particular point in history, then resetting that branch may leave you
81b6c950
BF
135with no way to find the history it used to point to; so use this command
136carefully.
d19fbc3c 137
e34caace 138[[understanding-commits]]
d19fbc3c
BF
139Understanding History: Commits
140------------------------------
141
142Every change in the history of a project is represented by a commit.
143The gitlink:git-show[1] command shows the most recent commit on the
144current branch:
145
146------------------------------------------------
147$ git show
148commit 2b5f6dcce5bf94b9b119e9ed8d537098ec61c3d2
149Author: Jamal Hadi Salim <hadi@cyberus.ca>
150Date: Sat Dec 2 22:22:25 2006 -0800
151
152 [XFRM]: Fix aevent structuring to be more complete.
153
154 aevents can not uniquely identify an SA. We break the ABI with this
155 patch, but consensus is that since it is not yet utilized by any
156 (known) application then it is fine (better do it now than later).
157
158 Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
159 Signed-off-by: David S. Miller <davem@davemloft.net>
160
161diff --git a/Documentation/networking/xfrm_sync.txt b/Documentation/networking/xfrm_sync.txt
162index 8be626f..d7aac9d 100644
163--- a/Documentation/networking/xfrm_sync.txt
164+++ b/Documentation/networking/xfrm_sync.txt
165@@ -47,10 +47,13 @@ aevent_id structure looks like:
166
167 struct xfrm_aevent_id {
168 struct xfrm_usersa_id sa_id;
169+ xfrm_address_t saddr;
170 __u32 flags;
171+ __u32 reqid;
172 };
173...
174------------------------------------------------
175
176As you can see, a commit shows who made the latest change, what they
177did, and why.
178
35121930
BF
179Every commit has a 40-hexdigit id, sometimes called the "object name" or the
180"SHA1 id", shown on the first line of the "git show" output. You can usually
181refer to a commit by a shorter name, such as a tag or a branch name, but this
182longer name can also be useful. Most importantly, it is a globally unique
183name for this commit: so if you tell somebody else the object name (for
184example in email), then you are guaranteed that name will refer to the same
185commit in their repository that it does in yours (assuming their repository
186has that commit at all). Since the object name is computed as a hash over the
187contents of the commit, you are guaranteed that the commit can never change
188without its name also changing.
189
190In fact, in <<git-internals>> we shall see that everything stored in git
191history, including file data and directory contents, is stored in an object
192with a name that is a hash of its contents.
d19fbc3c 193
e34caace 194[[understanding-reachability]]
d19fbc3c
BF
195Understanding history: commits, parents, and reachability
196~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
197
198Every commit (except the very first commit in a project) also has a
199parent commit which shows what happened before this commit.
200Following the chain of parents will eventually take you back to the
201beginning of the project.
202
203However, the commits do not form a simple list; git allows lines of
204development to diverge and then reconverge, and the point where two
205lines of development reconverge is called a "merge". The commit
206representing a merge can therefore have more than one parent, with
207each parent representing the most recent commit on one of the lines
208of development leading to that point.
209
210The best way to see how this works is using the gitlink:gitk[1]
211command; running gitk now on a git repository and looking for merge
212commits will help understand how the git organizes history.
213
214In the following, we say that commit X is "reachable" from commit Y
215if commit X is an ancestor of commit Y. Equivalently, you could say
216that Y is a descendent of X, or that there is a chain of parents
217leading from commit Y to commit X.
218
e34caace 219[[history-diagrams]]
3dff5379
PR
220Understanding history: History diagrams
221~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
d19fbc3c
BF
222
223We will sometimes represent git history using diagrams like the one
224below. Commits are shown as "o", and the links between them with
225lines drawn with - / and \. Time goes left to right:
226
1dc71a91
BF
227
228................................................
d19fbc3c
BF
229 o--o--o <-- Branch A
230 /
231 o--o--o <-- master
232 \
233 o--o--o <-- Branch B
1dc71a91 234................................................
d19fbc3c
BF
235
236If we need to talk about a particular commit, the character "o" may
237be replaced with another letter or number.
238
e34caace 239[[what-is-a-branch]]
d19fbc3c
BF
240Understanding history: What is a branch?
241~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
242
81b6c950
BF
243When we need to be precise, we will use the word "branch" to mean a line
244of development, and "branch head" (or just "head") to mean a reference
245to the most recent commit on a branch. In the example above, the branch
246head named "A" is a pointer to one particular commit, but we refer to
247the line of three commits leading up to that point as all being part of
d19fbc3c
BF
248"branch A".
249
81b6c950
BF
250However, when no confusion will result, we often just use the term
251"branch" both for branches and for branch heads.
d19fbc3c 252
e34caace 253[[manipulating-branches]]
d19fbc3c
BF
254Manipulating branches
255---------------------
256
257Creating, deleting, and modifying branches is quick and easy; here's
258a summary of the commands:
259
260git branch::
261 list all branches
262git branch <branch>::
263 create a new branch named <branch>, referencing the same
264 point in history as the current branch
265git branch <branch> <start-point>::
266 create a new branch named <branch>, referencing
267 <start-point>, which may be specified any way you like,
268 including using a branch name or a tag name
269git branch -d <branch>::
270 delete the branch <branch>; if the branch you are deleting
c64415e2
BF
271 points to a commit which is not reachable from the current
272 branch, this command will fail with a warning.
d19fbc3c
BF
273git branch -D <branch>::
274 even if the branch points to a commit not reachable
275 from the current branch, you may know that that commit
276 is still reachable from some other branch or tag. In that
277 case it is safe to use this command to force git to delete
278 the branch.
279git checkout <branch>::
280 make the current branch <branch>, updating the working
281 directory to reflect the version referenced by <branch>
282git checkout -b <new> <start-point>::
283 create a new branch <new> referencing <start-point>, and
284 check it out.
285
72a76c95
BF
286The special symbol "HEAD" can always be used to refer to the current
287branch. In fact, git uses a file named "HEAD" in the .git directory to
288remember which branch is current:
289
290------------------------------------------------
291$ cat .git/HEAD
292ref: refs/heads/master
293------------------------------------------------
294
25d9f3fa 295[[detached-head]]
72a76c95
BF
296Examining an old version without creating a new branch
297------------------------------------------------------
298
299The git-checkout command normally expects a branch head, but will also
300accept an arbitrary commit; for example, you can check out the commit
301referenced by a tag:
302
303------------------------------------------------
304$ git checkout v2.6.17
305Note: moving to "v2.6.17" which isn't a local branch
306If you want to create a new branch from this checkout, you may do so
307(now or later) by using -b with the checkout command again. Example:
308 git checkout -b <new_branch_name>
309HEAD is now at 427abfa... Linux v2.6.17
310------------------------------------------------
311
312The HEAD then refers to the SHA1 of the commit instead of to a branch,
313and git branch shows that you are no longer on a branch:
314
315------------------------------------------------
316$ cat .git/HEAD
317427abfa28afedffadfca9dd8b067eb6d36bac53f
953f3d6f 318$ git branch
72a76c95
BF
319* (no branch)
320 master
321------------------------------------------------
322
323In this case we say that the HEAD is "detached".
324
953f3d6f
BF
325This is an easy way to check out a particular version without having to
326make up a name for the new branch. You can still create a new branch
327(or tag) for this version later if you decide to.
d19fbc3c 328
e34caace 329[[examining-remote-branches]]
d19fbc3c
BF
330Examining branches from a remote repository
331-------------------------------------------
332
333The "master" branch that was created at the time you cloned is a copy
334of the HEAD in the repository that you cloned from. That repository
335may also have had other branches, though, and your local repository
336keeps branches which track each of those remote branches, which you
337can view using the "-r" option to gitlink:git-branch[1]:
338
339------------------------------------------------
340$ git branch -r
341 origin/HEAD
342 origin/html
343 origin/maint
344 origin/man
345 origin/master
346 origin/next
347 origin/pu
348 origin/todo
349------------------------------------------------
350
351You cannot check out these remote-tracking branches, but you can
352examine them on a branch of your own, just as you would a tag:
353
354------------------------------------------------
355$ git checkout -b my-todo-copy origin/todo
356------------------------------------------------
357
358Note that the name "origin" is just the name that git uses by default
359to refer to the repository that you cloned from.
360
361[[how-git-stores-references]]
f60b9642
BF
362Naming branches, tags, and other references
363-------------------------------------------
d19fbc3c
BF
364
365Branches, remote-tracking branches, and tags are all references to
f60b9642
BF
366commits. All references are named with a slash-separated path name
367starting with "refs"; the names we've been using so far are actually
368shorthand:
d19fbc3c 369
f60b9642
BF
370 - The branch "test" is short for "refs/heads/test".
371 - The tag "v2.6.18" is short for "refs/tags/v2.6.18".
372 - "origin/master" is short for "refs/remotes/origin/master".
d19fbc3c 373
f60b9642
BF
374The full name is occasionally useful if, for example, there ever
375exists a tag and a branch with the same name.
d19fbc3c 376
c64415e2
BF
377As another useful shortcut, the "HEAD" of a repository can be referred
378to just using the name of that repository. So, for example, "origin"
379is usually a shortcut for the HEAD branch in the repository "origin".
d19fbc3c
BF
380
381For the complete list of paths which git checks for references, and
f60b9642
BF
382the order it uses to decide which to choose when there are multiple
383references with the same shorthand name, see the "SPECIFYING
384REVISIONS" section of gitlink:git-rev-parse[1].
d19fbc3c
BF
385
386[[Updating-a-repository-with-git-fetch]]
387Updating a repository with git fetch
388------------------------------------
389
390Eventually the developer cloned from will do additional work in her
391repository, creating new commits and advancing the branches to point
392at the new commits.
393
394The command "git fetch", with no arguments, will update all of the
395remote-tracking branches to the latest version found in her
396repository. It will not touch any of your own branches--not even the
397"master" branch that was created for you on clone.
398
e34caace 399[[fetching-branches]]
d5cd5de4
BF
400Fetching branches from other repositories
401-----------------------------------------
402
403You can also track branches from repositories other than the one you
404cloned from, using gitlink:git-remote[1]:
405
406-------------------------------------------------
407$ git remote add linux-nfs git://linux-nfs.org/pub/nfs-2.6.git
04483524 408$ git fetch linux-nfs
d5cd5de4
BF
409* refs/remotes/linux-nfs/master: storing branch 'master' ...
410 commit: bf81b46
411-------------------------------------------------
412
413New remote-tracking branches will be stored under the shorthand name
414that you gave "git remote add", in this case linux-nfs:
415
416-------------------------------------------------
417$ git branch -r
418linux-nfs/master
419origin/master
420-------------------------------------------------
421
422If you run "git fetch <remote>" later, the tracking branches for the
423named <remote> will be updated.
424
425If you examine the file .git/config, you will see that git has added
426a new stanza:
427
428-------------------------------------------------
429$ cat .git/config
430...
431[remote "linux-nfs"]
923642fe
BF
432 url = git://linux-nfs.org/pub/nfs-2.6.git
433 fetch = +refs/heads/*:refs/remotes/linux-nfs/*
d5cd5de4
BF
434...
435-------------------------------------------------
436
fc90c536
BF
437This is what causes git to track the remote's branches; you may modify
438or delete these configuration options by editing .git/config with a
439text editor. (See the "CONFIGURATION FILE" section of
440gitlink:git-config[1] for details.)
d5cd5de4 441
e34caace 442[[exploring-git-history]]
d19fbc3c
BF
443Exploring git history
444=====================
445
446Git is best thought of as a tool for storing the history of a
447collection of files. It does this by storing compressed snapshots of
448the contents of a file heirarchy, together with "commits" which show
449the relationships between these snapshots.
450
451Git provides extremely flexible and fast tools for exploring the
452history of a project.
453
aacd404e 454We start with one specialized tool that is useful for finding the
d19fbc3c
BF
455commit that introduced a bug into a project.
456
e34caace 457[[using-bisect]]
d19fbc3c
BF
458How to use bisect to find a regression
459--------------------------------------
460
461Suppose version 2.6.18 of your project worked, but the version at
462"master" crashes. Sometimes the best way to find the cause of such a
463regression is to perform a brute-force search through the project's
464history to find the particular commit that caused the problem. The
465gitlink:git-bisect[1] command can help you do this:
466
467-------------------------------------------------
468$ git bisect start
469$ git bisect good v2.6.18
470$ git bisect bad master
471Bisecting: 3537 revisions left to test after this
472[65934a9a028b88e83e2b0f8b36618fe503349f8e] BLOCK: Make USB storage depend on SCSI rather than selecting it [try #6]
473-------------------------------------------------
474
475If you run "git branch" at this point, you'll see that git has
476temporarily moved you to a new branch named "bisect". This branch
477points to a commit (with commit id 65934...) that is reachable from
478v2.6.19 but not from v2.6.18. Compile and test it, and see whether
479it crashes. Assume it does crash. Then:
480
481-------------------------------------------------
482$ git bisect bad
483Bisecting: 1769 revisions left to test after this
484[7eff82c8b1511017ae605f0c99ac275a7e21b867] i2c-core: Drop useless bitmaskings
485-------------------------------------------------
486
487checks out an older version. Continue like this, telling git at each
488stage whether the version it gives you is good or bad, and notice
489that the number of revisions left to test is cut approximately in
490half each time.
491
492After about 13 tests (in this case), it will output the commit id of
493the guilty commit. You can then examine the commit with
494gitlink:git-show[1], find out who wrote it, and mail them your bug
495report with the commit id. Finally, run
496
497-------------------------------------------------
498$ git bisect reset
499-------------------------------------------------
500
501to return you to the branch you were on before and delete the
502temporary "bisect" branch.
503
504Note that the version which git-bisect checks out for you at each
505point is just a suggestion, and you're free to try a different
506version if you think it would be a good idea. For example,
507occasionally you may land on a commit that broke something unrelated;
508run
509
510-------------------------------------------------
04483524 511$ git bisect visualize
d19fbc3c
BF
512-------------------------------------------------
513
514which will run gitk and label the commit it chose with a marker that
515says "bisect". Chose a safe-looking commit nearby, note its commit
516id, and check it out with:
517
518-------------------------------------------------
519$ git reset --hard fb47ddb2db...
520-------------------------------------------------
521
522then test, run "bisect good" or "bisect bad" as appropriate, and
523continue.
524
e34caace 525[[naming-commits]]
d19fbc3c
BF
526Naming commits
527--------------
528
529We have seen several ways of naming commits already:
530
d55ae921 531 - 40-hexdigit object name
d19fbc3c
BF
532 - branch name: refers to the commit at the head of the given
533 branch
534 - tag name: refers to the commit pointed to by the given tag
535 (we've seen branches and tags are special cases of
536 <<how-git-stores-references,references>>).
537 - HEAD: refers to the head of the current branch
538
eb6ae7f4 539There are many more; see the "SPECIFYING REVISIONS" section of the
aec053bb 540gitlink:git-rev-parse[1] man page for the complete list of ways to
d19fbc3c
BF
541name revisions. Some examples:
542
543-------------------------------------------------
d55ae921 544$ git show fb47ddb2 # the first few characters of the object name
d19fbc3c
BF
545 # are usually enough to specify it uniquely
546$ git show HEAD^ # the parent of the HEAD commit
547$ git show HEAD^^ # the grandparent
548$ git show HEAD~4 # the great-great-grandparent
549-------------------------------------------------
550
551Recall that merge commits may have more than one parent; by default,
552^ and ~ follow the first parent listed in the commit, but you can
553also choose:
554
555-------------------------------------------------
556$ git show HEAD^1 # show the first parent of HEAD
557$ git show HEAD^2 # show the second parent of HEAD
558-------------------------------------------------
559
560In addition to HEAD, there are several other special names for
561commits:
562
563Merges (to be discussed later), as well as operations such as
564git-reset, which change the currently checked-out commit, generally
565set ORIG_HEAD to the value HEAD had before the current operation.
566
567The git-fetch operation always stores the head of the last fetched
568branch in FETCH_HEAD. For example, if you run git fetch without
569specifying a local branch as the target of the operation
570
571-------------------------------------------------
572$ git fetch git://example.com/proj.git theirbranch
573-------------------------------------------------
574
575the fetched commits will still be available from FETCH_HEAD.
576
577When we discuss merges we'll also see the special name MERGE_HEAD,
578which refers to the other branch that we're merging in to the current
579branch.
580
aec053bb 581The gitlink:git-rev-parse[1] command is a low-level command that is
d55ae921
BF
582occasionally useful for translating some name for a commit to the object
583name for that commit:
aec053bb
BF
584
585-------------------------------------------------
586$ git rev-parse origin
587e05db0fd4f31dde7005f075a84f96b360d05984b
588-------------------------------------------------
589
e34caace 590[[creating-tags]]
d19fbc3c
BF
591Creating tags
592-------------
593
594We can also create a tag to refer to a particular commit; after
595running
596
597-------------------------------------------------
04483524 598$ git tag stable-1 1b2e1d63ff
d19fbc3c
BF
599-------------------------------------------------
600
601You can use stable-1 to refer to the commit 1b2e1d63ff.
602
c64415e2
BF
603This creates a "lightweight" tag. If you would also like to include a
604comment with the tag, and possibly sign it cryptographically, then you
605should create a tag object instead; see the gitlink:git-tag[1] man page
606for details.
d19fbc3c 607
e34caace 608[[browsing-revisions]]
d19fbc3c
BF
609Browsing revisions
610------------------
611
612The gitlink:git-log[1] command can show lists of commits. On its
613own, it shows all commits reachable from the parent commit; but you
614can also make more specific requests:
615
616-------------------------------------------------
617$ git log v2.5.. # commits since (not reachable from) v2.5
618$ git log test..master # commits reachable from master but not test
619$ git log master..test # ...reachable from test but not master
620$ git log master...test # ...reachable from either test or master,
621 # but not both
622$ git log --since="2 weeks ago" # commits from the last 2 weeks
623$ git log Makefile # commits which modify Makefile
624$ git log fs/ # ... which modify any file under fs/
625$ git log -S'foo()' # commits which add or remove any file data
626 # matching the string 'foo()'
627-------------------------------------------------
628
629And of course you can combine all of these; the following finds
630commits since v2.5 which touch the Makefile or any file under fs:
631
632-------------------------------------------------
633$ git log v2.5.. Makefile fs/
634-------------------------------------------------
635
636You can also ask git log to show patches:
637
638-------------------------------------------------
639$ git log -p
640-------------------------------------------------
641
642See the "--pretty" option in the gitlink:git-log[1] man page for more
643display options.
644
645Note that git log starts with the most recent commit and works
646backwards through the parents; however, since git history can contain
3dff5379 647multiple independent lines of development, the particular order that
d19fbc3c
BF
648commits are listed in may be somewhat arbitrary.
649
e34caace 650[[generating-diffs]]
d19fbc3c
BF
651Generating diffs
652----------------
653
654You can generate diffs between any two versions using
655gitlink:git-diff[1]:
656
657-------------------------------------------------
658$ git diff master..test
659-------------------------------------------------
660
661Sometimes what you want instead is a set of patches:
662
663-------------------------------------------------
664$ git format-patch master..test
665-------------------------------------------------
666
667will generate a file with a patch for each commit reachable from test
668but not from master. Note that if master also has commits which are
669not reachable from test, then the combined result of these patches
670will not be the same as the diff produced by the git-diff example.
671
e34caace 672[[viewing-old-file-versions]]
d19fbc3c
BF
673Viewing old file versions
674-------------------------
675
676You can always view an old version of a file by just checking out the
677correct revision first. But sometimes it is more convenient to be
678able to view an old version of a single file without checking
679anything out; this command does that:
680
681-------------------------------------------------
682$ git show v2.5:fs/locks.c
683-------------------------------------------------
684
685Before the colon may be anything that names a commit, and after it
686may be any path to a file tracked by git.
687
e34caace 688[[history-examples]]
aec053bb
BF
689Examples
690--------
691
46acd3fa
BF
692[[counting-commits-on-a-branch]]
693Counting the number of commits on a branch
694~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
695
696Suppose you want to know how many commits you've made on "mybranch"
697since it diverged from "origin":
698
699-------------------------------------------------
700$ git log --pretty=oneline origin..mybranch | wc -l
701-------------------------------------------------
702
703Alternatively, you may often see this sort of thing done with the
704lower-level command gitlink:git-rev-list[1], which just lists the SHA1's
705of all the given commits:
706
707-------------------------------------------------
708$ git rev-list origin..mybranch | wc -l
709-------------------------------------------------
710
e34caace 711[[checking-for-equal-branches]]
aec053bb 712Check whether two branches point at the same history
2f99710c 713~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
aec053bb
BF
714
715Suppose you want to check whether two branches point at the same point
716in history.
717
718-------------------------------------------------
719$ git diff origin..master
720-------------------------------------------------
721
69f7ad73
BF
722will tell you whether the contents of the project are the same at the
723two branches; in theory, however, it's possible that the same project
724contents could have been arrived at by two different historical
d55ae921 725routes. You could compare the object names:
aec053bb
BF
726
727-------------------------------------------------
728$ git rev-list origin
729e05db0fd4f31dde7005f075a84f96b360d05984b
730$ git rev-list master
731e05db0fd4f31dde7005f075a84f96b360d05984b
732-------------------------------------------------
733
69f7ad73
BF
734Or you could recall that the ... operator selects all commits
735contained reachable from either one reference or the other but not
736both: so
aec053bb
BF
737
738-------------------------------------------------
739$ git log origin...master
740-------------------------------------------------
741
742will return no commits when the two branches are equal.
743
e34caace 744[[finding-tagged-descendants]]
b181d57f
BF
745Find first tagged version including a given fix
746~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
aec053bb 747
69f7ad73
BF
748Suppose you know that the commit e05db0fd fixed a certain problem.
749You'd like to find the earliest tagged release that contains that
750fix.
751
752Of course, there may be more than one answer--if the history branched
753after commit e05db0fd, then there could be multiple "earliest" tagged
754releases.
755
756You could just visually inspect the commits since e05db0fd:
757
758-------------------------------------------------
759$ gitk e05db0fd..
760-------------------------------------------------
761
b181d57f
BF
762Or you can use gitlink:git-name-rev[1], which will give the commit a
763name based on any tag it finds pointing to one of the commit's
764descendants:
765
766-------------------------------------------------
04483524 767$ git name-rev --tags e05db0fd
b181d57f
BF
768e05db0fd tags/v1.5.0-rc1^0~23
769-------------------------------------------------
770
771The gitlink:git-describe[1] command does the opposite, naming the
772revision using a tag on which the given commit is based:
773
774-------------------------------------------------
775$ git describe e05db0fd
04483524 776v1.5.0-rc0-260-ge05db0f
b181d57f
BF
777-------------------------------------------------
778
779but that may sometimes help you guess which tags might come after the
780given commit.
781
782If you just want to verify whether a given tagged version contains a
783given commit, you could use gitlink:git-merge-base[1]:
784
785-------------------------------------------------
786$ git merge-base e05db0fd v1.5.0-rc1
787e05db0fd4f31dde7005f075a84f96b360d05984b
788-------------------------------------------------
789
790The merge-base command finds a common ancestor of the given commits,
791and always returns one or the other in the case where one is a
792descendant of the other; so the above output shows that e05db0fd
793actually is an ancestor of v1.5.0-rc1.
794
795Alternatively, note that
796
797-------------------------------------------------
4a7979ca 798$ git log v1.5.0-rc1..e05db0fd
b181d57f
BF
799-------------------------------------------------
800
4a7979ca 801will produce empty output if and only if v1.5.0-rc1 includes e05db0fd,
b181d57f 802because it outputs only commits that are not reachable from v1.5.0-rc1.
aec053bb 803
4a7979ca
BF
804As yet another alternative, the gitlink:git-show-branch[1] command lists
805the commits reachable from its arguments with a display on the left-hand
806side that indicates which arguments that commit is reachable from. So,
807you can run something like
808
809-------------------------------------------------
810$ git show-branch e05db0fd v1.5.0-rc0 v1.5.0-rc1 v1.5.0-rc2
811! [e05db0fd] Fix warnings in sha1_file.c - use C99 printf format if
812available
813 ! [v1.5.0-rc0] GIT v1.5.0 preview
814 ! [v1.5.0-rc1] GIT v1.5.0-rc1
815 ! [v1.5.0-rc2] GIT v1.5.0-rc2
816...
817-------------------------------------------------
818
819then search for a line that looks like
820
821-------------------------------------------------
822+ ++ [e05db0fd] Fix warnings in sha1_file.c - use C99 printf format if
823available
824-------------------------------------------------
825
826Which shows that e05db0fd is reachable from itself, from v1.5.0-rc1, and
827from v1.5.0-rc2, but not from v1.5.0-rc0.
828
629d9f78
BF
829[[showing-commits-unique-to-a-branch]]
830Showing commits unique to a given branch
831~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4a7979ca 832
629d9f78
BF
833Suppose you would like to see all the commits reachable from the branch
834head named "master" but not from any other head in your repository.
d19fbc3c 835
629d9f78
BF
836We can list all the heads in this repository with
837gitlink:git-show-ref[1]:
d19fbc3c 838
629d9f78
BF
839-------------------------------------------------
840$ git show-ref --heads
841bf62196b5e363d73353a9dcf094c59595f3153b7 refs/heads/core-tutorial
842db768d5504c1bb46f63ee9d6e1772bd047e05bf9 refs/heads/maint
843a07157ac624b2524a059a3414e99f6f44bebc1e7 refs/heads/master
84424dbc180ea14dc1aebe09f14c8ecf32010690627 refs/heads/tutorial-2
8451e87486ae06626c2f31eaa63d26fc0fd646c8af2 refs/heads/tutorial-fixes
846-------------------------------------------------
d19fbc3c 847
629d9f78
BF
848We can get just the branch-head names, and remove "master", with
849the help of the standard utilities cut and grep:
850
851-------------------------------------------------
852$ git show-ref --heads | cut -d' ' -f2 | grep -v '^refs/heads/master'
853refs/heads/core-tutorial
854refs/heads/maint
855refs/heads/tutorial-2
856refs/heads/tutorial-fixes
857-------------------------------------------------
858
859And then we can ask to see all the commits reachable from master
860but not from these other heads:
861
862-------------------------------------------------
863$ gitk master --not $( git show-ref --heads | cut -d' ' -f2 |
864 grep -v '^refs/heads/master' )
865-------------------------------------------------
866
867Obviously, endless variations are possible; for example, to see all
868commits reachable from some head but not from any tag in the repository:
869
870-------------------------------------------------
c78974f7 871$ gitk $( git show-ref --heads ) --not $( git show-ref --tags )
629d9f78
BF
872-------------------------------------------------
873
874(See gitlink:git-rev-parse[1] for explanations of commit-selecting
875syntax such as `--not`.)
876
82c8bf28
BF
877[[making-a-release]]
878Creating a changelog and tarball for a software release
879~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
880
881The gitlink:git-archive[1] command can create a tar or zip archive from
882any version of a project; for example:
883
884-------------------------------------------------
885$ git archive --format=tar --prefix=project/ HEAD | gzip >latest.tar.gz
886-------------------------------------------------
887
888will use HEAD to produce a tar archive in which each filename is
ccd71866 889preceded by "project/".
82c8bf28
BF
890
891If you're releasing a new version of a software project, you may want
892to simultaneously make a changelog to include in the release
893announcement.
894
895Linus Torvalds, for example, makes new kernel releases by tagging them,
896then running:
897
898-------------------------------------------------
899$ release-script 2.6.12 2.6.13-rc6 2.6.13-rc7
900-------------------------------------------------
901
902where release-script is a shell script that looks like:
903
904-------------------------------------------------
905#!/bin/sh
906stable="$1"
907last="$2"
908new="$3"
909echo "# git tag v$new"
910echo "git archive --prefix=linux-$new/ v$new | gzip -9 > ../linux-$new.tar.gz"
911echo "git diff v$stable v$new | gzip -9 > ../patch-$new.gz"
912echo "git log --no-merges v$new ^v$last > ../ChangeLog-$new"
913echo "git shortlog --no-merges v$new ^v$last > ../ShortLog"
914echo "git diff --stat --summary -M v$last v$new > ../diffstat-$new"
915-------------------------------------------------
916
917and then he just cut-and-pastes the output commands after verifying that
918they look OK.
4a7979ca 919
8ceca74a 920[[Finding-comments-with-given-content]]
187b0d80
BF
921Finding commits referencing a file with given content
922-----------------------------------------------------
923
924Somebody hands you a copy of a file, and asks which commits modified a
925file such that it contained the given content either before or after the
926commit. You can find out with this:
927
928-------------------------------------------------
929$ git log --raw -r --abbrev=40 --pretty=oneline -- filename |
930 grep -B 1 `git hash-object filename`
931-------------------------------------------------
932
933Figuring out why this works is left as an exercise to the (advanced)
934student. The gitlink:git-log[1], gitlink:git-diff-tree[1], and
935gitlink:git-hash-object[1] man pages may prove helpful.
936
e34caace 937[[Developing-with-git]]
d19fbc3c
BF
938Developing with git
939===================
940
e34caace 941[[telling-git-your-name]]
d19fbc3c
BF
942Telling git your name
943---------------------
944
945Before creating any commits, you should introduce yourself to git. The
58c19d1f
BF
946easiest way to do so is to make sure the following lines appear in a
947file named .gitconfig in your home directory:
d19fbc3c
BF
948
949------------------------------------------------
d19fbc3c
BF
950[user]
951 name = Your Name Comes Here
952 email = you@yourdomain.example.com
d19fbc3c
BF
953------------------------------------------------
954
fc90c536
BF
955(See the "CONFIGURATION FILE" section of gitlink:git-config[1] for
956details on the configuration file.)
957
d19fbc3c 958
e34caace 959[[creating-a-new-repository]]
d19fbc3c
BF
960Creating a new repository
961-------------------------
962
963Creating a new repository from scratch is very easy:
964
965-------------------------------------------------
966$ mkdir project
967$ cd project
f1d2b477 968$ git init
d19fbc3c
BF
969-------------------------------------------------
970
971If you have some initial content (say, a tarball):
972
973-------------------------------------------------
974$ tar -xzvf project.tar.gz
975$ cd project
f1d2b477 976$ git init
d19fbc3c
BF
977$ git add . # include everything below ./ in the first commit:
978$ git commit
979-------------------------------------------------
980
981[[how-to-make-a-commit]]
ae25c67a 982How to make a commit
d19fbc3c
BF
983--------------------
984
985Creating a new commit takes three steps:
986
987 1. Making some changes to the working directory using your
988 favorite editor.
989 2. Telling git about your changes.
990 3. Creating the commit using the content you told git about
991 in step 2.
992
993In practice, you can interleave and repeat steps 1 and 2 as many
994times as you want: in order to keep track of what you want committed
995at step 3, git maintains a snapshot of the tree's contents in a
996special staging area called "the index."
997
01997b4a
BF
998At the beginning, the content of the index will be identical to
999that of the HEAD. The command "git diff --cached", which shows
1000the difference between the HEAD and the index, should therefore
1001produce no output at that point.
eb6ae7f4 1002
d19fbc3c
BF
1003Modifying the index is easy:
1004
1005To update the index with the new contents of a modified file, use
1006
1007-------------------------------------------------
1008$ git add path/to/file
1009-------------------------------------------------
1010
1011To add the contents of a new file to the index, use
1012
1013-------------------------------------------------
1014$ git add path/to/file
1015-------------------------------------------------
1016
eb6ae7f4 1017To remove a file from the index and from the working tree,
d19fbc3c
BF
1018
1019-------------------------------------------------
1020$ git rm path/to/file
1021-------------------------------------------------
1022
1023After each step you can verify that
1024
1025-------------------------------------------------
1026$ git diff --cached
1027-------------------------------------------------
1028
1029always shows the difference between the HEAD and the index file--this
1030is what you'd commit if you created the commit now--and that
1031
1032-------------------------------------------------
1033$ git diff
1034-------------------------------------------------
1035
1036shows the difference between the working tree and the index file.
1037
1038Note that "git add" always adds just the current contents of a file
1039to the index; further changes to the same file will be ignored unless
1040you run git-add on the file again.
1041
1042When you're ready, just run
1043
1044-------------------------------------------------
1045$ git commit
1046-------------------------------------------------
1047
1048and git will prompt you for a commit message and then create the new
3dff5379 1049commit. Check to make sure it looks like what you expected with
d19fbc3c
BF
1050
1051-------------------------------------------------
1052$ git show
1053-------------------------------------------------
1054
1055As a special shortcut,
1056
1057-------------------------------------------------
1058$ git commit -a
1059-------------------------------------------------
1060
1061will update the index with any files that you've modified or removed
1062and create a commit, all in one step.
1063
1064A number of commands are useful for keeping track of what you're
1065about to commit:
1066
1067-------------------------------------------------
1068$ git diff --cached # difference between HEAD and the index; what
1069 # would be commited if you ran "commit" now.
1070$ git diff # difference between the index file and your
1071 # working directory; changes that would not
1072 # be included if you ran "commit" now.
c64415e2
BF
1073$ git diff HEAD # difference between HEAD and working tree; what
1074 # would be committed if you ran "commit -a" now.
d19fbc3c
BF
1075$ git status # a brief per-file summary of the above.
1076-------------------------------------------------
1077
e34caace 1078[[creating-good-commit-messages]]
ae25c67a 1079Creating good commit messages
d19fbc3c
BF
1080-----------------------------
1081
1082Though not required, it's a good idea to begin the commit message
1083with a single short (less than 50 character) line summarizing the
1084change, followed by a blank line and then a more thorough
1085description. Tools that turn commits into email, for example, use
1086the first line on the Subject line and the rest of the commit in the
1087body.
1088
2dc53617
JH
1089[[ignoring-files]]
1090Ignoring files
1091--------------
1092
1093A project will often generate files that you do 'not' want to track with git.
1094This typically includes files generated by a build process or temporary
1095backup files made by your editor. Of course, 'not' tracking files with git
1096is just a matter of 'not' calling "`git add`" on them. But it quickly becomes
1097annoying to have these untracked files lying around; e.g. they make
1098"`git add .`" and "`git commit -a`" practically useless, and they keep
464a8a7a 1099showing up in the output of "`git status`".
2dc53617 1100
464a8a7a
BF
1101You can tell git to ignore certain files by creating a file called .gitignore
1102in the top level of your working directory, with contents such as:
2dc53617
JH
1103
1104-------------------------------------------------
1105# Lines starting with '#' are considered comments.
464a8a7a 1106# Ignore any file named foo.txt.
2dc53617
JH
1107foo.txt
1108# Ignore (generated) html files,
1109*.html
1110# except foo.html which is maintained by hand.
1111!foo.html
1112# Ignore objects and archives.
1113*.[oa]
1114-------------------------------------------------
1115
464a8a7a
BF
1116See gitlink:gitignore[5] for a detailed explanation of the syntax. You can
1117also place .gitignore files in other directories in your working tree, and they
1118will apply to those directories and their subdirectories. The `.gitignore`
1119files can be added to your repository like any other files (just run `git add
1120.gitignore` and `git commit`, as usual), which is convenient when the exclude
1121patterns (such as patterns matching build output files) would also make sense
1122for other users who clone your repository.
1123
1124If you wish the exclude patterns to affect only certain repositories
1125(instead of every repository for a given project), you may instead put
1126them in a file in your repository named .git/info/exclude, or in any file
1127specified by the `core.excludesfile` configuration variable. Some git
1128commands can also take exclude patterns directly on the command line.
1129See gitlink:gitignore[5] for the details.
2dc53617 1130
e34caace 1131[[how-to-merge]]
ae25c67a 1132How to merge
d19fbc3c
BF
1133------------
1134
1135You can rejoin two diverging branches of development using
1136gitlink:git-merge[1]:
1137
1138-------------------------------------------------
1139$ git merge branchname
1140-------------------------------------------------
1141
1142merges the development in the branch "branchname" into the current
1143branch. If there are conflicts--for example, if the same file is
1144modified in two different ways in the remote branch and the local
1145branch--then you are warned; the output may look something like this:
1146
1147-------------------------------------------------
fabbd8f6
BF
1148$ git merge next
1149 100% (4/4) done
1150Auto-merged file.txt
d19fbc3c
BF
1151CONFLICT (content): Merge conflict in file.txt
1152Automatic merge failed; fix conflicts and then commit the result.
1153-------------------------------------------------
1154
1155Conflict markers are left in the problematic files, and after
1156you resolve the conflicts manually, you can update the index
1157with the contents and run git commit, as you normally would when
1158creating a new file.
1159
1160If you examine the resulting commit using gitk, you will see that it
1161has two parents, one pointing to the top of the current branch, and
1162one to the top of the other branch.
1163
d19fbc3c
BF
1164[[resolving-a-merge]]
1165Resolving a merge
1166-----------------
1167
1168When a merge isn't resolved automatically, git leaves the index and
1169the working tree in a special state that gives you all the
1170information you need to help resolve the merge.
1171
1172Files with conflicts are marked specially in the index, so until you
ef561ac7
BF
1173resolve the problem and update the index, gitlink:git-commit[1] will
1174fail:
d19fbc3c
BF
1175
1176-------------------------------------------------
1177$ git commit
1178file.txt: needs merge
1179-------------------------------------------------
1180
ef561ac7
BF
1181Also, gitlink:git-status[1] will list those files as "unmerged", and the
1182files with conflicts will have conflict markers added, like this:
1183
1184-------------------------------------------------
1185<<<<<<< HEAD:file.txt
1186Hello world
1187=======
1188Goodbye
1189>>>>>>> 77976da35a11db4580b80ae27e8d65caf5208086:file.txt
1190-------------------------------------------------
1191
1192All you need to do is edit the files to resolve the conflicts, and then
1193
1194-------------------------------------------------
1195$ git add file.txt
1196$ git commit
1197-------------------------------------------------
1198
1199Note that the commit message will already be filled in for you with
1200some information about the merge. Normally you can just use this
1201default message unchanged, but you may add additional commentary of
1202your own if desired.
1203
1204The above is all you need to know to resolve a simple merge. But git
1205also provides more information to help resolve conflicts:
1206
e34caace 1207[[conflict-resolution]]
ef561ac7
BF
1208Getting conflict-resolution help during a merge
1209~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
d19fbc3c
BF
1210
1211All of the changes that git was able to merge automatically are
1212already added to the index file, so gitlink:git-diff[1] shows only
ef561ac7 1213the conflicts. It uses an unusual syntax:
d19fbc3c
BF
1214
1215-------------------------------------------------
1216$ git diff
1217diff --cc file.txt
1218index 802992c,2b60207..0000000
1219--- a/file.txt
1220+++ b/file.txt
1221@@@ -1,1 -1,1 +1,5 @@@
1222++<<<<<<< HEAD:file.txt
1223 +Hello world
1224++=======
1225+ Goodbye
1226++>>>>>>> 77976da35a11db4580b80ae27e8d65caf5208086:file.txt
1227-------------------------------------------------
1228
1229Recall that the commit which will be commited after we resolve this
1230conflict will have two parents instead of the usual one: one parent
1231will be HEAD, the tip of the current branch; the other will be the
1232tip of the other branch, which is stored temporarily in MERGE_HEAD.
1233
ef561ac7
BF
1234During the merge, the index holds three versions of each file. Each of
1235these three "file stages" represents a different version of the file:
1236
1237-------------------------------------------------
1238$ git show :1:file.txt # the file in a common ancestor of both branches
1239$ git show :2:file.txt # the version from HEAD, but including any
1240 # nonconflicting changes from MERGE_HEAD
1241$ git show :3:file.txt # the version from MERGE_HEAD, but including any
1242 # nonconflicting changes from HEAD.
1243-------------------------------------------------
1244
1245Since the stage 2 and stage 3 versions have already been updated with
1246nonconflicting changes, the only remaining differences between them are
1247the important ones; thus gitlink:git-diff[1] can use the information in
1248the index to show only those conflicts.
1249
1250The diff above shows the differences between the working-tree version of
1251file.txt and the stage 2 and stage 3 versions. So instead of preceding
1252each line by a single "+" or "-", it now uses two columns: the first
1253column is used for differences between the first parent and the working
1254directory copy, and the second for differences between the second parent
1255and the working directory copy. (See the "COMBINED DIFF FORMAT" section
1256of gitlink:git-diff-files[1] for a details of the format.)
1257
1258After resolving the conflict in the obvious way (but before updating the
1259index), the diff will look like:
d19fbc3c
BF
1260
1261-------------------------------------------------
1262$ git diff
1263diff --cc file.txt
1264index 802992c,2b60207..0000000
1265--- a/file.txt
1266+++ b/file.txt
1267@@@ -1,1 -1,1 +1,1 @@@
1268- Hello world
1269 -Goodbye
1270++Goodbye world
1271-------------------------------------------------
1272
1273This shows that our resolved version deleted "Hello world" from the
1274first parent, deleted "Goodbye" from the second parent, and added
1275"Goodbye world", which was previously absent from both.
1276
ef561ac7
BF
1277Some special diff options allow diffing the working directory against
1278any of these stages:
1279
1280-------------------------------------------------
1281$ git diff -1 file.txt # diff against stage 1
1282$ git diff --base file.txt # same as the above
1283$ git diff -2 file.txt # diff against stage 2
1284$ git diff --ours file.txt # same as the above
1285$ git diff -3 file.txt # diff against stage 3
1286$ git diff --theirs file.txt # same as the above.
1287-------------------------------------------------
1288
1289The gitlink:git-log[1] and gitk[1] commands also provide special help
1290for merges:
d19fbc3c
BF
1291
1292-------------------------------------------------
1293$ git log --merge
ef561ac7 1294$ gitk --merge
d19fbc3c
BF
1295-------------------------------------------------
1296
ef561ac7
BF
1297These will display all commits which exist only on HEAD or on
1298MERGE_HEAD, and which touch an unmerged file.
d19fbc3c 1299
61d72564 1300You may also use gitlink:git-mergetool[1], which lets you merge the
c64415e2
BF
1301unmerged files using external tools such as emacs or kdiff3.
1302
ef561ac7 1303Each time you resolve the conflicts in a file and update the index:
d19fbc3c
BF
1304
1305-------------------------------------------------
1306$ git add file.txt
d19fbc3c
BF
1307-------------------------------------------------
1308
ef561ac7
BF
1309the different stages of that file will be "collapsed", after which
1310git-diff will (by default) no longer show diffs for that file.
d19fbc3c
BF
1311
1312[[undoing-a-merge]]
ae25c67a 1313Undoing a merge
d19fbc3c
BF
1314---------------
1315
1316If you get stuck and decide to just give up and throw the whole mess
1317away, you can always return to the pre-merge state with
1318
1319-------------------------------------------------
1320$ git reset --hard HEAD
1321-------------------------------------------------
1322
1323Or, if you've already commited the merge that you want to throw away,
1324
1325-------------------------------------------------
1c73bb0e 1326$ git reset --hard ORIG_HEAD
d19fbc3c
BF
1327-------------------------------------------------
1328
1329However, this last command can be dangerous in some cases--never
1330throw away a commit you have already committed if that commit may
1331itself have been merged into another branch, as doing so may confuse
1332further merges.
1333
e34caace 1334[[fast-forwards]]
d19fbc3c
BF
1335Fast-forward merges
1336-------------------
1337
1338There is one special case not mentioned above, which is treated
1339differently. Normally, a merge results in a merge commit, with two
1340parents, one pointing at each of the two lines of development that
1341were merged.
1342
59723040
BF
1343However, if the current branch is a descendant of the other--so every
1344commit present in the one is already contained in the other--then git
1345just performs a "fast forward"; the head of the current branch is moved
1346forward to point at the head of the merged-in branch, without any new
1347commits being created.
d19fbc3c 1348
e34caace 1349[[fixing-mistakes]]
b684f830
BF
1350Fixing mistakes
1351---------------
1352
1353If you've messed up the working tree, but haven't yet committed your
1354mistake, you can return the entire working tree to the last committed
1355state with
1356
1357-------------------------------------------------
1358$ git reset --hard HEAD
1359-------------------------------------------------
1360
1361If you make a commit that you later wish you hadn't, there are two
1362fundamentally different ways to fix the problem:
1363
1364 1. You can create a new commit that undoes whatever was done
1365 by the previous commit. This is the correct thing if your
1366 mistake has already been made public.
1367
1368 2. You can go back and modify the old commit. You should
1369 never do this if you have already made the history public;
1370 git does not normally expect the "history" of a project to
1371 change, and cannot correctly perform repeated merges from
1372 a branch that has had its history changed.
1373
e34caace 1374[[reverting-a-commit]]
b684f830
BF
1375Fixing a mistake with a new commit
1376~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1377
1378Creating a new commit that reverts an earlier change is very easy;
1379just pass the gitlink:git-revert[1] command a reference to the bad
1380commit; for example, to revert the most recent commit:
1381
1382-------------------------------------------------
1383$ git revert HEAD
1384-------------------------------------------------
1385
1386This will create a new commit which undoes the change in HEAD. You
1387will be given a chance to edit the commit message for the new commit.
1388
1389You can also revert an earlier change, for example, the next-to-last:
1390
1391-------------------------------------------------
1392$ git revert HEAD^
1393-------------------------------------------------
1394
1395In this case git will attempt to undo the old change while leaving
1396intact any changes made since then. If more recent changes overlap
1397with the changes to be reverted, then you will be asked to fix
1398conflicts manually, just as in the case of <<resolving-a-merge,
1399resolving a merge>>.
1400
365aa199 1401[[fixing-a-mistake-by-editing-history]]
b684f830
BF
1402Fixing a mistake by editing history
1403~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1404
1405If the problematic commit is the most recent commit, and you have not
1406yet made that commit public, then you may just
1407<<undoing-a-merge,destroy it using git-reset>>.
1408
1409Alternatively, you
1410can edit the working directory and update the index to fix your
1411mistake, just as if you were going to <<how-to-make-a-commit,create a
1412new commit>>, then run
1413
1414-------------------------------------------------
1415$ git commit --amend
1416-------------------------------------------------
1417
1418which will replace the old commit by a new commit incorporating your
1419changes, giving you a chance to edit the old commit message first.
1420
1421Again, you should never do this to a commit that may already have
1422been merged into another branch; use gitlink:git-revert[1] instead in
1423that case.
1424
1425It is also possible to edit commits further back in the history, but
1426this is an advanced topic to be left for
1427<<cleaning-up-history,another chapter>>.
1428
e34caace 1429[[checkout-of-path]]
b684f830
BF
1430Checking out an old version of a file
1431~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1432
1433In the process of undoing a previous bad change, you may find it
1434useful to check out an older version of a particular file using
1435gitlink:git-checkout[1]. We've used git checkout before to switch
1436branches, but it has quite different behavior if it is given a path
1437name: the command
1438
1439-------------------------------------------------
1440$ git checkout HEAD^ path/to/file
1441-------------------------------------------------
1442
1443replaces path/to/file by the contents it had in the commit HEAD^, and
1444also updates the index to match. It does not change branches.
1445
1446If you just want to look at an old version of the file, without
1447modifying the working directory, you can do that with
1448gitlink:git-show[1]:
1449
1450-------------------------------------------------
ed4eb0d8 1451$ git show HEAD^:path/to/file
b684f830
BF
1452-------------------------------------------------
1453
1454which will display the given version of the file.
1455
e34caace 1456[[ensuring-good-performance]]
d19fbc3c
BF
1457Ensuring good performance
1458-------------------------
1459
1460On large repositories, git depends on compression to keep the history
1461information from taking up to much space on disk or in memory.
1462
1463This compression is not performed automatically. Therefore you
17217090 1464should occasionally run gitlink:git-gc[1]:
d19fbc3c
BF
1465
1466-------------------------------------------------
1467$ git gc
1468-------------------------------------------------
1469
17217090
BF
1470to recompress the archive. This can be very time-consuming, so
1471you may prefer to run git-gc when you are not doing other work.
d19fbc3c 1472
e34caace
BF
1473
1474[[ensuring-reliability]]
11e016a3
BF
1475Ensuring reliability
1476--------------------
1477
e34caace 1478[[checking-for-corruption]]
11e016a3
BF
1479Checking the repository for corruption
1480~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1481
1191ee18
BF
1482The gitlink:git-fsck[1] command runs a number of self-consistency checks
1483on the repository, and reports on any problems. This may take some
21dcb3b7
BF
1484time. The most common warning by far is about "dangling" objects:
1485
1486-------------------------------------------------
04e50e94 1487$ git fsck
21dcb3b7
BF
1488dangling commit 7281251ddd2a61e38657c827739c57015671a6b3
1489dangling commit 2706a059f258c6b245f298dc4ff2ccd30ec21a63
1490dangling commit 13472b7c4b80851a1bc551779171dcb03655e9b5
1491dangling blob 218761f9d90712d37a9c5e36f406f92202db07eb
1492dangling commit bf093535a34a4d35731aa2bd90fe6b176302f14f
1493dangling commit 8e4bec7f2ddaa268bef999853c25755452100f8e
1494dangling tree d50bb86186bf27b681d25af89d3b5b68382e4085
1495dangling tree b24c2473f1fd3d91352a624795be026d64c8841f
1496...
1497-------------------------------------------------
1498
59723040 1499Dangling objects are not a problem. At worst they may take up a little
54782859
AP
1500extra disk space. They can sometimes provide a last-resort method for
1501recovering lost work--see <<dangling-objects>> for details. However, if
1502you wish, you can remove them with gitlink:git-prune[1] or the --prune
1191ee18 1503option to gitlink:git-gc[1]:
21dcb3b7
BF
1504
1505-------------------------------------------------
1506$ git gc --prune
1507-------------------------------------------------
1508
1191ee18
BF
1509This may be time-consuming. Unlike most other git operations (including
1510git-gc when run without any options), it is not safe to prune while
1511other git operations are in progress in the same repository.
21dcb3b7 1512
e34caace 1513[[recovering-lost-changes]]
11e016a3
BF
1514Recovering lost changes
1515~~~~~~~~~~~~~~~~~~~~~~~
1516
e34caace 1517[[reflogs]]
559e4d7a
BF
1518Reflogs
1519^^^^^^^
1520
1521Say you modify a branch with gitlink:git-reset[1] --hard, and then
1522realize that the branch was the only reference you had to that point in
1523history.
1524
1525Fortunately, git also keeps a log, called a "reflog", of all the
1526previous values of each branch. So in this case you can still find the
1527old history using, for example,
1528
1529-------------------------------------------------
1530$ git log master@{1}
1531-------------------------------------------------
1532
1533This lists the commits reachable from the previous version of the head.
1534This syntax can be used to with any git command that accepts a commit,
1535not just with git log. Some other examples:
1536
1537-------------------------------------------------
1538$ git show master@{2} # See where the branch pointed 2,
1539$ git show master@{3} # 3, ... changes ago.
1540$ gitk master@{yesterday} # See where it pointed yesterday,
1541$ gitk master@{"1 week ago"} # ... or last week
953f3d6f
BF
1542$ git log --walk-reflogs master # show reflog entries for master
1543-------------------------------------------------
1544
1545A separate reflog is kept for the HEAD, so
1546
1547-------------------------------------------------
1548$ git show HEAD@{"1 week ago"}
559e4d7a
BF
1549-------------------------------------------------
1550
953f3d6f
BF
1551will show what HEAD pointed to one week ago, not what the current branch
1552pointed to one week ago. This allows you to see the history of what
1553you've checked out.
1554
559e4d7a 1555The reflogs are kept by default for 30 days, after which they may be
036be17e 1556pruned. See gitlink:git-reflog[1] and gitlink:git-gc[1] to learn
559e4d7a
BF
1557how to control this pruning, and see the "SPECIFYING REVISIONS"
1558section of gitlink:git-rev-parse[1] for details.
1559
1560Note that the reflog history is very different from normal git history.
1561While normal history is shared by every repository that works on the
1562same project, the reflog history is not shared: it tells you only about
1563how the branches in your local repository have changed over time.
1564
59723040 1565[[dangling-object-recovery]]
559e4d7a
BF
1566Examining dangling objects
1567^^^^^^^^^^^^^^^^^^^^^^^^^^
1568
59723040
BF
1569In some situations the reflog may not be able to save you. For example,
1570suppose you delete a branch, then realize you need the history it
1571contained. The reflog is also deleted; however, if you have not yet
1572pruned the repository, then you may still be able to find the lost
1573commits in the dangling objects that git-fsck reports. See
1574<<dangling-objects>> for the details.
559e4d7a
BF
1575
1576-------------------------------------------------
1577$ git fsck
1578dangling commit 7281251ddd2a61e38657c827739c57015671a6b3
1579dangling commit 2706a059f258c6b245f298dc4ff2ccd30ec21a63
1580dangling commit 13472b7c4b80851a1bc551779171dcb03655e9b5
1581...
1582-------------------------------------------------
1583
aacd404e 1584You can examine
559e4d7a
BF
1585one of those dangling commits with, for example,
1586
1587------------------------------------------------
1588$ gitk 7281251ddd --not --all
1589------------------------------------------------
1590
1591which does what it sounds like: it says that you want to see the commit
1592history that is described by the dangling commit(s), but not the
1593history that is described by all your existing branches and tags. Thus
1594you get exactly the history reachable from that commit that is lost.
1595(And notice that it might not be just one commit: we only report the
1596"tip of the line" as being dangling, but there might be a whole deep
79c96c57 1597and complex commit history that was dropped.)
559e4d7a
BF
1598
1599If you decide you want the history back, you can always create a new
1600reference pointing to it, for example, a new branch:
1601
1602------------------------------------------------
1603$ git branch recovered-branch 7281251ddd
1604------------------------------------------------
1605
59723040
BF
1606Other types of dangling objects (blobs and trees) are also possible, and
1607dangling objects can arise in other situations.
1608
11e016a3 1609
e34caace 1610[[sharing-development]]
d19fbc3c 1611Sharing development with others
b684f830 1612===============================
d19fbc3c
BF
1613
1614[[getting-updates-with-git-pull]]
1615Getting updates with git pull
b684f830 1616-----------------------------
d19fbc3c
BF
1617
1618After you clone a repository and make a few changes of your own, you
1619may wish to check the original repository for updates and merge them
1620into your own work.
1621
1622We have already seen <<Updating-a-repository-with-git-fetch,how to
1623keep remote tracking branches up to date>> with gitlink:git-fetch[1],
1624and how to merge two branches. So you can merge in changes from the
1625original repository's master branch with:
1626
1627-------------------------------------------------
1628$ git fetch
1629$ git merge origin/master
1630-------------------------------------------------
1631
1632However, the gitlink:git-pull[1] command provides a way to do this in
1633one step:
1634
1635-------------------------------------------------
1636$ git pull origin master
1637-------------------------------------------------
1638
1639In fact, "origin" is normally the default repository to pull from,
1640and the default branch is normally the HEAD of the remote repository,
1641so often you can accomplish the above with just
1642
1643-------------------------------------------------
1644$ git pull
1645-------------------------------------------------
1646
c64415e2
BF
1647See the descriptions of the branch.<name>.remote and branch.<name>.merge
1648options in gitlink:git-config[1] to learn how to control these defaults
1649depending on the current branch. Also note that the --track option to
1650gitlink:git-branch[1] and gitlink:git-checkout[1] can be used to
1651automatically set the default remote branch to pull from at the time
1652that a branch is created:
1653
1654-------------------------------------------------
1da158ea 1655$ git checkout --track -b maint origin/maint
c64415e2 1656-------------------------------------------------
d19fbc3c
BF
1657
1658In addition to saving you keystrokes, "git pull" also helps you by
1659producing a default commit message documenting the branch and
1660repository that you pulled from.
1661
1662(But note that no such commit will be created in the case of a
1663<<fast-forwards,fast forward>>; instead, your branch will just be
79c96c57 1664updated to point to the latest commit from the upstream branch.)
d19fbc3c 1665
1191ee18
BF
1666The git-pull command can also be given "." as the "remote" repository,
1667in which case it just merges in a branch from the current repository; so
4c63ff45
BF
1668the commands
1669
1670-------------------------------------------------
1671$ git pull . branch
1672$ git merge branch
1673-------------------------------------------------
1674
1675are roughly equivalent. The former is actually very commonly used.
1676
e34caace 1677[[submitting-patches]]
d19fbc3c 1678Submitting patches to a project
b684f830 1679-------------------------------
d19fbc3c
BF
1680
1681If you just have a few changes, the simplest way to submit them may
1682just be to send them as patches in email:
1683
036be17e 1684First, use gitlink:git-format-patch[1]; for example:
d19fbc3c
BF
1685
1686-------------------------------------------------
eb6ae7f4 1687$ git format-patch origin
d19fbc3c
BF
1688-------------------------------------------------
1689
1690will produce a numbered series of files in the current directory, one
1691for each patch in the current branch but not in origin/HEAD.
1692
1693You can then import these into your mail client and send them by
1694hand. However, if you have a lot to send at once, you may prefer to
1695use the gitlink:git-send-email[1] script to automate the process.
1696Consult the mailing list for your project first to determine how they
1697prefer such patches be handled.
1698
e34caace 1699[[importing-patches]]
d19fbc3c 1700Importing patches to a project
b684f830 1701------------------------------
d19fbc3c
BF
1702
1703Git also provides a tool called gitlink:git-am[1] (am stands for
1704"apply mailbox"), for importing such an emailed series of patches.
1705Just save all of the patch-containing messages, in order, into a
1706single mailbox file, say "patches.mbox", then run
1707
1708-------------------------------------------------
eb6ae7f4 1709$ git am -3 patches.mbox
d19fbc3c
BF
1710-------------------------------------------------
1711
1712Git will apply each patch in order; if any conflicts are found, it
1713will stop, and you can fix the conflicts as described in
01997b4a
BF
1714"<<resolving-a-merge,Resolving a merge>>". (The "-3" option tells
1715git to perform a merge; if you would prefer it just to abort and
1716leave your tree and index untouched, you may omit that option.)
1717
1718Once the index is updated with the results of the conflict
1719resolution, instead of creating a new commit, just run
d19fbc3c
BF
1720
1721-------------------------------------------------
1722$ git am --resolved
1723-------------------------------------------------
1724
1725and git will create the commit for you and continue applying the
1726remaining patches from the mailbox.
1727
1728The final result will be a series of commits, one for each patch in
1729the original mailbox, with authorship and commit log message each
1730taken from the message containing each patch.
1731
eda69449
BF
1732[[public-repositories]]
1733Public git repositories
1734-----------------------
d19fbc3c 1735
eda69449
BF
1736Another way to submit changes to a project is to tell the maintainer of
1737that project to pull the changes from your repository using git-pull[1].
1738In the section "<<getting-updates-with-git-pull, Getting updates with
1739git pull>>" we described this as a way to get updates from the "main"
1740repository, but it works just as well in the other direction.
d19fbc3c 1741
eda69449
BF
1742If you and the maintainer both have accounts on the same machine, then
1743you can just pull changes from each other's repositories directly;
11d51533 1744commands that accept repository URLs as arguments will also accept a
eda69449 1745local directory name:
d19fbc3c
BF
1746
1747-------------------------------------------------
1748$ git clone /path/to/repository
1749$ git pull /path/to/other/repository
1750-------------------------------------------------
1751
11d51533
BF
1752or an ssh url:
1753
1754-------------------------------------------------
1755$ git clone ssh://yourhost/~you/repository
1756-------------------------------------------------
1757
1758For projects with few developers, or for synchronizing a few private
1759repositories, this may be all you need.
1760
eda69449
BF
1761However, the more common way to do this is to maintain a separate public
1762repository (usually on a different host) for others to pull changes
1763from. This is usually more convenient, and allows you to cleanly
1764separate private work in progress from publicly visible work.
d19fbc3c
BF
1765
1766You will continue to do your day-to-day work in your personal
1767repository, but periodically "push" changes from your personal
1768repository into your public repository, allowing other developers to
1769pull from that repository. So the flow of changes, in a situation
1770where there is one other developer with a public repository, looks
1771like this:
1772
1773 you push
1774 your personal repo ------------------> your public repo
1775 ^ |
1776 | |
1777 | you pull | they pull
1778 | |
1779 | |
1780 | they push V
1781 their public repo <------------------- their repo
1782
11d51533
BF
1783We explain how to do this in the following sections.
1784
eda69449
BF
1785[[setting-up-a-public-repository]]
1786Setting up a public repository
1787~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1788
1789Assume your personal repository is in the directory ~/proj. We
1790first create a new clone of the repository and tell git-daemon that it
1791is meant to be public:
d19fbc3c
BF
1792
1793-------------------------------------------------
52c80037 1794$ git clone --bare ~/proj proj.git
eda69449 1795$ touch proj.git/git-daemon-export-ok
d19fbc3c
BF
1796-------------------------------------------------
1797
52c80037 1798The resulting directory proj.git contains a "bare" git repository--it is
eda69449
BF
1799just the contents of the ".git" directory, without any files checked out
1800around it.
d19fbc3c 1801
c64415e2 1802Next, copy proj.git to the server where you plan to host the
d19fbc3c
BF
1803public repository. You can use scp, rsync, or whatever is most
1804convenient.
1805
eda69449
BF
1806[[exporting-via-git]]
1807Exporting a git repository via the git protocol
1808~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1809
1810This is the preferred method.
1811
1812If someone else administers the server, they should tell you what
1813directory to put the repository in, and what git:// url it will appear
1814at. You can then skip to the section
d19fbc3c
BF
1815"<<pushing-changes-to-a-public-repository,Pushing changes to a public
1816repository>>", below.
1817
eda69449
BF
1818Otherwise, all you need to do is start gitlink:git-daemon[1]; it will
1819listen on port 9418. By default, it will allow access to any directory
1820that looks like a git directory and contains the magic file
1821git-daemon-export-ok. Passing some directory paths as git-daemon
1822arguments will further restrict the exports to those paths.
1823
1824You can also run git-daemon as an inetd service; see the
1825gitlink:git-daemon[1] man page for details. (See especially the
1826examples section.)
d19fbc3c
BF
1827
1828[[exporting-via-http]]
1829Exporting a git repository via http
eda69449 1830~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
d19fbc3c
BF
1831
1832The git protocol gives better performance and reliability, but on a
1833host with a web server set up, http exports may be simpler to set up.
1834
1835All you need to do is place the newly created bare git repository in
1836a directory that is exported by the web server, and make some
1837adjustments to give web clients some extra information they need:
1838
1839-------------------------------------------------
1840$ mv proj.git /home/you/public_html/proj.git
1841$ cd proj.git
c64415e2 1842$ git --bare update-server-info
d19fbc3c
BF
1843$ chmod a+x hooks/post-update
1844-------------------------------------------------
1845
1846(For an explanation of the last two lines, see
1847gitlink:git-update-server-info[1], and the documentation
a2983cb7 1848link:hooks.html[Hooks used by git].)
d19fbc3c
BF
1849
1850Advertise the url of proj.git. Anybody else should then be able to
1851clone or pull from that url, for example with a commandline like:
1852
1853-------------------------------------------------
1854$ git clone http://yourserver.com/~you/proj.git
1855-------------------------------------------------
1856
1857(See also
1858link:howto/setup-git-server-over-http.txt[setup-git-server-over-http]
1859for a slightly more sophisticated setup using WebDAV which also
1860allows pushing over http.)
1861
d19fbc3c
BF
1862[[pushing-changes-to-a-public-repository]]
1863Pushing changes to a public repository
eda69449 1864~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
d19fbc3c 1865
eda69449 1866Note that the two techniques outlined above (exporting via
d19fbc3c
BF
1867<<exporting-via-http,http>> or <<exporting-via-git,git>>) allow other
1868maintainers to fetch your latest changes, but they do not allow write
1869access, which you will need to update the public repository with the
1870latest changes created in your private repository.
1871
1872The simplest way to do this is using gitlink:git-push[1] and ssh; to
1873update the remote branch named "master" with the latest state of your
1874branch named "master", run
1875
1876-------------------------------------------------
1877$ git push ssh://yourserver.com/~you/proj.git master:master
1878-------------------------------------------------
1879
1880or just
1881
1882-------------------------------------------------
1883$ git push ssh://yourserver.com/~you/proj.git master
1884-------------------------------------------------
1885
1886As with git-fetch, git-push will complain if this does not result in
1887a <<fast-forwards,fast forward>>. Normally this is a sign of
1888something wrong. However, if you are sure you know what you're
1889doing, you may force git-push to perform the update anyway by
1890proceeding the branch name by a plus sign:
1891
1892-------------------------------------------------
1893$ git push ssh://yourserver.com/~you/proj.git +master
1894-------------------------------------------------
1895
11d51533
BF
1896Note that the target of a "push" is normally a
1897<<def_bare_repository,bare>> repository. You can also push to a
1898repository that has a checked-out working tree, but the working tree
1899will not be updated by the push. This may lead to unexpected results if
1900the branch you push to is the currently checked-out branch!
1901
d19fbc3c
BF
1902As with git-fetch, you may also set up configuration options to
1903save typing; so, for example, after
1904
1905-------------------------------------------------
c64415e2 1906$ cat >>.git/config <<EOF
d19fbc3c
BF
1907[remote "public-repo"]
1908 url = ssh://yourserver.com/~you/proj.git
1909EOF
1910-------------------------------------------------
1911
1912you should be able to perform the above push with just
1913
1914-------------------------------------------------
1915$ git push public-repo master
1916-------------------------------------------------
1917
1918See the explanations of the remote.<name>.url, branch.<name>.remote,
9d13bda3 1919and remote.<name>.push options in gitlink:git-config[1] for
d19fbc3c
BF
1920details.
1921
e34caace 1922[[setting-up-a-shared-repository]]
d19fbc3c 1923Setting up a shared repository
eda69449 1924~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
d19fbc3c
BF
1925
1926Another way to collaborate is by using a model similar to that
1927commonly used in CVS, where several developers with special rights
1928all push to and pull from a single shared repository. See
a2983cb7 1929link:cvs-migration.html[git for CVS users] for instructions on how to
d19fbc3c
BF
1930set this up.
1931
8fae2225
BF
1932However, while there is nothing wrong with git's support for shared
1933repositories, this mode of operation is not generally recommended,
1934simply because the mode of collaboration that git supports--by
1935exchanging patches and pulling from public repositories--has so many
1936advantages over the central shared repository:
1937
1938 - Git's ability to quickly import and merge patches allows a
1939 single maintainer to process incoming changes even at very
1940 high rates. And when that becomes too much, git-pull provides
1941 an easy way for that maintainer to delegate this job to other
1942 maintainers while still allowing optional review of incoming
1943 changes.
1944 - Since every developer's repository has the same complete copy
1945 of the project history, no repository is special, and it is
1946 trivial for another developer to take over maintenance of a
1947 project, either by mutual agreement, or because a maintainer
1948 becomes unresponsive or difficult to work with.
1949 - The lack of a central group of "committers" means there is
1950 less need for formal decisions about who is "in" and who is
1951 "out".
1952
e34caace 1953[[setting-up-gitweb]]
eda69449
BF
1954Allowing web browsing of a repository
1955~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
d19fbc3c 1956
a8cd1402
BF
1957The gitweb cgi script provides users an easy way to browse your
1958project's files and history without having to install git; see the file
04483524 1959gitweb/INSTALL in the git source tree for instructions on setting it up.
d19fbc3c 1960
e34caace 1961[[sharing-development-examples]]
b684f830
BF
1962Examples
1963--------
d19fbc3c 1964
9e2163ea
BF
1965[[maintaining-topic-branches]]
1966Maintaining topic branches for a Linux subsystem maintainer
1967~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1968
1969This describes how Tony Luck uses git in his role as maintainer of the
1970IA64 architecture for the Linux kernel.
1971
1972He uses two public branches:
1973
1974 - A "test" tree into which patches are initially placed so that they
1975 can get some exposure when integrated with other ongoing development.
1976 This tree is available to Andrew for pulling into -mm whenever he
1977 wants.
1978
1979 - A "release" tree into which tested patches are moved for final sanity
1980 checking, and as a vehicle to send them upstream to Linus (by sending
1981 him a "please pull" request.)
1982
1983He also uses a set of temporary branches ("topic branches"), each
1984containing a logical grouping of patches.
1985
1986To set this up, first create your work tree by cloning Linus's public
1987tree:
1988
1989-------------------------------------------------
1990$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git work
1991$ cd work
1992-------------------------------------------------
1993
1994Linus's tree will be stored in the remote branch named origin/master,
1995and can be updated using gitlink:git-fetch[1]; you can track other
1996public trees using gitlink:git-remote[1] to set up a "remote" and
1997git-fetch[1] to keep them up-to-date; see <<repositories-and-branches>>.
1998
1999Now create the branches in which you are going to work; these start out
2000at the current tip of origin/master branch, and should be set up (using
2001the --track option to gitlink:git-branch[1]) to merge changes in from
2002Linus by default.
2003
2004-------------------------------------------------
2005$ git branch --track test origin/master
2006$ git branch --track release origin/master
2007-------------------------------------------------
2008
2009These can be easily kept up to date using gitlink:git-pull[1]
2010
2011-------------------------------------------------
2012$ git checkout test && git pull
2013$ git checkout release && git pull
2014-------------------------------------------------
2015
2016Important note! If you have any local changes in these branches, then
2017this merge will create a commit object in the history (with no local
2018changes git will simply do a "Fast forward" merge). Many people dislike
2019the "noise" that this creates in the Linux history, so you should avoid
2020doing this capriciously in the "release" branch, as these noisy commits
2021will become part of the permanent history when you ask Linus to pull
2022from the release branch.
2023
2024A few configuration variables (see gitlink:git-config[1]) can
2025make it easy to push both branches to your public tree. (See
2026<<setting-up-a-public-repository>>.)
2027
2028-------------------------------------------------
2029$ cat >> .git/config <<EOF
2030[remote "mytree"]
2031 url = master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6.git
2032 push = release
2033 push = test
2034EOF
2035-------------------------------------------------
2036
2037Then you can push both the test and release trees using
2038gitlink:git-push[1]:
2039
2040-------------------------------------------------
2041$ git push mytree
2042-------------------------------------------------
2043
2044or push just one of the test and release branches using:
2045
2046-------------------------------------------------
2047$ git push mytree test
2048-------------------------------------------------
2049
2050or
2051
2052-------------------------------------------------
2053$ git push mytree release
2054-------------------------------------------------
2055
2056Now to apply some patches from the community. Think of a short
2057snappy name for a branch to hold this patch (or related group of
2058patches), and create a new branch from the current tip of Linus's
2059branch:
2060
2061-------------------------------------------------
2062$ git checkout -b speed-up-spinlocks origin
2063-------------------------------------------------
2064
2065Now you apply the patch(es), run some tests, and commit the change(s). If
2066the patch is a multi-part series, then you should apply each as a separate
2067commit to this branch.
2068
2069-------------------------------------------------
2070$ ... patch ... test ... commit [ ... patch ... test ... commit ]*
2071-------------------------------------------------
2072
2073When you are happy with the state of this change, you can pull it into the
2074"test" branch in preparation to make it public:
2075
2076-------------------------------------------------
2077$ git checkout test && git pull . speed-up-spinlocks
2078-------------------------------------------------
2079
2080It is unlikely that you would have any conflicts here ... but you might if you
2081spent a while on this step and had also pulled new versions from upstream.
2082
2083Some time later when enough time has passed and testing done, you can pull the
2084same branch into the "release" tree ready to go upstream. This is where you
2085see the value of keeping each patch (or patch series) in its own branch. It
2086means that the patches can be moved into the "release" tree in any order.
2087
2088-------------------------------------------------
2089$ git checkout release && git pull . speed-up-spinlocks
2090-------------------------------------------------
2091
2092After a while, you will have a number of branches, and despite the
2093well chosen names you picked for each of them, you may forget what
2094they are for, or what status they are in. To get a reminder of what
2095changes are in a specific branch, use:
2096
2097-------------------------------------------------
2098$ git log linux..branchname | git-shortlog
2099-------------------------------------------------
2100
2101To see whether it has already been merged into the test or release branches
2102use:
2103
2104-------------------------------------------------
2105$ git log test..branchname
2106-------------------------------------------------
2107
2108or
2109
2110-------------------------------------------------
2111$ git log release..branchname
2112-------------------------------------------------
2113
2114(If this branch has not yet been merged you will see some log entries.
2115If it has been merged, then there will be no output.)
2116
2117Once a patch completes the great cycle (moving from test to release,
2118then pulled by Linus, and finally coming back into your local
2119"origin/master" branch) the branch for this change is no longer needed.
2120You detect this when the output from:
2121
2122-------------------------------------------------
2123$ git log origin..branchname
2124-------------------------------------------------
2125
2126is empty. At this point the branch can be deleted:
2127
2128-------------------------------------------------
2129$ git branch -d branchname
2130-------------------------------------------------
2131
2132Some changes are so trivial that it is not necessary to create a separate
2133branch and then merge into each of the test and release branches. For
2134these changes, just apply directly to the "release" branch, and then
2135merge that into the "test" branch.
2136
2137To create diffstat and shortlog summaries of changes to include in a "please
2138pull" request to Linus you can use:
2139
2140-------------------------------------------------
2141$ git diff --stat origin..release
2142-------------------------------------------------
2143
2144and
2145
2146-------------------------------------------------
2147$ git log -p origin..release | git shortlog
2148-------------------------------------------------
2149
2150Here are some of the scripts that simplify all this even further.
2151
2152-------------------------------------------------
2153==== update script ====
2154# Update a branch in my GIT tree. If the branch to be updated
2155# is origin, then pull from kernel.org. Otherwise merge
2156# origin/master branch into test|release branch
2157
2158case "$1" in
2159test|release)
2160 git checkout $1 && git pull . origin
2161 ;;
2162origin)
2163 before=$(cat .git/refs/remotes/origin/master)
2164 git fetch origin
2165 after=$(cat .git/refs/remotes/origin/master)
2166 if [ $before != $after ]
2167 then
2168 git log $before..$after | git shortlog
2169 fi
2170 ;;
2171*)
2172 echo "Usage: $0 origin|test|release" 1>&2
2173 exit 1
2174 ;;
2175esac
2176-------------------------------------------------
2177
2178-------------------------------------------------
2179==== merge script ====
2180# Merge a branch into either the test or release branch
2181
2182pname=$0
2183
2184usage()
2185{
2186 echo "Usage: $pname branch test|release" 1>&2
2187 exit 1
2188}
2189
2190if [ ! -f .git/refs/heads/"$1" ]
2191then
2192 echo "Can't see branch <$1>" 1>&2
2193 usage
2194fi
2195
2196case "$2" in
2197test|release)
2198 if [ $(git log $2..$1 | wc -c) -eq 0 ]
2199 then
2200 echo $1 already merged into $2 1>&2
2201 exit 1
2202 fi
2203 git checkout $2 && git pull . $1
2204 ;;
2205*)
2206 usage
2207 ;;
2208esac
2209-------------------------------------------------
2210
2211-------------------------------------------------
2212==== status script ====
2213# report on status of my ia64 GIT tree
2214
2215gb=$(tput setab 2)
2216rb=$(tput setab 1)
2217restore=$(tput setab 9)
2218
2219if [ `git rev-list test..release | wc -c` -gt 0 ]
2220then
2221 echo $rb Warning: commits in release that are not in test $restore
2222 git log test..release
2223fi
2224
2225for branch in `ls .git/refs/heads`
2226do
2227 if [ $branch = test -o $branch = release ]
2228 then
2229 continue
2230 fi
2231
2232 echo -n $gb ======= $branch ====== $restore " "
2233 status=
2234 for ref in test release origin/master
2235 do
2236 if [ `git rev-list $ref..$branch | wc -c` -gt 0 ]
2237 then
2238 status=$status${ref:0:1}
2239 fi
2240 done
2241 case $status in
2242 trl)
2243 echo $rb Need to pull into test $restore
2244 ;;
2245 rl)
2246 echo "In test"
2247 ;;
2248 l)
2249 echo "Waiting for linus"
2250 ;;
2251 "")
2252 echo $rb All done $restore
2253 ;;
2254 *)
2255 echo $rb "<$status>" $restore
2256 ;;
2257 esac
2258 git log origin/master..$branch | git shortlog
2259done
2260-------------------------------------------------
d19fbc3c 2261
d19fbc3c 2262
d19fbc3c 2263[[cleaning-up-history]]
4c63ff45
BF
2264Rewriting history and maintaining patch series
2265==============================================
2266
2267Normally commits are only added to a project, never taken away or
2268replaced. Git is designed with this assumption, and violating it will
2269cause git's merge machinery (for example) to do the wrong thing.
2270
2271However, there is a situation in which it can be useful to violate this
2272assumption.
2273
e34caace 2274[[patch-series]]
4c63ff45
BF
2275Creating the perfect patch series
2276---------------------------------
2277
2278Suppose you are a contributor to a large project, and you want to add a
2279complicated feature, and to present it to the other developers in a way
2280that makes it easy for them to read your changes, verify that they are
2281correct, and understand why you made each change.
2282
b181d57f 2283If you present all of your changes as a single patch (or commit), they
79c96c57 2284may find that it is too much to digest all at once.
4c63ff45
BF
2285
2286If you present them with the entire history of your work, complete with
2287mistakes, corrections, and dead ends, they may be overwhelmed.
2288
2289So the ideal is usually to produce a series of patches such that:
2290
2291 1. Each patch can be applied in order.
2292
2293 2. Each patch includes a single logical change, together with a
2294 message explaining the change.
2295
2296 3. No patch introduces a regression: after applying any initial
2297 part of the series, the resulting project still compiles and
2298 works, and has no bugs that it didn't have before.
2299
2300 4. The complete series produces the same end result as your own
2301 (probably much messier!) development process did.
2302
b181d57f
BF
2303We will introduce some tools that can help you do this, explain how to
2304use them, and then explain some of the problems that can arise because
2305you are rewriting history.
4c63ff45 2306
e34caace 2307[[using-git-rebase]]
4c63ff45
BF
2308Keeping a patch series up to date using git-rebase
2309--------------------------------------------------
2310
79c96c57
MC
2311Suppose that you create a branch "mywork" on a remote-tracking branch
2312"origin", and create some commits on top of it:
4c63ff45
BF
2313
2314-------------------------------------------------
2315$ git checkout -b mywork origin
2316$ vi file.txt
2317$ git commit
2318$ vi otherfile.txt
2319$ git commit
2320...
2321-------------------------------------------------
2322
2323You have performed no merges into mywork, so it is just a simple linear
2324sequence of patches on top of "origin":
2325
1dc71a91 2326................................................
4c63ff45
BF
2327 o--o--o <-- origin
2328 \
2329 o--o--o <-- mywork
1dc71a91 2330................................................
4c63ff45
BF
2331
2332Some more interesting work has been done in the upstream project, and
2333"origin" has advanced:
2334
1dc71a91 2335................................................
4c63ff45
BF
2336 o--o--O--o--o--o <-- origin
2337 \
2338 a--b--c <-- mywork
1dc71a91 2339................................................
4c63ff45
BF
2340
2341At this point, you could use "pull" to merge your changes back in;
2342the result would create a new merge commit, like this:
2343
1dc71a91 2344................................................
4c63ff45
BF
2345 o--o--O--o--o--o <-- origin
2346 \ \
2347 a--b--c--m <-- mywork
1dc71a91 2348................................................
4c63ff45
BF
2349
2350However, if you prefer to keep the history in mywork a simple series of
2351commits without any merges, you may instead choose to use
2352gitlink:git-rebase[1]:
2353
2354-------------------------------------------------
2355$ git checkout mywork
2356$ git rebase origin
2357-------------------------------------------------
2358
b181d57f
BF
2359This will remove each of your commits from mywork, temporarily saving
2360them as patches (in a directory named ".dotest"), update mywork to
2361point at the latest version of origin, then apply each of the saved
2362patches to the new mywork. The result will look like:
4c63ff45
BF
2363
2364
1dc71a91 2365................................................
4c63ff45
BF
2366 o--o--O--o--o--o <-- origin
2367 \
2368 a'--b'--c' <-- mywork
1dc71a91 2369................................................
4c63ff45 2370
b181d57f
BF
2371In the process, it may discover conflicts. In that case it will stop
2372and allow you to fix the conflicts; after fixing conflicts, use "git
2373add" to update the index with those contents, and then, instead of
2374running git-commit, just run
4c63ff45
BF
2375
2376-------------------------------------------------
2377$ git rebase --continue
2378-------------------------------------------------
2379
2380and git will continue applying the rest of the patches.
2381
2382At any point you may use the --abort option to abort this process and
2383return mywork to the state it had before you started the rebase:
2384
2385-------------------------------------------------
2386$ git rebase --abort
2387-------------------------------------------------
2388
e34caace 2389[[modifying-one-commit]]
365aa199
BF
2390Modifying a single commit
2391-------------------------
2392
2393We saw in <<fixing-a-mistake-by-editing-history>> that you can replace the
2394most recent commit using
2395
2396-------------------------------------------------
2397$ git commit --amend
2398-------------------------------------------------
2399
2400which will replace the old commit by a new commit incorporating your
2401changes, giving you a chance to edit the old commit message first.
2402
2403You can also use a combination of this and gitlink:git-rebase[1] to edit
2404commits further back in your history. First, tag the problematic commit with
2405
2406-------------------------------------------------
2407$ git tag bad mywork~5
2408-------------------------------------------------
2409
2410(Either gitk or git-log may be useful for finding the commit.)
2411
25d9f3fa
BF
2412Then check out that commit, edit it, and rebase the rest of the series
2413on top of it (note that we could check out the commit on a temporary
2414branch, but instead we're using a <<detached-head,detached head>>):
365aa199
BF
2415
2416-------------------------------------------------
25d9f3fa 2417$ git checkout bad
365aa199
BF
2418$ # make changes here and update the index
2419$ git commit --amend
25d9f3fa 2420$ git rebase --onto HEAD bad mywork
365aa199
BF
2421-------------------------------------------------
2422
25d9f3fa
BF
2423When you're done, you'll be left with mywork checked out, with the top
2424patches on mywork reapplied on top of your modified commit. You can
365aa199
BF
2425then clean up with
2426
2427-------------------------------------------------
365aa199
BF
2428$ git tag -d bad
2429-------------------------------------------------
2430
2431Note that the immutable nature of git history means that you haven't really
2432"modified" existing commits; instead, you have replaced the old commits with
2433new commits having new object names.
2434
e34caace 2435[[reordering-patch-series]]
4c63ff45
BF
2436Reordering or selecting from a patch series
2437-------------------------------------------
2438
b181d57f
BF
2439Given one existing commit, the gitlink:git-cherry-pick[1] command
2440allows you to apply the change introduced by that commit and create a
2441new commit that records it. So, for example, if "mywork" points to a
2442series of patches on top of "origin", you might do something like:
2443
2444-------------------------------------------------
2445$ git checkout -b mywork-new origin
2446$ gitk origin..mywork &
2447-------------------------------------------------
2448
2449And browse through the list of patches in the mywork branch using gitk,
2450applying them (possibly in a different order) to mywork-new using
2451cherry-pick, and possibly modifying them as you go using commit
2452--amend.
2453
2454Another technique is to use git-format-patch to create a series of
2455patches, then reset the state to before the patches:
4c63ff45 2456
b181d57f
BF
2457-------------------------------------------------
2458$ git format-patch origin
2459$ git reset --hard origin
2460-------------------------------------------------
4c63ff45 2461
b181d57f
BF
2462Then modify, reorder, or eliminate patches as preferred before applying
2463them again with gitlink:git-am[1].
4c63ff45 2464
e34caace 2465[[patch-series-tools]]
4c63ff45
BF
2466Other tools
2467-----------
2468
b181d57f 2469There are numerous other tools, such as stgit, which exist for the
79c96c57 2470purpose of maintaining a patch series. These are outside of the scope of
b181d57f 2471this manual.
4c63ff45 2472
e34caace 2473[[problems-with-rewriting-history]]
4c63ff45
BF
2474Problems with rewriting history
2475-------------------------------
2476
b181d57f
BF
2477The primary problem with rewriting the history of a branch has to do
2478with merging. Suppose somebody fetches your branch and merges it into
2479their branch, with a result something like this:
2480
1dc71a91 2481................................................
b181d57f
BF
2482 o--o--O--o--o--o <-- origin
2483 \ \
2484 t--t--t--m <-- their branch:
1dc71a91 2485................................................
b181d57f
BF
2486
2487Then suppose you modify the last three commits:
2488
1dc71a91 2489................................................
b181d57f
BF
2490 o--o--o <-- new head of origin
2491 /
2492 o--o--O--o--o--o <-- old head of origin
1dc71a91 2493................................................
b181d57f
BF
2494
2495If we examined all this history together in one repository, it will
2496look like:
2497
1dc71a91 2498................................................
b181d57f
BF
2499 o--o--o <-- new head of origin
2500 /
2501 o--o--O--o--o--o <-- old head of origin
2502 \ \
2503 t--t--t--m <-- their branch:
1dc71a91 2504................................................
b181d57f
BF
2505
2506Git has no way of knowing that the new head is an updated version of
2507the old head; it treats this situation exactly the same as it would if
2508two developers had independently done the work on the old and new heads
2509in parallel. At this point, if someone attempts to merge the new head
2510in to their branch, git will attempt to merge together the two (old and
2511new) lines of development, instead of trying to replace the old by the
2512new. The results are likely to be unexpected.
2513
2514You may still choose to publish branches whose history is rewritten,
2515and it may be useful for others to be able to fetch those branches in
2516order to examine or test them, but they should not attempt to pull such
2517branches into their own work.
2518
2519For true distributed development that supports proper merging,
2520published branches should never be rewritten.
2521
e34caace 2522[[advanced-branch-management]]
b181d57f
BF
2523Advanced branch management
2524==========================
4c63ff45 2525
e34caace 2526[[fetching-individual-branches]]
b181d57f
BF
2527Fetching individual branches
2528----------------------------
2529
2530Instead of using gitlink:git-remote[1], you can also choose just
2531to update one branch at a time, and to store it locally under an
2532arbitrary name:
2533
2534-------------------------------------------------
2535$ git fetch origin todo:my-todo-work
2536-------------------------------------------------
2537
2538The first argument, "origin", just tells git to fetch from the
2539repository you originally cloned from. The second argument tells git
2540to fetch the branch named "todo" from the remote repository, and to
2541store it locally under the name refs/heads/my-todo-work.
2542
2543You can also fetch branches from other repositories; so
2544
2545-------------------------------------------------
2546$ git fetch git://example.com/proj.git master:example-master
2547-------------------------------------------------
2548
2549will create a new branch named "example-master" and store in it the
2550branch named "master" from the repository at the given URL. If you
2551already have a branch named example-master, it will attempt to
59723040
BF
2552<<fast-forwards,fast-forward>> to the commit given by example.com's
2553master branch. In more detail:
b181d57f 2554
59723040
BF
2555[[fetch-fast-forwards]]
2556git fetch and fast-forwards
2557---------------------------
b181d57f
BF
2558
2559In the previous example, when updating an existing branch, "git
2560fetch" checks to make sure that the most recent commit on the remote
2561branch is a descendant of the most recent commit on your copy of the
2562branch before updating your copy of the branch to point at the new
59723040 2563commit. Git calls this process a <<fast-forwards,fast forward>>.
b181d57f
BF
2564
2565A fast forward looks something like this:
2566
1dc71a91 2567................................................
b181d57f
BF
2568 o--o--o--o <-- old head of the branch
2569 \
2570 o--o--o <-- new head of the branch
1dc71a91 2571................................................
b181d57f
BF
2572
2573
2574In some cases it is possible that the new head will *not* actually be
2575a descendant of the old head. For example, the developer may have
2576realized she made a serious mistake, and decided to backtrack,
2577resulting in a situation like:
2578
1dc71a91 2579................................................
b181d57f
BF
2580 o--o--o--o--a--b <-- old head of the branch
2581 \
2582 o--o--o <-- new head of the branch
1dc71a91 2583................................................
b181d57f
BF
2584
2585In this case, "git fetch" will fail, and print out a warning.
2586
2587In that case, you can still force git to update to the new head, as
2588described in the following section. However, note that in the
2589situation above this may mean losing the commits labeled "a" and "b",
2590unless you've already created a reference of your own pointing to
2591them.
2592
e34caace 2593[[forcing-fetch]]
b181d57f
BF
2594Forcing git fetch to do non-fast-forward updates
2595------------------------------------------------
2596
2597If git fetch fails because the new head of a branch is not a
2598descendant of the old head, you may force the update with:
2599
2600-------------------------------------------------
2601$ git fetch git://example.com/proj.git +master:refs/remotes/example/master
2602-------------------------------------------------
2603
c64415e2
BF
2604Note the addition of the "+" sign. Alternatively, you can use the "-f"
2605flag to force updates of all the fetched branches, as in:
2606
2607-------------------------------------------------
2608$ git fetch -f origin
2609-------------------------------------------------
2610
2611Be aware that commits that the old version of example/master pointed at
2612may be lost, as we saw in the previous section.
b181d57f 2613
e34caace 2614[[remote-branch-configuration]]
b181d57f
BF
2615Configuring remote branches
2616---------------------------
2617
2618We saw above that "origin" is just a shortcut to refer to the
79c96c57 2619repository that you originally cloned from. This information is
b181d57f 2620stored in git configuration variables, which you can see using
9d13bda3 2621gitlink:git-config[1]:
b181d57f
BF
2622
2623-------------------------------------------------
9d13bda3 2624$ git config -l
b181d57f
BF
2625core.repositoryformatversion=0
2626core.filemode=true
2627core.logallrefupdates=true
2628remote.origin.url=git://git.kernel.org/pub/scm/git/git.git
2629remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
2630branch.master.remote=origin
2631branch.master.merge=refs/heads/master
2632-------------------------------------------------
2633
2634If there are other repositories that you also use frequently, you can
2635create similar configuration options to save typing; for example,
2636after
2637
2638-------------------------------------------------
9d13bda3 2639$ git config remote.example.url git://example.com/proj.git
b181d57f
BF
2640-------------------------------------------------
2641
2642then the following two commands will do the same thing:
2643
2644-------------------------------------------------
2645$ git fetch git://example.com/proj.git master:refs/remotes/example/master
2646$ git fetch example master:refs/remotes/example/master
2647-------------------------------------------------
2648
2649Even better, if you add one more option:
2650
2651-------------------------------------------------
9d13bda3 2652$ git config remote.example.fetch master:refs/remotes/example/master
b181d57f
BF
2653-------------------------------------------------
2654
2655then the following commands will all do the same thing:
2656
2657-------------------------------------------------
52c80037
BF
2658$ git fetch git://example.com/proj.git master:refs/remotes/example/master
2659$ git fetch example master:refs/remotes/example/master
b181d57f
BF
2660$ git fetch example
2661-------------------------------------------------
2662
2663You can also add a "+" to force the update each time:
2664
2665-------------------------------------------------
9d13bda3 2666$ git config remote.example.fetch +master:ref/remotes/example/master
b181d57f
BF
2667-------------------------------------------------
2668
2669Don't do this unless you're sure you won't mind "git fetch" possibly
2670throwing away commits on mybranch.
2671
2672Also note that all of the above configuration can be performed by
2673directly editing the file .git/config instead of using
9d13bda3 2674gitlink:git-config[1].
b181d57f 2675
9d13bda3 2676See gitlink:git-config[1] for more details on the configuration
b181d57f 2677options mentioned above.
d19fbc3c 2678
d19fbc3c 2679
35121930 2680[[git-internals]]
d19fbc3c
BF
2681Git internals
2682=============
2683
a536b08b
BF
2684Git depends on two fundamental abstractions: the "object database", and
2685the "current directory cache" aka "index".
b181d57f 2686
e34caace 2687[[the-object-database]]
b181d57f
BF
2688The Object Database
2689-------------------
2690
2691The object database is literally just a content-addressable collection
2692of objects. All objects are named by their content, which is
2693approximated by the SHA1 hash of the object itself. Objects may refer
2694to other objects (by referencing their SHA1 hash), and so you can
2695build up a hierarchy of objects.
2696
c64415e2 2697All objects have a statically determined "type" which is
b181d57f
BF
2698determined at object creation time, and which identifies the format of
2699the object (i.e. how it is used, and how it can refer to other
2700objects). There are currently four different object types: "blob",
a536b08b 2701"tree", "commit", and "tag".
b181d57f 2702
a536b08b
BF
2703A <<def_blob_object,"blob" object>> cannot refer to any other object,
2704and is, as the name implies, a pure storage object containing some
2705user data. It is used to actually store the file data, i.e. a blob
2706object is associated with some particular version of some file.
b181d57f 2707
a536b08b
BF
2708A <<def_tree_object,"tree" object>> is an object that ties one or more
2709"blob" objects into a directory structure. In addition, a tree object
2710can refer to other tree objects, thus creating a directory hierarchy.
b181d57f 2711
a536b08b
BF
2712A <<def_commit_object,"commit" object>> ties such directory hierarchies
2713together into a <<def_DAG,directed acyclic graph>> of revisions - each
2714"commit" is associated with exactly one tree (the directory hierarchy at
2715the time of the commit). In addition, a "commit" refers to one or more
2716"parent" commit objects that describe the history of how we arrived at
2717that directory hierarchy.
b181d57f
BF
2718
2719As a special case, a commit object with no parents is called the "root"
c64415e2 2720commit, and is the point of an initial project commit. Each project
b181d57f
BF
2721must have at least one root, and while you can tie several different
2722root objects together into one project by creating a commit object which
2723has two or more separate roots as its ultimate parents, that's probably
2724just going to confuse people. So aim for the notion of "one root object
2725per project", even if git itself does not enforce that.
2726
a536b08b
BF
2727A <<def_tag_object,"tag" object>> symbolically identifies and can be
2728used to sign other objects. It contains the identifier and type of
2729another object, a symbolic name (of course!) and, optionally, a
2730signature.
b181d57f
BF
2731
2732Regardless of object type, all objects share the following
2733characteristics: they are all deflated with zlib, and have a header
2734that not only specifies their type, but also provides size information
2735about the data in the object. It's worth noting that the SHA1 hash
2736that is used to name the object is the hash of the original data
2737plus this header, so `sha1sum` 'file' does not match the object name
2738for 'file'.
2739(Historical note: in the dawn of the age of git the hash
2740was the sha1 of the 'compressed' object.)
2741
2742As a result, the general consistency of an object can always be tested
2743independently of the contents or the type of the object: all objects can
2744be validated by verifying that (a) their hashes match the content of the
2745file and (b) the object successfully inflates to a stream of bytes that
4c7100a9
JH
2746forms a sequence of <ascii type without space> {plus} <space> {plus} <ascii decimal
2747size> {plus} <byte\0> {plus} <binary object data>.
b181d57f
BF
2748
2749The structured objects can further have their structure and
2750connectivity to other objects verified. This is generally done with
04e50e94 2751the `git-fsck` program, which generates a full dependency graph
b181d57f
BF
2752of all objects, and verifies their internal consistency (in addition
2753to just verifying their superficial consistency through the hash).
2754
2755The object types in some more detail:
2756
e34caace 2757[[blob-object]]
b181d57f
BF
2758Blob Object
2759-----------
2760
2761A "blob" object is nothing but a binary blob of data, and doesn't
2762refer to anything else. There is no signature or any other
2763verification of the data, so while the object is consistent (it 'is'
2764indexed by its sha1 hash, so the data itself is certainly correct), it
2765has absolutely no other attributes. No name associations, no
2766permissions. It is purely a blob of data (i.e. normally "file
2767contents").
2768
2769In particular, since the blob is entirely defined by its data, if two
2770files in a directory tree (or in multiple different versions of the
2771repository) have the same contents, they will share the same blob
2772object. The object is totally independent of its location in the
2773directory tree, and renaming a file does not change the object that
2774file is associated with in any way.
2775
2776A blob is typically created when gitlink:git-update-index[1]
2777is run, and its data can be accessed by gitlink:git-cat-file[1].
2778
e34caace 2779[[tree-object]]
b181d57f
BF
2780Tree Object
2781-----------
2782
2783The next hierarchical object type is the "tree" object. A tree object
2784is a list of mode/name/blob data, sorted by name. Alternatively, the
2785mode data may specify a directory mode, in which case instead of
2786naming a blob, that name is associated with another TREE object.
2787
2788Like the "blob" object, a tree object is uniquely determined by the
2789set contents, and so two separate but identical trees will always
2790share the exact same object. This is true at all levels, i.e. it's
2791true for a "leaf" tree (which does not refer to any other trees, only
2792blobs) as well as for a whole subdirectory.
2793
2794For that reason a "tree" object is just a pure data abstraction: it
2795has no history, no signatures, no verification of validity, except
2796that since the contents are again protected by the hash itself, we can
2797trust that the tree is immutable and its contents never change.
2798
2799So you can trust the contents of a tree to be valid, the same way you
2800can trust the contents of a blob, but you don't know where those
2801contents 'came' from.
2802
2803Side note on trees: since a "tree" object is a sorted list of
2804"filename+content", you can create a diff between two trees without
2805actually having to unpack two trees. Just ignore all common parts,
2806and your diff will look right. In other words, you can effectively
2807(and efficiently) tell the difference between any two random trees by
2808O(n) where "n" is the size of the difference, rather than the size of
2809the tree.
2810
2811Side note 2 on trees: since the name of a "blob" depends entirely and
2812exclusively on its contents (i.e. there are no names or permissions
2813involved), you can see trivial renames or permission changes by
2814noticing that the blob stayed the same. However, renames with data
2815changes need a smarter "diff" implementation.
2816
2817A tree is created with gitlink:git-write-tree[1] and
2818its data can be accessed by gitlink:git-ls-tree[1].
2819Two trees can be compared with gitlink:git-diff-tree[1].
2820
e34caace 2821[[commit-object]]
b181d57f
BF
2822Commit Object
2823-------------
2824
2825The "commit" object is an object that introduces the notion of
2826history into the picture. In contrast to the other objects, it
2827doesn't just describe the physical state of a tree, it describes how
2828we got there, and why.
2829
2830A "commit" is defined by the tree-object that it results in, the
2831parent commits (zero, one or more) that led up to that point, and a
2832comment on what happened. Again, a commit is not trusted per se:
2833the contents are well-defined and "safe" due to the cryptographically
2834strong signatures at all levels, but there is no reason to believe
2835that the tree is "good" or that the merge information makes sense.
2836The parents do not have to actually have any relationship with the
2837result, for example.
2838
c64415e2 2839Note on commits: unlike some SCM's, commits do not contain
b181d57f
BF
2840rename information or file mode change information. All of that is
2841implicit in the trees involved (the result tree, and the result trees
2842of the parents), and describing that makes no sense in this idiotic
2843file manager.
2844
2845A commit is created with gitlink:git-commit-tree[1] and
2846its data can be accessed by gitlink:git-cat-file[1].
2847
e34caace 2848[[trust]]
b181d57f
BF
2849Trust
2850-----
2851
2852An aside on the notion of "trust". Trust is really outside the scope
2853of "git", but it's worth noting a few things. First off, since
2854everything is hashed with SHA1, you 'can' trust that an object is
2855intact and has not been messed with by external sources. So the name
2856of an object uniquely identifies a known state - just not a state that
2857you may want to trust.
2858
2859Furthermore, since the SHA1 signature of a commit refers to the
2860SHA1 signatures of the tree it is associated with and the signatures
2861of the parent, a single named commit specifies uniquely a whole set
2862of history, with full contents. You can't later fake any step of the
2863way once you have the name of a commit.
2864
2865So to introduce some real trust in the system, the only thing you need
2866to do is to digitally sign just 'one' special note, which includes the
2867name of a top-level commit. Your digital signature shows others
2868that you trust that commit, and the immutability of the history of
2869commits tells others that they can trust the whole history.
2870
2871In other words, you can easily validate a whole archive by just
2872sending out a single email that tells the people the name (SHA1 hash)
2873of the top commit, and digitally sign that email using something
2874like GPG/PGP.
2875
2876To assist in this, git also provides the tag object...
2877
e34caace 2878[[tag-object]]
b181d57f
BF
2879Tag Object
2880----------
2881
2882Git provides the "tag" object to simplify creating, managing and
2883exchanging symbolic and signed tokens. The "tag" object at its
2884simplest simply symbolically identifies another object by containing
2885the sha1, type and symbolic name.
2886
2887However it can optionally contain additional signature information
2888(which git doesn't care about as long as there's less than 8k of
2889it). This can then be verified externally to git.
2890
2891Note that despite the tag features, "git" itself only handles content
2892integrity; the trust framework (and signature provision and
2893verification) has to come from outside.
2894
2895A tag is created with gitlink:git-mktag[1],
2896its data can be accessed by gitlink:git-cat-file[1],
2897and the signature can be verified by
2898gitlink:git-verify-tag[1].
2899
2900
e34caace 2901[[the-index]]
b181d57f
BF
2902The "index" aka "Current Directory Cache"
2903-----------------------------------------
2904
2905The index is a simple binary file, which contains an efficient
c64415e2 2906representation of the contents of a virtual directory. It
b181d57f
BF
2907does so by a simple array that associates a set of names, dates,
2908permissions and content (aka "blob") objects together. The cache is
2909always kept ordered by name, and names are unique (with a few very
2910specific rules) at any point in time, but the cache has no long-term
2911meaning, and can be partially updated at any time.
2912
2913In particular, the index certainly does not need to be consistent with
2914the current directory contents (in fact, most operations will depend on
2915different ways to make the index 'not' be consistent with the directory
2916hierarchy), but it has three very important attributes:
2917
2918'(a) it can re-generate the full state it caches (not just the
2919directory structure: it contains pointers to the "blob" objects so
2920that it can regenerate the data too)'
2921
2922As a special case, there is a clear and unambiguous one-way mapping
2923from a current directory cache to a "tree object", which can be
2924efficiently created from just the current directory cache without
2925actually looking at any other data. So a directory cache at any one
2926time uniquely specifies one and only one "tree" object (but has
2927additional data to make it easy to match up that tree object with what
2928has happened in the directory)
2929
2930'(b) it has efficient methods for finding inconsistencies between that
2931cached state ("tree object waiting to be instantiated") and the
2932current state.'
2933
2934'(c) it can additionally efficiently represent information about merge
2935conflicts between different tree objects, allowing each pathname to be
2936associated with sufficient information about the trees involved that
2937you can create a three-way merge between them.'
2938
79c96c57 2939Those are the ONLY three things that the directory cache does. It's a
b181d57f
BF
2940cache, and the normal operation is to re-generate it completely from a
2941known tree object, or update/compare it with a live tree that is being
2942developed. If you blow the directory cache away entirely, you generally
2943haven't lost any information as long as you have the name of the tree
2944that it described.
2945
2946At the same time, the index is at the same time also the
2947staging area for creating new trees, and creating a new tree always
2948involves a controlled modification of the index file. In particular,
2949the index file can have the representation of an intermediate tree that
2950has not yet been instantiated. So the index can be thought of as a
2951write-back cache, which can contain dirty information that has not yet
2952been written back to the backing store.
2953
2954
2955
e34caace 2956[[the-workflow]]
b181d57f
BF
2957The Workflow
2958------------
2959
2960Generally, all "git" operations work on the index file. Some operations
2961work *purely* on the index file (showing the current state of the
2962index), but most operations move data to and from the index file. Either
2963from the database or from the working directory. Thus there are four
2964main combinations:
2965
e34caace 2966[[working-directory-to-index]]
b181d57f
BF
2967working directory -> index
2968~~~~~~~~~~~~~~~~~~~~~~~~~~
2969
2970You update the index with information from the working directory with
2971the gitlink:git-update-index[1] command. You
2972generally update the index information by just specifying the filename
2973you want to update, like so:
2974
2975-------------------------------------------------
2976$ git-update-index filename
2977-------------------------------------------------
2978
2979but to avoid common mistakes with filename globbing etc, the command
2980will not normally add totally new entries or remove old entries,
2981i.e. it will normally just update existing cache entries.
2982
2983To tell git that yes, you really do realize that certain files no
2984longer exist, or that new files should be added, you
2985should use the `--remove` and `--add` flags respectively.
2986
2987NOTE! A `--remove` flag does 'not' mean that subsequent filenames will
2988necessarily be removed: if the files still exist in your directory
2989structure, the index will be updated with their new status, not
2990removed. The only thing `--remove` means is that update-cache will be
2991considering a removed file to be a valid thing, and if the file really
2992does not exist any more, it will update the index accordingly.
2993
2994As a special case, you can also do `git-update-index --refresh`, which
2995will refresh the "stat" information of each index to match the current
2996stat information. It will 'not' update the object status itself, and
2997it will only update the fields that are used to quickly test whether
2998an object still matches its old backing store object.
2999
e34caace 3000[[index-to-object-database]]
b181d57f
BF
3001index -> object database
3002~~~~~~~~~~~~~~~~~~~~~~~~
3003
3004You write your current index file to a "tree" object with the program
3005
3006-------------------------------------------------
3007$ git-write-tree
3008-------------------------------------------------
3009
3010that doesn't come with any options - it will just write out the
3011current index into the set of tree objects that describe that state,
3012and it will return the name of the resulting top-level tree. You can
3013use that tree to re-generate the index at any time by going in the
3014other direction:
3015
e34caace 3016[[object-database-to-index]]
b181d57f
BF
3017object database -> index
3018~~~~~~~~~~~~~~~~~~~~~~~~
3019
3020You read a "tree" file from the object database, and use that to
3021populate (and overwrite - don't do this if your index contains any
3022unsaved state that you might want to restore later!) your current
3023index. Normal operation is just
3024
3025-------------------------------------------------
3026$ git-read-tree <sha1 of tree>
3027-------------------------------------------------
3028
3029and your index file will now be equivalent to the tree that you saved
3030earlier. However, that is only your 'index' file: your working
3031directory contents have not been modified.
3032
e34caace 3033[[index-to-working-directory]]
b181d57f
BF
3034index -> working directory
3035~~~~~~~~~~~~~~~~~~~~~~~~~~
3036
3037You update your working directory from the index by "checking out"
3038files. This is not a very common operation, since normally you'd just
3039keep your files updated, and rather than write to your working
3040directory, you'd tell the index files about the changes in your
3041working directory (i.e. `git-update-index`).
3042
3043However, if you decide to jump to a new version, or check out somebody
3044else's version, or just restore a previous tree, you'd populate your
3045index file with read-tree, and then you need to check out the result
3046with
3047
3048-------------------------------------------------
3049$ git-checkout-index filename
3050-------------------------------------------------
3051
3052or, if you want to check out all of the index, use `-a`.
3053
3054NOTE! git-checkout-index normally refuses to overwrite old files, so
3055if you have an old version of the tree already checked out, you will
3056need to use the "-f" flag ('before' the "-a" flag or the filename) to
3057'force' the checkout.
3058
3059
3060Finally, there are a few odds and ends which are not purely moving
3061from one representation to the other:
3062
e34caace 3063[[tying-it-all-together]]
b181d57f
BF
3064Tying it all together
3065~~~~~~~~~~~~~~~~~~~~~
3066
3067To commit a tree you have instantiated with "git-write-tree", you'd
3068create a "commit" object that refers to that tree and the history
3069behind it - most notably the "parent" commits that preceded it in
3070history.
3071
3072Normally a "commit" has one parent: the previous state of the tree
3073before a certain change was made. However, sometimes it can have two
3074or more parent commits, in which case we call it a "merge", due to the
3075fact that such a commit brings together ("merges") two or more
3076previous states represented by other commits.
3077
3078In other words, while a "tree" represents a particular directory state
3079of a working directory, a "commit" represents that state in "time",
3080and explains how we got there.
3081
3082You create a commit object by giving it the tree that describes the
3083state at the time of the commit, and a list of parents:
3084
3085-------------------------------------------------
3086$ git-commit-tree <tree> -p <parent> [-p <parent2> ..]
3087-------------------------------------------------
3088
3089and then giving the reason for the commit on stdin (either through
3090redirection from a pipe or file, or by just typing it at the tty).
3091
3092git-commit-tree will return the name of the object that represents
3093that commit, and you should save it away for later use. Normally,
3094you'd commit a new `HEAD` state, and while git doesn't care where you
3095save the note about that state, in practice we tend to just write the
3096result to the file pointed at by `.git/HEAD`, so that we can always see
3097what the last committed state was.
3098
3099Here is an ASCII art by Jon Loeliger that illustrates how
3100various pieces fit together.
3101
3102------------
3103
3104 commit-tree
3105 commit obj
3106 +----+
3107 | |
3108 | |
3109 V V
3110 +-----------+
3111 | Object DB |
3112 | Backing |
3113 | Store |
3114 +-----------+
3115 ^
3116 write-tree | |
3117 tree obj | |
3118 | | read-tree
3119 | | tree obj
3120 V
3121 +-----------+
3122 | Index |
3123 | "cache" |
3124 +-----------+
3125 update-index ^
3126 blob obj | |
3127 | |
3128 checkout-index -u | | checkout-index
3129 stat | | blob obj
3130 V
3131 +-----------+
3132 | Working |
3133 | Directory |
3134 +-----------+
3135
3136------------
3137
3138
e34caace 3139[[examining-the-data]]
b181d57f
BF
3140Examining the data
3141------------------
3142
3143You can examine the data represented in the object database and the
3144index with various helper tools. For every object, you can use
3145gitlink:git-cat-file[1] to examine details about the
3146object:
3147
3148-------------------------------------------------
3149$ git-cat-file -t <objectname>
3150-------------------------------------------------
3151
3152shows the type of the object, and once you have the type (which is
3153usually implicit in where you find the object), you can use
3154
3155-------------------------------------------------
3156$ git-cat-file blob|tree|commit|tag <objectname>
3157-------------------------------------------------
3158
3159to show its contents. NOTE! Trees have binary content, and as a result
3160there is a special helper for showing that content, called
3161`git-ls-tree`, which turns the binary content into a more easily
3162readable form.
3163
3164It's especially instructive to look at "commit" objects, since those
3165tend to be small and fairly self-explanatory. In particular, if you
3166follow the convention of having the top commit name in `.git/HEAD`,
3167you can do
3168
3169-------------------------------------------------
3170$ git-cat-file commit HEAD
3171-------------------------------------------------
3172
3173to see what the top commit was.
3174
e34caace 3175[[merging-multiple-trees]]
b181d57f 3176Merging multiple trees
d19fbc3c
BF
3177----------------------
3178
b181d57f
BF
3179Git helps you do a three-way merge, which you can expand to n-way by
3180repeating the merge procedure arbitrary times until you finally
3181"commit" the state. The normal situation is that you'd only do one
3182three-way merge (two parents), and commit it, but if you like to, you
3183can do multiple parents in one go.
3184
3185To do a three-way merge, you need the two sets of "commit" objects
3186that you want to merge, use those to find the closest common parent (a
3187third "commit" object), and then use those commit objects to find the
3188state of the directory ("tree" object) at these points.
3189
3190To get the "base" for the merge, you first look up the common parent
3191of two commits with
3192
3193-------------------------------------------------
3194$ git-merge-base <commit1> <commit2>
3195-------------------------------------------------
3196
3197which will return you the commit they are both based on. You should
3198now look up the "tree" objects of those commits, which you can easily
3199do with (for example)
3200
3201-------------------------------------------------
3202$ git-cat-file commit <commitname> | head -1
3203-------------------------------------------------
3204
3205since the tree object information is always the first line in a commit
3206object.
3207
1191ee18 3208Once you know the three trees you are going to merge (the one "original"
c64415e2 3209tree, aka the common tree, and the two "result" trees, aka the branches
1191ee18
BF
3210you want to merge), you do a "merge" read into the index. This will
3211complain if it has to throw away your old index contents, so you should
b181d57f 3212make sure that you've committed those - in fact you would normally
1191ee18
BF
3213always do a merge against your last commit (which should thus match what
3214you have in your current index anyway).
b181d57f
BF
3215
3216To do the merge, do
3217
3218-------------------------------------------------
3219$ git-read-tree -m -u <origtree> <yourtree> <targettree>
3220-------------------------------------------------
3221
3222which will do all trivial merge operations for you directly in the
3223index file, and you can just write the result out with
3224`git-write-tree`.
3225
3226
e34caace 3227[[merging-multiple-trees-2]]
b181d57f
BF
3228Merging multiple trees, continued
3229---------------------------------
3230
3231Sadly, many merges aren't trivial. If there are files that have
3232been added.moved or removed, or if both branches have modified the
3233same file, you will be left with an index tree that contains "merge
3234entries" in it. Such an index tree can 'NOT' be written out to a tree
3235object, and you will have to resolve any such merge clashes using
3236other tools before you can write out the result.
3237
3238You can examine such index state with `git-ls-files --unmerged`
3239command. An example:
3240
3241------------------------------------------------
3242$ git-read-tree -m $orig HEAD $target
3243$ git-ls-files --unmerged
3244100644 263414f423d0e4d70dae8fe53fa34614ff3e2860 1 hello.c
3245100644 06fa6a24256dc7e560efa5687fa84b51f0263c3a 2 hello.c
3246100644 cc44c73eb783565da5831b4d820c962954019b69 3 hello.c
3247------------------------------------------------
3248
3249Each line of the `git-ls-files --unmerged` output begins with
3250the blob mode bits, blob SHA1, 'stage number', and the
3251filename. The 'stage number' is git's way to say which tree it
3252came from: stage 1 corresponds to `$orig` tree, stage 2 `HEAD`
3253tree, and stage3 `$target` tree.
3254
3255Earlier we said that trivial merges are done inside
3256`git-read-tree -m`. For example, if the file did not change
3257from `$orig` to `HEAD` nor `$target`, or if the file changed
3258from `$orig` to `HEAD` and `$orig` to `$target` the same way,
3259obviously the final outcome is what is in `HEAD`. What the
3260above example shows is that file `hello.c` was changed from
3261`$orig` to `HEAD` and `$orig` to `$target` in a different way.
3262You could resolve this by running your favorite 3-way merge
c64415e2
BF
3263program, e.g. `diff3`, `merge`, or git's own merge-file, on
3264the blob objects from these three stages yourself, like this:
b181d57f
BF
3265
3266------------------------------------------------
3267$ git-cat-file blob 263414f... >hello.c~1
3268$ git-cat-file blob 06fa6a2... >hello.c~2
3269$ git-cat-file blob cc44c73... >hello.c~3
c64415e2 3270$ git merge-file hello.c~2 hello.c~1 hello.c~3
b181d57f
BF
3271------------------------------------------------
3272
3273This would leave the merge result in `hello.c~2` file, along
3274with conflict markers if there are conflicts. After verifying
3275the merge result makes sense, you can tell git what the final
3276merge result for this file is by:
3277
3278-------------------------------------------------
3279$ mv -f hello.c~2 hello.c
3280$ git-update-index hello.c
3281-------------------------------------------------
3282
3283When a path is in unmerged state, running `git-update-index` for
3284that path tells git to mark the path resolved.
3285
3286The above is the description of a git merge at the lowest level,
3287to help you understand what conceptually happens under the hood.
3288In practice, nobody, not even git itself, uses three `git-cat-file`
3289for this. There is `git-merge-index` program that extracts the
3290stages to temporary files and calls a "merge" script on it:
3291
3292-------------------------------------------------
3293$ git-merge-index git-merge-one-file hello.c
3294-------------------------------------------------
3295
207dfa07 3296and that is what higher level `git merge -s resolve` is implemented with.
b181d57f 3297
e34caace 3298[[pack-files]]
b181d57f
BF
3299How git stores objects efficiently: pack files
3300----------------------------------------------
3301
3302We've seen how git stores each object in a file named after the
3303object's SHA1 hash.
3304
3305Unfortunately this system becomes inefficient once a project has a
3306lot of objects. Try this on an old project:
3307
3308------------------------------------------------
3309$ git count-objects
33106930 objects, 47620 kilobytes
3311------------------------------------------------
3312
3313The first number is the number of objects which are kept in
3314individual files. The second is the amount of space taken up by
3315those "loose" objects.
3316
3317You can save space and make git faster by moving these loose objects in
3318to a "pack file", which stores a group of objects in an efficient
3319compressed format; the details of how pack files are formatted can be
3320found in link:technical/pack-format.txt[technical/pack-format.txt].
3321
3322To put the loose objects into a pack, just run git repack:
3323
3324------------------------------------------------
3325$ git repack
3326Generating pack...
3327Done counting 6020 objects.
3328Deltifying 6020 objects.
3329 100% (6020/6020) done
3330Writing 6020 objects.
3331 100% (6020/6020) done
3332Total 6020, written 6020 (delta 4070), reused 0 (delta 0)
3333Pack pack-3e54ad29d5b2e05838c75df582c65257b8d08e1c created.
3334------------------------------------------------
3335
3336You can then run
3337
3338------------------------------------------------
3339$ git prune
3340------------------------------------------------
3341
3342to remove any of the "loose" objects that are now contained in the
3343pack. This will also remove any unreferenced objects (which may be
3344created when, for example, you use "git reset" to remove a commit).
3345You can verify that the loose objects are gone by looking at the
3346.git/objects directory or by running
3347
3348------------------------------------------------
3349$ git count-objects
33500 objects, 0 kilobytes
3351------------------------------------------------
3352
3353Although the object files are gone, any commands that refer to those
3354objects will work exactly as they did before.
3355
3356The gitlink:git-gc[1] command performs packing, pruning, and more for
3357you, so is normally the only high-level command you need.
d19fbc3c 3358
59723040 3359[[dangling-objects]]
21dcb3b7 3360Dangling objects
61b41790 3361----------------
21dcb3b7 3362
04e50e94 3363The gitlink:git-fsck[1] command will sometimes complain about dangling
21dcb3b7
BF
3364objects. They are not a problem.
3365
1191ee18
BF
3366The most common cause of dangling objects is that you've rebased a
3367branch, or you have pulled from somebody else who rebased a branch--see
3368<<cleaning-up-history>>. In that case, the old head of the original
59723040
BF
3369branch still exists, as does everything it pointed to. The branch
3370pointer itself just doesn't, since you replaced it with another one.
1191ee18 3371
59723040 3372There are also other situations that cause dangling objects. For
1191ee18
BF
3373example, a "dangling blob" may arise because you did a "git add" of a
3374file, but then, before you actually committed it and made it part of the
3375bigger picture, you changed something else in that file and committed
3376that *updated* thing - the old state that you added originally ends up
3377not being pointed to by any commit or tree, so it's now a dangling blob
3378object.
3379
3380Similarly, when the "recursive" merge strategy runs, and finds that
3381there are criss-cross merges and thus more than one merge base (which is
3382fairly unusual, but it does happen), it will generate one temporary
3383midway tree (or possibly even more, if you had lots of criss-crossing
3384merges and more than two merge bases) as a temporary internal merge
3385base, and again, those are real objects, but the end result will not end
3386up pointing to them, so they end up "dangling" in your repository.
3387
3388Generally, dangling objects aren't anything to worry about. They can
3389even be very useful: if you screw something up, the dangling objects can
3390be how you recover your old tree (say, you did a rebase, and realized
3391that you really didn't want to - you can look at what dangling objects
3392you have, and decide to reset your head to some old dangling state).
21dcb3b7 3393
59723040 3394For commits, you can just use:
21dcb3b7
BF
3395
3396------------------------------------------------
3397$ gitk <dangling-commit-sha-goes-here> --not --all
3398------------------------------------------------
3399
59723040
BF
3400This asks for all the history reachable from the given commit but not
3401from any branch, tag, or other reference. If you decide it's something
3402you want, you can always create a new reference to it, e.g.,
3403
3404------------------------------------------------
3405$ git branch recovered-branch <dangling-commit-sha-goes-here>
3406------------------------------------------------
3407
3408For blobs and trees, you can't do the same, but you can still examine
3409them. You can just do
21dcb3b7
BF
3410
3411------------------------------------------------
3412$ git show <dangling-blob/tree-sha-goes-here>
3413------------------------------------------------
3414
1191ee18
BF
3415to show what the contents of the blob were (or, for a tree, basically
3416what the "ls" for that directory was), and that may give you some idea
3417of what the operation was that left that dangling object.
21dcb3b7 3418
1191ee18
BF
3419Usually, dangling blobs and trees aren't very interesting. They're
3420almost always the result of either being a half-way mergebase (the blob
3421will often even have the conflict markers from a merge in it, if you
3422have had conflicting merges that you fixed up by hand), or simply
3423because you interrupted a "git fetch" with ^C or something like that,
3424leaving _some_ of the new objects in the object database, but just
3425dangling and useless.
21dcb3b7
BF
3426
3427Anyway, once you are sure that you're not interested in any dangling
3428state, you can just prune all unreachable objects:
3429
3430------------------------------------------------
3431$ git prune
3432------------------------------------------------
3433
1191ee18
BF
3434and they'll be gone. But you should only run "git prune" on a quiescent
3435repository - it's kind of like doing a filesystem fsck recovery: you
3436don't want to do that while the filesystem is mounted.
21dcb3b7 3437
04e50e94
BF
3438(The same is true of "git-fsck" itself, btw - but since
3439git-fsck never actually *changes* the repository, it just reports
3440on what it found, git-fsck itself is never "dangerous" to run.
21dcb3b7
BF
3441Running it while somebody is actually changing the repository can cause
3442confusing and scary messages, but it won't actually do anything bad. In
3443contrast, running "git prune" while somebody is actively changing the
3444repository is a *BAD* idea).
3445
126640af 3446[[birdview-on-the-source-code]]
a5fc33b4
BF
3447A birds-eye view of Git's source code
3448-------------------------------------
126640af 3449
a5fc33b4
BF
3450It is not always easy for new developers to find their way through Git's
3451source code. This section gives you a little guidance to show where to
3452start.
126640af 3453
a5fc33b4 3454A good place to start is with the contents of the initial commit, with:
126640af
JS
3455
3456----------------------------------------------------
a5fc33b4 3457$ git checkout e83c5163
126640af
JS
3458----------------------------------------------------
3459
a5fc33b4
BF
3460The initial revision lays the foundation for almost everything git has
3461today, but is small enough to read in one sitting.
126640af 3462
a5fc33b4
BF
3463Note that terminology has changed since that revision. For example, the
3464README in that revision uses the word "changeset" to describe what we
3465now call a <<def_commit_object,commit>>.
126640af 3466
a5fc33b4 3467Also, we do not call it "cache" any more, but "index", however, the
126640af
JS
3468file is still called `cache.h`. Remark: Not much reason to change it now,
3469especially since there is no good single name for it anyway, because it is
3470basically _the_ header file which is included by _all_ of Git's C sources.
3471
a5fc33b4
BF
3472If you grasp the ideas in that initial commit, you should check out a
3473more recent version and skim `cache.h`, `object.h` and `commit.h`.
126640af
JS
3474
3475In the early days, Git (in the tradition of UNIX) was a bunch of programs
3476which were extremely simple, and which you used in scripts, piping the
3477output of one into another. This turned out to be good for initial
3478development, since it was easier to test new things. However, recently
3479many of these parts have become builtins, and some of the core has been
3480"libified", i.e. put into libgit.a for performance, portability reasons,
3481and to avoid code duplication.
3482
3483By now, you know what the index is (and find the corresponding data
3484structures in `cache.h`), and that there are just a couple of object types
3485(blobs, trees, commits and tags) which inherit their common structure from
3486`struct object`, which is their first member (and thus, you can cast e.g.
3487`(struct object *)commit` to achieve the _same_ as `&commit->object`, i.e.
3488get at the object name and flags).
3489
3490Now is a good point to take a break to let this information sink in.
3491
3492Next step: get familiar with the object naming. Read <<naming-commits>>.
3493There are quite a few ways to name an object (and not only revisions!).
3494All of these are handled in `sha1_name.c`. Just have a quick look at
3495the function `get_sha1()`. A lot of the special handling is done by
3496functions like `get_sha1_basic()` or the likes.
3497
3498This is just to get you into the groove for the most libified part of Git:
3499the revision walker.
3500
3501Basically, the initial version of `git log` was a shell script:
3502
3503----------------------------------------------------------------
3504$ git-rev-list --pretty $(git-rev-parse --default HEAD "$@") | \
3505 LESS=-S ${PAGER:-less}
3506----------------------------------------------------------------
3507
3508What does this mean?
3509
3510`git-rev-list` is the original version of the revision walker, which
3511_always_ printed a list of revisions to stdout. It is still functional,
3512and needs to, since most new Git programs start out as scripts using
3513`git-rev-list`.
3514
3515`git-rev-parse` is not as important any more; it was only used to filter out
3516options that were relevant for the different plumbing commands that were
3517called by the script.
3518
3519Most of what `git-rev-list` did is contained in `revision.c` and
3520`revision.h`. It wraps the options in a struct named `rev_info`, which
3521controls how and what revisions are walked, and more.
3522
3523The original job of `git-rev-parse` is now taken by the function
3524`setup_revisions()`, which parses the revisions and the common command line
3525options for the revision walker. This information is stored in the struct
3526`rev_info` for later consumption. You can do your own command line option
3527parsing after calling `setup_revisions()`. After that, you have to call
3528`prepare_revision_walk()` for initialization, and then you can get the
3529commits one by one with the function `get_revision()`.
3530
3531If you are interested in more details of the revision walking process,
3532just have a look at the first implementation of `cmd_log()`; call
3533`git-show v1.3.0~155^2~4` and scroll down to that function (note that you
3534no longer need to call `setup_pager()` directly).
3535
3536Nowadays, `git log` is a builtin, which means that it is _contained_ in the
3537command `git`. The source side of a builtin is
3538
3539- a function called `cmd_<bla>`, typically defined in `builtin-<bla>.c`,
3540 and declared in `builtin.h`,
3541
3542- an entry in the `commands[]` array in `git.c`, and
3543
3544- an entry in `BUILTIN_OBJECTS` in the `Makefile`.
3545
3546Sometimes, more than one builtin is contained in one source file. For
3547example, `cmd_whatchanged()` and `cmd_log()` both reside in `builtin-log.c`,
3548since they share quite a bit of code. In that case, the commands which are
3549_not_ named like the `.c` file in which they live have to be listed in
3550`BUILT_INS` in the `Makefile`.
3551
3552`git log` looks more complicated in C than it does in the original script,
3553but that allows for a much greater flexibility and performance.
3554
3555Here again it is a good point to take a pause.
3556
3557Lesson three is: study the code. Really, it is the best way to learn about
3558the organization of Git (after you know the basic concepts).
3559
3560So, think about something which you are interested in, say, "how can I
3561access a blob just knowing the object name of it?". The first step is to
3562find a Git command with which you can do it. In this example, it is either
3563`git show` or `git cat-file`.
3564
3565For the sake of clarity, let's stay with `git cat-file`, because it
3566
3567- is plumbing, and
3568
3569- was around even in the initial commit (it literally went only through
3570 some 20 revisions as `cat-file.c`, was renamed to `builtin-cat-file.c`
3571 when made a builtin, and then saw less than 10 versions).
3572
3573So, look into `builtin-cat-file.c`, search for `cmd_cat_file()` and look what
3574it does.
3575
3576------------------------------------------------------------------
3577 git_config(git_default_config);
3578 if (argc != 3)
3579 usage("git-cat-file [-t|-s|-e|-p|<type>] <sha1>");
3580 if (get_sha1(argv[2], sha1))
3581 die("Not a valid object name %s", argv[2]);
3582------------------------------------------------------------------
3583
3584Let's skip over the obvious details; the only really interesting part
3585here is the call to `get_sha1()`. It tries to interpret `argv[2]` as an
3586object name, and if it refers to an object which is present in the current
3587repository, it writes the resulting SHA-1 into the variable `sha1`.
3588
3589Two things are interesting here:
3590
3591- `get_sha1()` returns 0 on _success_. This might surprise some new
3592 Git hackers, but there is a long tradition in UNIX to return different
3593 negative numbers in case of different errors -- and 0 on success.
3594
3595- the variable `sha1` in the function signature of `get_sha1()` is `unsigned
a5fc33b4 3596 char \*`, but is actually expected to be a pointer to `unsigned
126640af 3597 char[20]`. This variable will contain the 160-bit SHA-1 of the given
a5fc33b4 3598 commit. Note that whenever a SHA-1 is passed as `unsigned char \*`, it
126640af 3599 is the binary representation, as opposed to the ASCII representation in
a5fc33b4 3600 hex characters, which is passed as `char *`.
126640af
JS
3601
3602You will see both of these things throughout the code.
3603
3604Now, for the meat:
3605
3606-----------------------------------------------------------------------------
3607 case 0:
3608 buf = read_object_with_reference(sha1, argv[1], &size, NULL);
3609-----------------------------------------------------------------------------
3610
3611This is how you read a blob (actually, not only a blob, but any type of
3612object). To know how the function `read_object_with_reference()` actually
3613works, find the source code for it (something like `git grep
3614read_object_with | grep ":[a-z]"` in the git repository), and read
3615the source.
3616
3617To find out how the result can be used, just read on in `cmd_cat_file()`:
3618
3619-----------------------------------
3620 write_or_die(1, buf, size);
3621-----------------------------------
3622
3623Sometimes, you do not know where to look for a feature. In many such cases,
3624it helps to search through the output of `git log`, and then `git show` the
3625corresponding commit.
3626
3627Example: If you know that there was some test case for `git bundle`, but
3628do not remember where it was (yes, you _could_ `git grep bundle t/`, but that
3629does not illustrate the point!):
3630
3631------------------------
3632$ git log --no-merges t/
3633------------------------
3634
3635In the pager (`less`), just search for "bundle", go a few lines back,
3636and see that it is in commit 18449ab0... Now just copy this object name,
3637and paste it into the command line
3638
3639-------------------
3640$ git show 18449ab0
3641-------------------
3642
3643Voila.
3644
3645Another example: Find out what to do in order to make some script a
3646builtin:
3647
3648-------------------------------------------------
3649$ git log --no-merges --diff-filter=A builtin-*.c
3650-------------------------------------------------
3651
3652You see, Git is actually the best tool to find out about the source of Git
3653itself!
3654
e34caace 3655[[glossary]]
d19fbc3c
BF
3656include::glossary.txt[]
3657
2624d9a5 3658[[git-quick-start]]
99f171bb
BF
3659Appendix A: Git Quick Reference
3660===============================
2624d9a5 3661
99f171bb
BF
3662This is a quick summary of the major commands; the previous chapters
3663explain how these work in more detail.
2624d9a5
BF
3664
3665[[quick-creating-a-new-repository]]
3666Creating a new repository
3667-------------------------
3668
3669From a tarball:
3670
3671-----------------------------------------------
3672$ tar xzf project.tar.gz
3673$ cd project
3674$ git init
3675Initialized empty Git repository in .git/
3676$ git add .
3677$ git commit
3678-----------------------------------------------
3679
3680From a remote repository:
3681
3682-----------------------------------------------
3683$ git clone git://example.com/pub/project.git
3684$ cd project
3685-----------------------------------------------
3686
3687[[managing-branches]]
3688Managing branches
3689-----------------
3690
3691-----------------------------------------------
3692$ git branch # list all local branches in this repo
3693$ git checkout test # switch working directory to branch "test"
3694$ git branch new # create branch "new" starting at current HEAD
3695$ git branch -d new # delete branch "new"
3696-----------------------------------------------
3697
3698Instead of basing new branch on current HEAD (the default), use:
3699
3700-----------------------------------------------
3701$ git branch new test # branch named "test"
3702$ git branch new v2.6.15 # tag named v2.6.15
3703$ git branch new HEAD^ # commit before the most recent
3704$ git branch new HEAD^^ # commit before that
3705$ git branch new test~10 # ten commits before tip of branch "test"
3706-----------------------------------------------
3707
3708Create and switch to a new branch at the same time:
3709
3710-----------------------------------------------
3711$ git checkout -b new v2.6.15
3712-----------------------------------------------
3713
3714Update and examine branches from the repository you cloned from:
3715
3716-----------------------------------------------
3717$ git fetch # update
3718$ git branch -r # list
3719 origin/master
3720 origin/next
3721 ...
3722$ git checkout -b masterwork origin/master
3723-----------------------------------------------
3724
3725Fetch a branch from a different repository, and give it a new
3726name in your repository:
3727
3728-----------------------------------------------
3729$ git fetch git://example.com/project.git theirbranch:mybranch
3730$ git fetch git://example.com/project.git v2.6.15:mybranch
3731-----------------------------------------------
3732
3733Keep a list of repositories you work with regularly:
3734
3735-----------------------------------------------
3736$ git remote add example git://example.com/project.git
3737$ git remote # list remote repositories
3738example
3739origin
3740$ git remote show example # get details
3741* remote example
3742 URL: git://example.com/project.git
3743 Tracked remote branches
3744 master next ...
3745$ git fetch example # update branches from example
3746$ git branch -r # list all remote branches
3747-----------------------------------------------
3748
3749
3750[[exploring-history]]
3751Exploring history
3752-----------------
3753
3754-----------------------------------------------
3755$ gitk # visualize and browse history
3756$ git log # list all commits
3757$ git log src/ # ...modifying src/
3758$ git log v2.6.15..v2.6.16 # ...in v2.6.16, not in v2.6.15
3759$ git log master..test # ...in branch test, not in branch master
3760$ git log test..master # ...in branch master, but not in test
3761$ git log test...master # ...in one branch, not in both
3762$ git log -S'foo()' # ...where difference contain "foo()"
3763$ git log --since="2 weeks ago"
3764$ git log -p # show patches as well
3765$ git show # most recent commit
3766$ git diff v2.6.15..v2.6.16 # diff between two tagged versions
3767$ git diff v2.6.15..HEAD # diff with current head
3768$ git grep "foo()" # search working directory for "foo()"
3769$ git grep v2.6.15 "foo()" # search old tree for "foo()"
3770$ git show v2.6.15:a.txt # look at old version of a.txt
3771-----------------------------------------------
3772
3773Search for regressions:
3774
3775-----------------------------------------------
3776$ git bisect start
3777$ git bisect bad # current version is bad
3778$ git bisect good v2.6.13-rc2 # last known good revision
3779Bisecting: 675 revisions left to test after this
3780 # test here, then:
3781$ git bisect good # if this revision is good, or
3782$ git bisect bad # if this revision is bad.
3783 # repeat until done.
3784-----------------------------------------------
3785
3786[[making-changes]]
3787Making changes
3788--------------
3789
3790Make sure git knows who to blame:
3791
3792------------------------------------------------
3793$ cat >>~/.gitconfig <<\EOF
3794[user]
3795 name = Your Name Comes Here
3796 email = you@yourdomain.example.com
3797EOF
3798------------------------------------------------
3799
3800Select file contents to include in the next commit, then make the
3801commit:
3802
3803-----------------------------------------------
3804$ git add a.txt # updated file
3805$ git add b.txt # new file
3806$ git rm c.txt # old file
3807$ git commit
3808-----------------------------------------------
3809
3810Or, prepare and create the commit in one step:
3811
3812-----------------------------------------------
3813$ git commit d.txt # use latest content only of d.txt
3814$ git commit -a # use latest content of all tracked files
3815-----------------------------------------------
3816
3817[[merging]]
3818Merging
3819-------
3820
3821-----------------------------------------------
3822$ git merge test # merge branch "test" into the current branch
3823$ git pull git://example.com/project.git master
3824 # fetch and merge in remote branch
3825$ git pull . test # equivalent to git merge test
3826-----------------------------------------------
3827
3828[[sharing-your-changes]]
3829Sharing your changes
3830--------------------
3831
3832Importing or exporting patches:
3833
3834-----------------------------------------------
3835$ git format-patch origin..HEAD # format a patch for each commit
3836 # in HEAD but not in origin
3837$ git am mbox # import patches from the mailbox "mbox"
3838-----------------------------------------------
3839
3840Fetch a branch in a different git repository, then merge into the
3841current branch:
3842
3843-----------------------------------------------
3844$ git pull git://example.com/project.git theirbranch
3845-----------------------------------------------
3846
3847Store the fetched branch into a local branch before merging into the
3848current branch:
3849
3850-----------------------------------------------
3851$ git pull git://example.com/project.git theirbranch:mybranch
3852-----------------------------------------------
3853
3854After creating commits on a local branch, update the remote
3855branch with your commits:
3856
3857-----------------------------------------------
3858$ git push ssh://example.com/project.git mybranch:theirbranch
3859-----------------------------------------------
3860
3861When remote and local branch are both named "test":
3862
3863-----------------------------------------------
3864$ git push ssh://example.com/project.git test
3865-----------------------------------------------
3866
3867Shortcut version for a frequently used remote repository:
3868
3869-----------------------------------------------
3870$ git remote add example ssh://example.com/project.git
3871$ git push example test
3872-----------------------------------------------
3873
3874[[repository-maintenance]]
3875Repository maintenance
3876----------------------
3877
3878Check for corruption:
3879
3880-----------------------------------------------
3881$ git fsck
3882-----------------------------------------------
3883
3884Recompress, remove unused cruft:
3885
3886-----------------------------------------------
3887$ git gc
3888-----------------------------------------------
3889
3890
e34caace 3891[[todo]]
2624d9a5
BF
3892Appendix B: Notes and todo list for this manual
3893===============================================
6bd9b682
BF
3894
3895This is a work in progress.
3896
3897The basic requirements:
2f99710c
BF
3898 - It must be readable in order, from beginning to end, by
3899 someone intelligent with a basic grasp of the unix
3900 commandline, but without any special knowledge of git. If
3901 necessary, any other prerequisites should be specifically
3902 mentioned as they arise.
3903 - Whenever possible, section headings should clearly describe
3904 the task they explain how to do, in language that requires
3905 no more knowledge than necessary: for example, "importing
3906 patches into a project" rather than "the git-am command"
6bd9b682 3907
d5cd5de4
BF
3908Think about how to create a clear chapter dependency graph that will
3909allow people to get to important topics without necessarily reading
3910everything in between.
d19fbc3c
BF
3911
3912Scan Documentation/ for other stuff left out; in particular:
3913 howto's
d19fbc3c
BF
3914 some of technical/?
3915 hooks
0b375ab0 3916 list of commands in gitlink:git[1]
d19fbc3c
BF
3917
3918Scan email archives for other stuff left out
3919
3920Scan man pages to see if any assume more background than this manual
3921provides.
3922
2f99710c 3923Simplify beginning by suggesting disconnected head instead of
b181d57f 3924temporary branch creation?
d19fbc3c 3925
2f99710c
BF
3926Add more good examples. Entire sections of just cookbook examples
3927might be a good idea; maybe make an "advanced examples" section a
3928standard end-of-chapter section?
d19fbc3c
BF
3929
3930Include cross-references to the glossary, where appropriate.
3931
9a241220
BF
3932Document shallow clones? See draft 1.5.0 release notes for some
3933documentation.
3934
3dff5379 3935Add a section on working with other version control systems, including
9a241220
BF
3936CVS, Subversion, and just imports of series of release tarballs.
3937
a8cd1402 3938More details on gitweb?
0b375ab0
BF
3939
3940Write a chapter on using plumbing and writing scripts.
d9bd321c
BF
3941
3942Alternates, clone -reference, etc.
3943
3944git unpack-objects -r for recovery