]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/gittutorial-2.txt
t4034: abstract away SHA-1-specific constants
[thirdparty/git.git] / Documentation / gittutorial-2.txt
CommitLineData
b27a23e3
CC
1gittutorial-2(7)
2================
e31952da 3
b27a23e3
CC
4NAME
5----
2de9b711 6gittutorial-2 - A tutorial introduction to Git: part two
b27a23e3
CC
7
8SYNOPSIS
9--------
7791a1d9 10[verse]
b27a23e3
CC
11git *
12
13DESCRIPTION
14-----------
15
6998e4db 16You should work through linkgit:gittutorial[7] before reading this tutorial.
e31952da
BF
17
18The goal of this tutorial is to introduce two fundamental pieces of
2de9b711 19Git's architecture--the object database and the index file--and to
e31952da 20provide the reader with everything necessary to understand the rest
2de9b711 21of the Git documentation.
e31952da 22
2de9b711 23The Git object database
e31952da
BF
24-----------------------
25
26Let's start a new project and create a small amount of history:
27
28------------------------------------------------
29$ mkdir test-project
30$ cd test-project
515377ea 31$ git init
ef0a89a6 32Initialized empty Git repository in .git/
e31952da
BF
33$ echo 'hello world' > file.txt
34$ git add .
35$ git commit -a -m "initial commit"
27a58359 36[master (root-commit) 54196cc] initial commit
7f814632 37 1 file changed, 1 insertion(+)
61f5cb7f 38 create mode 100644 file.txt
e31952da
BF
39$ echo 'hello world!' >file.txt
40$ git commit -a -m "add emphasis"
27a58359 41[master c4d59f3] add emphasis
7f814632 42 1 file changed, 1 insertion(+), 1 deletion(-)
e31952da
BF
43------------------------------------------------
44
2de9b711 45What are the 7 digits of hex that Git responded to the commit with?
e31952da
BF
46
47We saw in part one of the tutorial that commits have names like this.
2de9b711 48It turns out that every object in the Git history is stored under
d5fa1f1a 49a 40-digit hex name. That name is the SHA-1 hash of the object's
2de9b711 50contents; among other things, this ensures that Git will never store
d5fa1f1a 51the same data twice (since identical data is given an identical SHA-1
2de9b711 52name), and that the contents of a Git object will never change (since
72c69ebc
AE
53that would change the object's name as well). The 7 char hex strings
54here are simply the abbreviation of such 40 character long strings.
55Abbreviations can be used everywhere where the 40 character strings
56can be used, so long as they are unambiguous.
e31952da 57
ebd124c6 58It is expected that the content of the commit object you created while
d5fa1f1a 59following the example above generates a different SHA-1 hash than
ebd124c6
NP
60the one shown above because the commit object records the time when
61it was created and the name of the person performing the commit.
62
2de9b711 63We can ask Git about this particular object with the `cat-file`
ebd124c6
NP
64command. Don't copy the 40 hex digits from this example but use those
65from your own version. Note that you can shorten it to only a few
66characters to save yourself typing all 40 hex digits:
e31952da
BF
67
68------------------------------------------------
d54467b8 69$ git cat-file -t 54196cc2
ebd124c6 70commit
d54467b8 71$ git cat-file commit 54196cc2
ebd124c6
NP
72tree 92b8b694ffb1675e5975148e1121810081dbdffe
73author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
74committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
75
76initial commit
e31952da
BF
77------------------------------------------------
78
79A tree can refer to one or more "blob" objects, each corresponding to
80a file. In addition, a tree can also refer to other tree objects,
abda1ef5 81thus creating a directory hierarchy. You can examine the contents of
e31952da 82any tree using ls-tree (remember that a long enough initial portion
d5fa1f1a 83of the SHA-1 will also work):
e31952da
BF
84
85------------------------------------------------
86$ git ls-tree 92b8b694
87100644 blob 3b18e512dba79e4c8300dd08aeb37f8e728b8dad file.txt
88------------------------------------------------
89
d5fa1f1a 90Thus we see that this tree has one file in it. The SHA-1 hash is a
e31952da
BF
91reference to that file's data:
92
93------------------------------------------------
94$ git cat-file -t 3b18e512
95blob
96------------------------------------------------
97
98A "blob" is just file data, which we can also examine with cat-file:
99
100------------------------------------------------
101$ git cat-file blob 3b18e512
102hello world
103------------------------------------------------
104
2de9b711 105Note that this is the old file data; so the object that Git named in
e31952da
BF
106its response to the initial tree was a tree with a snapshot of the
107directory state that was recorded by the first commit.
108
d5fa1f1a 109All of these objects are stored under their SHA-1 names inside the Git
e31952da
BF
110directory:
111
112------------------------------------------------
113$ find .git/objects/
114.git/objects/
115.git/objects/pack
116.git/objects/info
117.git/objects/3b
118.git/objects/3b/18e512dba79e4c8300dd08aeb37f8e728b8dad
119.git/objects/92
120.git/objects/92/b8b694ffb1675e5975148e1121810081dbdffe
121.git/objects/54
122.git/objects/54/196cc2703dc165cbd373a65a4dcf22d50ae7f7
123.git/objects/a0
124.git/objects/a0/423896973644771497bdc03eb99d5281615b51
125.git/objects/d0
126.git/objects/d0/492b368b66bdabf2ac1fd8c92b39d3db916e59
127.git/objects/c4
128.git/objects/c4/d59f390b9cfd4318117afde11d601c1085f241
129------------------------------------------------
130
131and the contents of these files is just the compressed data plus a
132header identifying their length and their type. The type is either a
ebd124c6 133blob, a tree, a commit, or a tag.
e31952da
BF
134
135The simplest commit to find is the HEAD commit, which we can find
136from .git/HEAD:
137
138------------------------------------------------
139$ cat .git/HEAD
140ref: refs/heads/master
141------------------------------------------------
142
143As you can see, this tells us which branch we're currently on, and it
144tells us this by naming a file under the .git directory, which itself
d5fa1f1a 145contains a SHA-1 name referring to a commit object, which we can
e31952da
BF
146examine with cat-file:
147
148------------------------------------------------
149$ cat .git/refs/heads/master
150c4d59f390b9cfd4318117afde11d601c1085f241
151$ git cat-file -t c4d59f39
152commit
153$ git cat-file commit c4d59f39
154tree d0492b368b66bdabf2ac1fd8c92b39d3db916e59
155parent 54196cc2703dc165cbd373a65a4dcf22d50ae7f7
156author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143418702 -0500
157committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143418702 -0500
158
159add emphasis
160------------------------------------------------
161
162The "tree" object here refers to the new state of the tree:
163
164------------------------------------------------
165$ git ls-tree d0492b36
166100644 blob a0423896973644771497bdc03eb99d5281615b51 file.txt
2befe6fe 167$ git cat-file blob a0423896
e31952da
BF
168hello world!
169------------------------------------------------
170
171and the "parent" object refers to the previous commit:
172
173------------------------------------------------
d54467b8 174$ git cat-file commit 54196cc2
e31952da
BF
175tree 92b8b694ffb1675e5975148e1121810081dbdffe
176author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
177committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
178
179initial commit
180------------------------------------------------
181
182The tree object is the tree we examined first, and this commit is
183unusual in that it lacks any parent.
184
185Most commits have only one parent, but it is also common for a commit
186to have multiple parents. In that case the commit represents a
187merge, with the parent references pointing to the heads of the merged
188branches.
189
190Besides blobs, trees, and commits, the only remaining type of object
5162e697 191is a "tag", which we won't discuss here; refer to linkgit:git-tag[1]
e31952da
BF
192for details.
193
2de9b711 194So now we know how Git uses the object database to represent a
e31952da
BF
195project's history:
196
197 * "commit" objects refer to "tree" objects representing the
198 snapshot of a directory tree at a particular point in the
199 history, and refer to "parent" commits to show how they're
200 connected into the project history.
201 * "tree" objects represent the state of a single directory,
202 associating directory names to "blob" objects containing file
203 data and "tree" objects containing subdirectory information.
204 * "blob" objects contain file data without any other structure.
205 * References to commit objects at the head of each branch are
206 stored in files under .git/refs/heads/.
207 * The name of the current branch is stored in .git/HEAD.
208
209Note, by the way, that lots of commands take a tree as an argument.
210But as we can see above, a tree can be referred to in many different
d5fa1f1a 211ways--by the SHA-1 name for that tree, by the name of a commit that
e31952da
BF
212refers to the tree, by the name of a branch whose head refers to that
213tree, etc.--and most such commands can accept any of these names.
214
215In command synopses, the word "tree-ish" is sometimes used to
216designate such an argument.
217
218The index file
219--------------
220
483bc4f0
JN
221The primary tool we've been using to create commits is `git-commit
222-a`, which creates a commit including every change you've made to
e31952da
BF
223your working tree. But what if you want to commit changes only to
224certain files? Or only certain changes to certain files?
225
226If we look at the way commits are created under the cover, we'll see
227that there are more flexible ways creating commits.
228
229Continuing with our test-project, let's modify file.txt again:
230
231------------------------------------------------
232$ echo "hello world, again" >>file.txt
233------------------------------------------------
234
235but this time instead of immediately making the commit, let's take an
236intermediate step, and ask for diffs along the way to keep track of
237what's happening:
238
239------------------------------------------------
240$ git diff
241--- a/file.txt
242+++ b/file.txt
243@@ -1 +1,2 @@
244 hello world!
d5e3d60c 245+hello world, again
d7f078b8 246$ git add file.txt
e31952da
BF
247$ git diff
248------------------------------------------------
249
250The last diff is empty, but no new commits have been made, and the
251head still doesn't contain the new line:
252
253------------------------------------------------
d54467b8 254$ git diff HEAD
e31952da
BF
255diff --git a/file.txt b/file.txt
256index a042389..513feba 100644
257--- a/file.txt
258+++ b/file.txt
259@@ -1 +1,2 @@
260 hello world!
d5e3d60c 261+hello world, again
e31952da
BF
262------------------------------------------------
263
0b444cdb 264So 'git diff' is comparing against something other than the head.
e31952da
BF
265The thing that it's comparing against is actually the index file,
266which is stored in .git/index in a binary format, but whose contents
267we can examine with ls-files:
268
269------------------------------------------------
270$ git ls-files --stage
271100644 513feba2e53ebbd2532419ded848ba19de88ba00 0 file.txt
272$ git cat-file -t 513feba2
273blob
274$ git cat-file blob 513feba2
1d17c25c 275hello world!
e31952da
BF
276hello world, again
277------------------------------------------------
278
0b444cdb 279So what our 'git add' did was store a new blob and then put
e31952da 280a reference to it in the index file. If we modify the file again,
0b444cdb 281we'll see that the new modifications are reflected in the 'git diff'
e31952da
BF
282output:
283
284------------------------------------------------
285$ echo 'again?' >>file.txt
286$ git diff
287index 513feba..ba3da7b 100644
288--- a/file.txt
289+++ b/file.txt
290@@ -1,2 +1,3 @@
291 hello world!
292 hello world, again
293+again?
294------------------------------------------------
295
0b444cdb 296With the right arguments, 'git diff' can also show us the difference
e31952da
BF
297between the working directory and the last commit, or between the
298index and the last commit:
299
300------------------------------------------------
301$ git diff HEAD
302diff --git a/file.txt b/file.txt
303index a042389..ba3da7b 100644
304--- a/file.txt
305+++ b/file.txt
306@@ -1 +1,3 @@
307 hello world!
308+hello world, again
309+again?
310$ git diff --cached
311diff --git a/file.txt b/file.txt
312index a042389..513feba 100644
313--- a/file.txt
314+++ b/file.txt
315@@ -1 +1,2 @@
316 hello world!
317+hello world, again
318------------------------------------------------
319
0b444cdb 320At any time, we can create a new commit using 'git commit' (without
483bc4f0 321the "-a" option), and verify that the state committed only includes the
e31952da
BF
322changes stored in the index file, not the additional change that is
323still only in our working tree:
324
325------------------------------------------------
326$ git commit -m "repeat"
327$ git diff HEAD
328diff --git a/file.txt b/file.txt
329index 513feba..ba3da7b 100644
330--- a/file.txt
331+++ b/file.txt
332@@ -1,2 +1,3 @@
333 hello world!
334 hello world, again
335+again?
336------------------------------------------------
337
0b444cdb 338So by default 'git commit' uses the index to create the commit, not
483bc4f0 339the working tree; the "-a" option to commit tells it to first update
e31952da
BF
340the index with all changes in the working tree.
341
0b444cdb 342Finally, it's worth looking at the effect of 'git add' on the index
e31952da
BF
343file:
344
345------------------------------------------------
346$ echo "goodbye, world" >closing.txt
347$ git add closing.txt
348------------------------------------------------
349
0b444cdb 350The effect of the 'git add' was to add one entry to the index file:
e31952da
BF
351
352------------------------------------------------
353$ git ls-files --stage
354100644 8b9743b20d4b15be3955fc8d5cd2b09cd2336138 0 closing.txt
355100644 513feba2e53ebbd2532419ded848ba19de88ba00 0 file.txt
356------------------------------------------------
357
358And, as you can see with cat-file, this new entry refers to the
359current contents of the file:
360
361------------------------------------------------
1bb91460
JH
362$ git cat-file blob 8b9743b2
363goodbye, world
e31952da
BF
364------------------------------------------------
365
366The "status" command is a useful way to get a quick summary of the
367situation:
368
369------------------------------------------------
370$ git status
8942821e
SN
371On branch master
372Changes to be committed:
80f537f7 373 (use "git restore --staged <file>..." to unstage)
8942821e
SN
374
375 new file: closing.txt
376
377Changes not staged for commit:
378 (use "git add <file>..." to update what will be committed)
80f537f7 379 (use "git restore <file>..." to discard changes in working directory)
8942821e
SN
380
381 modified: file.txt
382
e31952da
BF
383------------------------------------------------
384
385Since the current state of closing.txt is cached in the index file,
bf3478de 386it is listed as "Changes to be committed". Since file.txt has
e31952da 387changes in the working directory that aren't reflected in the index,
bf3478de 388it is marked "changed but not updated". At this point, running "git
e31952da
BF
389commit" would create a commit that added closing.txt (with its new
390contents), but that didn't modify file.txt.
391
483bc4f0 392Also, note that a bare `git diff` shows the changes to file.txt, but
e31952da
BF
393not the addition of closing.txt, because the version of closing.txt
394in the index file is identical to the one in the working directory.
395
396In addition to being the staging area for new commits, the index file
397is also populated from the object database when checking out a
398branch, and is used to hold the trees involved in a merge operation.
6998e4db 399See linkgit:gitcore-tutorial[7] and the relevant man
e31952da
BF
400pages for details.
401
402What next?
403----------
404
405At this point you should know everything necessary to read the man
406pages for any of the git commands; one good place to start would be
673151a9 407with the commands mentioned in linkgit:giteveryday[7]. You
6998e4db 408should be able to find any unknown jargon in linkgit:gitglossary[7].
e31952da 409
cd50aba9 410The link:user-manual.html[Git User's Manual] provides a more
2de9b711 411comprehensive introduction to Git.
cd50aba9 412
6998e4db 413linkgit:gitcvs-migration[7] explains how to
2de9b711 414import a CVS repository into Git, and shows how to use Git in a
e31952da
BF
415CVS-like way.
416
2de9b711 417For some interesting examples of Git use, see the
e31952da
BF
418link:howto-index.html[howtos].
419
2de9b711
TA
420For Git developers, linkgit:gitcore-tutorial[7] goes
421into detail on the lower-level Git mechanisms involved in, for
e31952da 422example, creating a new commit.
b27a23e3
CC
423
424SEE ALSO
425--------
426linkgit:gittutorial[7],
427linkgit:gitcvs-migration[7],
497c8331
CC
428linkgit:gitcore-tutorial[7],
429linkgit:gitglossary[7],
6e702c24 430linkgit:git-help[1],
673151a9 431linkgit:giteveryday[7],
b27a23e3
CC
432link:user-manual.html[The Git User's Manual]
433
434GIT
435---
941b9c52 436Part of the linkgit:git[1] suite