]>
Commit | Line | Data |
---|---|---|
8c7fa247 LT |
1 | A short git tutorial |
2 | ==================== | |
3 | May 2005 | |
4 | ||
5 | ||
6 | Introduction | |
7 | ------------ | |
8 | ||
9 | This is trying to be a short tutorial on setting up and using a git | |
10 | archive, mainly because being hands-on and using explicit examples is | |
11 | often the best way of explaining what is going on. | |
12 | ||
13 | In normal life, most people wouldn't use the "core" git programs | |
14 | directly, but rather script around them to make them more palatable. | |
15 | Understanding the core git stuff may help some people get those scripts | |
16 | done, though, and it may also be instructive in helping people | |
17 | understand what it is that the higher-level helper scripts are actually | |
18 | doing. | |
19 | ||
20 | The core git is often called "plumbing", with the prettier user | |
f35ca9ed LT |
21 | interfaces on top of it called "porcelain". You may not want to use the |
22 | plumbing directly very often, but it can be good to know what the | |
23 | plumbing does for when the porcelain isn't flushing... | |
8c7fa247 LT |
24 | |
25 | ||
26 | Creating a git archive | |
27 | ---------------------- | |
28 | ||
29 | Creating a new git archive couldn't be easier: all git archives start | |
30 | out empty, and the only thing you need to do is find yourself a | |
31 | subdirectory that you want to use as a working tree - either an empty | |
32 | one for a totally new project, or an existing working tree that you want | |
33 | to import into git. | |
34 | ||
837eedf4 | 35 | For our first example, we're going to start a totally new archive from |
8c7fa247 LT |
36 | scratch, with no pre-existing files, and we'll call it "git-tutorial". |
37 | To start up, create a subdirectory for it, change into that | |
38 | subdirectory, and initialize the git infrastructure with "git-init-db": | |
39 | ||
40 | mkdir git-tutorial | |
41 | cd git-tutorial | |
42 | git-init-db | |
43 | ||
44 | to which git will reply | |
45 | ||
46 | defaulting to local storage area | |
47 | ||
837eedf4 | 48 | which is just git's way of saying that you haven't been doing anything |
8c7fa247 LT |
49 | strange, and that it will have created a local .git directory setup for |
50 | your new project. You will now have a ".git" directory, and you can | |
51 | inspect that with "ls". For your new empty project, ls should show you | |
52 | three entries: | |
53 | ||
54 | - a symlink called HEAD, pointing to "refs/heads/master" | |
55 | ||
56 | Don't worry about the fact that the file that the HEAD link points to | |
837eedf4 | 57 | doesn't even exist yet - you haven't created the commit that will |
8c7fa247 LT |
58 | start your HEAD development branch yet. |
59 | ||
60 | - a subdirectory called "objects", which will contain all the git SHA1 | |
61 | objects of your project. You should never have any real reason to | |
62 | look at the objects directly, but you might want to know that these | |
63 | objects are what contains all the real _data_ in your repository. | |
64 | ||
65 | - a subdirectory called "refs", which contains references to objects. | |
66 | ||
67 | In particular, the "refs" subdirectory will contain two other | |
68 | subdirectories, named "heads" and "tags" respectively. They do | |
69 | exactly what their names imply: they contain references to any number | |
70 | of different "heads" of development (aka "branches"), and to any | |
71 | "tags" that you have created to name specific versions of your | |
72 | repository. | |
73 | ||
74 | One note: the special "master" head is the default branch, which is | |
75 | why the .git/HEAD file was created as a symlink to it even if it | |
837eedf4 | 76 | doesn't yet exist. Basically, the HEAD link is supposed to always |
8c7fa247 LT |
77 | point to the branch you are working on right now, and you always |
78 | start out expecting to work on the "master" branch. | |
79 | ||
80 | However, this is only a convention, and you can name your branches | |
81 | anything you want, and don't have to ever even _have_ a "master" | |
82 | branch. A number of the git tools will assume that .git/HEAD is | |
83 | valid, though. | |
84 | ||
85 | [ Implementation note: an "object" is identified by its 160-bit SHA1 | |
86 | hash, aka "name", and a reference to an object is always the 40-byte | |
87 | hex representation of that SHA1 name. The files in the "refs" | |
88 | subdirectory are expected to contain these hex references (usually | |
89 | with a final '\n' at the end), and you should thus expect to see a | |
90 | number of 41-byte files containing these references in this refs | |
91 | subdirectories when you actually start populating your tree ] | |
92 | ||
93 | You have now created your first git archive. Of course, since it's | |
94 | empty, that's not very useful, so let's start populating it with data. | |
95 | ||
96 | ||
97 | Populating a git archive | |
98 | ------------------------ | |
99 | ||
100 | We'll keep this simple and stupid, so we'll start off with populating a | |
101 | few trivial files just to get a feel for it. | |
102 | ||
103 | Start off with just creating any random files that you want to maintain | |
104 | in your git archive. We'll start off with a few bad examples, just to | |
105 | get a feel for how this works: | |
106 | ||
107 | echo "Hello World" > a | |
108 | echo "Silly example" > b | |
109 | ||
110 | you have now created two files in your working directory, but to | |
111 | actually check in your hard work, you will have to go through two steps: | |
112 | ||
113 | - fill in the "cache" aka "index" file with the information about your | |
114 | working directory state | |
115 | ||
116 | - commit that index file as an object. | |
117 | ||
118 | The first step is trivial: when you want to tell git about any changes | |
119 | to your working directory, you use the "git-update-cache" program. That | |
120 | program normally just takes a list of filenames you want to update, but | |
121 | to avoid trivial mistakes, it refuses to add new entries to the cache | |
122 | (or remove existing ones) unless you explicitly tell it that you're | |
123 | adding a new entry with the "--add" flag (or removing an entry with the | |
124 | "--remove") flag. | |
125 | ||
126 | So to populate the index with the two files you just created, you can do | |
127 | ||
128 | git-update-cache --add a b | |
129 | ||
130 | and you have now told git to track those two files. | |
131 | ||
132 | In fact, as you did that, if you now look into your object directory, | |
837eedf4 | 133 | you'll notice that git will have added two new objects to the object |
8c7fa247 LT |
134 | store. If you did exactly the steps above, you should now be able to do |
135 | ||
136 | ls .git/objects/??/* | |
137 | ||
138 | and see two files: | |
139 | ||
140 | .git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238 | |
141 | .git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962 | |
142 | ||
143 | which correspond with the object with SHA1 names of 557db... and f24c7.. | |
144 | respectively. | |
145 | ||
146 | If you want to, you can use "git-cat-file" to look at those objects, but | |
147 | you'll have to use the object name, not the filename of the object: | |
148 | ||
149 | git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238 | |
150 | ||
151 | where the "-t" tells git-cat-file to tell you what the "type" of the | |
152 | object is. Git will tell you that you have a "blob" object (ie just a | |
153 | regular file), and you can see the contents with | |
154 | ||
155 | git-cat-file "blob" 557db03de997c86a4a028e1ebd3a1ceb225be238 | |
156 | ||
157 | which will print out "Hello World". The object 557db... is nothing | |
158 | more than the contents of your file "a". | |
159 | ||
160 | [ Digression: don't confuse that object with the file "a" itself. The | |
81bb573e LT |
161 | object is literally just those specific _contents_ of the file, and |
162 | however much you later change the contents in file "a", the object we | |
163 | just looked at will never change. Objects are immutable. ] | |
8c7fa247 LT |
164 | |
165 | Anyway, as we mentioned previously, you normally never actually take a | |
166 | look at the objects themselves, and typing long 40-character hex SHA1 | |
167 | names is not something you'd normally want to do. The above digression | |
168 | was just to show that "git-update-cache" did something magical, and | |
169 | actually saved away the contents of your files into the git content | |
170 | store. | |
171 | ||
172 | Updating the cache did something else too: it created a ".git/index" | |
173 | file. This is the index that describes your current working tree, and | |
174 | something you should be very aware of. Again, you normally never worry | |
175 | about the index file itself, but you should be aware of the fact that | |
176 | you have not actually really "checked in" your files into git so far, | |
177 | you've only _told_ git about them. | |
178 | ||
f35ca9ed | 179 | However, since git knows about them, you can now start using some of the |
8c7fa247 LT |
180 | most basic git commands to manipulate the files or look at their status. |
181 | ||
182 | In particular, let's not even check in the two files into git yet, we'll | |
183 | start off by adding another line to "a" first: | |
184 | ||
185 | echo "It's a new day for git" >> a | |
186 | ||
187 | and you can now, since you told git about the previous state of "a", ask | |
188 | git what has changed in the tree compared to your old index, using the | |
189 | "git-diff-files" command: | |
190 | ||
191 | git-diff-files | |
192 | ||
193 | oops. That wasn't very readable. It just spit out its own internal | |
194 | version of a "diff", but that internal version really just tells you | |
195 | that it has noticed that "a" has been modified, and that the old object | |
196 | contents it had have been replaced with something else. | |
197 | ||
198 | To make it readable, we can tell git-diff-files to output the | |
199 | differences as a patch, using the "-p" flag: | |
200 | ||
201 | git-diff-files -p | |
202 | ||
203 | which will spit out | |
204 | ||
205 | diff --git a/a b/a | |
206 | --- a/a | |
207 | +++ b/a | |
208 | @@ -1 +1,2 @@ | |
209 | Hello World | |
210 | +It's a new day for git | |
211 | ||
212 | ie the diff of the change we caused by adding another line to "a". | |
213 | ||
214 | In other words, git-diff-files always shows us the difference between | |
215 | what is recorded in the index, and what is currently in the working | |
216 | tree. That's very useful. | |
217 | ||
ed616049 LT |
218 | A common shorthand for "git-diff-files -p" is to just write |
219 | ||
220 | git diff | |
221 | ||
222 | which will do the same thing. | |
223 | ||
8c7fa247 LT |
224 | |
225 | Committing git state | |
226 | -------------------- | |
227 | ||
228 | Now, we want to go to the next stage in git, which is to take the files | |
229 | that git knows about in the index, and commit them as a real tree. We do | |
230 | that in two phases: creating a "tree" object, and committing that "tree" | |
231 | object as a "commit" object together with an explanation of what the | |
232 | tree was all about, along with information of how we came to that state. | |
233 | ||
234 | Creating a tree object is trivial, and is done with "git-write-tree". | |
235 | There are no options or other input: git-write-tree will take the | |
236 | current index state, and write an object that describes that whole | |
237 | index. In other words, we're now tying together all the different | |
238 | filenames with their contents (and their permissions), and we're | |
239 | creating the equivalent of a git "directory" object: | |
240 | ||
241 | git-write-tree | |
242 | ||
243 | and this will just output the name of the resulting tree, in this case | |
244 | (if you have does exactly as I've described) it should be | |
245 | ||
246 | 3ede4ed7e895432c0a247f09d71a76db53bd0fa4 | |
247 | ||
248 | which is another incomprehensible object name. Again, if you want to, | |
249 | you can use "git-cat-file -t 3ede4.." to see that this time the object | |
250 | is not a "blob" object, but a "tree" object (you can also use | |
251 | git-cat-file to actually output the raw object contents, but you'll see | |
252 | mainly a binary mess, so that's less interesting). | |
253 | ||
254 | However - normally you'd never use "git-write-tree" on its own, because | |
255 | normally you always commit a tree into a commit object using the | |
256 | "git-commit-tree" command. In fact, it's easier to not actually use | |
257 | git-write-tree on its own at all, but to just pass its result in as an | |
258 | argument to "git-commit-tree". | |
259 | ||
260 | "git-commit-tree" normally takes several arguments - it wants to know | |
261 | what the _parent_ of a commit was, but since this is the first commit | |
262 | ever in this new archive, and it has no parents, we only need to pass in | |
263 | the tree ID. However, git-commit-tree also wants to get a commit message | |
264 | on its standard input, and it will write out the resulting ID for the | |
265 | commit to its standard output. | |
266 | ||
267 | And this is where we start using the .git/HEAD file. The HEAD file is | |
268 | supposed to contain the reference to the top-of-tree, and since that's | |
269 | exactly what git-commit-tree spits out, we can do this all with a simple | |
270 | shell pipeline: | |
271 | ||
272 | echo "Initial commit" | git-commit-tree $(git-write-tree) > .git/HEAD | |
273 | ||
274 | which will say: | |
275 | ||
276 | Committing initial tree 3ede4ed7e895432c0a247f09d71a76db53bd0fa4 | |
277 | ||
278 | just to warn you about the fact that it created a totally new commit | |
279 | that is not related to anything else. Normally you do this only _once_ | |
280 | for a project ever, and all later commits will be parented on top of an | |
281 | earlier commit, and you'll never see this "Committing initial tree" | |
282 | message ever again. | |
283 | ||
ed616049 LT |
284 | Again, normally you'd never actually do this by hand. There is a |
285 | helpful script called "git commit" that will do all of this for you. So | |
286 | you could have just writtten | |
287 | ||
288 | git commit | |
289 | ||
290 | instead, and it would have done the above magic scripting for you. | |
291 | ||
8c7fa247 LT |
292 | |
293 | Making a change | |
294 | --------------- | |
295 | ||
296 | Remember how we did the "git-update-cache" on file "a" and then we | |
837eedf4 | 297 | changed "a" afterward, and could compare the new state of "a" with the |
8c7fa247 LT |
298 | state we saved in the index file? |
299 | ||
300 | Further, remember how I said that "git-write-tree" writes the contents | |
301 | of the _index_ file to the tree, and thus what we just committed was in | |
302 | fact the _original_ contents of the file "a", not the new ones. We did | |
303 | that on purpose, to show the difference between the index state, and the | |
304 | state in the working directory, and how they don't have to match, even | |
305 | when we commit things. | |
306 | ||
307 | As before, if we do "git-diff-files -p" in our git-tutorial project, | |
308 | we'll still see the same difference we saw last time: the index file | |
309 | hasn't changed by the act of committing anything. However, now that we | |
310 | have committed something, we can also learn to use a new command: | |
311 | "git-diff-cache". | |
312 | ||
313 | Unlike "git-diff-files", which showed the difference between the index | |
314 | file and the working directory, "git-diff-cache" shows the differences | |
a7b20909 LT |
315 | between a committed _tree_ and either the the index file or the working |
316 | directory. In other words, git-diff-cache wants a tree to be diffed | |
317 | against, and before we did the commit, we couldn't do that, because we | |
318 | didn't have anything to diff against. | |
8c7fa247 LT |
319 | |
320 | But now we can do | |
321 | ||
322 | git-diff-cache -p HEAD | |
323 | ||
324 | (where "-p" has the same meaning as it did in git-diff-files), and it | |
325 | will show us the same difference, but for a totally different reason. | |
a7b20909 LT |
326 | Now we're comparing the working directory not against the index file, |
327 | but against the tree we just wrote. It just so happens that those two | |
328 | are obviously the same, so we get the same result. | |
329 | ||
ed616049 LT |
330 | Again, because this is a common operation, you can also just shorthand |
331 | it with | |
332 | ||
333 | git diff HEAD | |
334 | ||
335 | which ends up doing the above for you. | |
336 | ||
a7b20909 LT |
337 | In other words, "git-diff-cache" normally compares a tree against the |
338 | working directory, but when given the "--cached" flag, it is told to | |
339 | instead compare against just the index cache contents, and ignore the | |
340 | current working directory state entirely. Since we just wrote the index | |
341 | file to HEAD, doing "git-diff-cache --cached -p HEAD" should thus return | |
342 | an empty set of differences, and that's exactly what it does. | |
343 | ||
344 | [ Digression: "git-diff-cache" really always uses the index for its | |
345 | comparisons, and saying that it compares a tree against the working | |
346 | directory is thus not strictly accurate. In particular, the list of | |
347 | files to compare (the "meta-data") _always_ comes from the index file, | |
348 | regardless of whether the --cached flag is used or not. The --cached | |
349 | flag really only determines whether the file _contents_ to be compared | |
350 | come from the working directory or not. | |
351 | ||
352 | This is not hard to understand, as soon as you realize that git simply | |
353 | never knows (or cares) about files that it is not told about | |
354 | explicitly. Git will never go _looking_ for files to compare, it | |
355 | expects you to tell it what the files are, and that's what the index | |
356 | is there for. ] | |
8c7fa247 LT |
357 | |
358 | However, our next step is to commit the _change_ we did, and again, to | |
837eedf4 | 359 | understand what's going on, keep in mind the difference between "working |
8c7fa247 LT |
360 | directory contents", "index file" and "committed tree". We have changes |
361 | in the working directory that we want to commit, and we always have to | |
362 | work through the index file, so the first thing we need to do is to | |
363 | update the index cache: | |
364 | ||
365 | git-update-cache a | |
366 | ||
367 | (note how we didn't need the "--add" flag this time, since git knew | |
368 | about the file already). | |
369 | ||
370 | Note what happens to the different git-diff-xxx versions here. After | |
371 | we've updated "a" in the index, "git-diff-files -p" now shows no | |
372 | differences, but "git-diff-cache -p HEAD" still _does_ show that the | |
373 | current state is different from the state we committed. In fact, now | |
374 | "git-diff-cache" shows the same difference whether we use the "--cached" | |
375 | flag or not, since now the index is coherent with the working directory. | |
376 | ||
377 | Now, since we've updated "a" in the index, we can commit the new | |
ed616049 LT |
378 | version. We could do it by writing the tree by hand again, and |
379 | committing the tree (this time we'd have to use the "-p HEAD" flag to | |
380 | tell commit that the HEAD was the _parent_ of the new commit, and that | |
381 | this wasn't an initial commit any more), but you've done that once | |
382 | already, so let's just use the helpful script this time: | |
8c7fa247 | 383 | |
81bb573e | 384 | git commit |
8c7fa247 | 385 | |
ed616049 LT |
386 | which starts an editor for you to write the commit message and tells you |
387 | a bit about what you're doing. | |
388 | ||
8c7fa247 LT |
389 | Write whatever message you want, and all the lines that start with '#' |
390 | will be pruned out, and the rest will be used as the commit message for | |
391 | the change. If you decide you don't want to commit anything after all at | |
392 | this point (you can continue to edit things and update the cache), you | |
393 | can just leave an empty message. Otherwise git-commit-script will commit | |
394 | the change for you. | |
395 | ||
8c7fa247 LT |
396 | You've now made your first real git commit. And if you're interested in |
397 | looking at what git-commit-script really does, feel free to investigate: | |
398 | it's a few very simple shell scripts to generate the helpful (?) commit | |
399 | message headers, and a few one-liners that actually do the commit itself. | |
400 | ||
401 | ||
402 | Checking it out | |
403 | --------------- | |
404 | ||
405 | While creating changes is useful, it's even more useful if you can tell | |
406 | later what changed. The most useful command for this is another of the | |
407 | "diff" family, namely "git-diff-tree". | |
408 | ||
409 | git-diff-tree can be given two arbitrary trees, and it will tell you the | |
410 | differences between them. Perhaps even more commonly, though, you can | |
411 | give it just a single commit object, and it will figure out the parent | |
412 | of that commit itself, and show the difference directly. Thus, to get | |
413 | the same diff that we've already seen several times, we can now do | |
414 | ||
415 | git-diff-tree -p HEAD | |
416 | ||
417 | (again, "-p" means to show the difference as a human-readable patch), | |
418 | and it will show what the last commit (in HEAD) actually changed. | |
419 | ||
420 | More interestingly, you can also give git-diff-tree the "-v" flag, which | |
421 | tells it to also show the commit message and author and date of the | |
422 | commit, and you can tell it to show a whole series of diffs. | |
423 | Alternatively, you can tell it to be "silent", and not show the diffs at | |
424 | all, but just show the actual commit message. | |
425 | ||
426 | In fact, together with the "git-rev-list" program (which generates a | |
427 | list of revisions), git-diff-tree ends up being a veritable fount of | |
428 | changes. A trivial (but very useful) script called "git-whatchanged" is | |
429 | included with git which does exactly this, and shows a log of recent | |
430 | activity. | |
431 | ||
81bb573e | 432 | To see the whole history of our pitiful little git-tutorial project, you |
8c7fa247 LT |
433 | can do |
434 | ||
81bb573e LT |
435 | git log |
436 | ||
437 | which shows just the log messages, or if we want to see the log together | |
cc29f732 | 438 | with the associated patches use the more complex (and much more |
81bb573e LT |
439 | powerful) |
440 | ||
837eedf4 | 441 | git-whatchanged -p --root |
8c7fa247 | 442 | |
81bb573e LT |
443 | and you will see exactly what has changed in the repository over its |
444 | short history. | |
445 | ||
446 | [ Side note: the "--root" flag is a flag to git-diff-tree to tell it to | |
447 | show the initial aka "root" commit too. Normally you'd probably not | |
448 | want to see the initial import diff, but since the tutorial project | |
449 | was started from scratch and is so small, we use it to make the result | |
450 | a bit more interesting ] | |
8c7fa247 | 451 | |
837eedf4 | 452 | With that, you should now be having some inkling of what git does, and |
8c7fa247 LT |
453 | can explore on your own. |
454 | ||
f35ca9ed | 455 | |
cc29f732 | 456 | Copying archives |
f35ca9ed LT |
457 | ----------------- |
458 | ||
cc29f732 | 459 | Git archives are normally totally self-sufficient, and it's worth noting |
f35ca9ed LT |
460 | that unlike CVS, for example, there is no separate notion of |
461 | "repository" and "working tree". A git repository normally _is_ the | |
462 | working tree, with the local git information hidden in the ".git" | |
463 | subdirectory. There is nothing else. What you see is what you got. | |
464 | ||
465 | [ Side note: you can tell git to split the git internal information from | |
466 | the directory that it tracks, but we'll ignore that for now: it's not | |
467 | how normal projects work, and it's really only meant for special uses. | |
468 | So the mental model of "the git information is always tied directly to | |
469 | the working directory that it describes" may not be technically 100% | |
470 | accurate, but it's a good model for all normal use ] | |
471 | ||
472 | This has two implications: | |
473 | ||
474 | - if you grow bored with the tutorial archive you created (or you've | |
475 | made a mistake and want to start all over), you can just do simple | |
476 | ||
477 | rm -rf git-tutorial | |
478 | ||
479 | and it will be gone. There's no external repository, and there's no | |
480 | history outside of the project you created. | |
481 | ||
482 | - if you want to move or duplicate a git archive, you can do so. There | |
e7c1ca42 JH |
483 | is "git clone" command, but if all you want to do is just to |
484 | create a copy of your archive (with all the full history that | |
485 | went along with it), you can do so with a regular | |
486 | "cp -a git-tutorial new-git-tutorial". | |
f35ca9ed LT |
487 | |
488 | Note that when you've moved or copied a git archive, your git index | |
489 | file (which caches various information, notably some of the "stat" | |
490 | information for the files involved) will likely need to be refreshed. | |
491 | So after you do a "cp -a" to create a new copy, you'll want to do | |
492 | ||
493 | git-update-cache --refresh | |
494 | ||
495 | to make sure that the index file is up-to-date in the new one. | |
496 | ||
497 | Note that the second point is true even across machines. You can | |
498 | duplicate a remote git archive with _any_ regular copy mechanism, be it | |
499 | "scp", "rsync" or "wget". | |
500 | ||
501 | When copying a remote repository, you'll want to at a minimum update the | |
502 | index cache when you do this, and especially with other peoples | |
503 | repositories you often want to make sure that the index cache is in some | |
504 | known state (you don't know _what_ they've done and not yet checked in), | |
505 | so usually you'll precede the "git-update-cache" with a | |
506 | ||
ce30a4b6 | 507 | git-read-tree --reset HEAD |
f35ca9ed LT |
508 | git-update-cache --refresh |
509 | ||
ce30a4b6 LT |
510 | which will force a total index re-build from the tree pointed to by HEAD |
511 | (it resets the index contents to HEAD, and then the git-update-cache | |
512 | makes sure to match up all index entries with the checked-out files). | |
f35ca9ed | 513 | |
ce30a4b6 LT |
514 | The above can also be written as simply |
515 | ||
516 | git reset | |
517 | ||
518 | and in fact a lot of the common git command combinations can be scripted | |
519 | with the "git xyz" interfaces, and you can learn things by just looking | |
520 | at what the git-*-script scripts do ("git reset" is the above two lines | |
521 | implemented in "git-reset-script", but some things like "git status" and | |
522 | "git commit" are slightly more complex scripts around the basic git | |
523 | commands). | |
524 | ||
525 | NOTE! Many (most?) public remote repositories will not contain any of | |
526 | the checked out files or even an index file, and will _only_ contain the | |
527 | actual core git files. Such a repository usually doesn't even have the | |
f35ca9ed | 528 | ".git" subdirectory, but has all the git files directly in the |
ce30a4b6 | 529 | repository. |
f35ca9ed LT |
530 | |
531 | To create your own local live copy of such a "raw" git repository, you'd | |
cc29f732 | 532 | first create your own subdirectory for the project, and then copy the |
f35ca9ed LT |
533 | raw repository contents into the ".git" directory. For example, to |
534 | create your own copy of the git repository, you'd do the following | |
535 | ||
536 | mkdir my-git | |
537 | cd my-git | |
e7c1ca42 | 538 | rsync -rL rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git .git |
f35ca9ed LT |
539 | |
540 | followed by | |
541 | ||
542 | git-read-tree HEAD | |
543 | ||
544 | to populate the index. However, now you have populated the index, and | |
545 | you have all the git internal files, but you will notice that you don't | |
546 | actually have any of the _working_directory_ files to work on. To get | |
547 | those, you'd check them out with | |
548 | ||
549 | git-checkout-cache -u -a | |
550 | ||
551 | where the "-u" flag means that you want the checkout to keep the index | |
cc29f732 | 552 | up-to-date (so that you don't have to refresh it afterward), and the |
e7c1ca42 | 553 | "-a" flag means "check out all files" (if you have a stale copy or an |
f35ca9ed | 554 | older version of a checked out tree you may also need to add the "-f" |
e7c1ca42 | 555 | flag first, to tell git-checkout-cache to _force_ overwriting of any old |
f35ca9ed LT |
556 | files). |
557 | ||
ed616049 LT |
558 | Again, this can all be simplified with |
559 | ||
e7c1ca42 | 560 | git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git |
ed616049 LT |
561 | cd my-git |
562 | git checkout | |
563 | ||
564 | which will end up doing all of the above for you. | |
565 | ||
cc29f732 | 566 | You have now successfully copied somebody else's (mine) remote |
f35ca9ed LT |
567 | repository, and checked it out. |
568 | ||
ed616049 LT |
569 | |
570 | Creating a new branch | |
571 | --------------------- | |
572 | ||
573 | Branches in git are really nothing more than pointers into the git | |
574 | object space from within the ",git/refs/" subdirectory, and as we | |
575 | already discussed, the HEAD branch is nothing but a symlink to one of | |
576 | these object pointers. | |
577 | ||
578 | You can at any time create a new branch by just picking an arbitrary | |
579 | point in the project history, and just writing the SHA1 name of that | |
580 | object into a file under .git/refs/heads/. You can use any filename you | |
581 | want (and indeed, subdirectories), but the convention is that the | |
582 | "normal" branch is called "master". That's just a convention, though, | |
583 | and nothing enforces it. | |
584 | ||
585 | To show that as an example, let's go back to the git-tutorial archive we | |
586 | used earlier, and create a branch in it. You literally do that by just | |
587 | creating a new SHA1 reference file, and switch to it by just making the | |
588 | HEAD pointer point to it: | |
589 | ||
590 | cat .git/HEAD > .git/refs/heads/mybranch | |
591 | ln -sf refs/heads/mybranch .git/HEAD | |
592 | ||
593 | and you're done. | |
594 | ||
595 | Now, if you make the decision to start your new branch at some other | |
596 | point in the history than the current HEAD, you usually also want to | |
597 | actually switch the contents of your working directory to that point | |
598 | when you switch the head, and "git checkout" will do that for you: | |
599 | instead of switching the branch by hand with "ln -sf", you can just do | |
600 | ||
601 | git checkout mybranch | |
602 | ||
603 | which will basically "jump" to the branch specified, update your working | |
604 | directory to that state, and also make it become the new default HEAD. | |
605 | ||
606 | You can always just jump back to your original "master" branch by doing | |
607 | ||
608 | git checkout master | |
609 | ||
610 | and if you forget which branch you happen to be on, a simple | |
611 | ||
612 | ls -l .git/HEAD | |
613 | ||
614 | will tell you where it's pointing. | |
615 | ||
616 | ||
617 | Merging two branches | |
618 | -------------------- | |
619 | ||
620 | One of the ideas of having a branch is that you do some (possibly | |
621 | experimental) work in it, and eventually merge it back to the main | |
622 | branch. So assuming you created the above "mybranch" that started out | |
623 | being the same as the original "master" branch, let's make sure we're in | |
624 | that branch, and do some work there. | |
625 | ||
626 | git checkout mybranch | |
627 | echo "Work, work, work" >> a | |
628 | git commit a | |
629 | ||
630 | Here, we just added another line to "a", and we used a shorthand for | |
631 | both going a "git-update-cache a" and "git commit" by just giving the | |
632 | filename directly to "git commit". | |
633 | ||
634 | Now, to make it a bit more interesting, let's assume that somebody else | |
635 | does some work in the original branch, and simulate that by going back | |
636 | to the master branch, and editing the same file differently there: | |
637 | ||
638 | git checkout master | |
639 | ||
640 | Here, take a moment to look at the contents of "a", and notice how they | |
641 | don't contain the work we just did in "mybranch" - because that work | |
642 | hasn't happened in the "master" branch at all. Then do | |
643 | ||
644 | echo "Play, play, play" >> a | |
645 | echo "Lots of fun" >> b | |
646 | git commit a b | |
647 | ||
648 | since the master branch is obviously in a much better mood. | |
649 | ||
650 | Now, you've got two branches, and you decide that you want to merge the | |
651 | work done. Before we do that, let's introduce a cool graphical tool that | |
652 | helps you view what's going on: | |
653 | ||
654 | gitk --all | |
655 | ||
656 | will show you graphically both of your branches (that's what the "--all" | |
657 | means: normally it will just show you your current HEAD) and their | |
658 | histories. You can also see exactly how they came to be from a common | |
659 | source. | |
660 | ||
661 | Anyway, let's exit gitk (^Q or the File menu), and decide that we want | |
662 | to merge the work we did on the "mybranch" branch into the "master" | |
663 | branch (which is currently our HEAD too). To do that, there's a nice | |
664 | script called "git resolve", which wants to know which branches you want | |
665 | to resolve and what the merge is all about: | |
666 | ||
667 | git resolve HEAD mybranch "Merge work in mybranch" | |
668 | ||
669 | where the third argument is going to be used as the commit message if | |
670 | the merge can be resolved automatically. | |
671 | ||
672 | Now, in this case we've intentionally created a situation where the | |
673 | merge will need to be fixed up by hand, though, so git will do as much | |
674 | of it as it can automatically (which in this case is just merge the "b" | |
675 | file, which had no differences in the "mybranch" branch), and say: | |
676 | ||
677 | Simple merge failed, trying Automatic merge | |
678 | Auto-merging a. | |
679 | merge: warning: conflicts during merge | |
680 | ERROR: Merge conflict in a. | |
681 | fatal: merge program failed | |
682 | Automatic merge failed, fix up by hand | |
683 | ||
684 | which is way too verbose, but it basically tells you that it failed the | |
685 | really trivial merge ("Simple merge") and did an "Automatic merge" | |
686 | instead, but that too failed due to conflicts in "a". | |
687 | ||
688 | Not to worry. It left the (trivial) conflict in "a" in the same form you | |
689 | should already be well used to if you've ever used CVS, so let's just | |
690 | open "a" in our editor (whatever that may be), and fix it up somehow. | |
691 | I'd suggest just making it so that "a" contains all four lines: | |
692 | ||
693 | Hello World | |
694 | It's a new day for git | |
695 | Play, play, play | |
696 | Work, work, work | |
697 | ||
698 | and once you're happy with your manual merge, just do a | |
699 | ||
700 | git commit a | |
701 | ||
702 | which will very loudly warn you that you're now committing a merge | |
703 | (which is correct, so never mind), and you can write a small merge | |
704 | message about your adventures in git-merge-land. | |
705 | ||
706 | After you're done, start up "gitk --all" to see graphically what the | |
707 | history looks like. Notive that "mybranch" still exists, and you can | |
708 | switch to it, and continue to work with it if you want to. The | |
709 | "mybranch" branch will not contain the merge, but next time you merge it | |
710 | from the "master" branch, git will know how you merged it, so you'll not | |
711 | have to do _that_ merge again. | |
712 | ||
713 | ||
714 | Merging external work | |
715 | --------------------- | |
716 | ||
717 | It's usually much more common that you merge with somebody else than | |
718 | merging with your own branches, so it's worth pointing out that git | |
719 | makes that very easy too, and in fact, it's not that different from | |
720 | doing a "git resolve". In fact, a remote merge ends up being nothing | |
721 | more than "fetch the work from a remote repository into a temporary tag" | |
722 | followed by a "git resolve". | |
723 | ||
724 | It's such a common thing to do that it's called "git pull", and you can | |
725 | simply do | |
726 | ||
727 | git pull <remote-repository> | |
728 | ||
729 | and optionally give a branch-name for the remote end as a second | |
730 | argument. | |
731 | ||
732 | [ Todo: fill in real examples ] | |
733 | ||
734 | ||
735 | Tagging a version | |
736 | ----------------- | |
737 | ||
738 | In git, there's two kinds of tags, a "light" one, and a "signed tag". | |
739 | ||
740 | A "light" tag is technically nothing more than a branch, except we put | |
741 | it in the ".git/refs/tags/" subdirectory instead of calling it a "head". | |
742 | So the simplest form of tag involves nothing more than | |
743 | ||
744 | cat .git/HEAD > .git/refs/tags/my-first-tag | |
745 | ||
746 | after which point you can use this symbolic name for that particular | |
747 | state. You can, for example, do | |
748 | ||
749 | git diff my-first-tag | |
750 | ||
751 | to diff your current state against that tag (which at this point will | |
752 | obviously be an empty diff, but if you continue to develop and commit | |
753 | stuff, you can use your tag as a "anchor-point" to see what has changed | |
754 | since you tagged it. | |
755 | ||
756 | A "signed tag" is actually a real git object, and contains not only a | |
757 | pointer to the state you want to tag, but also a small tag name and | |
758 | message, along with a PGP signature that says that yes, you really did | |
759 | that tag. You create these signed tags with | |
760 | ||
761 | git tag <tagname> | |
762 | ||
763 | which will sign the current HEAD (but you can also give it another | |
764 | argument that specifies the thing to tag, ie you could have tagged the | |
765 | current "mybranch" point by using "git tag <tagname> mybranch"). | |
766 | ||
767 | You normally only do signed tags for major releases or things | |
768 | like that, while the light-weight tags are useful for any marking you | |
769 | want to do - any time you decide that you want to remember a certain | |
770 | point, just create a private tag for it, and you have a nice symbolic | |
771 | name for the state at that point. | |
772 | ||
e7c1ca42 JH |
773 | |
774 | Publishing your work | |
775 | -------------------- | |
776 | ||
777 | We already talked about using somebody else's work from a remote | |
778 | repository, in the "merging external work" section. It involved | |
779 | fetching the work from a remote repository; but how would _you_ | |
780 | prepare a repository so that other people can fetch from it? | |
781 | ||
782 | Your real work happens in your working directory with your | |
783 | primary repository hanging under it as its ".git" subdirectory. | |
784 | You _could_ make it accessible remotely and ask people to pull | |
785 | from it, but in practice that is not the way things are usually | |
786 | done. A recommended way is to have a public repository, make it | |
787 | reachable by other people, and when the changes you made in your | |
788 | primary working directory are in good shape, update the public | |
789 | repository with it. | |
790 | ||
791 | [ Side note: this public repository could further be mirrored, | |
792 | and that is how kernel.org git repositories are done. ] | |
793 | ||
794 | Publishing the changes from your private repository to your | |
795 | public repository requires you to have write privilege on the | |
796 | machine that hosts your public repository, and it is internally | |
797 | done via an SSH connection. | |
798 | ||
799 | First, you need to create an empty repository to push to on the | |
800 | machine that houses your public repository. This needs to be | |
801 | done only once. | |
802 | ||
803 | Your private repository's GIT directory is usually .git, but | |
804 | often your public repository is named "<projectname>.git". | |
805 | Let's create such a public repository for project "my-git". | |
806 | After logging into the remote machine, create an empty | |
807 | directory: | |
808 | ||
809 | mkdir my-git.git | |
810 | ||
811 | Then, initialize that directory with git-init-db, but this time, | |
812 | since it's name is not usual ".git", we do things a bit | |
813 | differently: | |
814 | ||
815 | GIT_DIR=my-git.git git-init-db | |
816 | ||
817 | Make sure this directory is available for others you want your | |
818 | changes to be pulled by. Also make sure that you have the | |
819 | 'git-receive-pack' program on the $PATH. | |
820 | ||
821 | [ Side note: many installations of sshd does not invoke your | |
822 | shell as the login shell when you directly run programs; what | |
823 | this means is that if your login shell is bash, only .bashrc | |
824 | is read bypassing .bash_profile. As a workaround, make sure | |
825 | .bashrc sets up $PATH so that 'git-receive-pack' program can | |
826 | be run. ] | |
827 | ||
828 | Your 'public repository' is ready to accept your changes. Now, | |
829 | come back to the machine you have your private repository. From | |
830 | there, run this command: | |
831 | ||
832 | git push <public-host>:/path/to/my-git.git master | |
833 | ||
834 | This synchronizes your public repository to match the named | |
835 | branch head (i.e. refs/heads/master in this case) and objects | |
836 | reachable from them in your current repository. | |
837 | ||
838 | As a real example, this is how I update my public git | |
839 | repository. Kernel.org mirror network takes care of the | |
840 | propagation to other publically visible machines: | |
841 | ||
842 | git push master.kernel.org:/pub/scm/git/git.git/ | |
843 | ||
844 | ||
ed616049 | 845 | [ to be continued.. cvsimports, pushing and pulling ] |