]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/git-fast-export.txt
Merge branch 'jk/fast-export-anonym-alt'
[thirdparty/git.git] / Documentation / git-fast-export.txt
CommitLineData
f2dc849e
JS
1git-fast-export(1)
2==================
3
4NAME
5----
6git-fast-export - Git data exporter
7
8
9SYNOPSIS
10--------
7791a1d9 11[verse]
de613050 12'git fast-export [<options>]' | 'git fast-import'
f2dc849e
JS
13
14DESCRIPTION
15-----------
16This program dumps the given revisions in a form suitable to be piped
0b444cdb 17into 'git fast-import'.
f2dc849e 18
29b802aa 19You can use it as a human-readable bundle replacement (see
9df53c5d
EN
20linkgit:git-bundle[1]), or as a format that can be edited before being
21fed to 'git fast-import' in order to do history rewrites (an ability
22relied on by tools like 'git filter-repo').
f2dc849e
JS
23
24OPTIONS
25-------
26--progress=<n>::
27 Insert 'progress' statements every <n> objects, to be shown by
0b444cdb 28 'git fast-import' during import.
f2dc849e 29
cd16c59b 30--signed-tags=(verbatim|warn|warn-strip|strip|abort)::
f2dc849e
JS
31 Specify how to handle signed tags. Since any transformation
32 after the export can change the tag names (which can also happen
33 when excluding revisions) the signatures will not match.
34+
35When asking to 'abort' (which is the default), this program will die
cd16c59b
JK
36when encountering a signed tag. With 'strip', the tags will silently
37be made unsigned, with 'warn-strip' they will be made unsigned but a
38warning will be displayed, with 'verbatim', they will be silently
39exported and with 'warn', they will be exported, but you will see a
40warning.
f2dc849e 41
2d8ad469 42--tag-of-filtered-object=(abort|drop|rewrite)::
6a5d0b0a 43 Specify how to handle tags whose tagged object is filtered out.
2d8ad469
EN
44 Since revisions and files to export can be limited by path,
45 tagged objects may be filtered completely.
46+
47When asking to 'abort' (which is the default), this program will die
48when encountering such a tag. With 'drop' it will omit such tags from
49the output. With 'rewrite', if the tagged object is a commit, it will
50rewrite the tag to tag an ancestor commit (via parent rewriting; see
51linkgit:git-rev-list[1])
52
ae7c5dce
AG
53-M::
54-C::
55 Perform move and/or copy detection, as described in the
56 linkgit:git-diff[1] manual page, and use it to generate
57 rename and copy commands in the output dump.
58+
59Note that earlier versions of this command did not complain and
60produced incorrect results if you gave these options.
61
df6a7ff7
PB
62--export-marks=<file>::
63 Dumps the internal marks table to <file> when complete.
64 Marks are written one per line as `:markid SHA-1`. Only marks
65 for revisions are dumped; marks for blobs are ignored.
66 Backends can use this file to validate imports after they
67 have been completed, or to save the marks table across
68 incremental runs. As <file> is only opened and truncated
69 at completion, the same path can also be safely given to
1c262bb7 70 --import-marks.
c4458ecd
AP
71 The file will not be written if no new object has been
72 marked/exported.
df6a7ff7
PB
73
74--import-marks=<file>::
75 Before processing any input, load the marks specified in
76 <file>. The input file must exist, must be readable, and
1c262bb7 77 must use the same format as produced by --export-marks.
a1638cfe
EN
78
79--mark-tags::
80 In addition to labelling blobs and commits with mark ids, also
81 label tags. This is useful in conjunction with
82 `--export-marks` and `--import-marks`, and is also useful (and
83 necessary) for exporting of nested tags. It does not hurt
84 other cases and would be the default, but many fast-import
85 frontends are not prepared to accept tags with mark
86 identifiers.
df6a7ff7 87+
a1638cfe
EN
88Any commits (or tags) that have already been marked will not be
89exported again. If the backend uses a similar --import-marks file,
90this allows for incremental bidirectional exporting of the repository
91by keeping the marks the same across runs.
df6a7ff7 92
4e46a8d6
JS
93--fake-missing-tagger::
94 Some old repositories have tags without a tagger. The
95 fast-import protocol was pretty strict about that, and did not
96 allow that. So fake a tagger to be able to fast-import the
97 output.
98
82670a5c
SR
99--use-done-feature::
100 Start the stream with a 'feature done' stanza, and terminate
101 it with a 'done' command.
102
79559f27
GI
103--no-data::
104 Skip output of blob objects and instead refer to blobs via
105 their original SHA-1 hash. This is useful when rewriting the
106 directory structure or history of a repository without
107 touching the contents of individual files. Note that the
108 resulting stream can only be used by a repository which
109 already contains the necessary objects.
110
7f40ab09
EN
111--full-tree::
112 This option will cause fast-export to issue a "deleteall"
113 directive for each commit followed by a full list of all files
114 in the commit (as opposed to just listing the files which are
115 different from the commit's first parent).
116
a8722750 117--anonymize::
75d3d657
JK
118 Anonymize the contents of the repository while still retaining
119 the shape of the history and stored tree. See the section on
120 `ANONYMIZING` below.
a8722750 121
65b5d9fa
JK
122--anonymize-map=<from>[:<to>]::
123 Convert token `<from>` to `<to>` in the anonymized output. If
124 `<to>` is omitted, map `<from>` to itself (i.e., do not
125 anonymize it). See the section on `ANONYMIZING` below.
126
530ca19c
EN
127--reference-excluded-parents::
128 By default, running a command such as `git fast-export
129 master~5..master` will not include the commit master{tilde}5
130 and will make master{tilde}4 no longer have master{tilde}5 as
131 a parent (though both the old master{tilde}4 and new
132 master{tilde}4 will have all the same files). Use
24966cd9 133 --reference-excluded-parents to instead have the stream
530ca19c
EN
134 refer to commits in the excluded range of history by their
135 sha1sum. Note that the resulting stream can only be used by a
136 repository which already contains the necessary parent
137 commits.
138
a965bb31
EN
139--show-original-ids::
140 Add an extra directive to the output for commits and blobs,
141 `original-oid <SHA1SUM>`. While such directives will likely be
142 ignored by importers such as git-fast-import, it may be useful
143 for intermediary filters (e.g. for rewriting commit messages
144 which refer to older commits, or for stripping blobs by id).
145
e80001f8
EN
146--reencode=(yes|no|abort)::
147 Specify how to handle `encoding` header in commit objects. When
148 asking to 'abort' (which is the default), this program will die
149 when encountering such a commit object. With 'yes', the commit
031fd4b9 150 message will be re-encoded into UTF-8. With 'no', the original
e80001f8
EN
151 encoding will be preserved.
152
03e9010c
FC
153--refspec::
154 Apply the specified refspec to each ref exported. Multiple of them can
155 be specified.
156
62b4698e 157[<git-rev-list-args>...]::
0460ed2c
FC
158 A list of arguments, acceptable to 'git rev-parse' and
159 'git rev-list', that specifies the specific objects and references
160 to export. For example, `master~10..master` causes the
161 current master reference to be exported along with all objects
530ca19c
EN
162 added since its 10th ancestor commit and (unless the
163 --reference-excluded-parents option is specified) all files
164 common to master{tilde}9 and master{tilde}10.
f2dc849e
JS
165
166EXAMPLES
167--------
168
169-------------------------------------------------------------------
170$ git fast-export --all | (cd /empty/repository && git fast-import)
171-------------------------------------------------------------------
172
173This will export the whole repository and import it into the existing
174empty repository. Except for reencoding commits that are not in
175UTF-8, it would be a one-to-one mirror.
176
177-----------------------------------------------------
178$ git fast-export master~5..master |
179 sed "s|refs/heads/master|refs/heads/other|" |
180 git fast-import
181-----------------------------------------------------
182
183This makes a new branch called 'other' from 'master~5..master'
184(i.e. if 'master' has linear history, it will take the last 5 commits).
185
186Note that this assumes that none of the blobs and commit messages
187referenced by that revision range contains the string
188'refs/heads/master'.
189
190
75d3d657
JK
191ANONYMIZING
192-----------
193
194If the `--anonymize` option is given, git will attempt to remove all
195identifying information from the repository while still retaining enough
196of the original tree and history patterns to reproduce some bugs. The
197goal is that a git bug which is found on a private repository will
198persist in the anonymized repository, and the latter can be shared with
199git developers to help solve the bug.
200
201With this option, git will replace all refnames, paths, blob contents,
202commit and tag messages, names, and email addresses in the output with
203anonymized data. Two instances of the same string will be replaced
204equivalently (e.g., two commits with the same author will have the same
205anonymized author in the output, but bear no resemblance to the original
206author string). The relationship between commits, branches, and tags is
207retained, as well as the commit timestamps (but the commit messages and
208refnames bear no resemblance to the originals). The relative makeup of
209the tree is retained (e.g., if you have a root tree with 10 files and 3
210trees, so will the output), but their names and the contents of the
211files will be replaced.
212
213If you think you have found a git bug, you can start by exporting an
214anonymized stream of the whole repository:
215
216---------------------------------------------------
217$ git fast-export --anonymize --all >anon-stream
218---------------------------------------------------
219
220Then confirm that the bug persists in a repository created from that
221stream (many bugs will not, as they really do depend on the exact
222repository contents):
223
224---------------------------------------------------
225$ git init anon-repo
226$ cd anon-repo
227$ git fast-import <../anon-stream
228$ ... test your bug ...
229---------------------------------------------------
230
231If the anonymized repository shows the bug, it may be worth sharing
232`anon-stream` along with a regular bug report. Note that the anonymized
233stream compresses very well, so gzipping it is encouraged. If you want
234to examine the stream to see that it does not contain any private data,
235you can peruse it directly before sending. You may also want to try:
236
237---------------------------------------------------
238$ perl -pe 's/\d+/X/g' <anon-stream | sort -u | less
239---------------------------------------------------
240
241which shows all of the unique lines (with numbers converted to "X", to
242collapse "User 0", "User 1", etc into "User X"). This produces a much
243smaller output, and it is usually easy to quickly confirm that there is
244no private data in the stream.
245
65b5d9fa
JK
246Reproducing some bugs may require referencing particular commits or
247paths, which becomes challenging after refnames and paths have been
248anonymized. You can ask for a particular token to be left as-is or
249mapped to a new value. For example, if you have a bug which reproduces
250with `git rev-list sensitive -- secret.c`, you can run:
251
252---------------------------------------------------
253$ git fast-export --anonymize --all \
254 --anonymize-map=sensitive:foo \
255 --anonymize-map=secret.c:bar.c \
256 >stream
257---------------------------------------------------
258
259After importing the stream, you can then run `git rev-list foo -- bar.c`
260in the anonymized repository.
261
262Note that paths and refnames are split into tokens at slash boundaries.
263The command above would anonymize `subdir/secret.c` as something like
264`path123/bar.c`; you could then search for `bar.c` in the anonymized
265repository to determine the final pathname.
266
267To make referencing the final pathname simpler, you can map each path
268component; so if you also anonymize `subdir` to `publicdir`, then the
269final pathname would be `publicdir/bar.c`.
75d3d657 270
76a8788c 271LIMITATIONS
f2dc849e
JS
272-----------
273
0b444cdb 274Since 'git fast-import' cannot tag trees, you will not be
283efb01 275able to export the linux.git repository completely, as it contains
f2dc849e
JS
276a tag referencing a tree instead of a commit.
277
26726718
MH
278SEE ALSO
279--------
280linkgit:git-fast-import[1]
281
f2dc849e
JS
282GIT
283---
9e1f0a85 284Part of the linkgit:git[1] suite