Documentation/technical/protocol-v2.txt

   1 Git Wire Protocol, Version 2
   2 ============================
   3
   4 This document presents a specification for a version 2 of Git's wire
   5 protocol.  Protocol v2 will improve upon v1 in the following ways:
   6
   7   * Instead of multiple service names, multiple commands will be
   8     supported by a single service
   9   * Easily extendable as capabilities are moved into their own section
  10     of the protocol, no longer being hidden behind a NUL byte and
  11     limited by the size of a pkt-line
  12   * Separate out other information hidden behind NUL bytes (e.g. agent
  13     string as a capability and symrefs can be requested using 'ls-refs')
  14   * Reference advertisement will be omitted unless explicitly requested
  15   * ls-refs command to explicitly request some refs
  16   * Designed with http and stateless-rpc in mind.  With clear flush
  17     semantics the http remote helper can simply act as a proxy
  18
  19 In protocol v2 communication is command oriented.  When first contacting a
  20 server a list of capabilities will advertised.  Some of these capabilities
  21 will be commands which a client can request be executed.  Once a command
  22 has completed, a client can reuse the connection and request that other
  23 commands be executed.
  24
  25 Packet-Line Framing
  26 -------------------
  27
  28 All communication is done using packet-line framing, just as in v1.  See
  29 `Documentation/technical/pack-protocol.txt` and
  30 `Documentation/technical/protocol-common.txt` for more information.
  31
  32 In protocol v2 these special packets will have the following semantics:
  33
  34   * '0000' Flush Packet (flush-pkt) - indicates the end of a message
  35   * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
  36   * '0002' Response End Packet (response-end-pkt) - indicates the end of a
  37     response for stateless connections
  38
  39 Initial Client Request
  40 ----------------------
  41
  42 In general a client can request to speak protocol v2 by sending
  43 `version=2` through the respective side-channel for the transport being
  44 used which inevitably sets `GIT_PROTOCOL`.  More information can be
  45 found in `pack-protocol.txt` and `http-protocol.txt`.  In all cases the
  46 response from the server is the capability advertisement.
  47
  48 Git Transport
  49 ~~~~~~~~~~~~~
  50
  51 When using the git:// transport, you can request to use protocol v2 by
  52 sending "version=2" as an extra parameter:
  53
  54    003egit-upload-pack /project.git\0host=myserver.com\0\0version=2\0
  55
  56 SSH and File Transport
  57 ~~~~~~~~~~~~~~~~~~~~~~
  58
  59 When using either the ssh:// or file:// transport, the GIT_PROTOCOL
  60 environment variable must be set explicitly to include "version=2".
  61
  62 HTTP Transport
  63 ~~~~~~~~~~~~~~
  64
  65 When using the http:// or https:// transport a client makes a "smart"
  66 info/refs request as described in `http-protocol.txt` and requests that
  67 v2 be used by supplying "version=2" in the `Git-Protocol` header.
  68
  69    C: GET $GIT_URL/info/refs?service=git-upload-pack HTTP/1.0
  70    C: Git-Protocol: version=2
  71
  72 A v2 server would reply:
  73
  74    S: 200 OK
  75    S: <Some headers>
  76    S: ...
  77    S:
  78    S: 000eversion 2\n
  79    S: <capability-advertisement>
  80
  81 Subsequent requests are then made directly to the service
  82 `$GIT_URL/git-upload-pack`. (This works the same for git-receive-pack).
  83
  84 Capability Advertisement
  85 ------------------------
  86
  87 A server which decides to communicate (based on a request from a client)
  88 using protocol version 2, notifies the client by sending a version string
  89 in its initial response followed by an advertisement of its capabilities.
  90 Each capability is a key with an optional value.  Clients must ignore all
  91 unknown keys.  Semantics of unknown values are left to the definition of
  92 each key.  Some capabilities will describe commands which can be requested
  93 to be executed by the client.
  94
  95     capability-advertisement = protocol-version
  96                                capability-list
  97                                flush-pkt
  98
  99     protocol-version = PKT-LINE("version 2" LF)
 100     capability-list = *capability
 101     capability = PKT-LINE(key[=value] LF)
 102
 103     key = 1*(ALPHA | DIGIT | "-_")
 104     value = 1*(ALPHA | DIGIT | " -_.,?\/{}[]()<>!@#$%^&*+=:;")
 105
 106 Command Request
 107 ---------------
 108
 109 After receiving the capability advertisement, a client can then issue a
 110 request to select the command it wants with any particular capabilities
 111 or arguments.  There is then an optional section where the client can
 112 provide any command specific parameters or queries.  Only a single
 113 command can be requested at a time.
 114
 115     request = empty-request | command-request
 116     empty-request = flush-pkt
 117     command-request = command
 118                       capability-list
 119                       [command-args]
 120                       flush-pkt
 121     command = PKT-LINE("command=" key LF)
 122     command-args = delim-pkt
 123                    *command-specific-arg
 124
 125     command-specific-args are packet line framed arguments defined by
 126     each individual command.
 127
 128 The server will then check to ensure that the client's request is
 129 comprised of a valid command as well as valid capabilities which were
 130 advertised.  If the request is valid the server will then execute the
 131 command.  A server MUST wait till it has received the client's entire
 132 request before issuing a response.  The format of the response is
 133 determined by the command being executed, but in all cases a flush-pkt
 134 indicates the end of the response.
 135
 136 When a command has finished, and the client has received the entire
 137 response from the server, a client can either request that another
 138 command be executed or can terminate the connection.  A client may
 139 optionally send an empty request consisting of just a flush-pkt to
 140 indicate that no more requests will be made.
 141
 142 Capabilities
 143 ------------
 144
 145 There are two different types of capabilities: normal capabilities,
 146 which can be used to convey information or alter the behavior of a
 147 request, and commands, which are the core actions that a client wants to
 148 perform (fetch, push, etc).
 149
 150 Protocol version 2 is stateless by default.  This means that all commands
 151 must only last a single round and be stateless from the perspective of the
 152 server side, unless the client has requested a capability indicating that
 153 state should be maintained by the server.  Clients MUST NOT require state
 154 management on the server side in order to function correctly.  This
 155 permits simple round-robin load-balancing on the server side, without
 156 needing to worry about state management.
 157
 158 agent
 159 ~~~~~
 160
 161 The server can advertise the `agent` capability with a value `X` (in the
 162 form `agent=X`) to notify the client that the server is running version
 163 `X`.  The client may optionally send its own agent string by including
 164 the `agent` capability with a value `Y` (in the form `agent=Y`) in its
 165 request to the server (but it MUST NOT do so if the server did not
 166 advertise the agent capability). The `X` and `Y` strings may contain any
 167 printable ASCII characters except space (i.e., the byte range 32 < x <
 168 127), and are typically of the form "package/version" (e.g.,
 169 "git/1.8.3.1"). The agent strings are purely informative for statistics
 170 and debugging purposes, and MUST NOT be used to programmatically assume
 171 the presence or absence of particular features.
 172
 173 ls-refs
 174 ~~~~~~~
 175
 176 `ls-refs` is the command used to request a reference advertisement in v2.
 177 Unlike the current reference advertisement, ls-refs takes in arguments
 178 which can be used to limit the refs sent from the server.
 179
 180 Additional features not supported in the base command will be advertised
 181 as the value of the command in the capability advertisement in the form
 182 of a space separated list of features: "<command>=<feature 1> <feature 2>"
 183
 184 ls-refs takes in the following arguments:
 185
 186     symrefs
 187         In addition to the object pointed by it, show the underlying ref
 188         pointed by it when showing a symbolic ref.
 189     peel
 190         Show peeled tags.
 191     ref-prefix <prefix>
 192         When specified, only references having a prefix matching one of
 193         the provided prefixes are displayed.
 194
 195 If the 'unborn' feature is advertised the following argument can be
 196 included in the client's request.
 197
 198     unborn
 199         The server will send information about HEAD even if it is a symref
 200         pointing to an unborn branch in the form "unborn HEAD
 201         symref-target:<target>".
 202
 203 The output of ls-refs is as follows:
 204
 205     output = *ref
 206              flush-pkt
 207     obj-id-or-unborn = (obj-id | "unborn")
 208     ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
 209     ref-attribute = (symref | peeled)
 210     symref = "symref-target:" symref-target
 211     peeled = "peeled:" obj-id
 212
 213 fetch
 214 ~~~~~
 215
 216 `fetch` is the command used to fetch a packfile in v2.  It can be looked
 217 at as a modified version of the v1 fetch where the ref-advertisement is
 218 stripped out (since the `ls-refs` command fills that role) and the
 219 message format is tweaked to eliminate redundancies and permit easy
 220 addition of future extensions.
 221
 222 Additional features not supported in the base command will be advertised
 223 as the value of the command in the capability advertisement in the form
 224 of a space separated list of features: "<command>=<feature 1> <feature 2>"
 225
 226 A `fetch` request can take the following arguments:
 227
 228     want <oid>
 229         Indicates to the server an object which the client wants to
 230         retrieve.  Wants can be anything and are not limited to
 231         advertised objects.
 232
 233     have <oid>
 234         Indicates to the server an object which the client has locally.
 235         This allows the server to make a packfile which only contains
 236         the objects that the client needs. Multiple 'have' lines can be
 237         supplied.
 238
 239     done
 240         Indicates to the server that negotiation should terminate (or
 241         not even begin if performing a clone) and that the server should
 242         use the information supplied in the request to construct the
 243         packfile.
 244
 245     thin-pack
 246         Request that a thin pack be sent, which is a pack with deltas
 247         which reference base objects not contained within the pack (but
 248         are known to exist at the receiving end). This can reduce the
 249         network traffic significantly, but it requires the receiving end
 250         to know how to "thicken" these packs by adding the missing bases
 251         to the pack.
 252
 253     no-progress
 254         Request that progress information that would normally be sent on
 255         side-band channel 2, during the packfile transfer, should not be
 256         sent.  However, the side-band channel 3 is still used for error
 257         responses.
 258
 259     include-tag
 260         Request that annotated tags should be sent if the objects they
 261         point to are being sent.
 262
 263     ofs-delta
 264         Indicate that the client understands PACKv2 with delta referring
 265         to its base by position in pack rather than by an oid.  That is,
 266         they can read OBJ_OFS_DELTA (aka type 6) in a packfile.
 267
 268 If the 'shallow' feature is advertised the following arguments can be
 269 included in the clients request as well as the potential addition of the
 270 'shallow-info' section in the server's response as explained below.
 271
 272     shallow <oid>
 273         A client must notify the server of all commits for which it only
 274         has shallow copies (meaning that it doesn't have the parents of
 275         a commit) by supplying a 'shallow <oid>' line for each such
 276         object so that the server is aware of the limitations of the
 277         client's history.  This is so that the server is aware that the
 278         client may not have all objects reachable from such commits.
 279
 280     deepen <depth>
 281         Requests that the fetch/clone should be shallow having a commit
 282         depth of <depth> relative to the remote side.
 283
 284     deepen-relative
 285         Requests that the semantics of the "deepen" command be changed
 286         to indicate that the depth requested is relative to the client's
 287         current shallow boundary, instead of relative to the requested
 288         commits.
 289
 290     deepen-since <timestamp>
 291         Requests that the shallow clone/fetch should be cut at a
 292         specific time, instead of depth.  Internally it's equivalent to
 293         doing "git rev-list --max-age=<timestamp>". Cannot be used with
 294         "deepen".
 295
 296     deepen-not <rev>
 297         Requests that the shallow clone/fetch should be cut at a
 298         specific revision specified by '<rev>', instead of a depth.
 299         Internally it's equivalent of doing "git rev-list --not <rev>".
 300         Cannot be used with "deepen", but can be used with
 301         "deepen-since".
 302
 303 If the 'filter' feature is advertised, the following argument can be
 304 included in the client's request:
 305
 306     filter <filter-spec>
 307         Request that various objects from the packfile be omitted
 308         using one of several filtering techniques. These are intended
 309         for use with partial clone and partial fetch operations. See
 310         `rev-list` for possible "filter-spec" values. When communicating
 311         with other processes, senders SHOULD translate scaled integers
 312         (e.g. "1k") into a fully-expanded form (e.g. "1024") to aid
 313         interoperability with older receivers that may not understand
 314         newly-invented scaling suffixes. However, receivers SHOULD
 315         accept the following suffixes: 'k', 'm', and 'g' for 1024,
 316         1048576, and 1073741824, respectively.
 317
 318 If the 'ref-in-want' feature is advertised, the following argument can
 319 be included in the client's request as well as the potential addition of
 320 the 'wanted-refs' section in the server's response as explained below.
 321
 322     want-ref <ref>
 323         Indicates to the server that the client wants to retrieve a
 324         particular ref, where <ref> is the full name of a ref on the
 325         server.
 326
 327 If the 'sideband-all' feature is advertised, the following argument can be
 328 included in the client's request:
 329
 330     sideband-all
 331         Instruct the server to send the whole response multiplexed, not just
 332         the packfile section. All non-flush and non-delim PKT-LINE in the
 333         response (not only in the packfile section) will then start with a byte
 334         indicating its sideband (1, 2, or 3), and the server may send "0005\2"
 335         (a PKT-LINE of sideband 2 with no payload) as a keepalive packet.
 336
 337 If the 'packfile-uris' feature is advertised, the following argument
 338 can be included in the client's request as well as the potential
 339 addition of the 'packfile-uris' section in the server's response as
 340 explained below.
 341
 342     packfile-uris <comma-separated list of protocols>
 343         Indicates to the server that the client is willing to receive
 344         URIs of any of the given protocols in place of objects in the
 345         sent packfile. Before performing the connectivity check, the
 346         client should download from all given URIs. Currently, the
 347         protocols supported are "http" and "https".
 348
 349 The response of `fetch` is broken into a number of sections separated by
 350 delimiter packets (0001), with each section beginning with its section
 351 header. Most sections are sent only when the packfile is sent.
 352
 353     output = acknowledgements flush-pkt |
 354              [acknowledgments delim-pkt] [shallow-info delim-pkt]
 355              [wanted-refs delim-pkt] [packfile-uris delim-pkt]
 356              packfile flush-pkt
 357
 358     acknowledgments = PKT-LINE("acknowledgments" LF)
 359                       (nak | *ack)
 360                       (ready)
 361     ready = PKT-LINE("ready" LF)
 362     nak = PKT-LINE("NAK" LF)
 363     ack = PKT-LINE("ACK" SP obj-id LF)
 364
 365     shallow-info = PKT-LINE("shallow-info" LF)
 366                    *PKT-LINE((shallow | unshallow) LF)
 367     shallow = "shallow" SP obj-id
 368     unshallow = "unshallow" SP obj-id
 369
 370     wanted-refs = PKT-LINE("wanted-refs" LF)
 371                   *PKT-LINE(wanted-ref LF)
 372     wanted-ref = obj-id SP refname
 373
 374     packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
 375     packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF)
 376
 377     packfile = PKT-LINE("packfile" LF)
 378                *PKT-LINE(%x01-03 *%x00-ff)
 379
 380     acknowledgments section
 381         * If the client determines that it is finished with negotiations by
 382           sending a "done" line (thus requiring the server to send a packfile),
 383           the acknowledgments sections MUST be omitted from the server's
 384           response.
 385
 386         * Always begins with the section header "acknowledgments"
 387
 388         * The server will respond with "NAK" if none of the object ids sent
 389           as have lines were common.
 390
 391         * The server will respond with "ACK obj-id" for all of the
 392           object ids sent as have lines which are common.
 393
 394         * A response cannot have both "ACK" lines as well as a "NAK"
 395           line.
 396
 397         * The server will respond with a "ready" line indicating that
 398           the server has found an acceptable common base and is ready to
 399           make and send a packfile (which will be found in the packfile
 400           section of the same response)
 401
 402         * If the server has found a suitable cut point and has decided
 403           to send a "ready" line, then the server can decide to (as an
 404           optimization) omit any "ACK" lines it would have sent during
 405           its response.  This is because the server will have already
 406           determined the objects it plans to send to the client and no
 407           further negotiation is needed.
 408
 409     shallow-info section
 410         * If the client has requested a shallow fetch/clone, a shallow
 411           client requests a fetch or the server is shallow then the
 412           server's response may include a shallow-info section.  The
 413           shallow-info section will be included if (due to one of the
 414           above conditions) the server needs to inform the client of any
 415           shallow boundaries or adjustments to the clients already
 416           existing shallow boundaries.
 417
 418         * Always begins with the section header "shallow-info"
 419
 420         * If a positive depth is requested, the server will compute the
 421           set of commits which are no deeper than the desired depth.
 422
 423         * The server sends a "shallow obj-id" line for each commit whose
 424           parents will not be sent in the following packfile.
 425
 426         * The server sends an "unshallow obj-id" line for each commit
 427           which the client has indicated is shallow, but is no longer
 428           shallow as a result of the fetch (due to its parents being
 429           sent in the following packfile).
 430
 431         * The server MUST NOT send any "unshallow" lines for anything
 432           which the client has not indicated was shallow as a part of
 433           its request.
 434
 435     wanted-refs section
 436         * This section is only included if the client has requested a
 437           ref using a 'want-ref' line and if a packfile section is also
 438           included in the response.
 439
 440         * Always begins with the section header "wanted-refs".
 441
 442         * The server will send a ref listing ("<oid> <refname>") for
 443           each reference requested using 'want-ref' lines.
 444
 445         * The server MUST NOT send any refs which were not requested
 446           using 'want-ref' lines.
 447
 448     packfile-uris section
 449         * This section is only included if the client sent
 450           'packfile-uris' and the server has at least one such URI to
 451           send.
 452
 453         * Always begins with the section header "packfile-uris".
 454
 455         * For each URI the server sends, it sends a hash of the pack's
 456           contents (as output by git index-pack) followed by the URI.
 457
 458         * The hashes are 40 hex characters long. When Git upgrades to a new
 459           hash algorithm, this might need to be updated. (It should match
 460           whatever index-pack outputs after "pack\t" or "keep\t".
 461
 462     packfile section
 463         * This section is only included if the client has sent 'want'
 464           lines in its request and either requested that no more
 465           negotiation be done by sending 'done' or if the server has
 466           decided it has found a sufficient cut point to produce a
 467           packfile.
 468
 469         * Always begins with the section header "packfile"
 470
 471         * The transmission of the packfile begins immediately after the
 472           section header
 473
 474         * The data transfer of the packfile is always multiplexed, using
 475           the same semantics of the 'side-band-64k' capability from
 476           protocol version 1.  This means that each packet, during the
 477           packfile data stream, is made up of a leading 4-byte pkt-line
 478           length (typical of the pkt-line format), followed by a 1-byte
 479           stream code, followed by the actual data.
 480
 481           The stream code can be one of:
 482                 1 - pack data
 483                 2 - progress messages
 484                 3 - fatal error message just before stream aborts
 485
 486 server-option
 487 ~~~~~~~~~~~~~
 488
 489 If advertised, indicates that any number of server specific options can be
 490 included in a request.  This is done by sending each option as a
 491 "server-option=<option>" capability line in the capability-list section of
 492 a request.
 493
 494 The provided options must not contain a NUL or LF character.
 495
 496  object-format
 497 ~~~~~~~~~~~~~~~
 498
 499 The server can advertise the `object-format` capability with a value `X` (in the
 500 form `object-format=X`) to notify the client that the server is able to deal
 501 with objects using hash algorithm X.  If not specified, the server is assumed to
 502 only handle SHA-1.  If the client would like to use a hash algorithm other than
 503 SHA-1, it should specify its object-format string.
 504
 505 session-id=<session id>
 506 ~~~~~~~~~~~~~~~~~~~~~~~
 507
 508 The server may advertise a session ID that can be used to identify this process
 509 across multiple requests. The client may advertise its own session ID back to
 510 the server as well.
 511
 512 Session IDs should be unique to a given process. They must fit within a
 513 packet-line, and must not contain non-printable or whitespace characters. The
 514 current implementation uses trace2 session IDs (see
 515 link:api-trace2.html[api-trace2] for details), but this may change and users of
 516 the session ID should not rely on this fact.