]> git.ipfire.org Git - thirdparty/util-linux.git/blame - misc-utils/hardlink.1.adoc
Merge branch 'PR/libmount-utab-event' of github.com:karelzak/util-linux-work
[thirdparty/util-linux.git] / misc-utils / hardlink.1.adoc
CommitLineData
295b3979 1//po4a: entry man manual
6c64d12d
MB
2////
3SPDX-License-Identifier: MIT
4
5Copyright (C) 2008 - 2012 Julian Andres Klode. See hardlink.c for license.
6Copyright (C) 2021 Karel Zak <kzak@redhat.com>
7////
8= hardlink(1)
9:doctype: manpage
10:man manual: User Commands
11:man source: util-linux {release-version}
12:page-layout: base
13:command: hardlink
14
15== NAME
16
17hardlink - link multiple copies of a file
18
19== SYNOPSIS
20
21*hardlink* [options] [_directory_|_file_]...
22
23== DESCRIPTION
24
114330ff
BS
25*hardlink* is a tool that replaces copies of a file with either hardlinks
26or copy-on-write clones, thus saving space.
27
28*hardlink* first creates a binary tree of file sizes and then compares
29the content of files that have the same size. There are two basic content
30comparison methods. The *memcmp* method directly reads data blocks from
31files and compares them. The other method is based on checksums (like SHA256);
32in this case for each data block a checksum is calculated by the Linux kernel
33crypto API, and this checksum is stored in userspace and used for file
34comparisons.
35
36For each file also an "intro" buffer (32 bytes) is cached. This buffer is used
37independently from the comparison method and requested cache-size and io-size.
38The "intro" buffer dramatically reduces operations with data content as files
39are very often different from the beginning.
f3212b91 40
6c64d12d
MB
41== OPTIONS
42
2b2d3172 43include::man-common/help-version.adoc[]
ef356f72 44
4bcc2ed3 45*-c*, *--content*::
85a956a7 46Consider only file content, not attributes, when determining whether two files are equal. Same as *-pot*.
6c64d12d 47
85a956a7
KZ
48*-b*, *--io-size* _size_::
49The size of the *read*(2) or *sendfile*(2) buffer used when comparing file contents.
50The _size_ argument may be followed by the multiplicative suffixes KiB, MiB,
51etc. The "iB" is optional, e.g., "K" has the same meaning as "KiB". The
52default is 8KiB for memcmp method and 1MiB for the other methods. The only
53memcmp method uses process memory for the buffer, other methods use zero-copy
54way and I/O operation is done in the kernel. The size may be altered on the fly
55to fit a number of cached content checksums.
6c64d12d 56
85a956a7 57*-d*, *--respect-dir*::
0d359501 58Only try to link files with the same directory name. The top-level directory (as specified on the *hardlink* command line) is ignored. For example, *hardlink --respect-dir /foo /bar* will link _/foo/some/file_ with _/bar/some/file_, but not _/bar/other/file_. If combined with *--respect-name*, then entire paths (except the top-level directory) are compared.
259bed15 59
6c64d12d 60*-f*, *--respect-name*::
145d42e9 61Only try to link files with the same (base)name. It's strongly recommended to use long options rather than *-f* which is interpreted in a different way by other *hardlink* implementations.
6c64d12d 62
85a956a7
KZ
63*-i*, *--include* _regex_::
64A regular expression to include files. If the option *--exclude* has been given, this option re-includes files which would otherwise be excluded. If the option is used without *--exclude*, only files matched by the pattern are included.
a9b1dfd9 65
85a956a7
KZ
66*-m*, *--maximize*::
67Among equal files, keep the file with the highest link count.
68
69*-M*, *--minimize*::
70Among equal files, keep the file with the lowest link count.
71
72*-n*, *--dry-run*::
73Do not act, just print what would happen.
6c64d12d
MB
74
75*-o*, *--ignore-owner*::
145d42e9 76Link and compare files even if their owner information (user and group) differs. Results may be unpredictable.
6c64d12d 77
85a956a7
KZ
78*-O*, *--keep-oldest*::
79Among equal files, keep the oldest file (least recent modification time). By default, the newest file is kept. If *--maximize* or *--minimize* is specified, the link count has a higher precedence than the time of modification.
80
81*-p*, *--ignore-mode*::
82Link and compare files even if their mode is different. Results may be slightly unpredictable.
83
84*-q*, *--quiet*::
85Quiet mode, don't print anything.
86
87*-r*, *--cache-size* _size_::
88The size of the cache for content checksums. All non-memcmp methods calculate checksum for each
89file content block (see *--io-size*), these checksums are cached for the next comparison. The
90size is important for large files or a large sets of files of the same size. The default is
9110MiB.
92
93*-s*, *--minimum-size* _size_::
94The minimum size to consider. By default this is 1, so empty files will not be linked. The _size_ argument may be followed by the multiplicative suffixes KiB (=1024), MiB (=1024*1024), and so on for GiB, TiB, PiB, EiB, ZiB and YiB (the "iB" is optional, e.g., "K" has the same meaning as "KiB").
95
96*-S*, *--maximum-size* _size_::
97The maximum size to consider. By default this is 0 and 0 has the special meaning of unlimited. The _size_ argument may be followed by the multiplicative suffixes KiB (=1024), MiB (=1024*1024), and so on for GiB, TiB, PiB, EiB, ZiB and YiB (the "iB" is optional, e.g., "K" has the same meaning as "KiB").
98
6c64d12d 99*-t*, *--ignore-time*::
145d42e9 100Link and compare files even if their time of modification is different. This is usually a good choice.
6c64d12d 101
85a956a7
KZ
102*-v*, *--verbose*::
103Verbose output, explain to the user what is being done. If specified once, every hardlinked file is displayed. If specified twice, it also shows every comparison.
104
105*-x*, *--exclude* _regex_::
106A regular expression which excludes files from being compared and linked.
b18a986c 107
6c64d12d 108*-X*, *--respect-xattrs*::
2c646c80 109Only try to link files with the same extended attributes.
6c64d12d 110
85a956a7
KZ
111*-y*, *--method* _name_::
112Set the file content comparison method. The currently supported methods are
113sha256, sha1, crc32c and memcmp. The default is sha256, or memcmp if Linux
114Crypto API is not available. The methods based on checksums are implemented in
115zero-copy way, in this case file contents are not copied to the userspace and all
116calculation is done in kernel.
117
2d16e519
FFD
118*--reflink*[=_when_]::
119Create copy-on-write clones (aka reflinks) rather than hardlinks. The reflinked files
120share only on-disk data, but the file mode and owner can be different. It's recommended
121to use it with *--ignore-owner* and *--ignore-mode* options. This option implies
122*--skip-reflinks* to ignore already cloned files.
123+
124The optional argument _when_ can be *never*, *always*, or *auto*. If the _when_ argument
125is omitted, it defaults to *auto*, in this case, *hardlink* checks filesystem type and
126uses reflinks on BTRFS and XFS only, and fallback to hardlinks when creating reflink is impossible.
127The argument *always* disables filesystem type detection and fallback to hardlinks, in this case,
128only reflinks are allowed.
129
130*--skip-reflinks*::
131Ignore already cloned files. This option may be used without *--reflink* when creating classic hardlinks.
132
259bed15 133
6c64d12d
MB
134== ARGUMENTS
135
136*hardlink* takes one or more directories which will be searched for files to be linked.
137
138== BUGS
139
bd67ca44 140The original *hardlink* implementation uses the option *-f* to force hardlinks creation between filesystem. This very rarely usable feature is no more supported by the current *hardlink*.
6c64d12d 141
e6743239 142*hardlink* assumes that the trees it operates on do not change during operation. If a tree does change, the result is undefined and potentially dangerous. For example, if a regular file is replaced by a device, *hardlink* may start reading from the device. If a component of a path is replaced by a symbolic link or file permissions change, security may be compromised. Do not run *hardlink* on a changing tree or on a tree controlled by another user.
6c64d12d
MB
143
144== AUTHOR
145
146There are multiple *hardlink* implementations. The very first implementation is from Jakub Jelinek for Fedora distribution, this implementation has been used in util-linux between versions v2.34 to v2.36. The current implementations is based on Debian version from Julian Andres Klode.
147
625e9c61 148include::man-common/bugreports.adoc[]
6c64d12d 149
625e9c61 150include::man-common/footer.adoc[]
6c64d12d
MB
151
152ifdef::translation[]
625e9c61 153include::man-common/translation.adoc[]
6c64d12d 154endif::[]