]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man7/glob.7
Many pages: Fix style issues reported by `make lint-groff`
[thirdparty/man-pages.git] / man7 / glob.7
CommitLineData
fea681da
MK
1.\" Copyright (c) 1998 Andries Brouwer
2.\"
e4a74ca8 3.\" SPDX-License-Identifier: GPL-2.0-or-later
fea681da
MK
4.\"
5.\" 2003-08-24 fix for / by John Kristoff + joey
6.\"
ed6c69ca 7.TH GLOB 7 2020-08-13 "Linux" "Linux Programmer's Manual"
fea681da 8.SH NAME
f68512e9 9glob \- globbing pathnames
fea681da 10.SH DESCRIPTION
b4112efb 11Long ago, in UNIX\ V6, there was a program
fea681da
MK
12.I /etc/glob
13that would expand wildcard patterns.
5fab2e7c 14Soon afterward this became a shell built-in.
a721e8b2 15.PP
fea681da
MK
16These days there is also a library routine
17.BR glob (3)
18that will perform this function for a user program.
a721e8b2 19.PP
4dec66f9 20The rules are as follows (POSIX.2, 3.13).
73d8cece 21.SS Wildcard matching
fea681da 22A string is a wildcard pattern if it contains one of the
735334d4 23characters \(aq?\(aq, \(aq*\(aq, or \(aq[\(aq.
c13182ef 24Globbing is the operation
fea681da 25that expands a wildcard pattern into the list of pathnames
c13182ef
MK
26matching the pattern.
27Matching is defined by:
a721e8b2 28.PP
333a424b 29A \(aq?\(aq (not between brackets) matches any single character.
a721e8b2 30.PP
333a424b 31A \(aq*\(aq (not between brackets) matches any string,
fea681da 32including the empty string.
1ce284ec
MK
33.PP
34.B "Character classes"
6545cc56 35.PP
333a424b
MK
36An expression "\fI[...]\fP" where the first character after the
37leading \(aq[\(aq is not an \(aq!\(aq matches a single character,
fea681da
MK
38namely any of the characters enclosed by the brackets.
39The string enclosed by the brackets cannot be empty;
333a424b 40therefore \(aq]\(aq can be allowed between the brackets, provided
c13182ef 41that it is the first character.
333a424b 42(Thus, "\fI[][!]\fP" matches the
735334d4 43three characters \(aq[\(aq, \(aq]\(aq, and \(aq!\(aq.)
1ce284ec
MK
44.PP
45.B Ranges
6545cc56 46.PP
fea681da 47There is one special convention:
333a424b 48two characters separated by \(aq\-\(aq denote a range.
c45660d7
MK
49(Thus, "\fI[A\-Fa\-f0\-9]\fP"
50is equivalent to "\fI[ABCDEFabcdef0123456789]\fP".)
333a424b 51One may include \(aq\-\(aq in its literal meaning by making it the
fea681da 52first or last character between the brackets.
333a424b
MK
53(Thus, "\fI[]\-]\fP" matches just the two characters \(aq]\(aq and \(aq\-\(aq,
54and "\fI[\-\-0]\fP" matches the
55three characters \(aq\-\(aq, \(aq.\(aq, \(aq0\(aq, since \(aq/\(aq
fea681da 56cannot be matched.)
1ce284ec
MK
57.PP
58.B Complementation
6545cc56 59.PP
333a424b 60An expression "\fI[!...]\fP" matches a single character, namely
fea681da 61any character that is not matched by the expression obtained
333a424b
MK
62by removing the first \(aq!\(aq from it.
63(Thus, "\fI[!]a\-]\fP" matches any
735334d4 64single character except \(aq]\(aq, \(aqa\(aq, and \(aq\-\(aq.)
a721e8b2 65.PP
735334d4 66One can remove the special meaning of \(aq?\(aq, \(aq*\(aq, and \(aq[\(aq by
fea681da
MK
67preceding them by a backslash, or, in case this is part of
68a shell command line, enclosing them in quotes.
69Between brackets these characters stand for themselves.
31a6818e 70Thus, "\fI[[?*\e]\fP" matches the
735334d4 71four characters \(aq[\(aq, \(aq?\(aq, \(aq*\(aq, and \(aq\e\(aq.
1ce284ec 72.SS Pathnames
fea681da 73Globbing is applied on each of the components of a pathname
c13182ef 74separately.
333a424b
MK
75A \(aq/\(aq in a pathname cannot be matched by a \(aq?\(aq or \(aq*\(aq
76wildcard, or by a range like "\fI[.\-0]\fP".
1bceaaee
MK
77A range containing an explicit \(aq/\(aq character is syntactically incorrect.
78(POSIX requires that syntactically incorrect patterns are left unchanged.)
a721e8b2 79.PP
c45660d7
MK
80If a filename starts with a \(aq.\(aq,
81this character must be matched explicitly.
333a424b
MK
82(Thus, \fIrm\ *\fP will not remove .profile, and \fItar\ c\ *\fP will not
83archive all your files; \fItar\ c\ .\fP is better.)
73d8cece 84.SS Empty lists
333a424b 85The nice and simple rule given above: "expand a wildcard pattern
008f1ecc 86into the list of matching pathnames" was the original UNIX
c13182ef
MK
87definition.
88It allowed one to have patterns that expand into
fea681da 89an empty list, as in
a721e8b2 90.PP
fea681da 91.nf
7295b7ed 92 xv \-wait 0 *.gif *.jpg
fea681da 93.fi
a721e8b2 94.PP
fea681da
MK
95where perhaps no *.gif files are present (and this is not
96an error).
97However, POSIX requires that a wildcard pattern is left
98unchanged when it is syntactically incorrect, or the list of
99matching pathnames is empty.
100With
101.I bash
c998e004 102one can force the classical behavior using this command:
a721e8b2 103.PP
1ae6b2c7
AC
104.in +4n
105.EX
106shopt \-s nullglob
107.EE
108.in
c998e004 109.\" In Bash v1, by setting allow_null_glob_expansion=true
a721e8b2 110.PP
c13182ef 111(Similar problems occur elsewhere.
59dc509c 112For example, where old scripts have
a721e8b2 113.PP
1ae6b2c7
AC
114.in +4n
115.EX
116rm \`find . \-name "*\(ti"\`
117.EE
118.in
a721e8b2 119.PP
fea681da 120new scripts require
a721e8b2 121.PP
1ae6b2c7
AC
122.in +4n
123.EX
124rm \-f nosuchfile \`find . \-name "*\(ti"\`
125.EE
126.in
a721e8b2 127.PP
fea681da
MK
128to avoid error messages from
129.I rm
130called with an empty argument list.)
fea681da
MK
131.SH NOTES
132.SS Regular expressions
133Note that wildcard patterns are not regular expressions,
c13182ef
MK
134although they are a bit similar.
135First of all, they match
fea681da 136filenames, rather than text, and secondly, the conventions
333a424b 137are not the same: for example, in a regular expression \(aq*\(aq means zero or
fea681da 138more copies of the preceding thing.
a721e8b2 139.PP
fea681da 140Now that regular expressions have bracket expressions where
9ca13180
MK
141the negation is indicated by a \(aq\(ha\(aq, POSIX has declared the
142effect of a wildcard pattern "\fI[\(ha...]\fP" to be undefined.
c634028a 143.SS Character classes and internationalization
fea681da 144Of course ranges were originally meant to be ASCII ranges,
333a424b 145so that "\fI[\ \-%]\fP" stands for "\fI[\ !"#$%]\fP" and "\fI[a\-z]\fP" stands
fea681da 146for "any lowercase letter".
008f1ecc 147Some UNIX implementations generalized this so that a range X\-Y
fea681da 148stands for the set of characters with code between the codes for
c13182ef
MK
149X and for Y.
150However, this requires the user to know the
fea681da
MK
151character coding in use on the local system, and moreover, is
152not convenient if the collating sequence for the local alphabet
153differs from the ordering of the character codes.
154Therefore, POSIX extended the bracket notation greatly,
155both for wildcard patterns and for regular expressions.
156In the above we saw three types of items that can occur in a bracket
157expression: namely (i) the negation, (ii) explicit single characters,
c13182ef
MK
158and (iii) ranges.
159POSIX specifies ranges in an internationally
fea681da 160more useful way and adds three more types:
a721e8b2 161.PP
4d9b6984 162(iii) Ranges X\-Y comprise all characters that fall between X
9fdfa163 163and Y (inclusive) in the current collating sequence as defined
097585ed
MK
164by the
165.B LC_COLLATE
166category in the current locale.
a721e8b2 167.PP
fea681da 168(iv) Named character classes, like
408731d4 169.PP
fea681da
MK
170.nf
171[:alnum:] [:alpha:] [:blank:] [:cntrl:]
172[:digit:] [:graph:] [:lower:] [:print:]
173[:punct:] [:space:] [:upper:] [:xdigit:]
174.fi
408731d4 175.PP
333a424b
MK
176so that one can say "\fI[[:lower:]]\fP" instead of "\fI[a\-z]\fP", and have
177things work in Denmark, too, where there are three letters past \(aqz\(aq
fea681da 178in the alphabet.
1274071a
MK
179These character classes are defined by the
180.B LC_CTYPE
181category
fea681da 182in the current locale.
a721e8b2 183.PP
333a424b
MK
184(v) Collating symbols, like "\fI[.ch.]\fP" or "\fI[.a-acute.]\fP",
185where the string between "\fI[.\fP" and "\fI.]\fP" is a collating
c13182ef
MK
186element defined for the current locale.
187Note that this may
ae03dc66 188be a multicharacter element.
a721e8b2 189.PP
333a424b
MK
190(vi) Equivalence class expressions, like "\fI[=a=]\fP",
191where the string between "\fI[=\fP" and "\fI=]\fP" is any collating
fea681da 192element from its equivalence class, as defined for the
c13182ef 193current locale.
333a424b 194For example, "\fI[[=a=]]\fP" might be equivalent
7b97eb9f 195to "\fI[a\('a\(\`a\(:a\(^a]\fP", that is,
333a424b 196to "\fI[a[.a-acute.][.a-grave.][.a-umlaut.][.a-circumflex.]]\fP".
47297adb 197.SH SEE ALSO
fea681da
MK
198.BR sh (1),
199.BR fnmatch (3),
200.BR glob (3),
201.BR locale (7),
202.BR regex (7)