]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man7/glob.7
socket.7: srcfix
[thirdparty/man-pages.git] / man7 / glob.7
CommitLineData
fea681da
MK
1.\" Copyright (c) 1998 Andries Brouwer
2.\"
3.\" This is free documentation; you can redistribute it and/or
4.\" modify it under the terms of the GNU General Public License as
5.\" published by the Free Software Foundation; either version 2 of
6.\" the License, or (at your option) any later version.
7.\"
8.\" The GNU General Public License's references to "object code"
9.\" and "executables" are to be interpreted as the output of any
10.\" document formatting or typesetting system, including
11.\" intermediate and printed output.
12.\"
13.\" This manual is distributed in the hope that it will be useful,
14.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
15.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
16.\" GNU General Public License for more details.
17.\"
18.\" You should have received a copy of the GNU General Public
19.\" License along with this manual; if not, write to the Free
20.\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111,
21.\" USA.
22.\"
23.\" 2003-08-24 fix for / by John Kristoff + joey
24.\"
69289f8a 25.TH GLOB 7 2003-08-24 "Linux" "Linux Programmer's Manual"
fea681da
MK
26.SH NAME
27glob \- Globbing pathnames
28.SH DESCRIPTION
008f1ecc 29Long ago, in UNIX V6, there was a program
fea681da
MK
30.I /etc/glob
31that would expand wildcard patterns.
5fab2e7c 32Soon afterward this became a shell built-in.
fea681da
MK
33
34These days there is also a library routine
35.BR glob (3)
36that will perform this function for a user program.
37
4dec66f9 38The rules are as follows (POSIX.2, 3.13).
1ce284ec 39.SS "Wildcard Matching"
fea681da 40A string is a wildcard pattern if it contains one of the
333a424b 41characters \(aq?\(aq, \(aq*\(aq or \(aq[\(aq.
c13182ef 42Globbing is the operation
fea681da 43that expands a wildcard pattern into the list of pathnames
c13182ef
MK
44matching the pattern.
45Matching is defined by:
fea681da 46
333a424b 47A \(aq?\(aq (not between brackets) matches any single character.
fea681da 48
333a424b 49A \(aq*\(aq (not between brackets) matches any string,
fea681da 50including the empty string.
1ce284ec
MK
51.PP
52.B "Character classes"
53.sp
333a424b
MK
54An expression "\fI[...]\fP" where the first character after the
55leading \(aq[\(aq is not an \(aq!\(aq matches a single character,
fea681da
MK
56namely any of the characters enclosed by the brackets.
57The string enclosed by the brackets cannot be empty;
333a424b 58therefore \(aq]\(aq can be allowed between the brackets, provided
c13182ef 59that it is the first character.
333a424b
MK
60(Thus, "\fI[][!]\fP" matches the
61three characters \(aq[\(aq, \(aq]\(aq and \(aq!\(aq.)
1ce284ec
MK
62.PP
63.B Ranges
64.sp
fea681da 65There is one special convention:
333a424b 66two characters separated by \(aq\-\(aq denote a range.
c45660d7
MK
67(Thus, "\fI[A\-Fa\-f0\-9]\fP"
68is equivalent to "\fI[ABCDEFabcdef0123456789]\fP".)
333a424b 69One may include \(aq\-\(aq in its literal meaning by making it the
fea681da 70first or last character between the brackets.
333a424b
MK
71(Thus, "\fI[]\-]\fP" matches just the two characters \(aq]\(aq and \(aq\-\(aq,
72and "\fI[\-\-0]\fP" matches the
73three characters \(aq\-\(aq, \(aq.\(aq, \(aq0\(aq, since \(aq/\(aq
fea681da 74cannot be matched.)
1ce284ec
MK
75.PP
76.B Complementation
77.sp
333a424b 78An expression "\fI[!...]\fP" matches a single character, namely
fea681da 79any character that is not matched by the expression obtained
333a424b
MK
80by removing the first \(aq!\(aq from it.
81(Thus, "\fI[!]a\-]\fP" matches any
82single character except \(aq]\(aq, \(aqa\(aq and \(aq\-\(aq.)
fea681da 83
333a424b 84One can remove the special meaning of \(aq?\(aq, \(aq*\(aq and \(aq[\(aq by
fea681da
MK
85preceding them by a backslash, or, in case this is part of
86a shell command line, enclosing them in quotes.
87Between brackets these characters stand for themselves.
333a424b
MK
88Thus, "\fI[[?*\e]\fP" matches the
89four characters \(aq[\(aq, \(aq?\(aq, \(aq*\(aq and \(aq\e\(aq.
1ce284ec 90.SS Pathnames
fea681da 91Globbing is applied on each of the components of a pathname
c13182ef 92separately.
333a424b
MK
93A \(aq/\(aq in a pathname cannot be matched by a \(aq?\(aq or \(aq*\(aq
94wildcard, or by a range like "\fI[.\-0]\fP".
c13182ef 95A range cannot contain an
333a424b 96explicit \(aq/\(aq character; this would lead to a syntax error.
fea681da 97
c45660d7
MK
98If a filename starts with a \(aq.\(aq,
99this character must be matched explicitly.
333a424b
MK
100(Thus, \fIrm\ *\fP will not remove .profile, and \fItar\ c\ *\fP will not
101archive all your files; \fItar\ c\ .\fP is better.)
1ce284ec 102.SS "Empty Lists"
333a424b 103The nice and simple rule given above: "expand a wildcard pattern
008f1ecc 104into the list of matching pathnames" was the original UNIX
c13182ef
MK
105definition.
106It allowed one to have patterns that expand into
fea681da
MK
107an empty list, as in
108.br
109.nf
7295b7ed 110 xv \-wait 0 *.gif *.jpg
fea681da
MK
111.fi
112where perhaps no *.gif files are present (and this is not
113an error).
114However, POSIX requires that a wildcard pattern is left
115unchanged when it is syntactically incorrect, or the list of
116matching pathnames is empty.
117With
118.I bash
d9bfdb9c 119one can force the classical behavior by setting
fea681da
MK
120.IR allow_null_glob_expansion=true .
121
c13182ef
MK
122(Similar problems occur elsewhere.
123E.g., where old scripts have
fea681da
MK
124.br
125.nf
26868e5b 126 rm \`find . \-name "*~"\`
fea681da
MK
127.fi
128new scripts require
129.br
130.nf
26868e5b 131 rm \-f nosuchfile \`find . \-name "*~"\`
fea681da
MK
132.fi
133to avoid error messages from
134.I rm
135called with an empty argument list.)
fea681da
MK
136.SH NOTES
137.SS Regular expressions
138Note that wildcard patterns are not regular expressions,
c13182ef
MK
139although they are a bit similar.
140First of all, they match
fea681da 141filenames, rather than text, and secondly, the conventions
333a424b 142are not the same: for example, in a regular expression \(aq*\(aq means zero or
fea681da
MK
143more copies of the preceding thing.
144
145Now that regular expressions have bracket expressions where
333a424b
MK
146the negation is indicated by a \(aq^\(aq, POSIX has declared the
147effect of a wildcard pattern "\fI[^...]\fP" to be undefined.
fea681da
MK
148.SS Character classes and Internationalization
149Of course ranges were originally meant to be ASCII ranges,
333a424b 150so that "\fI[\ \-%]\fP" stands for "\fI[\ !"#$%]\fP" and "\fI[a\-z]\fP" stands
fea681da 151for "any lowercase letter".
008f1ecc 152Some UNIX implementations generalized this so that a range X\-Y
fea681da 153stands for the set of characters with code between the codes for
c13182ef
MK
154X and for Y.
155However, this requires the user to know the
fea681da
MK
156character coding in use on the local system, and moreover, is
157not convenient if the collating sequence for the local alphabet
158differs from the ordering of the character codes.
159Therefore, POSIX extended the bracket notation greatly,
160both for wildcard patterns and for regular expressions.
161In the above we saw three types of items that can occur in a bracket
162expression: namely (i) the negation, (ii) explicit single characters,
c13182ef
MK
163and (iii) ranges.
164POSIX specifies ranges in an internationally
fea681da
MK
165more useful way and adds three more types:
166
4d9b6984 167(iii) Ranges X\-Y comprise all characters that fall between X
9fdfa163 168and Y (inclusive) in the current collating sequence as defined
097585ed
MK
169by the
170.B LC_COLLATE
171category in the current locale.
fea681da
MK
172
173(iv) Named character classes, like
fea681da 174.nf
cf0a9ace 175
fea681da
MK
176[:alnum:] [:alpha:] [:blank:] [:cntrl:]
177[:digit:] [:graph:] [:lower:] [:print:]
178[:punct:] [:space:] [:upper:] [:xdigit:]
cf0a9ace 179
fea681da 180.fi
333a424b
MK
181so that one can say "\fI[[:lower:]]\fP" instead of "\fI[a\-z]\fP", and have
182things work in Denmark, too, where there are three letters past \(aqz\(aq
fea681da 183in the alphabet.
1274071a
MK
184These character classes are defined by the
185.B LC_CTYPE
186category
fea681da
MK
187in the current locale.
188
333a424b
MK
189(v) Collating symbols, like "\fI[.ch.]\fP" or "\fI[.a-acute.]\fP",
190where the string between "\fI[.\fP" and "\fI.]\fP" is a collating
c13182ef
MK
191element defined for the current locale.
192Note that this may
ae03dc66 193be a multicharacter element.
fea681da 194
333a424b
MK
195(vi) Equivalence class expressions, like "\fI[=a=]\fP",
196where the string between "\fI[=\fP" and "\fI=]\fP" is any collating
fea681da 197element from its equivalence class, as defined for the
c13182ef 198current locale.
333a424b 199For example, "\fI[[=a=]]\fP" might be equivalent
4568aa3c 200to "\fI[a\('a\(`a\(:a\(^a]\fP", that is,
333a424b 201to "\fI[a[.a-acute.][.a-grave.][.a-umlaut.][.a-circumflex.]]\fP".
fea681da
MK
202.SH "SEE ALSO"
203.BR sh (1),
204.BR fnmatch (3),
205.BR glob (3),
206.BR locale (7),
207.BR regex (7)