]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Copyright (c) 1998 Andries Brouwer |
2 | .\" | |
3 | .\" This is free documentation; you can redistribute it and/or | |
4 | .\" modify it under the terms of the GNU General Public License as | |
5 | .\" published by the Free Software Foundation; either version 2 of | |
6 | .\" the License, or (at your option) any later version. | |
7 | .\" | |
8 | .\" The GNU General Public License's references to "object code" | |
9 | .\" and "executables" are to be interpreted as the output of any | |
10 | .\" document formatting or typesetting system, including | |
11 | .\" intermediate and printed output. | |
12 | .\" | |
13 | .\" This manual is distributed in the hope that it will be useful, | |
14 | .\" but WITHOUT ANY WARRANTY; without even the implied warranty of | |
15 | .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
16 | .\" GNU General Public License for more details. | |
17 | .\" | |
18 | .\" You should have received a copy of the GNU General Public | |
19 | .\" License along with this manual; if not, write to the Free | |
20 | .\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, | |
21 | .\" USA. | |
22 | .\" | |
23 | .\" 2003-08-24 fix for / by John Kristoff + joey | |
24 | .\" | |
69289f8a | 25 | .TH GLOB 7 2003-08-24 "Linux" "Linux Programmer's Manual" |
fea681da MK |
26 | .SH NAME |
27 | glob \- Globbing pathnames | |
28 | .SH DESCRIPTION | |
008f1ecc | 29 | Long ago, in UNIX V6, there was a program |
fea681da MK |
30 | .I /etc/glob |
31 | that would expand wildcard patterns. | |
5fab2e7c | 32 | Soon afterward this became a shell built-in. |
fea681da MK |
33 | |
34 | These days there is also a library routine | |
35 | .BR glob (3) | |
36 | that will perform this function for a user program. | |
37 | ||
4dec66f9 | 38 | The rules are as follows (POSIX.2, 3.13). |
1ce284ec | 39 | .SS "Wildcard Matching" |
fea681da | 40 | A string is a wildcard pattern if it contains one of the |
333a424b | 41 | characters \(aq?\(aq, \(aq*\(aq or \(aq[\(aq. |
c13182ef | 42 | Globbing is the operation |
fea681da | 43 | that expands a wildcard pattern into the list of pathnames |
c13182ef MK |
44 | matching the pattern. |
45 | Matching is defined by: | |
fea681da | 46 | |
333a424b | 47 | A \(aq?\(aq (not between brackets) matches any single character. |
fea681da | 48 | |
333a424b | 49 | A \(aq*\(aq (not between brackets) matches any string, |
fea681da | 50 | including the empty string. |
1ce284ec MK |
51 | .PP |
52 | .B "Character classes" | |
53 | .sp | |
333a424b MK |
54 | An expression "\fI[...]\fP" where the first character after the |
55 | leading \(aq[\(aq is not an \(aq!\(aq matches a single character, | |
fea681da MK |
56 | namely any of the characters enclosed by the brackets. |
57 | The string enclosed by the brackets cannot be empty; | |
333a424b | 58 | therefore \(aq]\(aq can be allowed between the brackets, provided |
c13182ef | 59 | that it is the first character. |
333a424b MK |
60 | (Thus, "\fI[][!]\fP" matches the |
61 | three characters \(aq[\(aq, \(aq]\(aq and \(aq!\(aq.) | |
1ce284ec MK |
62 | .PP |
63 | .B Ranges | |
64 | .sp | |
fea681da | 65 | There is one special convention: |
333a424b | 66 | two characters separated by \(aq\-\(aq denote a range. |
c45660d7 MK |
67 | (Thus, "\fI[A\-Fa\-f0\-9]\fP" |
68 | is equivalent to "\fI[ABCDEFabcdef0123456789]\fP".) | |
333a424b | 69 | One may include \(aq\-\(aq in its literal meaning by making it the |
fea681da | 70 | first or last character between the brackets. |
333a424b MK |
71 | (Thus, "\fI[]\-]\fP" matches just the two characters \(aq]\(aq and \(aq\-\(aq, |
72 | and "\fI[\-\-0]\fP" matches the | |
73 | three characters \(aq\-\(aq, \(aq.\(aq, \(aq0\(aq, since \(aq/\(aq | |
fea681da | 74 | cannot be matched.) |
1ce284ec MK |
75 | .PP |
76 | .B Complementation | |
77 | .sp | |
333a424b | 78 | An expression "\fI[!...]\fP" matches a single character, namely |
fea681da | 79 | any character that is not matched by the expression obtained |
333a424b MK |
80 | by removing the first \(aq!\(aq from it. |
81 | (Thus, "\fI[!]a\-]\fP" matches any | |
82 | single character except \(aq]\(aq, \(aqa\(aq and \(aq\-\(aq.) | |
fea681da | 83 | |
333a424b | 84 | One can remove the special meaning of \(aq?\(aq, \(aq*\(aq and \(aq[\(aq by |
fea681da MK |
85 | preceding them by a backslash, or, in case this is part of |
86 | a shell command line, enclosing them in quotes. | |
87 | Between brackets these characters stand for themselves. | |
333a424b MK |
88 | Thus, "\fI[[?*\e]\fP" matches the |
89 | four characters \(aq[\(aq, \(aq?\(aq, \(aq*\(aq and \(aq\e\(aq. | |
1ce284ec | 90 | .SS Pathnames |
fea681da | 91 | Globbing is applied on each of the components of a pathname |
c13182ef | 92 | separately. |
333a424b MK |
93 | A \(aq/\(aq in a pathname cannot be matched by a \(aq?\(aq or \(aq*\(aq |
94 | wildcard, or by a range like "\fI[.\-0]\fP". | |
c13182ef | 95 | A range cannot contain an |
333a424b | 96 | explicit \(aq/\(aq character; this would lead to a syntax error. |
fea681da | 97 | |
c45660d7 MK |
98 | If a filename starts with a \(aq.\(aq, |
99 | this character must be matched explicitly. | |
333a424b MK |
100 | (Thus, \fIrm\ *\fP will not remove .profile, and \fItar\ c\ *\fP will not |
101 | archive all your files; \fItar\ c\ .\fP is better.) | |
1ce284ec | 102 | .SS "Empty Lists" |
333a424b | 103 | The nice and simple rule given above: "expand a wildcard pattern |
008f1ecc | 104 | into the list of matching pathnames" was the original UNIX |
c13182ef MK |
105 | definition. |
106 | It allowed one to have patterns that expand into | |
fea681da MK |
107 | an empty list, as in |
108 | .br | |
109 | .nf | |
7295b7ed | 110 | xv \-wait 0 *.gif *.jpg |
fea681da MK |
111 | .fi |
112 | where perhaps no *.gif files are present (and this is not | |
113 | an error). | |
114 | However, POSIX requires that a wildcard pattern is left | |
115 | unchanged when it is syntactically incorrect, or the list of | |
116 | matching pathnames is empty. | |
117 | With | |
118 | .I bash | |
d9bfdb9c | 119 | one can force the classical behavior by setting |
fea681da MK |
120 | .IR allow_null_glob_expansion=true . |
121 | ||
c13182ef MK |
122 | (Similar problems occur elsewhere. |
123 | E.g., where old scripts have | |
fea681da MK |
124 | .br |
125 | .nf | |
26868e5b | 126 | rm \`find . \-name "*~"\` |
fea681da MK |
127 | .fi |
128 | new scripts require | |
129 | .br | |
130 | .nf | |
26868e5b | 131 | rm \-f nosuchfile \`find . \-name "*~"\` |
fea681da MK |
132 | .fi |
133 | to avoid error messages from | |
134 | .I rm | |
135 | called with an empty argument list.) | |
fea681da MK |
136 | .SH NOTES |
137 | .SS Regular expressions | |
138 | Note that wildcard patterns are not regular expressions, | |
c13182ef MK |
139 | although they are a bit similar. |
140 | First of all, they match | |
fea681da | 141 | filenames, rather than text, and secondly, the conventions |
333a424b | 142 | are not the same: for example, in a regular expression \(aq*\(aq means zero or |
fea681da MK |
143 | more copies of the preceding thing. |
144 | ||
145 | Now that regular expressions have bracket expressions where | |
333a424b MK |
146 | the negation is indicated by a \(aq^\(aq, POSIX has declared the |
147 | effect of a wildcard pattern "\fI[^...]\fP" to be undefined. | |
fea681da MK |
148 | .SS Character classes and Internationalization |
149 | Of course ranges were originally meant to be ASCII ranges, | |
333a424b | 150 | so that "\fI[\ \-%]\fP" stands for "\fI[\ !"#$%]\fP" and "\fI[a\-z]\fP" stands |
fea681da | 151 | for "any lowercase letter". |
008f1ecc | 152 | Some UNIX implementations generalized this so that a range X\-Y |
fea681da | 153 | stands for the set of characters with code between the codes for |
c13182ef MK |
154 | X and for Y. |
155 | However, this requires the user to know the | |
fea681da MK |
156 | character coding in use on the local system, and moreover, is |
157 | not convenient if the collating sequence for the local alphabet | |
158 | differs from the ordering of the character codes. | |
159 | Therefore, POSIX extended the bracket notation greatly, | |
160 | both for wildcard patterns and for regular expressions. | |
161 | In the above we saw three types of items that can occur in a bracket | |
162 | expression: namely (i) the negation, (ii) explicit single characters, | |
c13182ef MK |
163 | and (iii) ranges. |
164 | POSIX specifies ranges in an internationally | |
fea681da MK |
165 | more useful way and adds three more types: |
166 | ||
4d9b6984 | 167 | (iii) Ranges X\-Y comprise all characters that fall between X |
9fdfa163 | 168 | and Y (inclusive) in the current collating sequence as defined |
097585ed MK |
169 | by the |
170 | .B LC_COLLATE | |
171 | category in the current locale. | |
fea681da MK |
172 | |
173 | (iv) Named character classes, like | |
fea681da | 174 | .nf |
cf0a9ace | 175 | |
fea681da MK |
176 | [:alnum:] [:alpha:] [:blank:] [:cntrl:] |
177 | [:digit:] [:graph:] [:lower:] [:print:] | |
178 | [:punct:] [:space:] [:upper:] [:xdigit:] | |
cf0a9ace | 179 | |
fea681da | 180 | .fi |
333a424b MK |
181 | so that one can say "\fI[[:lower:]]\fP" instead of "\fI[a\-z]\fP", and have |
182 | things work in Denmark, too, where there are three letters past \(aqz\(aq | |
fea681da | 183 | in the alphabet. |
1274071a MK |
184 | These character classes are defined by the |
185 | .B LC_CTYPE | |
186 | category | |
fea681da MK |
187 | in the current locale. |
188 | ||
333a424b MK |
189 | (v) Collating symbols, like "\fI[.ch.]\fP" or "\fI[.a-acute.]\fP", |
190 | where the string between "\fI[.\fP" and "\fI.]\fP" is a collating | |
c13182ef MK |
191 | element defined for the current locale. |
192 | Note that this may | |
ae03dc66 | 193 | be a multicharacter element. |
fea681da | 194 | |
333a424b MK |
195 | (vi) Equivalence class expressions, like "\fI[=a=]\fP", |
196 | where the string between "\fI[=\fP" and "\fI=]\fP" is any collating | |
fea681da | 197 | element from its equivalence class, as defined for the |
c13182ef | 198 | current locale. |
333a424b | 199 | For example, "\fI[[=a=]]\fP" might be equivalent |
4568aa3c | 200 | to "\fI[a\('a\(`a\(:a\(^a]\fP", that is, |
333a424b | 201 | to "\fI[a[.a-acute.][.a-grave.][.a-umlaut.][.a-circumflex.]]\fP". |
fea681da MK |
202 | .SH "SEE ALSO" |
203 | .BR sh (1), | |
204 | .BR fnmatch (3), | |
205 | .BR glob (3), | |
206 | .BR locale (7), | |
207 | .BR regex (7) |