]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Copyright (c) 1998 Andries Brouwer |
2 | .\" | |
3 | .\" This is free documentation; you can redistribute it and/or | |
4 | .\" modify it under the terms of the GNU General Public License as | |
5 | .\" published by the Free Software Foundation; either version 2 of | |
6 | .\" the License, or (at your option) any later version. | |
7 | .\" | |
8 | .\" The GNU General Public License's references to "object code" | |
9 | .\" and "executables" are to be interpreted as the output of any | |
10 | .\" document formatting or typesetting system, including | |
11 | .\" intermediate and printed output. | |
12 | .\" | |
13 | .\" This manual is distributed in the hope that it will be useful, | |
14 | .\" but WITHOUT ANY WARRANTY; without even the implied warranty of | |
15 | .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
16 | .\" GNU General Public License for more details. | |
17 | .\" | |
18 | .\" You should have received a copy of the GNU General Public | |
19 | .\" License along with this manual; if not, write to the Free | |
20 | .\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, | |
21 | .\" USA. | |
22 | .\" | |
23 | .\" 2003-08-24 fix for / by John Kristoff + joey | |
24 | .\" | |
69289f8a | 25 | .TH GLOB 7 2003-08-24 "Linux" "Linux Programmer's Manual" |
fea681da MK |
26 | .SH NAME |
27 | glob \- Globbing pathnames | |
28 | .SH DESCRIPTION | |
29 | Long ago, in Unix V6, there was a program | |
30 | .I /etc/glob | |
31 | that would expand wildcard patterns. | |
32 | Soon afterwards this became a shell built-in. | |
33 | ||
34 | These days there is also a library routine | |
35 | .BR glob (3) | |
36 | that will perform this function for a user program. | |
37 | ||
4dec66f9 | 38 | The rules are as follows (POSIX.2, 3.13). |
1ce284ec | 39 | .SS "Wildcard Matching" |
fea681da | 40 | A string is a wildcard pattern if it contains one of the |
c13182ef MK |
41 | characters `?', `*' or `['. |
42 | Globbing is the operation | |
fea681da | 43 | that expands a wildcard pattern into the list of pathnames |
c13182ef MK |
44 | matching the pattern. |
45 | Matching is defined by: | |
fea681da MK |
46 | |
47 | A `?' (not between brackets) matches any single character. | |
48 | ||
49 | A `*' (not between brackets) matches any string, | |
50 | including the empty string. | |
1ce284ec MK |
51 | .PP |
52 | .B "Character classes" | |
53 | .sp | |
fea681da MK |
54 | An expression `[...]' where the first character after the |
55 | leading `[' is not an `!' matches a single character, | |
56 | namely any of the characters enclosed by the brackets. | |
57 | The string enclosed by the brackets cannot be empty; | |
58 | therefore `]' can be allowed between the brackets, provided | |
c13182ef MK |
59 | that it is the first character. |
60 | (Thus, `[][!]' matches the three characters `[', `]' and `!'.) | |
1ce284ec MK |
61 | .PP |
62 | .B Ranges | |
63 | .sp | |
fea681da | 64 | There is one special convention: |
4d9b6984 MK |
65 | two characters separated by `\-' denote a range. |
66 | (Thus, `[A\-Fa\-f0\-9]' is equivalent to `[ABCDEFabcdef0123456789]'.) | |
67 | One may include `\-' in its literal meaning by making it the | |
fea681da | 68 | first or last character between the brackets. |
4d9b6984 MK |
69 | (Thus, `[]\-]' matches just the two characters `]' and `\-', |
70 | and `[\-\-0]' matches the three characters `\-', `.', `0', since `/' | |
fea681da | 71 | cannot be matched.) |
1ce284ec MK |
72 | .PP |
73 | .B Complementation | |
74 | .sp | |
fea681da MK |
75 | An expression `[!...]' matches a single character, namely |
76 | any character that is not matched by the expression obtained | |
77 | by removing the first `!' from it. | |
4d9b6984 | 78 | (Thus, `[!]a\-]' matches any single character except `]', `a' and `\-'.) |
fea681da MK |
79 | |
80 | One can remove the special meaning of `?', `*' and `[' by | |
81 | preceding them by a backslash, or, in case this is part of | |
82 | a shell command line, enclosing them in quotes. | |
83 | Between brackets these characters stand for themselves. | |
84 | Thus, `[[?*\e]' matches the four characters `[', `?', `*' and `\e'. | |
1ce284ec | 85 | .SS Pathnames |
fea681da | 86 | Globbing is applied on each of the components of a pathname |
c13182ef MK |
87 | separately. |
88 | A `/' in a pathname cannot be matched by a `?' or `*' | |
89 | wildcard, or by a range like `[.\-0]'. | |
90 | A range cannot contain an | |
fea681da MK |
91 | explicit `/' character; this would lead to a syntax error. |
92 | ||
93 | If a filename starts with a `.', this character must be matched explicitly. | |
94 | (Thus, `rm *' will not remove .profile, and `tar c *' will not | |
95 | archive all your files; `tar c .' is better.) | |
1ce284ec | 96 | .SS "Empty Lists" |
fea681da MK |
97 | The nice and simple rule given above: `expand a wildcard pattern |
98 | into the list of matching pathnames' was the original Unix | |
c13182ef MK |
99 | definition. |
100 | It allowed one to have patterns that expand into | |
fea681da MK |
101 | an empty list, as in |
102 | .br | |
103 | .nf | |
7295b7ed | 104 | xv \-wait 0 *.gif *.jpg |
fea681da MK |
105 | .fi |
106 | where perhaps no *.gif files are present (and this is not | |
107 | an error). | |
108 | However, POSIX requires that a wildcard pattern is left | |
109 | unchanged when it is syntactically incorrect, or the list of | |
110 | matching pathnames is empty. | |
111 | With | |
112 | .I bash | |
d9bfdb9c | 113 | one can force the classical behavior by setting |
fea681da MK |
114 | .IR allow_null_glob_expansion=true . |
115 | ||
c13182ef MK |
116 | (Similar problems occur elsewhere. |
117 | E.g., where old scripts have | |
fea681da MK |
118 | .br |
119 | .nf | |
7295b7ed | 120 | rm `find . \-name "*~"` |
fea681da MK |
121 | .fi |
122 | new scripts require | |
123 | .br | |
124 | .nf | |
7295b7ed | 125 | rm \-f nosuchfile `find . \-name "*~"` |
fea681da MK |
126 | .fi |
127 | to avoid error messages from | |
128 | .I rm | |
129 | called with an empty argument list.) | |
fea681da MK |
130 | .SH NOTES |
131 | .SS Regular expressions | |
132 | Note that wildcard patterns are not regular expressions, | |
c13182ef MK |
133 | although they are a bit similar. |
134 | First of all, they match | |
fea681da | 135 | filenames, rather than text, and secondly, the conventions |
75b94dc3 | 136 | are not the same: for example, in a regular expression `*' means zero or |
fea681da MK |
137 | more copies of the preceding thing. |
138 | ||
139 | Now that regular expressions have bracket expressions where | |
140 | the negation is indicated by a `^', POSIX has declared the | |
141 | effect of a wildcard pattern `[^...]' to be undefined. | |
fea681da MK |
142 | .SS Character classes and Internationalization |
143 | Of course ranges were originally meant to be ASCII ranges, | |
4d9b6984 | 144 | so that `[\ \-%]' stands for `[\ !"#$%]' and `[a\-z]' stands |
fea681da | 145 | for "any lowercase letter". |
4d9b6984 | 146 | Some Unix implementations generalized this so that a range X\-Y |
fea681da | 147 | stands for the set of characters with code between the codes for |
c13182ef MK |
148 | X and for Y. |
149 | However, this requires the user to know the | |
fea681da MK |
150 | character coding in use on the local system, and moreover, is |
151 | not convenient if the collating sequence for the local alphabet | |
152 | differs from the ordering of the character codes. | |
153 | Therefore, POSIX extended the bracket notation greatly, | |
154 | both for wildcard patterns and for regular expressions. | |
155 | In the above we saw three types of items that can occur in a bracket | |
156 | expression: namely (i) the negation, (ii) explicit single characters, | |
c13182ef MK |
157 | and (iii) ranges. |
158 | POSIX specifies ranges in an internationally | |
fea681da MK |
159 | more useful way and adds three more types: |
160 | ||
4d9b6984 | 161 | (iii) Ranges X\-Y comprise all characters that fall between X |
9fdfa163 | 162 | and Y (inclusive) in the current collating sequence as defined |
fea681da MK |
163 | by the LC_COLLATE category in the current locale. |
164 | ||
165 | (iv) Named character classes, like | |
fea681da | 166 | .nf |
cf0a9ace | 167 | |
fea681da MK |
168 | [:alnum:] [:alpha:] [:blank:] [:cntrl:] |
169 | [:digit:] [:graph:] [:lower:] [:print:] | |
170 | [:punct:] [:space:] [:upper:] [:xdigit:] | |
cf0a9ace | 171 | |
fea681da | 172 | .fi |
4d9b6984 | 173 | so that one can say `[[:lower:]]' instead of `[a\-z]', and have |
fea681da MK |
174 | things work in Denmark, too, where there are three letters past `z' |
175 | in the alphabet. | |
1274071a MK |
176 | These character classes are defined by the |
177 | .B LC_CTYPE | |
178 | category | |
fea681da MK |
179 | in the current locale. |
180 | ||
181 | (v) Collating symbols, like `[.ch.]' or `[.a-acute.]', | |
182 | where the string between `[.' and `.]' is a collating | |
c13182ef MK |
183 | element defined for the current locale. |
184 | Note that this may | |
fea681da MK |
185 | be a multi-character element. |
186 | ||
187 | (vi) Equivalence class expressions, like `[=a=]', | |
188 | where the string between `[=' and `=]' is any collating | |
189 | element from its equivalence class, as defined for the | |
c13182ef MK |
190 | current locale. |
191 | For example, `[[=a=]]' might be equivalent | |
9044f1b8 MK |
192 | .\" FIXME the accented 'a' characters are not rendering properly |
193 | .\" mtk May 2007 | |
fea681da MK |
194 |