]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Copyright (c) 1998 Andries Brouwer |
2 | .\" | |
3 | .\" This is free documentation; you can redistribute it and/or | |
4 | .\" modify it under the terms of the GNU General Public License as | |
5 | .\" published by the Free Software Foundation; either version 2 of | |
6 | .\" the License, or (at your option) any later version. | |
7 | .\" | |
8 | .\" The GNU General Public License's references to "object code" | |
9 | .\" and "executables" are to be interpreted as the output of any | |
10 | .\" document formatting or typesetting system, including | |
11 | .\" intermediate and printed output. | |
12 | .\" | |
13 | .\" This manual is distributed in the hope that it will be useful, | |
14 | .\" but WITHOUT ANY WARRANTY; without even the implied warranty of | |
15 | .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
16 | .\" GNU General Public License for more details. | |
17 | .\" | |
18 | .\" You should have received a copy of the GNU General Public | |
19 | .\" License along with this manual; if not, write to the Free | |
20 | .\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, | |
21 | .\" USA. | |
22 | .\" | |
23 | .\" 2003-08-24 fix for / by John Kristoff + joey | |
24 | .\" | |
25 | .TH GLOB 7 2003-08-24 "Unix" "Linux Programmer's Manual" | |
26 | .SH NAME | |
27 | glob \- Globbing pathnames | |
28 | .SH DESCRIPTION | |
29 | Long ago, in Unix V6, there was a program | |
30 | .I /etc/glob | |
31 | that would expand wildcard patterns. | |
32 | Soon afterwards this became a shell built-in. | |
33 | ||
34 | These days there is also a library routine | |
35 | .BR glob (3) | |
36 | that will perform this function for a user program. | |
37 | ||
4dec66f9 | 38 | The rules are as follows (POSIX.2, 3.13). |
fea681da MK |
39 | .SH "WILDCARD MATCHING" |
40 | A string is a wildcard pattern if it contains one of the | |
41 | characters `?', `*' or `['. Globbing is the operation | |
42 | that expands a wildcard pattern into the list of pathnames | |
43 | matching the pattern. Matching is defined by: | |
44 | ||
45 | A `?' (not between brackets) matches any single character. | |
46 | ||
47 | A `*' (not between brackets) matches any string, | |
48 | including the empty string. | |
49 | ||
50 | .SS "Character classes" | |
51 | An expression `[...]' where the first character after the | |
52 | leading `[' is not an `!' matches a single character, | |
53 | namely any of the characters enclosed by the brackets. | |
54 | The string enclosed by the brackets cannot be empty; | |
55 | therefore `]' can be allowed between the brackets, provided | |
56 | that it is the first character. (Thus, `[][!]' matches the | |
57 | three characters `[', `]' and `!'.) | |
58 | ||
59 | .SS Ranges | |
60 | There is one special convention: | |
4d9b6984 MK |
61 | two characters separated by `\-' denote a range. |
62 | (Thus, `[A\-Fa\-f0\-9]' is equivalent to `[ABCDEFabcdef0123456789]'.) | |
63 | One may include `\-' in its literal meaning by making it the | |
fea681da | 64 | first or last character between the brackets. |
4d9b6984 MK |
65 | (Thus, `[]\-]' matches just the two characters `]' and `\-', |
66 | and `[\-\-0]' matches the three characters `\-', `.', `0', since `/' | |
fea681da MK |
67 | cannot be matched.) |
68 | ||
69 | .SS Complementation | |
70 | An expression `[!...]' matches a single character, namely | |
71 | any character that is not matched by the expression obtained | |
72 | by removing the first `!' from it. | |
4d9b6984 | 73 | (Thus, `[!]a\-]' matches any single character except `]', `a' and `\-'.) |
fea681da MK |
74 | |
75 | One can remove the special meaning of `?', `*' and `[' by | |
76 | preceding them by a backslash, or, in case this is part of | |
77 | a shell command line, enclosing them in quotes. | |
78 | Between brackets these characters stand for themselves. | |
79 | Thus, `[[?*\e]' matches the four characters `[', `?', `*' and `\e'. | |
80 | ||
81 | .SH PATHNAMES | |
82 | Globbing is applied on each of the components of a pathname | |
83 | separately. A `/' in a pathname cannot be matched by a `?' or `*' | |
4d9b6984 | 84 | wildcard, or by a range like `[.\-0]'. A range cannot contain an |
fea681da MK |
85 | explicit `/' character; this would lead to a syntax error. |
86 | ||
87 | If a filename starts with a `.', this character must be matched explicitly. | |
88 | (Thus, `rm *' will not remove .profile, and `tar c *' will not | |
89 | archive all your files; `tar c .' is better.) | |
90 | ||
91 | .SH "EMPTY LISTS" | |
92 | The nice and simple rule given above: `expand a wildcard pattern | |
93 | into the list of matching pathnames' was the original Unix | |
94 | definition. It allowed one to have patterns that expand into | |
95 | an empty list, as in | |
96 | .br | |
97 | .nf | |
2bc2f479 | 98 | xv \-wait 0 *.gif *.jpg |
fea681da MK |
99 | .fi |
100 | where perhaps no *.gif files are present (and this is not | |
101 | an error). | |
102 | However, POSIX requires that a wildcard pattern is left | |
103 | unchanged when it is syntactically incorrect, or the list of | |
104 | matching pathnames is empty. | |
105 | With | |
106 | .I bash | |
107 | one can force the classical behaviour by setting | |
108 | .IR allow_null_glob_expansion=true . | |
109 | ||
110 | (Similar problems occur elsewhere. E.g., where old scripts have | |
111 | .br | |
112 | .nf | |
2bc2f479 | 113 | rm `find . \-name "*~"` |
fea681da MK |
114 | .fi |
115 | new scripts require | |
116 | .br | |
117 | .nf | |
2bc2f479 | 118 | rm \-f nosuchfile `find . \-name "*~"` |
fea681da MK |
119 | .fi |
120 | to avoid error messages from | |
121 | .I rm | |
122 | called with an empty argument list.) | |
123 | ||
124 | .SH NOTES | |
125 | .SS Regular expressions | |
126 | Note that wildcard patterns are not regular expressions, | |
127 | although they are a bit similar. First of all, they match | |
128 | filenames, rather than text, and secondly, the conventions | |
129 | are not the same: e.g., in a regular expression `*' means zero or | |
130 | more copies of the preceding thing. | |
131 | ||
132 | Now that regular expressions have bracket expressions where | |
133 | the negation is indicated by a `^', POSIX has declared the | |
134 | effect of a wildcard pattern `[^...]' to be undefined. | |
135 | ||
136 | .SS Character classes and Internationalization | |
137 | Of course ranges were originally meant to be ASCII ranges, | |
4d9b6984 | 138 | so that `[\ \-%]' stands for `[\ !"#$%]' and `[a\-z]' stands |
fea681da | 139 | for "any lowercase letter". |
4d9b6984 | 140 | Some Unix implementations generalized this so that a range X\-Y |
fea681da MK |
141 | stands for the set of characters with code between the codes for |
142 | X and for Y. However, this requires the user to know the | |
143 | character coding in use on the local system, and moreover, is | |
144 | not convenient if the collating sequence for the local alphabet | |
145 | differs from the ordering of the character codes. | |
146 | Therefore, POSIX extended the bracket notation greatly, | |
147 | both for wildcard patterns and for regular expressions. | |
148 | In the above we saw three types of items that can occur in a bracket | |
149 | expression: namely (i) the negation, (ii) explicit single characters, | |
150 | and (iii) ranges. POSIX specifies ranges in an internationally | |
151 | more useful way and adds three more types: | |
152 | ||
4d9b6984 | 153 | (iii) Ranges X\-Y comprise all characters that fall between X |
9fdfa163 | 154 | and Y (inclusive) in the current collating sequence as defined |
fea681da MK |
155 | by the LC_COLLATE category in the current locale. |
156 | ||
157 | (iv) Named character classes, like | |
158 | .br | |
159 | .nf | |
160 | [:alnum:] [:alpha:] [:blank:] [:cntrl:] | |
161 | [:digit:] [:graph:] [:lower:] [:print:] | |
162 | [:punct:] [:space:] [:upper:] [:xdigit:] | |
163 | .fi | |
4d9b6984 | 164 | so that one can say `[[:lower:]]' instead of `[a\-z]', and have |
fea681da MK |
165 | things work in Denmark, too, where there are three letters past `z' |
166 | in the alphabet. | |
167 | These character classes are defined by the LC_CTYPE category | |
168 | in the current locale. | |
169 | ||
170 | (v) Collating symbols, like `[.ch.]' or `[.a-acute.]', | |
171 | where the string between `[.' and `.]' is a collating | |
172 | element defined for the current locale. Note that this may | |
173 | be a multi-character element. | |
174 | ||
175 | (vi) Equivalence class expressions, like `[=a=]', | |
176 | where the string between `[=' and `=]' is any collating | |
177 | element from its equivalence class, as defined for the | |
178 | current locale. For example, `[[=a=]]' might be equivalent | |
179 |