]>
Commit | Line | Data |
---|---|---|
b80d6d4b WP |
1 | |
2 | NAME | |
3 | parse_date - parses a date string into a timespec struct. | |
4 | ||
5 | SYNOPSIS | |
6 | #include "timeutils.h" | |
796b9759 | 7 | |
b80d6d4b WP |
8 | int parse_date(struct timespec *result, char const *p, |
9 | struct timespec const *now) | |
10 | ||
11 | LDADD libcommon.la | |
12 | ||
13 | DESCRIPTION | |
14 | Parse a date/time string, storing the resulting time value into *result. | |
15 | The string itself is pointed to by *p. Return 1 if successful. | |
16 | *p can be an incomplete or relative time specification; if so, use | |
17 | *now as the basis for the returned time. | |
796b9759 | 18 | |
b80d6d4b WP |
19 | |
20 | This function is based upon gnulib's parse-datetime.y-dd7a871. | |
21 | ||
22 | Below is a plain text version of the gnulib parse-datetime.texi-dd7a871 manual | |
23 | describing the input strings that are recognized. | |
24 | ||
25 | Any future modifications to the util-linux parser that affect input strings | |
26 | should be noted below. | |
27 | ||
28 | ||
29 | 1 Date input formats | |
30 | ******************** | |
31 | ||
32 | First, a quote: | |
33 | ||
34 | Our units of temporal measurement, from seconds on up to months, | |
35 | are so complicated, asymmetrical and disjunctive so as to make | |
36 | coherent mental reckoning in time all but impossible. Indeed, had | |
37 | some tyrannical god contrived to enslave our minds to time, to | |
38 | make it all but impossible for us to escape subjection to sodden | |
39 | routines and unpleasant surprises, he could hardly have done | |
40 | better than handing down our present system. It is like a set of | |
41 | trapezoidal building blocks, with no vertical or horizontal | |
42 | surfaces, like a language in which the simplest thought demands | |
43 | ornate constructions, useless particles and lengthy | |
44 | circumlocutions. Unlike the more successful patterns of language | |
45 | and science, which enable us to face experience boldly or at least | |
46 | level-headedly, our system of temporal calculation silently and | |
47 | persistently encourages our terror of time. | |
48 | ||
49 | ... It is as though architects had to measure length in feet, | |
50 | width in meters and height in ells; as though basic instruction | |
51 | manuals demanded a knowledge of five different languages. It is | |
52 | no wonder then that we often look into our own immediate past or | |
53 | future, last Tuesday or a week from Sunday, with feelings of | |
54 | helpless confusion. ... | |
55 | ||
56 | --Robert Grudin, `Time and the Art of Living'. | |
57 | ||
58 | This section describes the textual date representations that GNU | |
59 | programs accept. These are the strings you, as a user, can supply as | |
60 | arguments to the various programs. The C interface (via the | |
61 | `parse_datetime' function) is not described here. | |
62 | ||
63 | 1.1 General date syntax | |
64 | ======================= | |
65 | ||
66 | A "date" is a string, possibly empty, containing many items separated | |
67 | by whitespace. The whitespace may be omitted when no ambiguity arises. | |
68 | The empty string means the beginning of today (i.e., midnight). Order | |
69 | of the items is immaterial. A date string may contain many flavors of | |
70 | items: | |
71 | ||
72 | * calendar date items | |
73 | ||
74 | * time of day items | |
75 | ||
76 | * time zone items | |
77 | ||
78 | * combined date and time of day items | |
79 | ||
80 | * day of the week items | |
81 | ||
82 | * relative items | |
83 | ||
84 | * pure numbers. | |
85 | ||
86 | We describe each of these item types in turn, below. | |
87 | ||
88 | A few ordinal numbers may be written out in words in some contexts. | |
89 | This is most useful for specifying day of the week items or relative | |
90 | items (see below). Among the most commonly used ordinal numbers, the | |
91 | word `last' stands for -1, `this' stands for 0, and `first' and `next' | |
92 | both stand for 1. Because the word `second' stands for the unit of | |
93 | time there is no way to write the ordinal number 2, but for convenience | |
94 | `third' stands for 3, `fourth' for 4, `fifth' for 5, `sixth' for 6, | |
95 | `seventh' for 7, `eighth' for 8, `ninth' for 9, `tenth' for 10, | |
96 | `eleventh' for 11 and `twelfth' for 12. | |
97 | ||
98 | When a month is written this way, it is still considered to be | |
99 | written numerically, instead of being "spelled in full"; this changes | |
100 | the allowed strings. | |
101 | ||
102 | In the current implementation, only English is supported for words | |
103 | and abbreviations like `AM', `DST', `EST', `first', `January', | |
104 | `Sunday', `tomorrow', and `year'. | |
105 | ||
106 | The output of the `date' command is not always acceptable as a date | |
107 | string, not only because of the language problem, but also because | |
108 | there is no standard meaning for time zone items like `IST'. When using | |
109 | `date' to generate a date string intended to be parsed later, specify a | |
110 | date format that is independent of language and that does not use time | |
111 | zone items other than `UTC' and `Z'. Here are some ways to do this: | |
112 | ||
113 | $ LC_ALL=C TZ=UTC0 date | |
114 | Mon Mar 1 00:21:42 UTC 2004 | |
115 | $ TZ=UTC0 date +'%Y-%m-%d %H:%M:%SZ' | |
116 | 2004-03-01 00:21:42Z | |
117 | $ date --rfc-3339=ns # --rfc-3339 is a GNU extension. | |
118 | 2004-02-29 16:21:42.692722128-08:00 | |
119 | $ date --rfc-2822 # a GNU extension | |
120 | Sun, 29 Feb 2004 16:21:42 -0800 | |
121 | $ date +'%Y-%m-%d %H:%M:%S %z' # %z is a GNU extension. | |
122 | 2004-02-29 16:21:42 -0800 | |
123 | $ date +'@%s.%N' # %s and %N are GNU extensions. | |
124 | @1078100502.692722128 | |
125 | ||
126 | Alphabetic case is completely ignored in dates. Comments may be | |
127 | introduced between round parentheses, as long as included parentheses | |
128 | are properly nested. Hyphens not followed by a digit are currently | |
129 | ignored. Leading zeros on numbers are ignored. | |
130 | ||
131 | Invalid dates like `2005-02-29' or times like `24:00' are rejected. | |
132 | In the typical case of a host that does not support leap seconds, a | |
133 | time like `23:59:60' is rejected even if it corresponds to a valid leap | |
134 | second. | |
135 | ||
136 | 1.2 Calendar date items | |
137 | ======================= | |
138 | ||
139 | A "calendar date item" specifies a day of the year. It is specified | |
140 | differently, depending on whether the month is specified numerically or | |
141 | literally. All these strings specify the same calendar date: | |
142 | ||
143 | 1972-09-24 # ISO 8601. | |
144 | 72-9-24 # Assume 19xx for 69 through 99, | |
145 | # 20xx for 00 through 68. | |
146 | 72-09-24 # Leading zeros are ignored. | |
147 | 9/24/72 # Common U.S. writing. | |
148 | 24 September 1972 | |
149 | 24 Sept 72 # September has a special abbreviation. | |
150 | 24 Sep 72 # Three-letter abbreviations always allowed. | |
151 | Sep 24, 1972 | |
152 | 24-sep-72 | |
153 | 24sep72 | |
154 | ||
155 | The year can also be omitted. In this case, the last specified year | |
156 | is used, or the current year if none. For example: | |
157 | ||
158 | 9/24 | |
159 | sep 24 | |
160 | ||
161 | Here are the rules. | |
162 | ||
163 | For numeric months, the ISO 8601 format `YEAR-MONTH-DAY' is allowed, | |
164 | where YEAR is any positive number, MONTH is a number between 01 and 12, | |
165 | and DAY is a number between 01 and 31. A leading zero must be present | |
166 | if a number is less than ten. If YEAR is 68 or smaller, then 2000 is | |
167 | added to it; otherwise, if YEAR is less than 100, then 1900 is added to | |
168 | it. The construct `MONTH/DAY/YEAR', popular in the United States, is | |
169 | accepted. Also `MONTH/DAY', omitting the year. | |
170 | ||
171 | Literal months may be spelled out in full: `January', `February', | |
172 | `March', `April', `May', `June', `July', `August', `September', | |
173 | `October', `November' or `December'. Literal months may be abbreviated | |
174 | to their first three letters, possibly followed by an abbreviating dot. | |
175 | It is also permitted to write `Sept' instead of `September'. | |
176 | ||
177 | When months are written literally, the calendar date may be given as | |
178 | any of the following: | |
179 | ||
180 | DAY MONTH YEAR | |
181 | DAY MONTH | |
182 | MONTH DAY YEAR | |
183 | DAY-MONTH-YEAR | |
184 | ||
185 | Or, omitting the year: | |
186 | ||
187 | MONTH DAY | |
188 | ||
189 | 1.3 Time of day items | |
190 | ===================== | |
191 | ||
192 | A "time of day item" in date strings specifies the time on a given day. | |
193 | Here are some examples, all of which represent the same time: | |
194 | ||
195 | 20:02:00.000000 | |
196 | 20:02 | |
197 | 8:02pm | |
198 | 20:02-0500 # In EST (U.S. Eastern Standard Time). | |
199 | ||
200 | More generally, the time of day may be given as | |
201 | `HOUR:MINUTE:SECOND', where HOUR is a number between 0 and 23, MINUTE | |
202 | is a number between 0 and 59, and SECOND is a number between 0 and 59 | |
203 | possibly followed by `.' or `,' and a fraction containing one or more | |
204 | digits. Alternatively, `:SECOND' can be omitted, in which case it is | |
205 | taken to be zero. On the rare hosts that support leap seconds, SECOND | |
206 | may be 60. | |
207 | ||
208 | If the time is followed by `am' or `pm' (or `a.m.' or `p.m.'), HOUR | |
209 | is restricted to run from 1 to 12, and `:MINUTE' may be omitted (taken | |
210 | to be zero). `am' indicates the first half of the day, `pm' indicates | |
211 | the second half of the day. In this notation, 12 is the predecessor of | |
212 | 1: midnight is `12am' while noon is `12pm'. (This is the zero-oriented | |
213 | interpretation of `12am' and `12pm', as opposed to the old tradition | |
214 | derived from Latin which uses `12m' for noon and `12pm' for midnight.) | |
215 | ||
216 | The time may alternatively be followed by a time zone correction, | |
217 | expressed as `SHHMM', where S is `+' or `-', HH is a number of zone | |
218 | hours and MM is a number of zone minutes. The zone minutes term, MM, | |
219 | may be omitted, in which case the one- or two-digit correction is | |
220 | interpreted as a number of hours. You can also separate HH from MM | |
221 | with a colon. When a time zone correction is given this way, it forces | |
222 | interpretation of the time relative to Coordinated Universal Time | |
223 | (UTC), overriding any previous specification for the time zone or the | |
224 | local time zone. For example, `+0530' and `+05:30' both stand for the | |
225 | time zone 5.5 hours ahead of UTC (e.g., India). This is the best way to | |
226 | specify a time zone correction by fractional parts of an hour. The | |
227 | maximum zone correction is 24 hours. | |
228 | ||
229 | Either `am'/`pm' or a time zone correction may be specified, but not | |
230 | both. | |
231 | ||
232 | 1.4 Time zone items | |
233 | =================== | |
234 | ||
235 | A "time zone item" specifies an international time zone, indicated by a | |
236 | small set of letters, e.g., `UTC' or `Z' for Coordinated Universal | |
237 | Time. Any included periods are ignored. By following a | |
238 | non-daylight-saving time zone by the string `DST' in a separate word | |
239 | (that is, separated by some white space), the corresponding daylight | |
240 | saving time zone may be specified. Alternatively, a | |
241 | non-daylight-saving time zone can be followed by a time zone | |
242 | correction, to add the two values. This is normally done only for | |
243 | `UTC'; for example, `UTC+05:30' is equivalent to `+05:30'. | |
244 | ||
245 | Time zone items other than `UTC' and `Z' are obsolescent and are not | |
246 | recommended, because they are ambiguous; for example, `EST' has a | |
247 | different meaning in Australia than in the United States. Instead, | |
248 | it's better to use unambiguous numeric time zone corrections like | |
249 | `-0500', as described in the previous section. | |
250 | ||
251 | If neither a time zone item nor a time zone correction is supplied, | |
252 | timestamps are interpreted using the rules of the default time zone | |
253 | (*note Specifying time zone rules::). | |
254 | ||
255 | 1.5 Combined date and time of day items | |
256 | ======================================= | |
257 | ||
258 | The ISO 8601 date and time of day extended format consists of an ISO | |
259 | 8601 date, a `T' character separator, and an ISO 8601 time of day. | |
260 | This format is also recognized if the `T' is replaced by a space. | |
261 | ||
262 | In this format, the time of day should use 24-hour notation. | |
263 | Fractional seconds are allowed, with either comma or period preceding | |
264 | the fraction. ISO 8601 fractional minutes and hours are not supported. | |
265 | Typically, hosts support nanosecond timestamp resolution; excess | |
266 | precision is silently discarded. | |
267 | ||
268 | Here are some examples: | |
269 | ||
270 | 2012-09-24T20:02:00.052-05:00 | |
271 | 2012-12-31T23:59:59,999999999+11:00 | |
272 | 1970-01-01 00:00Z | |
273 | ||
274 | 1.6 Day of week items | |
275 | ===================== | |
276 | ||
277 | The explicit mention of a day of the week will forward the date (only | |
278 | if necessary) to reach that day of the week in the future. | |
279 | ||
280 | Days of the week may be spelled out in full: `Sunday', `Monday', | |
281 | `Tuesday', `Wednesday', `Thursday', `Friday' or `Saturday'. Days may | |
282 | be abbreviated to their first three letters, optionally followed by a | |
283 | period. The special abbreviations `Tues' for `Tuesday', `Wednes' for | |
284 | `Wednesday' and `Thur' or `Thurs' for `Thursday' are also allowed. | |
285 | ||
286 | A number may precede a day of the week item to move forward | |
287 | supplementary weeks. It is best used in expression like `third | |
288 | monday'. In this context, `last DAY' or `next DAY' is also acceptable; | |
289 | they move one week before or after the day that DAY by itself would | |
290 | represent. | |
291 | ||
292 | A comma following a day of the week item is ignored. | |
293 | ||
294 | 1.7 Relative items in date strings | |
295 | ================================== | |
296 | ||
297 | "Relative items" adjust a date (or the current date if none) forward or | |
298 | backward. The effects of relative items accumulate. Here are some | |
299 | examples: | |
300 | ||
301 | 1 year | |
302 | 1 year ago | |
303 | 3 years | |
304 | 2 days | |
305 | ||
306 | The unit of time displacement may be selected by the string `year' | |
307 | or `month' for moving by whole years or months. These are fuzzy units, | |
308 | as years and months are not all of equal duration. More precise units | |
309 | are `fortnight' which is worth 14 days, `week' worth 7 days, `day' | |
310 | worth 24 hours, `hour' worth 60 minutes, `minute' or `min' worth 60 | |
311 | seconds, and `second' or `sec' worth one second. An `s' suffix on | |
312 | these units is accepted and ignored. | |
313 | ||
314 | The unit of time may be preceded by a multiplier, given as an | |
315 | optionally signed number. Unsigned numbers are taken as positively | |
316 | signed. No number at all implies 1 for a multiplier. Following a | |
317 | relative item by the string `ago' is equivalent to preceding the unit | |
318 | by a multiplier with value -1. | |
319 | ||
320 | The string `tomorrow' is worth one day in the future (equivalent to | |
321 | `day'), the string `yesterday' is worth one day in the past (equivalent | |
322 | to `day ago'). | |
323 | ||
324 | The strings `now' or `today' are relative items corresponding to | |
325 | zero-valued time displacement, these strings come from the fact a | |
326 | zero-valued time displacement represents the current time when not | |
327 | otherwise changed by previous items. They may be used to stress other | |
328 | items, like in `12:00 today'. The string `this' also has the meaning | |
329 | of a zero-valued time displacement, but is preferred in date strings | |
330 | like `this thursday'. | |
331 | ||
332 | When a relative item causes the resulting date to cross a boundary | |
333 | where the clocks were adjusted, typically for daylight saving time, the | |
334 | resulting date and time are adjusted accordingly. | |
335 | ||
336 | The fuzz in units can cause problems with relative items. For | |
337 | example, `2003-07-31 -1 month' might evaluate to 2003-07-01, because | |
338 | 2003-06-31 is an invalid date. To determine the previous month more | |
339 | reliably, you can ask for the month before the 15th of the current | |
340 | month. For example: | |
341 | ||
342 | $ date -R | |
343 | Thu, 31 Jul 2003 13:02:39 -0700 | |
344 | $ date --date='-1 month' +'Last month was %B?' | |
345 | Last month was July? | |
346 | $ date --date="$(date +%Y-%m-15) -1 month" +'Last month was %B!' | |
347 | Last month was June! | |
348 | ||
349 | Also, take care when manipulating dates around clock changes such as | |
350 | daylight saving leaps. In a few cases these have added or subtracted | |
351 | as much as 24 hours from the clock, so it is often wise to adopt | |
352 | universal time by setting the `TZ' environment variable to `UTC0' | |
353 | before embarking on calendrical calculations. | |
354 | ||
355 | 1.8 Pure numbers in date strings | |
356 | ================================ | |
357 | ||
358 | The precise interpretation of a pure decimal number depends on the | |
359 | context in the date string. | |
360 | ||
361 | If the decimal number is of the form YYYYMMDD and no other calendar | |
362 | date item (*note Calendar date items::) appears before it in the date | |
363 | string, then YYYY is read as the year, MM as the month number and DD as | |
364 | the day of the month, for the specified calendar date. | |
365 | ||
366 | If the decimal number is of the form HHMM and no other time of day | |
367 | item appears before it in the date string, then HH is read as the hour | |
368 | of the day and MM as the minute of the hour, for the specified time of | |
369 | day. MM can also be omitted. | |
370 | ||
371 | If both a calendar date and a time of day appear to the left of a | |
372 | number in the date string, but no relative item, then the number | |
373 | overrides the year. | |
374 | ||
375 | 1.9 Seconds since the Epoch | |
376 | =========================== | |
377 | ||
378 | If you precede a number with `@', it represents an internal timestamp | |
379 | as a count of seconds. The number can contain an internal decimal | |
380 | point (either `.' or `,'); any excess precision not supported by the | |
381 | internal representation is truncated toward minus infinity. Such a | |
382 | number cannot be combined with any other date item, as it specifies a | |
383 | complete timestamp. | |
384 | ||
385 | Internally, computer times are represented as a count of seconds | |
386 | since an epoch--a well-defined point of time. On GNU and POSIX | |
387 | systems, the epoch is 1970-01-01 00:00:00 UTC, so `@0' represents this | |
388 | time, `@1' represents 1970-01-01 00:00:01 UTC, and so forth. GNU and | |
389 | most other POSIX-compliant systems support such times as an extension | |
390 | to POSIX, using negative counts, so that `@-1' represents 1969-12-31 | |
391 | 23:59:59 UTC. | |
392 | ||
393 | Traditional Unix systems count seconds with 32-bit two's-complement | |
394 | integers and can represent times from 1901-12-13 20:45:52 through | |
395 | 2038-01-19 03:14:07 UTC. More modern systems use 64-bit counts of | |
396 | seconds with nanosecond subcounts, and can represent all the times in | |
397 | the known lifetime of the universe to a resolution of 1 nanosecond. | |
398 | ||
399 | On most hosts, these counts ignore the presence of leap seconds. | |
400 | For example, on most hosts `@915148799' represents 1998-12-31 23:59:59 | |
401 | UTC, `@915148800' represents 1999-01-01 00:00:00 UTC, and there is no | |
402 | way to represent the intervening leap second 1998-12-31 23:59:60 UTC. | |
403 | ||
404 | 1.10 Specifying time zone rules | |
405 | =============================== | |
406 | ||
407 | Normally, dates are interpreted using the rules of the current time | |
408 | zone, which in turn are specified by the `TZ' environment variable, or | |
409 | by a system default if `TZ' is not set. To specify a different set of | |
410 | default time zone rules that apply just to one date, start the date | |
411 | with a string of the form `TZ="RULE"'. The two quote characters (`"') | |
412 | must be present in the date, and any quotes or backslashes within RULE | |
413 | must be escaped by a backslash. | |
414 | ||
415 | For example, with the GNU `date' command you can answer the question | |
416 | "What time is it in New York when a Paris clock shows 6:30am on October | |
417 | 31, 2004?" by using a date beginning with `TZ="Europe/Paris"' as shown | |
418 | in the following shell transcript: | |
419 | ||
420 | $ export TZ="America/New_York" | |
421 | $ date --date='TZ="Europe/Paris" 2004-10-31 06:30' | |
422 | Sun Oct 31 01:30:00 EDT 2004 | |
423 | ||
424 | In this example, the `--date' operand begins with its own `TZ' | |
425 | setting, so the rest of that operand is processed according to | |
426 | `Europe/Paris' rules, treating the string `2004-10-31 06:30' as if it | |
427 | were in Paris. However, since the output of the `date' command is | |
428 | processed according to the overall time zone rules, it uses New York | |
429 | time. (Paris was normally six hours ahead of New York in 2004, but | |
430 | this example refers to a brief Halloween period when the gap was five | |
431 | hours.) | |
432 | ||
433 | A `TZ' value is a rule that typically names a location in the `tz' | |
434 | database (http://www.twinsun.com/tz/tz-link.htm). A recent catalog of | |
435 | location names appears in the TWiki Date and Time Gateway | |
436 | (http://twiki.org/cgi-bin/xtra/tzdate). A few non-GNU hosts require a | |
437 | colon before a location name in a `TZ' setting, e.g., | |
438 | `TZ=":America/New_York"'. | |
439 | ||
440 | The `tz' database includes a wide variety of locations ranging from | |
441 | `Arctic/Longyearbyen' to `Antarctica/South_Pole', but if you are at sea | |
442 | and have your own private time zone, or if you are using a non-GNU host | |
443 | that does not support the `tz' database, you may need to use a POSIX | |
444 | rule instead. Simple POSIX rules like `UTC0' specify a time zone | |
445 | without daylight saving time; other rules can specify simple daylight | |
446 | saving regimes. *Note Specifying the Time Zone with `TZ': (libc)TZ | |
447 | Variable. | |
448 | ||
449 | 1.11 Authors of `parse_datetime' | |
450 | ================================ | |
451 | ||
452 | `parse_datetime' started life as `getdate', as originally implemented | |
453 | by Steven M. Bellovin (<smb@research.att.com>) while at the University | |
454 | of North Carolina at Chapel Hill. The code was later tweaked by a | |
455 | couple of people on Usenet, then completely overhauled by Rich $alz | |
456 | (<rsalz@bbn.com>) and Jim Berets (<jberets@bbn.com>) in August, 1990. | |
457 | Various revisions for the GNU system were made by David MacKenzie, Jim | |
458 | Meyering, Paul Eggert and others, including renaming it to `get_date' to | |
459 | avoid a conflict with the alternative Posix function `getdate', and a | |
460 | later rename to `parse_datetime'. The Posix function `getdate' can | |
461 | parse more locale-specific dates using `strptime', but relies on an | |
462 | environment variable and external file, and lacks the thread-safety of | |
463 | `parse_datetime'. | |
464 | ||
73afd3f8 | 465 | This chapter was originally produced by François Pinard |
b80d6d4b WP |
466 | (<pinard@iro.umontreal.ca>) from the `parse_datetime.y' source code, |
467 | and then edited by K. Berry (<kb@cs.umb.edu>). | |
468 |