]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man7/uri.7
Trivial punctuation fixes in SEE ALSO
[thirdparty/man-pages.git] / man7 / uri.7
CommitLineData
fea681da
MK
1'\"
2.\" (C) Copyright 1999-2000 David A. Wheeler (dwheeler@dwheeler.com)
3.\"
4.\" Permission is granted to make and distribute verbatim copies of this
5.\" manual provided the copyright notice and this permission notice are
6.\" preserved on all copies.
7.\"
8.\" Permission is granted to copy and distribute modified versions of this
9.\" manual under the conditions for verbatim copying, provided that the
10.\" entire resulting derived work is distributed under the terms of a
11.\" permission notice identical to this one.
c13182ef 12.\"
fea681da
MK
13.\" Since the Linux kernel and libraries are constantly changing, this
14.\" manual page may be incorrect or out-of-date. The author(s) assume no
15.\" responsibility for errors or omissions, or for damages resulting from
16.\" the use of the information contained herein. The author(s) may not
17.\" have taken the same level of care in the production of this manual,
18.\" which is licensed free of charge, as they might when working
19.\" professionally.
c13182ef 20.\"
fea681da
MK
21.\" Formatted or processed versions of this manual, if unaccompanied by
22.\" the source, must acknowledge the copyright and authors of this work.
23.\"
24.\" Fragments of this document are directly derived from IETF standards.
25.\" For those fragments which are directly derived from such standards,
26.\" the following notice applies, which is the standard copyright and
27.\" rights announcement of The Internet Society:
28.\"
29.\" Copyright (C) The Internet Society (1998). All Rights Reserved.
30.\" This document and translations of it may be copied and furnished to
31.\" others, and derivative works that comment on or otherwise explain it
32.\" or assist in its implementation may be prepared, copied, published
33.\" and distributed, in whole or in part, without restriction of any
34.\" kind, provided that the above copyright notice and this paragraph are
35.\" included on all such copies and derivative works. However, this
36.\" document itself may not be modified in any way, such as by removing
37.\" the copyright notice or references to the Internet Society or other
38.\" Internet organizations, except as needed for the purpose of
39.\" developing Internet standards in which case the procedures for
40.\" copyrights defined in the Internet Standards process must be
41.\" followed, or as required to translate it into languages other than English.
42.\"
43.\" Modified Fri Jul 25 23:00:00 1999 by David A. Wheeler (dwheeler@dwheeler.com)
44.\" Modified Fri Aug 21 23:00:00 1999 by David A. Wheeler (dwheeler@dwheeler.com)
45.\" Modified Tue Mar 14 2000 by David A. Wheeler (dwheeler@dwheeler.com)
46.\"
47.TH URI 7 2000-03-14 "Linux" "Linux Programmer's Manual"
48.SH NAME
49uri, url, urn \- uniform resource identifier (URI), including a URL or URN
50.SH SYNOPSIS
51.nf
52.HP 0.2i
53URI = [ absoluteURI | relativeURI ] [ "#" fragment ]
54.HP
55absoluteURI = scheme ":" ( hierarchical_part | opaque_part )
56.HP
57relativeURI = ( net_path | absolute_path | relative_path ) [ "?" query ]
fea681da 58.HP
c79efe90
MK
59scheme = "http" | "ftp" | "gopher" | "mailto" | "news" | "telnet" |
60 "file" | "man" | "info" | "whatis" | "ldap" | "wais" | \&...
fea681da
MK
61.HP
62hierarchical_part = ( net_path | absolute_path ) [ "?" query ]
fea681da
MK
63.HP
64net_path = "//" authority [ absolute_path ]
65.HP
66absolute_path = "/" path_segments
67.HP
68relative_path = relative_segment [ absolute_path ]
69.fi
70.SH DESCRIPTION
71.PP
72A Uniform Resource Identifier (URI) is a short string of characters
73identifying an abstract or physical resource (for example, a web page).
74A Uniform Resource Locator (URL) is a URI
75that identifies a resource through its primary access
76mechanism (e.g., its network "location"), rather than
77by name or some other attribute of that resource.
78A Uniform Resource Name (URN) is a URI
79that must remain globally unique and persistent even when
80the resource ceases to exist or becomes unavailable.
81.PP
82URIs are the standard way to name hypertext link destinations
83for tools such as web browsers.
6116ff44
MK
84The string "http://www.kernelnotes.org" is a URL (and thus it
85is also a URI).
fea681da
MK
86Many people use the term URL loosely as a synonym for URI
87(though technically URLs are a subset of URIs).
88.PP
89URIs can be absolute or relative.
90An absolute identifier refers to a resource independent of
91context, while a relative
92identifier refers to a resource by describing the difference
93from the current context.
94Within a relative path reference, the complete path segments "." and
95".." have special meanings: "the current hierarchy level" and "the
96level above this hierarchy level", respectively, just like they do in
97Unix-like systems.
98A path segment which contains a colon
99character can't be used as the first segment of a relative URI path
100(e.g., "this:that"), because it would be mistaken for a scheme name;
101precede such segments with ./ (e.g., "./this:that").
b9560046 102Note that descendants of MS-DOS (e.g., Microsoft Windows) replace
fea681da
MK
103devicename colons with the vertical bar ("|") in URIs, so "C:" becomes "C|".
104.PP
105A fragment identifier, if included, refers to a particular named portion
f81fb444
MK
106(fragment) of a resource; text after a \(aq#\(aq identifies the fragment.
107A URI beginning with \(aq#\(aq refers to that fragment in the current resource.
446a4bc8 108.SS Usage
fea681da
MK
109There are many different URI schemes, each with specific
110additional rules and meanings, but they are intentionally made to be
111as similar as possible.
112For example, many URL schemes
113permit the authority to be the following format, called here an
114.I ip_server
115(square brackets show what's optional):
116.HP
117.IR "ip_server = " [ user " [ : " password " ] @ ] " host " [ : " port ]
118.PP
18701562 119This format allows you to optionally insert a username,
fea681da
MK
120a user plus password, and/or a port number.
121The
122.I host
123is the name of the host computer, either its name as determined by DNS
124or an IP address (numbers separated by periods).
125Thus the URI
126<http://fred:fredpassword@xyz.com:8080/>
127logs into a web server on host xyz.com
128as fred (using fredpassword) using port 8080.
129Avoid including a password in a URI if possible because of the many
130security risks of having a password written down.
18701562 131If the URL supplies a username but no password, and the remote
fea681da
MK
132server requests a password, the program interpreting the URL
133should request one from the user.
134.PP
135Here are some of the most common schemes in use on Unix-like systems
136that are understood by many tools.
137Note that many tools using URIs also have internal schemes or specialized
138schemes; see those tools' documentation for information on those schemes.
446a4bc8
MK
139.PP
140.B "http \- Web (HTTP) server"
141.PP
fea681da
MK
142.RI http:// ip_server / path
143.br
144.RI http:// ip_server / path ? query
145.PP
146This is a URL accessing a web (HTTP) server.
147The default port is 80.
148If the path refers to a directory, the web server will choose what
149to return; usually if there is a file named "index.html" or "index.htm"
150its content is returned, otherwise, a list of the files in the current
151directory (with appropriate links) is generated and returned.
152An example is <http://lwn.net>.
153.PP
154A query can be given in the archaic "isindex" format, consisting of a
155word or phrase and not including an equal sign (=).
156A query can also be in the longer "GET" format, which has one or more
157query entries of the form
158.IR key = value
159separated by the ampersand character (&).
160Note that
161.I key
162can be repeated more than once, though it's up to the web server
163and its application programs to determine if there's any meaning to that.
164There is an unfortunate interaction with HTML/XML/SGML and
165the GET query format; when such URIs with more than one key
166are embedded in SGML/XML documents (including HTML), the ampersand
167(&) has to be rewritten as &amp;.
168Note that not all queries use this format; larger forms
169may be too long to store as a URI, so they use a different
6116ff44
MK
170interaction mechanism (called POST) which does
171not include the data in the URI.
fea681da
MK
172See the Common Gateway Interface specification at
173<http://www.w3.org/CGI> for more information.
446a4bc8
MK
174.PP
175.B "ftp \- File Transfer Protocol (FTP)"
176.PP
fea681da
MK
177.RI ftp:// ip_server / path
178.PP
179This is a URL accessing a file through the file transfer protocol (FTP).
180The default port (for control) is 21.
18701562 181If no username is included, the username "anonymous" is supplied, and
fea681da
MK
182in that case many clients provide as the password the requestor's
183Internet email address.
184An example is
185<ftp://ftp.is.co.za/rfc/rfc1808.txt>.
446a4bc8
MK
186.PP
187.B "gopher \- Gopher server"
188.PP
fea681da
MK
189.RI gopher:// ip_server / "gophertype selector"
190.br
191.RI gopher:// ip_server / "gophertype selector" %09 search
192.br
193.RI gopher:// ip_server / "gophertype selector" %09 search %09 gopher+_string
194.br
195.PP
196The default gopher port is 70.
197.I gophertype
198is a single-character field to denote the
199Gopher type of the resource to
200which the URL refers.
201The entire path may also be empty, in
202which case the delimiting "/" is also optional and the gophertype
203defaults to "1".
204.PP
205.I selector
c13182ef
MK
206is the Gopher selector string.
207In the Gopher protocol,
fea681da
MK
208Gopher selector strings are a sequence of octets which may contain
209any octets except 09 hexadecimal (US-ASCII HT or tab), 0A hexadecimal
210(US-ASCII character LF), and 0D (US-ASCII character CR).
446a4bc8
MK
211.PP
212.B "mailto \- Email address"
213.PP
fea681da
MK
214.RI mailto: email-address
215.PP
216This is an email address, usually of the form
217.IR name @ hostname .
218See
219.BR mailaddr (7)
220for more information on the correct format of an email address.
221Note that any % character must be rewritten as %25.
222An example is <mailto:dwheeler@dwheeler.com>.
446a4bc8
MK
223.PP
224.B "news \- Newsgroup or News message"
225.PP
fea681da
MK
226.RI news: newsgroup-name
227.br
228.RI news: message-id
229.PP
230A
231.I newsgroup-name
232is a period-delimited hierarchical name, such as
233"comp.infosystems.www.misc".
234If <newsgroup-name> is "*" (as in <news:*>), it is used to refer
235to "all available news groups".
236An example is <news:comp.lang.ada>.
237.PP
238A
239.I message-id
240corresponds to the Message-ID of
1bf04717 241.UR http://www.ietf.org/rfc/rfc1036.txt
331da7c3 242IETF RFC\ 1036,
fea681da
MK
243.UE
244without the enclosing "<"
245and ">"; it takes the form
246.IR unique @ full_domain_name .
247A message identifier may be distinguished from a news group name by the
248presence of the "@" character.
446a4bc8
MK
249.PP
250.B "telnet \- Telnet login"
251.PP
fea681da
MK
252.RI telnet:// ip_server /
253.PP
254The Telnet URL scheme is used to designate interactive text services that
c13182ef
MK
255may be accessed by the Telnet protocol.
256The final "/" character may be omitted.
fea681da
MK
257The default port is 23.
258An example is <telnet://melvyl.ucop.edu/>.
446a4bc8
MK
259.PP
260.B "file \- Normal file"
261.PP
fea681da
MK
262.RI file:// ip_server / path_segments
263.br
264.RI file: path_segments
265.PP
266This represents a file or directory accessible locally.
267As a special case,
268.I host
269can be the string "localhost" or the empty
2d986c92
MK
270string; this is interpreted as "the machine from which the URL is
271being interpreted".
fea681da
MK
272If the path is to a directory, the viewer should display the
273directory's contents with links to each containee;
274not all viewers currently do this.
275KDE supports generated files through the URL <file:/cgi-bin>.
276If the given file isn't found, browser writers may want to try to expand
277the filename via filename globbing
278(see
279.BR glob (7)
280and
281.BR glob (3)).
282.PP
283The second format (e.g., <file:/etc/passwd>)
284is a correct format for referring to
c13182ef
MK
285a local file.
286However, older standards did not permit this format,
fea681da 287and some programs don't recognize this as a URI.
75b94dc3
MK
288A more portable syntax is to use an empty string as the server name,
289for example,
fea681da
MK
290<file:///etc/passwd>; this form does the same thing
291and is easily recognized by pattern matchers and older programs as a URI.
292Note that if you really mean to say "start from the current location," don't
293specify the scheme at all; use a relative address like <../test.txt>,
294which has the side-effect of being scheme-independent.
295An example of this scheme is <file:///etc/passwd>.
446a4bc8
MK
296.PP
297.B "man \- Man page documentation"
298.PP
fea681da
MK
299.RI man: command-name
300.br
301.RI man: command-name ( section )
302.PP
303This refers to local online manual (man) reference pages.
6116ff44
MK
304The command name can optionally be followed by a
305parenthesis and section number; see
fea681da
MK
306.BR man (7)
307for more information on the meaning of the section numbers.
308This URI scheme is unique to Unix-like systems (such as Linux)
309and is not currently registered by the IETF.
310An example is <man:ls(1)>.
446a4bc8
MK
311.PP
312.B "info \- Info page documentation"
313.PP
fea681da
MK
314.RI info: virtual-filename
315.br
316.RI info: virtual-filename # nodename
317.br
318.RI info:( virtual-filename )
319.br
320.RI info:( virtual-filename ) nodename
321.PP
322This scheme refers to online info reference pages (generated from
6116ff44
MK
323texinfo files),
324a documentation format used by programs such as the GNU tools.
fea681da
MK
325This URI scheme is unique to Unix-like systems (such as Linux)
326and is not currently registered by the IETF.
327As of this writing, GNOME and KDE differ in their URI syntax
328and do not accept the other's syntax.
329The first two formats are the GNOME format; in nodenames all spaces
330are written as underscores.
331The second two formats are the KDE format;
332spaces in nodenames must be written as spaces, even though this
333is forbidden by the URI standards.
334It's hoped that in the future most tools will understand all of these
335formats and will always accept underscores for spaces in nodenames.
336In both GNOME and KDE, if the form without the nodename is used the
337nodename is assumed to be "Top".
338Examples of the GNOME format are <info:gcc> and <info:gcc#G++_and_GCC>.
339Examples of the KDE format are <info:(gcc)> and <info:(gcc)G++ and GCC>.
446a4bc8
MK
340.PP
341.B "whatis \- Documentation search"
342.PP
fea681da
MK
343.RI whatis: string
344.PP
6116ff44
MK
345This scheme searches the database of short (one-line) descriptions of
346commands and returns a list of descriptions containing that string.
fea681da
MK
347Only complete word matches are returned.
348See
349.BR whatis (1).
350This URI scheme is unique to Unix-like systems (such as Linux)
351and is not currently registered by the IETF.
446a4bc8
MK
352.PP
353.B "ghelp \- GNOME help documentation"
354.PP
fea681da
MK
355.RI ghelp: name-of-application
356.PP
357This loads GNOME help for the given application.
358Note that not much documentation currently exists in this format.
446a4bc8
MK
359.PP
360.B "ldap \- Lightweight Directory Access Protocol"
361.PP
fea681da
MK
362.RI ldap:// hostport
363.br
364.RI ldap:// hostport /
365.br
366.RI ldap:// hostport / dn
367.br
368.RI ldap:// hostport / dn ? attributes
369.br
370.RI ldap:// hostport / dn ? attributes ? scope
371.br
372.RI ldap:// hostport / dn ? attributes ? scope ? filter
373.br
374.RI ldap:// hostport / dn ? attributes ? scope ? filter ? extensions
375.PP
376This scheme supports queries to the
377Lightweight Directory Access Protocol (LDAP), a protocol for querying
3f624b93 378a set of servers for hierarchically organized information
fea681da
MK
379(such as people and computing resources).
380More information on the LDAP URL scheme is available in
381.UR http://www.ietf.org/rfc/rfc2255.txt
331da7c3 382RFC\ 2255.
fea681da
MK
383.UE
384The components of this URL are:
385.IP hostport 12
386the LDAP server to query, written as a hostname optionally followed by
387a colon and the port number.
c13182ef 388The default LDAP port is TCP port 389.
fea681da
MK
389If empty, the client determines which the LDAP server to use.
390.IP dn
391the LDAP Distinguished Name, which identifies
392the base object of the LDAP search (see
393.UR http://www.ietf.org/rfc/rfc2253.txt
331da7c3 394RFC\ 2253
fea681da
MK
395.UE
396section 3).
397.IP attributes
398a comma-separated list of attributes to be returned;
c13182ef 399see RFC\ 2251 section 4.1.5.
331da7c3 400If omitted, all attributes should be returned.
fea681da
MK
401.IP scope
402specifies the scope of the search, which can be one of
403"base" (for a base object search), "one" (for a one-level search),
c13182ef
MK
404or "sub" (for a subtree search).
405If scope is omitted, "base" is assumed.
fea681da
MK
406.IP filter
407specifies the search filter (subset of entries
c13182ef
MK
408to return).
409If omitted, all entries should be returned.
fea681da
MK
410See
411.UR http://www.ietf.org/rfc/rfc2254.txt
331da7c3 412RFC\ 2254
fea681da
MK
413.UE
414section 4.
415.IP extensions
416a comma-separated list of type=value
417pairs, where the =value portion may be omitted for options not
c13182ef 418requiring it.
f81fb444 419An extension prefixed with a \(aq!\(aq is critical
6116ff44 420(must be supported to be valid), otherwise it is non-critical (optional).
fea681da
MK
421.PP
422LDAP queries are easiest to explain by example.
423Here's a query that asks ldap.itd.umich.edu for information about
424the University of Michigan in the U.S.:
0dac954b
MK
425.PP
426.nf
fea681da 427ldap://ldap.itd.umich.edu/o=University%20of%20Michigan,c=US
3ffdc54f 428.fi
fea681da
MK
429.PP
430To just get its postal address attribute, request:
0dac954b
MK
431.PP
432.nf
fea681da 433ldap://ldap.itd.umich.edu/o=University%20of%20Michigan,c=US?postalAddress
0dac954b 434.fi
fea681da
MK
435.PP
436To ask a host.com at port 6666 for information about the person
437with common name (cn) "Babs Jensen" at University of Michigan, request:
0dac954b
MK
438.PP
439.nf
fea681da 440ldap://host.com:6666/o=University%20of%20Michigan,c=US??sub?(cn=Babs%20Jensen)
0dac954b 441.fi
446a4bc8
MK
442.PP
443.B "wais \- Wide Area Information Servers"
444.PP
fea681da
MK
445.RI wais:// hostport / database
446.br
447.RI wais:// hostport / database ? search
448.br
449.RI wais:// hostport / database / wtype / wpath
450.PP
451This scheme designates a WAIS database, search, or document
452(see
453.UR http://www.ietf.org/rfc/rfc1625.txt
331da7c3 454IETF RFC\ 1625
fea681da
MK
455.UE
456for more information on WAIS).
457Hostport is the hostname, optionally followed by a colon and port number
458(the default port number is 210).
459.PP
460The first form designates a WAIS database for searching.
461The second form designates a particular search of the WAIS database
462.IR database .
463The third form designates a particular document within a WAIS
464database to be retrieved.
465.I wtype
466is the WAIS designation of the type of the object and
467.I wpath
468is the WAIS document-id.
446a4bc8
MK
469.PP
470.B "other schemes"
471.PP
fea681da
MK
472There are many other URI schemes.
473Most tools that accept URIs support a set of internal URIs
474(e.g., Mozilla has the about: scheme for internal information,
475and the GNOME help browser has the toc: scheme for various starting
476locations).
477There are many schemes that have been defined but are not as widely
478used at the current time
479(e.g., prospero).
480The nntp: scheme is deprecated in favor of the news: scheme.
481URNs are to be supported by the urn: scheme, with a hierarchical name space
482(e.g., urn:ietf:... would identify IETF documents); at this time
483URNs are not widely implemented.
484Not all tools support all schemes.
446a4bc8 485.SS "Character Encoding"
fea681da
MK
486.PP
487URIs use a limited number of characters so that they can be
488typed in and used in a variety of situations.
489.PP
490The following characters are reserved, that is, they may appear in a
491URI but their use is limited to their reserved purpose
492(conflicting data must be escaped before forming the URI):
493.IP
494 ; / ? : @ & = + $ ,
495.PP
496Unreserved characters may be included in a URI.
497Unreserved characters
1954b6a9 498include upper and lower case English letters,
fea681da
MK
499decimal digits, and the following
500limited set of punctuation marks and symbols:
501.IP
4d9b6984 502 \- _ . ! ~ * ' ( )
fea681da
MK
503.PP
504All other characters must be escaped.
505An escaped octet is encoded as a character triplet, consisting of the
506percent character "%" followed by the two hexadecimal digits
507representing the octet code (you can use upper or lower case letters
c13182ef
MK
508for the hexadecimal digits).
509For example, a blank space must be escaped
fea681da
MK
510as "%20", a tab character as "%09", and the "&" as "%26".
511Because the percent "%" character always has the reserved purpose of
512being the escape indicator, it must be escaped as "%25".
513It is common practice to escape space characters as the plus symbol (+)
514in query text; this practice isn't uniformly defined
515in the relevant RFCs (which recommend %20 instead) but any tool accepting
516URIs with query text should be prepared for them.
517A URI is always shown in its "escaped" form.
518.PP
519Unreserved characters can be escaped without changing the semantics
520of the URI, but this should not be done unless the URI is being used
521in a context that does not allow the unescaped character to appear.
763f0e47
MK
522For example, "%7e" is sometimes used instead of "~" in an HTTP URL
523path, but the two are equivalent for an HTTP URL.
fea681da
MK
524.PP
525For URIs which must handle characters outside the US ASCII character set,
526the HTML 4.01 specification (section B.2) and
331da7c3 527IETF RFC\ 2718 (section 2.2.5) recommend the following approach:
fea681da 528.IP 1. 4
331da7c3 529translate the character sequences into UTF-8 (IETF RFC\ 2279) \(em see
fea681da 530.BR utf-8 (7)
4d9b6984 531\(em and then
fea681da
MK
532.IP 2.
533use the URI escaping mechanism, that is,
534use the %HH encoding for unsafe octets.
446a4bc8 535.SS "Writing a URI"
eb1af896 536When written, URIs should be placed inside double quotes
fea681da
MK
537(e.g., "http://www.kernelnotes.org"),
538enclosed in angle brackets (e.g., <http://lwn.net>),
539or placed on a line by themselves.
540A warning for those who use double-quotes:
541.B never
542move extraneous punctuation (such as the period ending a sentence or the
543comma in a list)
544inside a URI, since this will change the value of the URI.
545Instead, use angle brackets instead, or
546switch to a quoting system that never includes extraneous characters
547inside quotation marks.
548This latter system, called the 'new' or 'logical' quoting system by
549"Hart's Rules" and the "Oxford Dictionary for Writers and Editors",
550is preferred practice in Great Britain and hackers worldwide
551(see the
defcceb3
MK
552Jargon File's section on Hacker Writing Style,
553.IR http://www.fwi.uva.nl/~mes/jargon/h/HackerWritingStyle.html ,
fea681da 554for more information).
c13182ef 555Older documents suggested inserting the prefix "URL:"
fea681da
MK
556just before the URI, but this form has never caught on.
557.PP
558The URI syntax was designed to be unambiguous.
559However, as URIs have become commonplace, traditional media
560(television, radio, newspapers, billboards, etc.) have increasingly
561used abbreviated URI references consisting of
562only the authority and path portions of the identified resource
563(e.g., <www.w3.org/Addressing>).
564Such references are primarily
565intended for human interpretation rather than machine, with the
566assumption that context-based heuristics are sufficient to complete
567the URI (e.g., hostnames beginning with "www" are likely to have
568a URI prefix of "http://" and hostnames beginning with "ftp" likely
569to have a prefix of "ftp://").
570Many client implementations heuristically resolve these references.
571Such heuristics may
572change over time, particularly when new schemes are introduced.
573Since an abbreviated URI has the same syntax as a relative URL path,
574abbreviated URI references cannot be used where relative URIs are
575permitted, and can only be used when there is no defined base
576(such as in dialog boxes).
577Don't use abbreviated URIs as hypertext links inside a document;
578use the standard format as described here.
2b2581ee
MK
579.SH "CONFORMING TO"
580.PP
1bf04717 581.I http://www.ietf.org/rfc/rfc2396.txt
2b2581ee
MK
582(IETF RFC\ 2396),
583.I http://www.w3.org/TR/REC-html40
584(HTML 4.0).
fea681da
MK
585.SH NOTES
586Any tool accepting URIs (e.g., a web browser) on a Linux system should
6116ff44
MK
587be able to handle (directly or indirectly) all of the
588schemes described here, including the man: and info: schemes.
589Handling them by invoking some other program is
590fine and in fact encouraged.
fea681da
MK
591.PP
592Technically the fragment isn't part of the URI.
593.PP
594For information on how to embed URIs (including URLs) in a data format,
595see documentation on that format.
596HTML uses the format <A HREF="\fIuri\fP">
597.I text
598</A>.
599Texinfo files use the format @uref{\fIuri\fP}.
3f624b93 600Man and mdoc have the recently added UR macro, or just include the
fea681da
MK
601URI in the text (viewers should be able to detect :// as part of a URI).
602.PP
6116ff44
MK
603The GNOME and KDE desktop environments currently vary in the URIs
604they accept, in particular in their respective help browsers.
fea681da
MK
605To list man pages, GNOME uses <toc:man> while KDE uses <man:(index)>, and
606to list info pages, GNOME uses <toc:info> while KDE uses <info:(dir)>
607(the author of this man page prefers the KDE approach here, though a more
608regular format would be even better).
609In general, KDE uses <file:/cgi-bin/> as a prefix to a set of generated
610files.
611KDE prefers documentation in HTML, accessed via the
612<file:/cgi-bin/helpindex>.
613GNOME prefers the ghelp scheme to store and find documentation.
614Neither browser handles file: references to directories at the time
615of this writing, making it difficult to refer to an entire directory with
616a browsable URI.
6116ff44
MK
617As noted above, these environments differ in how they handle the
618info: scheme, probably the most important variation.
fea681da
MK
619It is expected that GNOME and KDE
620will converge to common URI formats, and a future
621version of this man page will describe the converged result.
622Efforts to aid this convergence are encouraged.
2b2581ee 623.SS Security
fea681da
MK
624.PP
625A URI does not in itself pose a security threat.
626There is no general guarantee that a URL, which at one time
c13182ef
MK
627located a given resource, will continue to do so.
628Nor is there any
fea681da
MK
629guarantee that a URL will not locate a different resource at some
630later point in time; such a guarantee can only be
631obtained from the person(s) controlling that namespace and the
632resource in question.
633.PP
634It is sometimes possible to construct a URL such that an attempt to
635perform a seemingly harmless operation, such as the
636retrieval of an entity associated with the resource, will in fact
c13182ef
MK
637cause a possibly damaging remote operation to occur.
638The unsafe URL
fea681da 639is typically constructed by specifying a port number other than that
c13182ef
MK
640reserved for the network protocol in question.
641The client unwittingly contacts a site that is in fact
642running a different protocol.
643The content of the URL contains instructions that, when
fea681da 644interpreted according to this other protocol, cause an unexpected
c13182ef
MK
645operation.
646An example has been the use of a gopher URL to cause an
fea681da
MK
647unintended or impersonating message to be sent via a SMTP server.
648.PP
649Caution should be used when using any URL that specifies a port
650number other than the default for the protocol, especially when it is
651a number within the reserved space.
652.PP
653Care should be taken when a URI contains escaped delimiters for a
654given protocol (for example, CR and LF characters for telnet
c13182ef
MK
655protocols) that these are not unescaped before transmission.
656This might violate the protocol, but avoids the potential for such
fea681da
MK
657characters to be used to simulate an extra operation or parameter in
658that protocol, which might lead to an unexpected and possibly harmful
659remote operation to be performed.
660.PP
661It is clearly unwise to use a URI that contains a password which is
c13182ef
MK
662intended to be secret.
663In particular, the use of a password within
84c517a4
MK
664the "userinfo" component of a URI is strongly recommended against except
665in those rare cases where the "password" parameter is intended to be public.
fea681da
MK
666.SH BUGS
667.PP
668Documentation may be placed in a variety of locations, so there
669currently isn't a good URI scheme for general online documentation
670in arbitrary formats.
671References of the form
672<file:///usr/doc/ZZZ> don't work because different distributions and
673local installation requirements may place the files in different
674directories
6116ff44
MK
675(it may be in /usr/doc, or /usr/local/doc, or /usr/share,
676or somewhere else).
fea681da
MK
677Also, the directory ZZZ usually changes when a version changes
678(though filename globbing could partially overcome this).
6116ff44
MK
679Finally, using the file: scheme doesn't easily support people
680who dynamically load documentation from the Internet (instead of
24d01c53 681loading the files onto a local file system).
fea681da 682A future URI scheme may be added (e.g., "userdoc:") to permit
6116ff44
MK
683programs to include cross-references to more detailed documentation
684without having to know the exact location of that documentation.
24d01c53 685Alternatively, a future version of the file-system specification may
fea681da
MK
686specify file locations sufficiently so that the file: scheme will
687be able to locate documentation.
688.PP
689Many programs and file formats don't include a way to incorporate
690or implement links using URIs.
691.PP
692Many programs can't handle all of these different URI formats; there
693should be a standard mechanism to load an arbitrary URI that automatically
6116ff44 694detects the users' environment (e.g., text or graphics,
3f624b93 695desktop environment, local user preferences, and currently executing
6116ff44 696tools) and invokes the right tool for any URI.
fd7f0a7f
MK
697.\" .SH AUTHOR
698.\" David A. Wheeler (dwheeler@dwheeler.com) wrote this man page.
fea681da
MK
699.SH "SEE ALSO"
700.BR lynx (1),
701.BR man2html (1),
702.BR mailaddr (7),
677653bb 703.BR utf-8 (7),
fea681da 704.UR http://www.ietf.org/rfc/rfc2255.txt
baf17bc4 705IETF RFC\ 2255
fea681da 706.UE