]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man7/uri.7
sock_diag.7: ffix: better hanging lists
[thirdparty/man-pages.git] / man7 / uri.7
CommitLineData
fea681da
MK
1.\" (C) Copyright 1999-2000 David A. Wheeler (dwheeler@dwheeler.com)
2.\"
93015253 3.\" %%%LICENSE_START(VERBATIM)
fea681da
MK
4.\" Permission is granted to make and distribute verbatim copies of this
5.\" manual provided the copyright notice and this permission notice are
6.\" preserved on all copies.
7.\"
8.\" Permission is granted to copy and distribute modified versions of this
9.\" manual under the conditions for verbatim copying, provided that the
10.\" entire resulting derived work is distributed under the terms of a
11.\" permission notice identical to this one.
c13182ef 12.\"
fea681da
MK
13.\" Since the Linux kernel and libraries are constantly changing, this
14.\" manual page may be incorrect or out-of-date. The author(s) assume no
15.\" responsibility for errors or omissions, or for damages resulting from
16.\" the use of the information contained herein. The author(s) may not
17.\" have taken the same level of care in the production of this manual,
18.\" which is licensed free of charge, as they might when working
19.\" professionally.
c13182ef 20.\"
fea681da
MK
21.\" Formatted or processed versions of this manual, if unaccompanied by
22.\" the source, must acknowledge the copyright and authors of this work.
4b72fb64 23.\" %%%LICENSE_END
fea681da
MK
24.\"
25.\" Fragments of this document are directly derived from IETF standards.
26.\" For those fragments which are directly derived from such standards,
27.\" the following notice applies, which is the standard copyright and
28.\" rights announcement of The Internet Society:
29.\"
30.\" Copyright (C) The Internet Society (1998). All Rights Reserved.
31.\" This document and translations of it may be copied and furnished to
32.\" others, and derivative works that comment on or otherwise explain it
33.\" or assist in its implementation may be prepared, copied, published
34.\" and distributed, in whole or in part, without restriction of any
35.\" kind, provided that the above copyright notice and this paragraph are
36.\" included on all such copies and derivative works. However, this
37.\" document itself may not be modified in any way, such as by removing
38.\" the copyright notice or references to the Internet Society or other
39.\" Internet organizations, except as needed for the purpose of
40.\" developing Internet standards in which case the procedures for
41.\" copyrights defined in the Internet Standards process must be
42.\" followed, or as required to translate it into languages other than English.
43.\"
44.\" Modified Fri Jul 25 23:00:00 1999 by David A. Wheeler (dwheeler@dwheeler.com)
45.\" Modified Fri Aug 21 23:00:00 1999 by David A. Wheeler (dwheeler@dwheeler.com)
46.\" Modified Tue Mar 14 2000 by David A. Wheeler (dwheeler@dwheeler.com)
47.\"
52d06f48 48.TH URI 7 2014-03-18 "Linux" "Linux Programmer's Manual"
fea681da
MK
49.SH NAME
50uri, url, urn \- uniform resource identifier (URI), including a URL or URN
51.SH SYNOPSIS
52.nf
53.HP 0.2i
54URI = [ absoluteURI | relativeURI ] [ "#" fragment ]
55.HP
56absoluteURI = scheme ":" ( hierarchical_part | opaque_part )
57.HP
58relativeURI = ( net_path | absolute_path | relative_path ) [ "?" query ]
fea681da 59.HP
c79efe90
MK
60scheme = "http" | "ftp" | "gopher" | "mailto" | "news" | "telnet" |
61 "file" | "man" | "info" | "whatis" | "ldap" | "wais" | \&...
fea681da
MK
62.HP
63hierarchical_part = ( net_path | absolute_path ) [ "?" query ]
fea681da
MK
64.HP
65net_path = "//" authority [ absolute_path ]
66.HP
67absolute_path = "/" path_segments
68.HP
69relative_path = relative_segment [ absolute_path ]
70.fi
71.SH DESCRIPTION
72.PP
73A Uniform Resource Identifier (URI) is a short string of characters
74identifying an abstract or physical resource (for example, a web page).
75A Uniform Resource Locator (URL) is a URI
76that identifies a resource through its primary access
77mechanism (e.g., its network "location"), rather than
78by name or some other attribute of that resource.
79A Uniform Resource Name (URN) is a URI
80that must remain globally unique and persistent even when
81the resource ceases to exist or becomes unavailable.
82.PP
83URIs are the standard way to name hypertext link destinations
84for tools such as web browsers.
6116ff44
MK
85The string "http://www.kernelnotes.org" is a URL (and thus it
86is also a URI).
fea681da
MK
87Many people use the term URL loosely as a synonym for URI
88(though technically URLs are a subset of URIs).
89.PP
90URIs can be absolute or relative.
91An absolute identifier refers to a resource independent of
92context, while a relative
93identifier refers to a resource by describing the difference
94from the current context.
95Within a relative path reference, the complete path segments "." and
96".." have special meanings: "the current hierarchy level" and "the
97level above this hierarchy level", respectively, just like they do in
008f1ecc 98UNIX-like systems.
fea681da
MK
99A path segment which contains a colon
100character can't be used as the first segment of a relative URI path
101(e.g., "this:that"), because it would be mistaken for a scheme name;
102precede such segments with ./ (e.g., "./this:that").
b9560046 103Note that descendants of MS-DOS (e.g., Microsoft Windows) replace
fea681da
MK
104devicename colons with the vertical bar ("|") in URIs, so "C:" becomes "C|".
105.PP
106A fragment identifier, if included, refers to a particular named portion
f81fb444
MK
107(fragment) of a resource; text after a \(aq#\(aq identifies the fragment.
108A URI beginning with \(aq#\(aq refers to that fragment in the current resource.
446a4bc8 109.SS Usage
fea681da
MK
110There are many different URI schemes, each with specific
111additional rules and meanings, but they are intentionally made to be
112as similar as possible.
113For example, many URL schemes
114permit the authority to be the following format, called here an
115.I ip_server
116(square brackets show what's optional):
117.HP
118.IR "ip_server = " [ user " [ : " password " ] @ ] " host " [ : " port ]
119.PP
18701562 120This format allows you to optionally insert a username,
fea681da
MK
121a user plus password, and/or a port number.
122The
123.I host
124is the name of the host computer, either its name as determined by DNS
125or an IP address (numbers separated by periods).
126Thus the URI
ffc3e08c
JW
127<http://fred:fredpassword@example.com:8080/>
128logs into a web server on host example.com
fea681da
MK
129as fred (using fredpassword) using port 8080.
130Avoid including a password in a URI if possible because of the many
131security risks of having a password written down.
18701562 132If the URL supplies a username but no password, and the remote
fea681da
MK
133server requests a password, the program interpreting the URL
134should request one from the user.
135.PP
008f1ecc 136Here are some of the most common schemes in use on UNIX-like systems
fea681da
MK
137that are understood by many tools.
138Note that many tools using URIs also have internal schemes or specialized
139schemes; see those tools' documentation for information on those schemes.
446a4bc8
MK
140.PP
141.B "http \- Web (HTTP) server"
142.PP
fea681da
MK
143.RI http:// ip_server / path
144.br
145.RI http:// ip_server / path ? query
146.PP
147This is a URL accessing a web (HTTP) server.
148The default port is 80.
149If the path refers to a directory, the web server will choose what
150to return; usually if there is a file named "index.html" or "index.htm"
151its content is returned, otherwise, a list of the files in the current
152directory (with appropriate links) is generated and returned.
153An example is <http://lwn.net>.
154.PP
155A query can be given in the archaic "isindex" format, consisting of a
156word or phrase and not including an equal sign (=).
157A query can also be in the longer "GET" format, which has one or more
158query entries of the form
159.IR key = value
160separated by the ampersand character (&).
161Note that
162.I key
163can be repeated more than once, though it's up to the web server
164and its application programs to determine if there's any meaning to that.
165There is an unfortunate interaction with HTML/XML/SGML and
166the GET query format; when such URIs with more than one key
167are embedded in SGML/XML documents (including HTML), the ampersand
168(&) has to be rewritten as &amp;.
169Note that not all queries use this format; larger forms
170may be too long to store as a URI, so they use a different
6116ff44
MK
171interaction mechanism (called POST) which does
172not include the data in the URI.
fea681da 173See the Common Gateway Interface specification at
608bf950
SK
174.UR http://www.w3.org\:/CGI
175.UE
176for more information.
446a4bc8
MK
177.PP
178.B "ftp \- File Transfer Protocol (FTP)"
179.PP
fea681da
MK
180.RI ftp:// ip_server / path
181.PP
182This is a URL accessing a file through the file transfer protocol (FTP).
183The default port (for control) is 21.
18701562 184If no username is included, the username "anonymous" is supplied, and
fea681da
MK
185in that case many clients provide as the password the requestor's
186Internet email address.
187An example is
188<ftp://ftp.is.co.za/rfc/rfc1808.txt>.
446a4bc8
MK
189.PP
190.B "gopher \- Gopher server"
191.PP
fea681da
MK
192.RI gopher:// ip_server / "gophertype selector"
193.br
194.RI gopher:// ip_server / "gophertype selector" %09 search
195.br
196.RI gopher:// ip_server / "gophertype selector" %09 search %09 gopher+_string
197.br
198.PP
199The default gopher port is 70.
200.I gophertype
201is a single-character field to denote the
202Gopher type of the resource to
203which the URL refers.
204The entire path may also be empty, in
205which case the delimiting "/" is also optional and the gophertype
206defaults to "1".
207.PP
208.I selector
c13182ef
MK
209is the Gopher selector string.
210In the Gopher protocol,
fea681da
MK
211Gopher selector strings are a sequence of octets which may contain
212any octets except 09 hexadecimal (US-ASCII HT or tab), 0A hexadecimal
213(US-ASCII character LF), and 0D (US-ASCII character CR).
446a4bc8
MK
214.PP
215.B "mailto \- Email address"
216.PP
fea681da
MK
217.RI mailto: email-address
218.PP
219This is an email address, usually of the form
220.IR name @ hostname .
221See
222.BR mailaddr (7)
223for more information on the correct format of an email address.
224Note that any % character must be rewritten as %25.
225An example is <mailto:dwheeler@dwheeler.com>.
446a4bc8
MK
226.PP
227.B "news \- Newsgroup or News message"
228.PP
fea681da
MK
229.RI news: newsgroup-name
230.br
231.RI news: message-id
232.PP
233A
234.I newsgroup-name
235is a period-delimited hierarchical name, such as
236"comp.infosystems.www.misc".
237If <newsgroup-name> is "*" (as in <news:*>), it is used to refer
238to "all available news groups".
239An example is <news:comp.lang.ada>.
240.PP
241A
242.I message-id
243corresponds to the Message-ID of
608bf950 244.UR http://www.ietf.org\:/rfc\:/rfc1036.txt
331da7c3 245IETF RFC\ 1036,
fea681da
MK
246.UE
247without the enclosing "<"
248and ">"; it takes the form
249.IR unique @ full_domain_name .
250A message identifier may be distinguished from a news group name by the
251presence of the "@" character.
446a4bc8
MK
252.PP
253.B "telnet \- Telnet login"
254.PP
fea681da
MK
255.RI telnet:// ip_server /
256.PP
257The Telnet URL scheme is used to designate interactive text services that
c13182ef
MK
258may be accessed by the Telnet protocol.
259The final "/" character may be omitted.
fea681da
MK
260The default port is 23.
261An example is <telnet://melvyl.ucop.edu/>.
446a4bc8
MK
262.PP
263.B "file \- Normal file"
264.PP
fea681da
MK
265.RI file:// ip_server / path_segments
266.br
267.RI file: path_segments
268.PP
269This represents a file or directory accessible locally.
270As a special case,
7adfc6e1 271.I ip_server
fea681da 272can be the string "localhost" or the empty
2d986c92
MK
273string; this is interpreted as "the machine from which the URL is
274being interpreted".
fea681da
MK
275If the path is to a directory, the viewer should display the
276directory's contents with links to each containee;
277not all viewers currently do this.
278KDE supports generated files through the URL <file:/cgi-bin>.
279If the given file isn't found, browser writers may want to try to expand
280the filename via filename globbing
281(see
282.BR glob (7)
283and
284.BR glob (3)).
285.PP
286The second format (e.g., <file:/etc/passwd>)
287is a correct format for referring to
c13182ef
MK
288a local file.
289However, older standards did not permit this format,
fea681da 290and some programs don't recognize this as a URI.
75b94dc3
MK
291A more portable syntax is to use an empty string as the server name,
292for example,
fea681da
MK
293<file:///etc/passwd>; this form does the same thing
294and is easily recognized by pattern matchers and older programs as a URI.
295Note that if you really mean to say "start from the current location," don't
296specify the scheme at all; use a relative address like <../test.txt>,
297which has the side-effect of being scheme-independent.
298An example of this scheme is <file:///etc/passwd>.
446a4bc8
MK
299.PP
300.B "man \- Man page documentation"
301.PP
fea681da
MK
302.RI man: command-name
303.br
304.RI man: command-name ( section )
305.PP
306This refers to local online manual (man) reference pages.
6116ff44
MK
307The command name can optionally be followed by a
308parenthesis and section number; see
fea681da
MK
309.BR man (7)
310for more information on the meaning of the section numbers.
008f1ecc 311This URI scheme is unique to UNIX-like systems (such as Linux)
fea681da
MK
312and is not currently registered by the IETF.
313An example is <man:ls(1)>.
446a4bc8
MK
314.PP
315.B "info \- Info page documentation"
316.PP
fea681da
MK
317.RI info: virtual-filename
318.br
319.RI info: virtual-filename # nodename
320.br
321.RI info:( virtual-filename )
322.br
323.RI info:( virtual-filename ) nodename
324.PP
325This scheme refers to online info reference pages (generated from
6116ff44
MK
326texinfo files),
327a documentation format used by programs such as the GNU tools.
008f1ecc 328This URI scheme is unique to UNIX-like systems (such as Linux)
fea681da
MK
329and is not currently registered by the IETF.
330As of this writing, GNOME and KDE differ in their URI syntax
331and do not accept the other's syntax.
332The first two formats are the GNOME format; in nodenames all spaces
333are written as underscores.
334The second two formats are the KDE format;
335spaces in nodenames must be written as spaces, even though this
336is forbidden by the URI standards.
337It's hoped that in the future most tools will understand all of these
338formats and will always accept underscores for spaces in nodenames.
339In both GNOME and KDE, if the form without the nodename is used the
340nodename is assumed to be "Top".
341Examples of the GNOME format are <info:gcc> and <info:gcc#G++_and_GCC>.
342Examples of the KDE format are <info:(gcc)> and <info:(gcc)G++ and GCC>.
446a4bc8
MK
343.PP
344.B "whatis \- Documentation search"
345.PP
fea681da
MK
346.RI whatis: string
347.PP
6116ff44
MK
348This scheme searches the database of short (one-line) descriptions of
349commands and returns a list of descriptions containing that string.
fea681da
MK
350Only complete word matches are returned.
351See
352.BR whatis (1).
008f1ecc 353This URI scheme is unique to UNIX-like systems (such as Linux)
fea681da 354and is not currently registered by the IETF.
446a4bc8
MK
355.PP
356.B "ghelp \- GNOME help documentation"
357.PP
fea681da
MK
358.RI ghelp: name-of-application
359.PP
360This loads GNOME help for the given application.
361Note that not much documentation currently exists in this format.
446a4bc8
MK
362.PP
363.B "ldap \- Lightweight Directory Access Protocol"
364.PP
fea681da
MK
365.RI ldap:// hostport
366.br
367.RI ldap:// hostport /
368.br
369.RI ldap:// hostport / dn
370.br
371.RI ldap:// hostport / dn ? attributes
372.br
373.RI ldap:// hostport / dn ? attributes ? scope
374.br
375.RI ldap:// hostport / dn ? attributes ? scope ? filter
376.br
377.RI ldap:// hostport / dn ? attributes ? scope ? filter ? extensions
378.PP
379This scheme supports queries to the
380Lightweight Directory Access Protocol (LDAP), a protocol for querying
3f624b93 381a set of servers for hierarchically organized information
fea681da 382(such as people and computing resources).
034dbf3a 383See
608bf950 384.UR http://www.ietf.org\:/rfc\:/rfc2255.txt
034dbf3a 385RFC\ 2255
fea681da 386.UE
034dbf3a 387for more information on the LDAP URL scheme.
fea681da
MK
388The components of this URL are:
389.IP hostport 12
390the LDAP server to query, written as a hostname optionally followed by
391a colon and the port number.
c13182ef 392The default LDAP port is TCP port 389.
fea681da
MK
393If empty, the client determines which the LDAP server to use.
394.IP dn
395the LDAP Distinguished Name, which identifies
396the base object of the LDAP search (see
608bf950 397.UR http://www.ietf.org\:/rfc\:/rfc2253.txt
331da7c3 398RFC\ 2253
fea681da
MK
399.UE
400section 3).
401.IP attributes
402a comma-separated list of attributes to be returned;
c13182ef 403see RFC\ 2251 section 4.1.5.
331da7c3 404If omitted, all attributes should be returned.
fea681da
MK
405.IP scope
406specifies the scope of the search, which can be one of
407"base" (for a base object search), "one" (for a one-level search),
c13182ef
MK
408or "sub" (for a subtree search).
409If scope is omitted, "base" is assumed.
fea681da
MK
410.IP filter
411specifies the search filter (subset of entries
c13182ef
MK
412to return).
413If omitted, all entries should be returned.
fea681da 414See
608bf950 415.UR http://www.ietf.org\:/rfc\:/rfc2254.txt
331da7c3 416RFC\ 2254
fea681da
MK
417.UE
418section 4.
419.IP extensions
420a comma-separated list of type=value
421pairs, where the =value portion may be omitted for options not
c13182ef 422requiring it.
f81fb444 423An extension prefixed with a \(aq!\(aq is critical
24b74457 424(must be supported to be valid), otherwise it is noncritical (optional).
fea681da
MK
425.PP
426LDAP queries are easiest to explain by example.
427Here's a query that asks ldap.itd.umich.edu for information about
428the University of Michigan in the U.S.:
0dac954b
MK
429.PP
430.nf
fea681da 431ldap://ldap.itd.umich.edu/o=University%20of%20Michigan,c=US
3ffdc54f 432.fi
fea681da
MK
433.PP
434To just get its postal address attribute, request:
0dac954b
MK
435.PP
436.nf
fea681da 437ldap://ldap.itd.umich.edu/o=University%20of%20Michigan,c=US?postalAddress
0dac954b 438.fi
fea681da
MK
439.PP
440To ask a host.com at port 6666 for information about the person
441with common name (cn) "Babs Jensen" at University of Michigan, request:
0dac954b
MK
442.PP
443.nf
fea681da 444ldap://host.com:6666/o=University%20of%20Michigan,c=US??sub?(cn=Babs%20Jensen)
0dac954b 445.fi
446a4bc8
MK
446.PP
447.B "wais \- Wide Area Information Servers"
448.PP
fea681da
MK
449.RI wais:// hostport / database
450.br
451.RI wais:// hostport / database ? search
452.br
453.RI wais:// hostport / database / wtype / wpath
454.PP
455This scheme designates a WAIS database, search, or document
456(see
608bf950 457.UR http://www.ietf.org\:/rfc\:/rfc1625.txt
331da7c3 458IETF RFC\ 1625
fea681da
MK
459.UE
460for more information on WAIS).
461Hostport is the hostname, optionally followed by a colon and port number
462(the default port number is 210).
463.PP
464The first form designates a WAIS database for searching.
465The second form designates a particular search of the WAIS database
466.IR database .
467The third form designates a particular document within a WAIS
468database to be retrieved.
469.I wtype
470is the WAIS designation of the type of the object and
471.I wpath
472is the WAIS document-id.
446a4bc8
MK
473.PP
474.B "other schemes"
475.PP
fea681da
MK
476There are many other URI schemes.
477Most tools that accept URIs support a set of internal URIs
478(e.g., Mozilla has the about: scheme for internal information,
479and the GNOME help browser has the toc: scheme for various starting
480locations).
481There are many schemes that have been defined but are not as widely
482used at the current time
483(e.g., prospero).
484The nntp: scheme is deprecated in favor of the news: scheme.
485URNs are to be supported by the urn: scheme, with a hierarchical name space
486(e.g., urn:ietf:... would identify IETF documents); at this time
487URNs are not widely implemented.
488Not all tools support all schemes.
73d8cece 489.SS Character encoding
fea681da
MK
490.PP
491URIs use a limited number of characters so that they can be
492typed in and used in a variety of situations.
493.PP
494The following characters are reserved, that is, they may appear in a
495URI but their use is limited to their reserved purpose
496(conflicting data must be escaped before forming the URI):
497.IP
498 ; / ? : @ & = + $ ,
499.PP
500Unreserved characters may be included in a URI.
501Unreserved characters
efaef3da 502include uppercase and lowercase English letters,
fea681da
MK
503decimal digits, and the following
504limited set of punctuation marks and symbols:
505.IP
4d9b6984 506 \- _ . ! ~ * ' ( )
fea681da
MK
507.PP
508All other characters must be escaped.
509An escaped octet is encoded as a character triplet, consisting of the
510percent character "%" followed by the two hexadecimal digits
efaef3da 511representing the octet code (you can use uppercase or lowercase letters
c13182ef
MK
512for the hexadecimal digits).
513For example, a blank space must be escaped
fea681da
MK
514as "%20", a tab character as "%09", and the "&" as "%26".
515Because the percent "%" character always has the reserved purpose of
516being the escape indicator, it must be escaped as "%25".
517It is common practice to escape space characters as the plus symbol (+)
518in query text; this practice isn't uniformly defined
519in the relevant RFCs (which recommend %20 instead) but any tool accepting
520URIs with query text should be prepared for them.
521A URI is always shown in its "escaped" form.
522.PP
523Unreserved characters can be escaped without changing the semantics
524of the URI, but this should not be done unless the URI is being used
525in a context that does not allow the unescaped character to appear.
763f0e47
MK
526For example, "%7e" is sometimes used instead of "~" in an HTTP URL
527path, but the two are equivalent for an HTTP URL.
fea681da
MK
528.PP
529For URIs which must handle characters outside the US ASCII character set,
530the HTML 4.01 specification (section B.2) and
331da7c3 531IETF RFC\ 2718 (section 2.2.5) recommend the following approach:
fea681da 532.IP 1. 4
5503c85e
MK
533translate the character sequences into UTF-8 (IETF RFC\ 2279)\(emsee
534.BR utf-8 (7)\(emand
535then
fea681da
MK
536.IP 2.
537use the URI escaping mechanism, that is,
538use the %HH encoding for unsafe octets.
73d8cece 539.SS Writing a URI
eb1af896 540When written, URIs should be placed inside double quotes
fea681da
MK
541(e.g., "http://www.kernelnotes.org"),
542enclosed in angle brackets (e.g., <http://lwn.net>),
543or placed on a line by themselves.
544A warning for those who use double-quotes:
545.B never
546move extraneous punctuation (such as the period ending a sentence or the
547comma in a list)
548inside a URI, since this will change the value of the URI.
549Instead, use angle brackets instead, or
550switch to a quoting system that never includes extraneous characters
551inside quotation marks.
552This latter system, called the 'new' or 'logical' quoting system by
553"Hart's Rules" and the "Oxford Dictionary for Writers and Editors",
554is preferred practice in Great Britain and hackers worldwide
555(see the
defcceb3 556Jargon File's section on Hacker Writing Style,
608bf950
SK
557.UR http://www.fwi.uva.nl\:/~mes\:/jargon\:/h\:/HackerWritingStyle.html
558.UE ,
fea681da 559for more information).
c13182ef 560Older documents suggested inserting the prefix "URL:"
fea681da
MK
561just before the URI, but this form has never caught on.
562.PP
563The URI syntax was designed to be unambiguous.
564However, as URIs have become commonplace, traditional media
565(television, radio, newspapers, billboards, etc.) have increasingly
566used abbreviated URI references consisting of
567only the authority and path portions of the identified resource
568(e.g., <www.w3.org/Addressing>).
569Such references are primarily
570intended for human interpretation rather than machine, with the
571assumption that context-based heuristics are sufficient to complete
572the URI (e.g., hostnames beginning with "www" are likely to have
573a URI prefix of "http://" and hostnames beginning with "ftp" likely
574to have a prefix of "ftp://").
575Many client implementations heuristically resolve these references.
576Such heuristics may
577change over time, particularly when new schemes are introduced.
578Since an abbreviated URI has the same syntax as a relative URL path,
579abbreviated URI references cannot be used where relative URIs are
33a0ccb2 580permitted, and can be used only when there is no defined base
fea681da
MK
581(such as in dialog boxes).
582Don't use abbreviated URIs as hypertext links inside a document;
583use the standard format as described here.
47297adb 584.SH CONFORMING TO
2b2581ee 585.PP
608bf950
SK
586.UR http://www.ietf.org\:/rfc\:/rfc2396.txt
587(IETF RFC\ 2396)
588.UE ,
589.UR http://www.w3.org\:/TR\:/REC-html40
590(HTML 4.0)
591.UE .
fea681da
MK
592.SH NOTES
593Any tool accepting URIs (e.g., a web browser) on a Linux system should
6116ff44
MK
594be able to handle (directly or indirectly) all of the
595schemes described here, including the man: and info: schemes.
596Handling them by invoking some other program is
597fine and in fact encouraged.
fea681da
MK
598.PP
599Technically the fragment isn't part of the URI.
600.PP
601For information on how to embed URIs (including URLs) in a data format,
602see documentation on that format.
603HTML uses the format <A HREF="\fIuri\fP">
604.I text
605</A>.
606Texinfo files use the format @uref{\fIuri\fP}.
3f624b93 607Man and mdoc have the recently added UR macro, or just include the
fea681da
MK
608URI in the text (viewers should be able to detect :// as part of a URI).
609.PP
6116ff44
MK
610The GNOME and KDE desktop environments currently vary in the URIs
611they accept, in particular in their respective help browsers.
fea681da
MK
612To list man pages, GNOME uses <toc:man> while KDE uses <man:(index)>, and
613to list info pages, GNOME uses <toc:info> while KDE uses <info:(dir)>
614(the author of this man page prefers the KDE approach here, though a more
615regular format would be even better).
616In general, KDE uses <file:/cgi-bin/> as a prefix to a set of generated
617files.
618KDE prefers documentation in HTML, accessed via the
619<file:/cgi-bin/helpindex>.
620GNOME prefers the ghelp scheme to store and find documentation.
621Neither browser handles file: references to directories at the time
622of this writing, making it difficult to refer to an entire directory with
623a browsable URI.
6116ff44
MK
624As noted above, these environments differ in how they handle the
625info: scheme, probably the most important variation.
fea681da
MK
626It is expected that GNOME and KDE
627will converge to common URI formats, and a future
628version of this man page will describe the converged result.
629Efforts to aid this convergence are encouraged.
2b2581ee 630.SS Security
fea681da
MK
631.PP
632A URI does not in itself pose a security threat.
633There is no general guarantee that a URL, which at one time
c13182ef
MK
634located a given resource, will continue to do so.
635Nor is there any
fea681da 636guarantee that a URL will not locate a different resource at some
33a0ccb2
MK
637later point in time; such a guarantee can be
638obtained only from the person(s) controlling that namespace and the
fea681da
MK
639resource in question.
640.PP
641It is sometimes possible to construct a URL such that an attempt to
642perform a seemingly harmless operation, such as the
643retrieval of an entity associated with the resource, will in fact
c13182ef
MK
644cause a possibly damaging remote operation to occur.
645The unsafe URL
fea681da 646is typically constructed by specifying a port number other than that
c13182ef
MK
647reserved for the network protocol in question.
648The client unwittingly contacts a site that is in fact
649running a different protocol.
650The content of the URL contains instructions that, when
fea681da 651interpreted according to this other protocol, cause an unexpected
c13182ef
MK
652operation.
653An example has been the use of a gopher URL to cause an
fea681da
MK
654unintended or impersonating message to be sent via a SMTP server.
655.PP
656Caution should be used when using any URL that specifies a port
657number other than the default for the protocol, especially when it is
658a number within the reserved space.
659.PP
660Care should be taken when a URI contains escaped delimiters for a
661given protocol (for example, CR and LF characters for telnet
c13182ef
MK
662protocols) that these are not unescaped before transmission.
663This might violate the protocol, but avoids the potential for such
fea681da
MK
664characters to be used to simulate an extra operation or parameter in
665that protocol, which might lead to an unexpected and possibly harmful
666remote operation to be performed.
667.PP
668It is clearly unwise to use a URI that contains a password which is
c13182ef
MK
669intended to be secret.
670In particular, the use of a password within
84c517a4
MK
671the "userinfo" component of a URI is strongly recommended against except
672in those rare cases where the "password" parameter is intended to be public.
fea681da
MK
673.SH BUGS
674.PP
675Documentation may be placed in a variety of locations, so there
676currently isn't a good URI scheme for general online documentation
677in arbitrary formats.
678References of the form
679<file:///usr/doc/ZZZ> don't work because different distributions and
680local installation requirements may place the files in different
681directories
6116ff44
MK
682(it may be in /usr/doc, or /usr/local/doc, or /usr/share,
683or somewhere else).
fea681da
MK
684Also, the directory ZZZ usually changes when a version changes
685(though filename globbing could partially overcome this).
6116ff44
MK
686Finally, using the file: scheme doesn't easily support people
687who dynamically load documentation from the Internet (instead of
9ee4a2b6 688loading the files onto a local filesystem).
fea681da 689A future URI scheme may be added (e.g., "userdoc:") to permit
6116ff44
MK
690programs to include cross-references to more detailed documentation
691without having to know the exact location of that documentation.
9ee4a2b6 692Alternatively, a future version of the filesystem specification may
fea681da
MK
693specify file locations sufficiently so that the file: scheme will
694be able to locate documentation.
695.PP
696Many programs and file formats don't include a way to incorporate
697or implement links using URIs.
698.PP
699Many programs can't handle all of these different URI formats; there
700should be a standard mechanism to load an arbitrary URI that automatically
6116ff44 701detects the users' environment (e.g., text or graphics,
3f624b93 702desktop environment, local user preferences, and currently executing
6116ff44 703tools) and invokes the right tool for any URI.
fd7f0a7f
MK
704.\" .SH AUTHOR
705.\" David A. Wheeler (dwheeler@dwheeler.com) wrote this man page.
47297adb 706.SH SEE ALSO
fea681da
MK
707.BR lynx (1),
708.BR man2html (1),
709.BR mailaddr (7),
173fe7e7
DP
710.BR utf-8 (7)
711
608bf950 712.UR http://www.ietf.org\:/rfc\:/rfc2255.txt
baf17bc4 713IETF RFC\ 2255
fea681da 714.UE