From: Daniel Stenberg Date: Tue, 7 Mar 2023 10:01:15 +0000 (+0100) Subject: docs: extend the URL API descriptions X-Git-Tag: curl-8_0_0~86 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=2a31086f390e6cbccf2b037efee26c457ee2f2e2;p=thirdparty%2Fcurl.git docs: extend the URL API descriptions Closes #10701 --- diff --git a/docs/libcurl/curl_url_get.3 b/docs/libcurl/curl_url_get.3 index 01afefc0fa..ca10821498 100644 --- a/docs/libcurl/curl_url_get.3 +++ b/docs/libcurl/curl_url_get.3 @@ -41,6 +41,10 @@ The \fIwhat\fP argument should be the particular part to extract (see list below) and \fIpart\fP points to a 'char *' to get updated to point to a newly allocated string with the contents. +The URL API has no particular maximum length for URL fiends. In the real +world, excessively long field in URLs will cause problems even if this API +accepts them. This function can return very large ones. + The \fIflags\fP argument is a bitmask with individual features. The returned part pointer must be freed with \fIcurl_free(3)\fP after use. @@ -91,26 +95,38 @@ anything outside the ASCII range. .IP CURLUPART_URL When asked to return the full URL, \fIcurl_url_get(3)\fP will return a normalized and possibly cleaned up version of what was previously parsed. + +We advice using the \fICURLU_PUNYCODE\fP option to get the URL as "normalized" +as possible since IDN allows host names to be written in many different ways +that still end up the same punycode version. .IP CURLUPART_SCHEME Scheme cannot be URL decoded on get. .IP CURLUPART_USER .IP CURLUPART_PASSWORD .IP CURLUPART_OPTIONS +The options field is an optional field that might follow the password in the +userinfo part. It is only recognized/used when parsing URLs for the following +schemes: pop3, smtp and imap. The URL API still allows users to set and get +this field independently of scheme when not parsing full URLs. .IP CURLUPART_HOST The host name. If it is an IPv6 numeric address, the zone id will not be part of it but is provided separately in \fICURLUPART_ZONEID\fP. IPv6 numerical addresses are returned within brackets ([]). + +IPv6 names are normalized when set, which should make them as short as +possible while maintaining correct syntax. .IP CURLUPART_ZONEID If the host name is a numeric IPv6 address, this field might also be set. .IP CURLUPART_PORT -Port cannot be URL decoded on get. +A port cannot be URL decoded on get. This number is returned in a string just +like all other parts. That string is guaranteed to hold a valid port number in +ASCII using base 10. .IP CURLUPART_PATH -\fIpart\fP will be '/' even if no path is supplied in the URL. +The \fIpart\fP will be '/' even if no path is supplied in the URL. A URL path +always starts with a slash. .IP CURLUPART_QUERY -The initial question mark that denotes the beginning of the query part is -a delimiter only. -It is not part of the query contents. - +The initial question mark that denotes the beginning of the query part is a +delimiter only. It is not part of the query contents. A not-present query will lead \fIpart\fP to be set to NULL. A zero-length query will lead \fIpart\fP to be set to a zero-length string. @@ -118,6 +134,8 @@ A zero-length query will lead \fIpart\fP to be set to a zero-length string. The query part will also get pluses converted to space when asked to URL decode on get with the CURLU_URLDECODE bit. .IP CURLUPART_FRAGMENT +The initial hash sign that denotes the beginning of the fragment is a +delimiter only. It is not part of the fragment contents. .SH EXAMPLE .nf CURLUcode rc; diff --git a/docs/libcurl/curl_url_set.3 b/docs/libcurl/curl_url_set.3 index 13ddc3abce..36d4a44e93 100644 --- a/docs/libcurl/curl_url_set.3 +++ b/docs/libcurl/curl_url_set.3 @@ -54,6 +54,10 @@ does not know about, the \fBCURLU_NON_SUPPORT_SCHEME\fP flags bit must be set. Otherwise, this function returns \fICURLUE_UNSUPPORTED_SCHEME\fP on URL schemes it does not recognize. +This function call has no particular maximum length for any provided input +string. In the real world, excessively long field in URLs will cause problems +even if this API accepts them. + The \fIflags\fP argument is a bitmask with independent features. .SH PARTS .IP CURLUPART_URL @@ -66,16 +70,25 @@ will be replaced with the information of the newly set URL. Pass a pointer to a null-terminated string to the \fIurl\fP parameter. The string must point to a correctly formatted "RFC 3986+" URL or be a NULL pointer. + +Unless \fICURLU_NO_AUTHORITY\fP is set, a blank host name is not allowed in +the URL. .IP CURLUPART_SCHEME Scheme cannot be URL decoded on set. libcurl only accepts setting schemes up to 40 bytes long. .IP CURLUPART_USER .IP CURLUPART_PASSWORD .IP CURLUPART_OPTIONS +The options field is an optional field that might follow the password in the +userinfo part. It is only recognized/used when parsing URLs for the following +schemes: pop3, smtp and imap. This function however allows users to +independently set this field at will. .IP CURLUPART_HOST The host name. If it is IDNA the string must then be encoded as your locale says or UTF-8 (when WinIDN is used). If it is a bracketed IPv6 numeric address it may contain a zone id (or you can use CURLUPART_ZONEID). + +Unless \fICURLU_NO_AUTHORITY\fP is set, a blank host name is not allowed to set. .IP CURLUPART_ZONEID If the host name is a numeric IPv6 address, this field can also be set. .IP CURLUPART_PORT