From: Ezio Melotti Date: Thu, 14 Apr 2011 04:50:25 +0000 (+0300) Subject: #11840: Merge with 3.1. X-Git-Tag: v3.2.1b1~129 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=c1f0577b54869325a15cd787ddfb79479653de15;p=thirdparty%2FPython%2Fcpython.git #11840: Merge with 3.1. --- c1f0577b54869325a15cd787ddfb79479653de15 diff --cc Doc/c-api/unicode.rst index 6783f650907c,a91f258e5870..f48eb73bfd1b --- a/Doc/c-api/unicode.rst +++ b/Doc/c-api/unicode.rst @@@ -314,45 -305,20 +314,45 @@@ APIs An unrecognized format character causes all the rest of the format string to be copied as-is to the result string, and any extra arguments discarded. + .. note:: + + The `"%lld"` and `"%llu"` format specifiers are only available + when :const:`HAVE_LONG_LONG` is defined. + + .. versionchanged:: 3.2 + Support for ``"%lld"`` and ``"%llu"`` added. -.. cfunction:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs) - Identical to :func:`PyUnicode_FromFormat` except that it takes exactly two +.. c:function:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs) + + Identical to :c:func:`PyUnicode_FromFormat` except that it takes exactly two arguments. +.. c:function:: PyObject* PyUnicode_TransformDecimalToASCII(Py_UNICODE *s, Py_ssize_t size) + + Create a Unicode object by replacing all decimal digits in - :c:type:`Py_UNICODE` buffer of the given size by ASCII digits 0--9 ++ :c:type:`Py_UNICODE` buffer of the given *size* by ASCII digits 0--9 + according to their decimal value. Return *NULL* if an exception + occurs. -.. cfunction:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode) - Return a read-only pointer to the Unicode object's internal :ctype:`Py_UNICODE` +.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode) + + Return a read-only pointer to the Unicode object's internal :c:type:`Py_UNICODE` buffer, *NULL* if *unicode* is not a Unicode object. -.. cfunction:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode) +.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeCopy(PyObject *unicode) + - Create a copy of a unicode string ending with a nul character. Return *NULL* ++ Create a copy of a Unicode string ending with a nul character. Return *NULL* + and raise a :exc:`MemoryError` exception on memory allocation failure, + otherwise return a new allocated buffer (use :c:func:`PyMem_Free` to free the + buffer). + + .. versionadded:: 3.2 + + +.. c:function:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode) Return the length of the Unicode object. @@@ -458,12 -390,12 +458,12 @@@ used, passing :c:func:`PyUnicode_FSDeco wchar_t Support """"""""""""""" - wchar_t support for platforms which support it: -:ctype:`wchar_t` support for platforms which support it: ++:c:type:`wchar_t` support for platforms which support it: -.. cfunction:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size) +.. c:function:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size) - Create a Unicode object from the :c:type:`wchar_t` buffer *w* of the given size. - Passing -1 as the size indicates that the function must itself compute the length, - Create a Unicode object from the :ctype:`wchar_t` buffer *w* of the given *size*. ++ Create a Unicode object from the :c:type:`wchar_t` buffer *w* of the given *size*. + Passing -1 as the *size* indicates that the function must itself compute the length, using wcslen. Return *NULL* on failure. @@@ -507,9 -425,9 +507,9 @@@ constructor Setting encoding to *NULL* causes the default encoding to be used which is ASCII. The file system calls should use -:cfunc:`PyUnicode_FSConverter` for encoding file names. This uses the -variable :cdata:`Py_FileSystemDefaultEncoding` internally. This +:c:func:`PyUnicode_FSConverter` for encoding file names. This uses the +variable :c:data:`Py_FileSystemDefaultEncoding` internally. This - variable should be treated as read-only: On some systems, it will be a + variable should be treated as read-only: on some systems, it will be a pointer to a static string, on others, it will change at run-time (such as when the application invokes setlocale). @@@ -536,9 -454,9 +536,9 @@@ These are the generic codec APIs the codec. -.. cfunction:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors) +.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors) - Encode the :c:type:`Py_UNICODE` buffer of the given size and return a Python - Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* and return a Python ++ Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* and return a Python bytes object. *encoding* and *errors* have the same meaning as the parameters of the same name in the Unicode :meth:`encode` method. The codec to be used is looked up using the Python codec registry. Return *NULL* if an @@@ -574,9 -492,9 +574,9 @@@ These are the UTF-8 codec APIs that have been decoded will be stored in *consumed*. -.. cfunction:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors) - Encode the :c:type:`Py_UNICODE` buffer of the given size using UTF-8 and - Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and ++ Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and return a Python bytes object. Return *NULL* if an exception was raised by the codec. @@@ -594,9 -512,9 +594,9 @@@ UTF-32 Codec These are the UTF-32 codec APIs: -.. cfunction:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder) +.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder) - Decode *length* bytes from a UTF-32 encoded buffer string and return the + Decode *size* bytes from a UTF-32 encoded buffer string and return the corresponding Unicode object. *errors* (if non-*NULL*) defines the error handling. It defaults to "strict". @@@ -662,9 -580,9 +662,9 @@@ UTF-16 Codec These are the UTF-16 codec APIs: -.. cfunction:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder) +.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder) - Decode *length* bytes from a UTF-16 encoded buffer string and return the + Decode *size* bytes from a UTF-16 encoded buffer string and return the corresponding Unicode object. *errors* (if non-*NULL*) defines the error handling. It defaults to "strict". @@@ -768,9 -686,9 +768,9 @@@ These are the "Unicode Escape" codec AP string *s*. Return *NULL* if an exception was raised by the codec. -.. cfunction:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size) +.. c:function:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size) - Encode the :c:type:`Py_UNICODE` buffer of the given size using Unicode-Escape and - Encode the :ctype:`Py_UNICODE` buffer of the given size using Unicode-Escape and ++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Unicode-Escape and return a Python string object. Return *NULL* if an exception was raised by the codec. @@@ -794,9 -712,9 +794,9 @@@ These are the "Raw Unicode Escape" code encoded string *s*. Return *NULL* if an exception was raised by the codec. -.. cfunction:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors) - Encode the :c:type:`Py_UNICODE` buffer of the given size using Raw-Unicode-Escape - Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape ++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape and return a Python string object. Return *NULL* if an exception was raised by the codec. @@@ -821,9 -739,9 +821,9 @@@ ordinals and only these are accepted b *s*. Return *NULL* if an exception was raised by the codec. -.. cfunction:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors) - Encode the :c:type:`Py_UNICODE` buffer of the given size using Latin-1 and - Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Latin-1 and ++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Latin-1 and return a Python bytes object. Return *NULL* if an exception was raised by the codec. @@@ -848,9 -766,9 +848,9 @@@ codes generate errors *s*. Return *NULL* if an exception was raised by the codec. -.. cfunction:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors) - Encode the :c:type:`Py_UNICODE` buffer of the given size using ASCII and - Encode the :ctype:`Py_UNICODE` buffer of the given *size* using ASCII and ++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using ASCII and return a Python bytes object. Return *NULL* if an exception was raised by the codec. @@@ -888,8 -804,9 +886,9 @@@ meaning that its ordinal value will be resp. Because of this, mappings only need to contain those mappings which map characters to different code points. + These are the mapping codec APIs: -.. cfunction:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors) +.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors) Create a Unicode object by decoding *size* bytes of the encoded string *s* using the given *mapping* object. Return *NULL* if an exception was raised by the @@@ -899,9 -816,9 +898,9 @@@ treated as "undefined mapping". -.. cfunction:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors) +.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors) - Encode the :c:type:`Py_UNICODE` buffer of the given size using the given - Encode the :ctype:`Py_UNICODE` buffer of the given *size* using the given ++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given *mapping* object and return a Python string object. Return *NULL* if an exception was raised by the codec. @@@ -915,9 -832,9 +914,9 @@@ The following codec API is special in that maps Unicode to Unicode. -.. cfunction:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors) +.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors) - Translate a :c:type:`Py_UNICODE` buffer of the given length by applying a - Translate a :ctype:`Py_UNICODE` buffer of the given *size* by applying a ++ Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a character mapping *table* to it and return the resulting Unicode object. Return *NULL* when an exception was raised by the codec. @@@ -929,17 -846,16 +928,16 @@@ :exc:`LookupError`) are left untouched and are copied as-is. - These are the MBCS codec APIs. They are currently only available on Windows and - use the Win32 MBCS converters to implement the conversions. Note that MBCS (or - DBCS) is a class of encodings, not just one. The target encoding is defined by - the user settings on the machine running the codec. - + MBCS codecs for Windows """"""""""""""""""""""" + These are the MBCS codec APIs. They are currently only available on Windows and + use the Win32 MBCS converters to implement the conversions. Note that MBCS (or + DBCS) is a class of encodings, not just one. The target encoding is defined by + the user settings on the machine running the codec. - -.. cfunction:: PyObject* PyUnicode_DecodeMBCS(const char *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_DecodeMBCS(const char *s, Py_ssize_t size, const char *errors) Create a Unicode object by decoding *size* bytes of the MBCS encoded string *s*. Return *NULL* if an exception was raised by the codec. @@@ -953,9 -869,9 +951,9 @@@ in *consumed*. -.. cfunction:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors) - Encode the :c:type:`Py_UNICODE` buffer of the given size using MBCS and return - Encode the :ctype:`Py_UNICODE` buffer of the given *size* using MBCS and return ++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using MBCS and return a Python bytes object. Return *NULL* if an exception was raised by the codec. @@@ -988,9 -904,9 +986,9 @@@ They all return *NULL* or ``-1`` if an Concat two strings giving a new Unicode string. -.. cfunction:: PyObject* PyUnicode_Split(PyObject *s, PyObject *sep, Py_ssize_t maxsplit) +.. c:function:: PyObject* PyUnicode_Split(PyObject *s, PyObject *sep, Py_ssize_t maxsplit) - Split a string giving a list of Unicode strings. If sep is *NULL*, splitting + Split a string giving a list of Unicode strings. If *sep* is *NULL*, splitting will be done at all whitespace substrings. Otherwise, splits occur at the given separator. At most *maxsplit* splits will be done. If negative, no limit is set. Separators are not included in the resulting list. @@@ -1019,22 -935,22 +1017,22 @@@ use the default error handling. -.. cfunction:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq) +.. c:function:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq) - Join a sequence of strings using the given separator and return the resulting + Join a sequence of strings using the given *separator* and return the resulting Unicode string. -.. cfunction:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction) +.. c:function:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction) - Return 1 if *substr* matches *str*[*start*:*end*] at the given tail end + Return 1 if *substr* matches ``str[start:end]`` at the given tail end (*direction* == -1 means to do a prefix match, *direction* == 1 a suffix match), 0 otherwise. Return ``-1`` if an error occurred. -.. cfunction:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction) +.. c:function:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction) - Return the first position of *substr* in *str*[*start*:*end*] using the given + Return the first position of *substr* in ``str[start:end]`` using the given *direction* (*direction* == 1 means to do a forward search, *direction* == -1 a backward search). The return value is the index of the first match; a value of ``-1`` indicates that no match was found, and ``-2`` indicates that an error