If *locale* is given and not ``None``, :func:`setlocale` modifies the locale
setting for the *category*. The available categories are listed in the data
- description below. *locale* may be a string, or an iterable of two strings
- (language code and encoding). If it's an iterable, it's converted to a locale
- name using the locale aliasing engine. An empty string specifies the user's
+ description below. *locale* may be a :ref:`string <locale_name>`, or a pair,
+ language code and encoding. An empty string specifies the user's
default settings. If the modification of the locale fails, the exception
:exc:`Error` is raised. If successful, the new locale setting is returned.
+ If *locale* is a pair, it is converted to a locale name using
+ the locale aliasing engine.
+ The language code has the same format as a :ref:`locale name <locale_name>`,
+ but without encoding and ``@``-modifier.
+ The language code and encoding can be ``None``.
+
If *locale* is omitted or ``None``, the current setting for *category* is
returned.
``'LANG'``. The GNU gettext search path contains ``'LC_ALL'``,
``'LC_CTYPE'``, ``'LANG'`` and ``'LANGUAGE'``, in that order.
- Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
- *language code* and *encoding* may be ``None`` if their values cannot be
+ The language code has the same format as a :ref:`locale name <locale_name>`,
+ but without encoding and ``@``-modifier.
+ The language code and encoding may be ``None`` if their values cannot be
determined.
+ The "C" locale is represented as ``(None, None)``.
.. deprecated-removed:: 3.11 3.15
.. function:: getlocale(category=LC_CTYPE)
- Returns the current setting for the given locale category as sequence containing
- *language code*, *encoding*. *category* may be one of the :const:`!LC_\*` values
- except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
+ Returns the current setting for the given locale category as a tuple containing
+ the language code and encoding. *category* may be one of the :const:`!LC_\*`
+ values except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
- Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
- *language code* and *encoding* may be ``None`` if their values cannot be
+ The language code has the same format as a :ref:`locale name <locale_name>`,
+ but without encoding and ``@``-modifier.
+ The language code and encoding may be ``None`` if their values cannot be
determined.
+ The "C" locale is represented as ``(None, None)``.
.. function:: getpreferredencoding(do_setlocale=True)
part of a character class such as letter or whitespace.
+.. _locale_name:
+
+Locale names
+------------
+
+The format of the locale name is platform dependent, and the set of supported
+locales can depend on the system configuration.
+
+On Posix platforms, it usually has the format [1]_:
+
+.. productionlist:: locale_name
+ : language ["_" territory] ["." charset] ["@" modifier]
+
+where *language* is a two- or three-letter language code from `ISO 639`_,
+*territory* is a two-letter country or region code from `ISO 3166`_,
+*charset* is a locale encoding, and *modifier* is a script name,
+a language subtag, a sort order identifier, or other locale modifier
+(for example, "latin", "valencia", "stroke" and "euro").
+
+On Windows, several formats are supported. [2]_ [3]_
+A subset of `IETF BCP 47`_ tags:
+
+.. productionlist:: locale_name
+ : language ["-" script] ["-" territory] ["." charset]
+ : language ["-" script] "-" territory "-" modifier
+
+where *language* and *territory* have the same meaning as in Posix,
+*script* is a four-letter script code from `ISO 15924`_,
+and *modifier* is a language subtag, a sort order identifier
+or custom modifier (for example, "valencia", "stroke" or "x-python").
+Both hyphen (``'-'``) and underscore (``'_'``) separators are supported.
+Only UTF-8 encoding is allowed for BCP 47 tags.
+
+Windows also supports locale names in the format:
+
+.. productionlist:: locale_name
+ : language ["_" territory] ["." charset]
+
+where *language* and *territory* are full names, such as "English" and
+"United States", and *charset* is either a code page number (for example, "1252")
+or UTF-8.
+Only the underscore separator is supported in this format.
+
+The "C" locale is supported on all platforms.
+
+.. _ISO 639: https://www.iso.org/iso-639-language-code
+.. _ISO 3166: https://www.iso.org/iso-3166-country-codes.html
+.. _IETF BCP 47: https://www.rfc-editor.org/info/bcp47
+.. _ISO 15924: https://www.unicode.org/iso15924/
+
+.. [1] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02>`_
+.. [2] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings>`_
+.. [3] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names>`_
+
+
.. _embedding-locale:
For extension writers and programs that embed Python