gh-116666: Add "token" glossary term (GH-130888)
Add glossary entry for `token`, and link to it.
Avoid talking about tokens in the SyntaxError intro (errors.rst); at this point
tokenization is too much of a technical detail. (Even to an advanced reader,
the fact that a *single* token is highlighted isn't too relevant. Also, we don't
need to guarantee that it's a single token.)
(cherry picked from commit
30d52058493e07fd1d3efea960482f4001bd2f86)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
thread removes *key* from *mapping* after the test, but before the lookup.
This issue can be solved with locks or by using the EAFP approach.
+ lexical analyzer
+
+ Formal name for the *tokenizer*; see :term:`token`.
+
list
A built-in Python :term:`sequence`. Despite its name it is more akin
to an array in other languages than to a linked list since access to
See also :term:`binary file` for a file object able to read and write
:term:`bytes-like objects <bytes-like object>`.
+ token
+
+ A small unit of source code, generated by the
+ :ref:`lexical analyzer <lexical>` (also called the *tokenizer*).
+ Names, numbers, strings, operators,
+ newlines and similar are represented by tokens.
+
+ The :mod:`tokenize` module exposes Python's lexical analyzer.
+ The :mod:`token` module contains information on the various types
+ of tokens.
+
triple-quoted string
A string which is bound by three instances of either a quotation mark
(") or an apostrophe ('). While they don't provide any functionality
.. index:: lexical analysis, parser, token
A Python program is read by a *parser*. Input to the parser is a stream of
-*tokens*, generated by the *lexical analyzer*. This chapter describes how the
-lexical analyzer breaks a file into tokens.
+:term:`tokens <token>`, generated by the *lexical analyzer* (also known as
+the *tokenizer*).
+This chapter describes how the lexical analyzer breaks a file into tokens.
Python reads program text as Unicode code points; the encoding of a source file
can be given by an encoding declaration and defaults to UTF-8, see :pep:`3120`
SyntaxError: invalid syntax
The parser repeats the offending line and displays little arrows pointing
-at the token in the line where the error was detected. The error may be
-caused by the absence of a token *before* the indicated token. In the
-example, the error is detected at the function :func:`print`, since a colon
-(``':'``) is missing before it. File name and line number are printed so you
-know where to look in case the input came from a script.
+at the place where the error was detected. Note that this is not always the
+place that needs to be fixed. In the example, the error is detected at the
+function :func:`print`, since a colon (``':'``) is missing just before it.
+
+The file name (``<stdin>`` in our example) and line number are printed so you
+know where to look in case the input came from a file.
.. _tut-exceptions:
This facility is an enormous step forward compared to earlier versions of the
interpreter; however, some wishes are left: It would be nice if the proper
-indentation were suggested on continuation lines (the parser knows if an indent
-token is required next). The completion mechanism might use the interpreter's
-symbol table. A command to check (or even suggest) matching parentheses,
-quotes, etc., would also be useful.
+indentation were suggested on continuation lines (the parser knows if an
+:data:`~token.INDENT` token is required next). The completion mechanism might
+use the interpreter's symbol table. A command to check (or even suggest)
+matching parentheses, quotes, etc., would also be useful.
One alternative enhanced interactive interpreter that has been around for quite
some time is IPython_, which features tab completion, object exploration and