Backport various tutorial fixups (with permission from the RM).

author Raymond Hettinger <python@rcn.com>

Sun, 7 Dec 2003 11:15:16 +0000 (11:15 +0000)

committer Raymond Hettinger <python@rcn.com>

Sun, 7 Dec 2003 11:15:16 +0000 (11:15 +0000)
author Raymond Hettinger <python@rcn.com>
Sun, 7 Dec 2003 11:15:16 +0000 (11:15 +0000)
committer Raymond Hettinger <python@rcn.com>
Sun, 7 Dec 2003 11:15:16 +0000 (11:15 +0000)
diff --git a/Doc/tut/tut.tex b/Doc/tut/tut.tex

index 6d7e7f43fd5a4bb1482f15560d4c940919e08ec6..77822c44bbef4820e48a9d7ee286e8f6bd06199a 100644 (file)
--- a/Doc/tut/tut.tex
+++ b/Doc/tut/tut.tex
@@ -2,9 +2,6 @@
  \usepackage[T1]{fontenc}
  
  % Things to do:
-% Add a section on file I/O
-% Write a chapter entitled ``Some Useful Modules''
-%  --re, math+cmath
  % Should really move the Python startup file info to an appendix
  
  \title{Python Tutorial}
@@ -331,19 +328,19 @@ With that declaration, all characters in the source file will be treated as
  possible to directly write Unicode string literals in the selected
  encoding.  The list of possible encodings can be found in the
  \citetitle[../lib/lib.html]{Python Library Reference}, in the section
-on \module{codecs}.
+on \ulink{\module{codecs}}{../lib/module-codecs.html}.
  
-If your editor supports saving files as \code{UTF-8} with an UTF-8
-signature (aka BOM -- Byte Order Mark), you can use that instead of an
+If your editor supports saving files as \code{UTF-8} with a UTF-8
+\emph{byte order mark} (aka BOM), you can use that instead of an
  encoding declaration. IDLE supports this capability if
  \code{Options/General/Default Source Encoding/UTF-8} is set. Notice
  that this signature is not understood in older Python releases (2.2
  and earlier), and also not understood by the operating system for
-\code{\#!} files. 
+\code{\#!} files.
  
  By using UTF-8 (either through the signature or an encoding
  declaration), characters of most languages in the world can be used
-simultaneously in string literals and comments. Using non-ASCII
+simultaneously in string literals and comments. Using non-\ASCII
  characters in identifiers is not supported. To display all these
  characters properly, your editor must recognize that the file is
  UTF-8, and it must use a font that supports all the characters in the
@@ -659,15 +656,14 @@ the first line above could also have been written \samp{word = 'Help'
  expressions:
  
  \begin{verbatim}
->>> import string
  >>> 'str' 'ing'                   #  <-  This is ok
  'string'
->>> string.strip('str') + 'ing'   #  <-  This is ok
+>>> 'str'.strip() + 'ing'   #  <-  This is ok
  'string'
->>> string.strip('str') 'ing'     #  <-  This is invalid
+>>> 'str'.strip() 'ing'     #  <-  This is invalid
    File "<stdin>", line 1, in ?
-    string.strip('str') 'ing'
-                            ^
+    'str'.strip() 'ing'
+                      ^
  SyntaxError: invalid syntax
  \end{verbatim}
  
@@ -809,6 +805,21 @@ The built-in function \function{len()} returns the length of a string:
  \end{verbatim}
  
  
+\begin{seealso}
+  \seetitle[../lib/typesseq.html]{Sequence Types}%
+           {Strings, and the Unicode strings described in the next
+            section, are examples of \emph{sequence types}, and
+            support the common operations supported by such types.}
+  \seetitle[../lib/string-methods.html]{String Methods}%
+           {Both strings and Unicode strings support a large number of
+            methods for basic transformations and searching.}
+  \seetitle[../lib/typesseq-strings.html]{String Formatting Operations}%
+           {The formatting operations invoked when strings and Unicode
+            strings are the left operand of the \code{\%} operator are
+            described in more detail here.}
+\end{seealso}
+
+
  \subsection{Unicode Strings \label{unicodeStrings}}
  \sectionauthor{Marc-Andre Lemburg}{mal@lemburg.com}
  
@@ -881,7 +892,7 @@ the more well known encodings which these codecs can convert are
  \emph{Latin-1}, \emph{ASCII}, \emph{UTF-8}, and \emph{UTF-16}.
  The latter two are variable-length encodings that store each Unicode
  character in one or more bytes. The default encoding is
-normally set to ASCII, which passes through characters in the range
+normally set to \ASCII, which passes through characters in the range
  0 to 127 and rejects any other characters with an error.
  When a Unicode string is printed, written to a file, or converted
  with \function{str()}, conversion takes place using this default encoding.
@@ -1518,7 +1529,7 @@ TypeError: function() got multiple values for keyword argument 'a'
  \end{verbatim}
  
  When a final formal parameter of the form \code{**\var{name}} is
-present, it receives a dictionary containing all keyword arguments
+present, it receives a \ulink{dictionary}{../lib/typesmapping.html} containing all keyword arguments
  whose keyword doesn't correspond to a formal parameter.  This may be
  combined with a formal parameter of the form
  \code{*\var{name}} (described in the next subsection) which receives a
@@ -1838,19 +1849,14 @@ cubes:
  More than one sequence may be passed; the function must then have as
  many arguments as there are sequences and is called with the
  corresponding item from each sequence (or \code{None} if some sequence
-is shorter than another).  If \code{None} is passed for the function,
-a function returning its argument(s) is substituted.
-
-Combining these two special cases, we see that
-\samp{map(None, \var{list1}, \var{list2})} is a convenient way of
-turning a pair of lists into a list of pairs.  For example:
+is shorter than another).  For example:
  
  \begin{verbatim}
  >>> seq = range(8)
->>> def square(x): return x*x
+>>> def add(x, y): return x+y
  ...
->>> map(None, seq, map(square, seq))
-[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)]
+>>> map(add, seq, seq)
+[0, 2, 4, 6, 8, 10, 12, 14]
  \end{verbatim}
  
  \samp{reduce(\var{func}, \var{sequence})} returns a single value
@@ -1973,9 +1979,10 @@ another value is assigned to it).  We'll find other uses for
  
  We saw that lists and strings have many common properties, such as
  indexing and slicing operations.  They are two examples of
-\emph{sequence} data types.  Since Python is an evolving language,
-other sequence data types may be added.  There is also another
-standard sequence data type: the \emph{tuple}.
+\ulink{\emph{sequence} data types}{../lib/typesseq.html}.  Since
+Python is an evolving language, other sequence data types may be
+added.  There is also another standard sequence data type: the
+\emph{tuple}.
  
  A tuple consists of a number of values separated by commas, for
  instance:
@@ -2045,7 +2052,8 @@ always creates a tuple, and unpacking works for any sequence.
  
  \section{Dictionaries \label{dictionaries}}
  
-Another useful data type built into Python is the \emph{dictionary}.
+Another useful data type built into Python is the
+\ulink{\emph{dictionary}}{../lib/typesmapping.html}.
  Dictionaries are sometimes found in other languages as ``associative
  memories'' or ``associative arrays''.  Unlike sequences, which are
  indexed by a range of numbers, dictionaries are indexed by \emph{keys},
@@ -2073,11 +2081,11 @@ If you store using a key that is already in use, the old value
  associated with that key is forgotten.  It is an error to extract a
  value using a non-existent key.
  
-The \code{keys()} method of a dictionary object returns a list of all
+The \method{keys()} method of a dictionary object returns a list of all
  the keys used in the dictionary, in random order (if you want it
-sorted, just apply the \code{sort()} method to the list of keys).  To
+sorted, just apply the \method{sort()} method to the list of keys).  To
  check whether a single key is in the dictionary, use the
-\code{has_key()} method of the dictionary.
+\method{has_key()} method of the dictionary.
  
  Here is a small example using a dictionary:
  
@@ -2151,6 +2159,21 @@ What is your quest?  It is the holy grail.
  What is your favorite color?  It is blue.
  \end{verbatim}
  
+To loop over a sequence in reverse, first specify the sequence
+in a forward direction and then call the \function{reversed()}
+function.
+
+\begin{verbatim}
+>>> for i in reversed(xrange(1,10,2)):
+...     print i
+...
+9
+7
+5
+3
+1
+\end{verbatim}
+
  
  \section{More on Conditions \label{conditions}}
  
@@ -2388,7 +2411,7 @@ script being run is on the search path, it is important that the
  script not have the same name as a standard module, or Python will
  attempt to load the script as a module when that module is imported.
  This will generally be an error.  See section~\ref{standardModules},
-``Standard Modules.'' for more information.
+``Standard Modules,'' for more information.
  
  
  \subsection{``Compiled'' Python files}
@@ -2454,9 +2477,10 @@ library of Python code in a form that is moderately hard to reverse
  engineer.
  
  \item
-The module \module{compileall}\refstmodindex{compileall} can create
-\file{.pyc} files (or \file{.pyo} files when \programopt{-O} is used) for
-all modules in a directory.
+The module \ulink{\module{compileall}}{../lib/module-compileall.html}%
+{} \refstmodindex{compileall} can create \file{.pyc} files (or
+\file{.pyo} files when \programopt{-O} is used) for all modules in a
+directory.
  
  \end{itemize}
  
@@ -2473,7 +2497,8 @@ system calls.  The set of such modules is a configuration option which
  also depends on the underlying platform  For example,
  the \module{amoeba} module is only provided on systems that somehow
  support Amoeba primitives.  One particular module deserves some
-attention: \module{sys}\refstmodindex{sys}, which is built into every
+attention: \ulink{\module{sys}}{../lib/module-sys.html}%
+\refstmodindex{sys}, which is built into every 
  Python interpreter.  The variables \code{sys.ps1} and
  \code{sys.ps2} define the strings used as primary and secondary
  prompts:
@@ -2756,14 +2781,15 @@ submodules with the same name from different packages.
  \subsection{Intra-package References}
  
  The submodules often need to refer to each other.  For example, the
-\module{surround} module might use the \module{echo} module.  In fact, such references
-are so common that the \code{import} statement first looks in the
+\module{surround} module might use the \module{echo} module.  In fact,
+such references
+are so common that the \keyword{import} statement first looks in the
  containing package before looking in the standard module search path.
  Thus, the surround module can simply use \code{import echo} or
  \code{from echo import echofilter}.  If the imported module is not
  found in the current package (the package of which the current module
-is a submodule), the \code{import} statement looks for a top-level module
-with the given name.
+is a submodule), the \keyword{import} statement looks for a top-level
+module with the given name.
  
  When packages are structured into subpackages (as with the
  \module{Sound} package in the example), there's no shortcut to refer
@@ -2773,15 +2799,6 @@ must be used.  For example, if the module
  in the \module{Sound.Effects} package, it can use \code{from
  Sound.Effects import echo}.
  
-%(One could design a notation to refer to parent packages, similar to
-%the use of ".." to refer to the parent directory in \UNIX{} and Windows
-%filesystems.  In fact, the \module{ni} module, which was the
-%ancestor of this package system, supported this using \code{__} for
-%the package containing the current module,
-%\code{__.__} for the parent package, and so on.  This feature was dropped
-%because of its awkwardness; since most packages will have a relative
-%shallow substructure, this is no big loss.)
-
  \subsection{Packages in Multiple Directories}
  
  Packages support one more special attribute, \member{__path__}.  This
@@ -2873,11 +2890,10 @@ The value of x is 32.5, and y is 40000...
  Here are two ways to write a table of squares and cubes:
  
  \begin{verbatim}
->>> import string
  >>> for x in range(1, 11):
-...     print string.rjust(repr(x), 2), string.rjust(repr(x*x), 3),
+...     print repr(x).rjust(2), repr(x*x).rjust(3),
  ...     # Note trailing comma on previous line
-...     print string.rjust(repr(x*x*x), 4)
+...     print repr(x*x*x).rjust(4)
  ...
   1   1    1
   2   4    8
@@ -2907,28 +2923,27 @@ Here are two ways to write a table of squares and cubes:
  (Note that one space between each column was added by the way
  \keyword{print} works: it always adds spaces between its arguments.)
  
-This example demonstrates the function \function{string.rjust()},
+This example demonstrates the \method{rjust()} method of string objects,
  which right-justifies a string in a field of a given width by padding
-it with spaces on the left.  There are similar functions
-\function{string.ljust()} and \function{string.center()}.  These
-functions do not write anything, they just return a new string.  If
+it with spaces on the left.  There are similar methods
+\method{ljust()} and \method{center()}.  These
+methods do not write anything, they just return a new string.  If
  the input string is too long, they don't truncate it, but return it
  unchanged; this will mess up your column lay-out but that's usually
  better than the alternative, which would be lying about a value.  (If
  you really want truncation you can always add a slice operation, as in
-\samp{string.ljust(x,~n)[0:n]}.)
+\samp{x.ljust(~n)[:n]}.)
  
-There is another function, \function{string.zfill()}, which pads a
+There is another method, \method{zfill()}, which pads a
  numeric string on the left with zeros.  It understands about plus and
  minus signs:
  
  \begin{verbatim}
->>> import string
->>> string.zfill('12', 5)
+>>> '12'.zfill(5)
  '00012'
->>> string.zfill('-3.14', 7)
+>>> '-3.14'.zfill(7)
  '-003.14'
->>> string.zfill('3.14159265359', 5)
+>>> '3.14159265359'.zfill(5)
  '3.14159265359'
  \end{verbatim}
  
@@ -3111,14 +3126,15 @@ objects.
  Strings can easily be written to and read from a file. Numbers take a
  bit more effort, since the \method{read()} method only returns
  strings, which will have to be passed to a function like
-\function{string.atoi()}, which takes a string like \code{'123'} and
+\function{int()}, which takes a string like \code{'123'} and
  returns its numeric value 123.  However, when you want to save more
  complex data types like lists, dictionaries, or class instances,
  things get a lot more complicated.
  
  Rather than have users be constantly writing and debugging code to
  save complicated data types, Python provides a standard module called
-\module{pickle}.  This is an amazing module that can take almost
+\ulink{\module{pickle}}{../lib/module-pickle.html}.  This is an
+amazing module that can take almost
  any Python object (even some forms of Python code!), and convert it to
  a string representation; this process is called \dfn{pickling}.  
  Reconstructing the object from the string representation is called
@@ -3143,12 +3159,15 @@ x = pickle.load(f)
  
  (There are other variants of this, used when pickling many objects or
  when you don't want to write the pickled data to a file; consult the
-complete documentation for \module{pickle} in the Library Reference.)
-
-\module{pickle} is the standard way to make Python objects which can
-be stored and reused by other programs or by a future invocation of
-the same program; the technical term for this is a
-\dfn{persistent} object.  Because \module{pickle} is so widely used,
+complete documentation for
+\ulink{\module{pickle}}{../lib/module-pickle.html} in the
+\citetitle[../lib/]{Python Library Reference}.)
+
+\ulink{\module{pickle}}{../lib/module-pickle.html} is the standard way
+to make Python objects which can be stored and reused by other
+programs or by a future invocation of the same program; the technical
+term for this is a \dfn{persistent} object.  Because
+\ulink{\module{pickle}}{../lib/module-pickle.html} is so widely used,
  many authors who write Python extensions take care to ensure that new
  data types such as matrices can be properly pickled and unpickled.
  
@@ -3294,12 +3313,12 @@ error message and then re-raise the exception (allowing a caller to
  handle the exception as well):
  
  \begin{verbatim}
-import string, sys
+import sys
  
  try:
      f = open('myfile.txt')
      s = f.readline()
-    i = int(string.strip(s))
+    i = int(s.strip())
  except IOError, (errno, strerror):
      print "I/O error(%s): %s" % (errno, strerror)
  except ValueError:
@@ -4338,7 +4357,7 @@ operating system:
  
  \begin{verbatim}
  >>> import os
->>> os.system('copy /data/mydata.fil /backup/mydata.fil')
+>>> os.system('time 0:02')
  0
  >>> os.getcwd()      # Return the current working directory
  'C:\\Python24'
@@ -4425,9 +4444,7 @@ The most direct way to terminate a script is to use \samp{sys.exit()}.
  
  The \ulink{\module{re}}{../lib/module-re.html}
  module provides regular expression tools for advanced string processing.
-When only simple capabilities are needed, string methods are preferred
-because they are easier to read and debug.  However, for more
-sophisticated applications, regular expressions can provide succinct,
+For complex matching and manipulation, regular expressions offer succinct,
  optimized solutions:
  
  \begin{verbatim}
@@ -4438,6 +4455,13 @@ optimized solutions:
  'cat in the hat'
  \end{verbatim}
  
+When only simple capabilities are needed, string methods are preferred
+because they are easier to read and debug:
+
+\begin{verbatim}
+>>> 'tea for too'.replace('too', 'two')
+'tea for two'
+\end{verbatim}
  
  \section{Mathematics\label{mathematics}}
  
@@ -4676,8 +4700,8 @@ than the main site, depending on your geographical location.  A more
  informal site is \url{http://starship.python.net/}, which contains a
  bunch of Python-related personal home pages; many people have
  downloadable software there. Many more user-created Python modules
-can be found in a third-party repository at
-\url{http://www.vex.net/parnassus}.
+can be found in the \ulink{Python Package
+Index}{http://www.python.org/pypi} (PyPI).
  
  For Python-related questions and problem reports, you can post to the
  newsgroup \newsgroup{comp.lang.python}, or send them to the mailing
@@ -4690,8 +4714,7 @@ up to several hundred),
  % days = 116.9 msgs / day and steadily increasing.
  asking (and answering) questions, suggesting new features, and
  announcing new modules.  Before posting, be sure to check the list of
-Frequently Asked Questions (also called the FAQ), at
-\url{http://www.python.org/doc/FAQ.html}, or look for it in the
+\ulink{Frequently Asked Questions}{http://www.python.org/doc/faq/} (also called the FAQ), or look for it in the
  \file{Misc/} directory of the Python source distribution.  Mailing
  list archives are available at \url{http://www.python.org/pipermail/}.
  The FAQ answers many of the questions that come up again and again,
@@ -4789,7 +4812,8 @@ Tab: complete
  \end{verbatim}
  
  in your \file{\~{}/.inputrc}.  (Of course, this makes it harder to
-type indented continuation lines.)
+type indented continuation lines if you're accustomed to using
+\kbd{Tab} for that purpose.)
  
  Automatic completion of variable and module names is optionally
  available.  To enable it in the interpreter's interactive mode, add
@@ -4818,7 +4842,8 @@ this deletes the names it creates once they are no longer needed; this
  is done since the startup file is executed in the same namespace as
  the interactive commands, and removing the names avoids creating side
  effects in the interactive environments.  You may find it convenient
-to keep some of the imported modules, such as \module{os}, which turn
+to keep some of the imported modules, such as
+\ulink{\module{os}}{../lib/module-os.html}, which turn
  out to be needed in most sessions with the interpreter.
  
  \begin{verbatim}
author	Raymond Hettinger <python@rcn.com>
	Sun, 7 Dec 2003 11:15:16 +0000 (11:15 +0000)
committer	Raymond Hettinger <python@rcn.com>
	Sun, 7 Dec 2003 11:15:16 +0000 (11:15 +0000)